Re: MIssing data in range query

2014-10-08 Thread Robert Coli
On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim ohech...@gmail.com wrote:

 Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement
 of that. Though, I didn't intend for the question to be about
 supercolumns.


(Yep, understand tho that if you hadn't been told that advice before, it
would grate a lot less. I will try to remember that Owen Kim has received
this piece of info, and will do my best to not repeat it to you... :D)


 It is possible I'm hitting an odd edge case though I'm having trouble
 reproducing the issue in a controlled environment since there seems to be a
 timing element to it, or at least it's not consistently happening. I
 haven't been able to reproduce it on a single node test cluster. I'm moving
 on to test a larger one now.


Right, my hypothesis is that there is something within the supercolumn
write path which differs from the non-supercolumn write path. In theory
this should be less possible since the 1.2 era supercolumn rewrite.

To be clear, are you reading back via PK? No secondary indexes involved,
right? The only bells your symptoms are ringing are secondary index bugs...

=Rob


Re: assertion error on joining

2014-10-08 Thread Kais Ahmed
I found the problem.

jira ticket :
https://issues.apache.org/jira/browse/CASSANDRA-8081

2014-10-06 18:45 GMT+02:00 Kais Ahmed k...@neteck-fr.com:

 Hi all,

 I'm a bit stuck , i want to expand my cluster C* 2.0.6 but i encountered
 an error on
 the new node.

 ERROR [FlushWriter:2] 2014-10-06 16:15:35,147 CassandraDaemon.java (line
 199) Exception in thread Thread[FlushWriter:2,5,main]
 java.lang.AssertionError: 394920
 at
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187)
 ...

 This assertion is here :

 public static void writeWithShortLength(ByteBuffer buffer, DataOutput
 out) throws IOException
 {
 int length = buffer.remaining();
  -- assert 0 = length  length = FBUtilities.MAX_UNSIGNED_SHORT :
 length;
 out.writeShort(length);
 write(buffer, out); // writing data bytes to output source
 }

 But i dont know what i can do to complete the bootstrap.

 Thanks,




How to enable client-to-node encrypt communication with Astyanax cassandra client

2014-10-08 Thread Lu, Boying
Hi, All,

I'm trying to enable client-to-node encrypt communication in Cassandra (2.0.7) 
with Astyanax client library (version=1.56.48)

I found the links about how to enable this feature:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureSSLClientToNode_t.html
But this only says how to set up in the server side, but not the client side.

Here is my configuration on the server side (in yaml):
client_encryption_options:
enabled: true
keystore:  full-path-to-keystore-file   #same file used by Cassandra server
keystore_password: some-password
truststore: fullpath-to-truststore-file  #same file used by Cassandra server
truststore_password: some-password
# More advanced defaults below:
# protocol: TLS
# algorithm: SunX509
# store_type: JKS
cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA]
require_client_auth: true

http://www.datastax.com/dev/blog/accessing-secure-dse-clusters-with-cql-native-protocol
This link says something about client side, but not how to do it with the 
Astyanax client library.

Searching the Astyanax source codes, I found the class SSLConnectionContext 
maybe useful
And here is my code snippet:
AstyanaxContextCluster clusterContext = new AstyanaxContext.Builder()
.forCluster(clusterName)
.forKeyspace(keyspaceName)
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setRetryPolicy(new QueryRetryPolicy(10, 1000)))
.withConnectionPoolConfiguration(new 
ConnectionPoolConfigurationImpl(_clusterName)
.setMaxConnsPerHost(1)
.setAuthenticationCredentials(credentials)
.setSSLConnectionContext(sslContext)
.setSeeds(String.format(%1$s:%2$d, uri.getHost(),
uri.getPort()))
)
.buildCluster(ThriftFamilyFactory.getInstance());

But when I tried to connect to the Cassandra server, I got following error:
Caused by: org.apache.thrift.transport.TTransportException: 
javax.net.ssl.SSLHandshakeException: Remote host closed connection during 
handshake
at 
org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)
at 
org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:158)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)
at 
org.apache.cassandra.thrift.Cassandra$Client.send_login(Cassandra.java:567)
at 
org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:559)
at 
com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.open(ThriftSyncConnectionFactoryImpl.java:203)
... 6 more

It looks like that my SSL settings are incorrect.

Does anyone know how to resolve this issue?

Thanks

Boying


RE: Doubts with the values of the parameter broadcast_rpc_address

2014-10-08 Thread Ricard Mestre Subirats
Hi Tyler,

I tried the configuration that you said and it didn’t work. I’m going to 
explain our scenario and the configurations that we tried but that didn’t work.
We work with VMWare virtual machines and we need to configure a 3 node cluster 
of Cassandra. We configure the file cassandra.yaml with the following values:

At the machine with IP 192.168.150.112:
-cluster_name: 'CassandraCluster1'
-seeds: 192.168.150.112
-listen_address: 192.168.150.112
-rpc_address: 0.0.0.0
-broadcast_rpc_address: 192.168.150.112

At the machine with IP 192.168.150.113:
-cluster_name: 'CassandraCluster1'
-seeds: 192.168.150.112
-listen_address: 192.168.150.113
-rpc_address: 0.0.0.0
-broadcast_rpc_address: 192.168.150.113

Then, if we start the service and execute “nodetool status” the result is the 
following:
nodetool: Failed to connect to '127.0.0.1:7199' - NoRouteToHostException: 
'There is not any route to the `host''.

We test the following configuration too:
At the machine with IP 192.168.150.112:
-cluster_name: 'CassandraCluster1'
-seeds: 192.168.150.112
-listen_address: 192.168.150.112
-rpc_address: localhost
-broadcast_rpc_address: 0.0.0.0

At the machine with IP 192.168.150.113:
-cluster_name: 'CassandraCluster1'
-seeds: 192.168.150.112
-listen_address: 192.168.150.113
-rpc_address: localhost
-broadcast_rpc_address: 0.0.0.0

The result at starting the service is:
broadcast_rpc_address cannot be 0.0.0.0
Fatal configuration error; unable to start. See log for stacktrace.

Can you give us any advice to achieve the cluster of the Cassandra machines? 
Which is the good configuration in our case?

Thank you!

Ricard

De: Tyler Hobbs [mailto:ty...@datastax.com]
Enviado el: martes, 07 de octubre de 2014 21:00
Para: user@cassandra.apache.org
Asunto: Re: Doubts with the values of the parameter broadcast_rpc_address

The broadcast_rpc_address should be an IP address that drivers/clients can 
connect to.  This is what will show up in the system.peers table under 
rpc_address.  In most cases it should match the value of broadcast_address 
(or listen_address, if broadcast_address isn't set).

On Tue, Oct 7, 2014 at 6:04 AM, Ricard Mestre Subirats 
ricard.mestre.subir...@everis.commailto:ricard.mestre.subir...@everis.com 
wrote:
Hi everyone,

We were working with Cassandra clusters in 2.0 version and now we want to work 
with clusters in 2.1 version. We configure the Cassandra.yaml as we configured 
it in the previous version, but at the moment of start the service there is a 
fatal error. The log tells us that if you configure to 0.0.0.0 rpc_address, the 
broadcast_rpc_address has to be set too. But we don’t know possible values for 
this parameter.

Can anyone explain us the functionality of this new parameter and a possible 
value?

Thank you very much!

Ricard



AVISO DE CONFIDENCIALIDAD.
Este correo y la información contenida o adjunta al mismo es privada y 
confidencial y va dirigida exclusivamente a su destinatario. everis informa a 
quien pueda haber recibido este correo por error que contiene información 
confidencial cuyo uso, copia, reproducción o distribución está expresamente 
prohibida. Si no es Vd. el destinatario del mismo y recibe este correo por 
error, le rogamos lo ponga en conocimiento del emisor y proceda a su 
eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.

CONFIDENTIALITY WARNING.
This message and the information contained in or attached to it are private and 
confidential and intended exclusively for the addressee. everis informs to whom 
it may receive it in error that it contains privileged information and its use, 
copy, reproduction or distribution is prohibited. If you are not an intended 
recipient of this E-mail, please notify the sender, delete it and do not read, 
act upon, print, disclose, copy, retain or redistribute any portion of this 
E-mail.



--
Tyler Hobbs
DataStaxhttp://datastax.com/



AVISO DE CONFIDENCIALIDAD.
Este correo y la información contenida o adjunta al mismo es privada y 
confidencial y va dirigida exclusivamente a su destinatario. everis informa a 
quien pueda haber recibido este correo por error que contiene información 
confidencial cuyo uso, copia, reproducción o distribución está expresamente 
prohibida. Si no es Vd. el destinatario del mismo y recibe este correo por 
error, le rogamos lo ponga en conocimiento del emisor y proceda a su 
eliminación sin copiarlo, imprimirlo o utilizarlo de ningún modo.

CONFIDENTIALITY WARNING.
This message and the information contained in or attached to it are private and 
confidential and intended exclusively for the addressee. everis informs to whom 
it may receive it in error that it contains privileged information and its use, 
copy, reproduction or distribution is prohibited. If you are not an intended 
recipient of this E-mail, please notify the sender, delete it and do not read, 
act upon, print, disclose, copy, retain or redistribute any 

Re: How to enable client-to-node encrypt communication with Astyanax cassandra client

2014-10-08 Thread Ben Bromhead
Haven't personally followed this but give it a go:
http://lyubent.github.io/security/planetcassandra/2013/05/31/ssl-for-astyanax.html

On 8 October 2014 20:46, Lu, Boying boying...@emc.com wrote:

 Hi, All,



 I’m trying to enable client-to-node encrypt communication in Cassandra
 (2.0.7) with Astyanax client library (version=1.56.48)



 I found the links about how to enable this feature:


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureSSLClientToNode_t.html

 But this only says how to set up in the server side, but not the client
 side.



 Here is my configuration on the server side (in yaml):

 client_encryption_options:

 enabled: true

 keystore:  full-path-to-keystore-file   *#same file used by Cassandra
 server*

 keystore_password: some-password

 truststore: fullpath-to-truststore-file  *#same file used by
 Cassandra server*

 truststore_password: some-password

 # More advanced defaults below:

 # protocol: TLS

 # algorithm: SunX509

 # store_type: JKS

 cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA]

 require_client_auth: true




 http://www.datastax.com/dev/blog/accessing-secure-dse-clusters-with-cql-native-protocol

 This link says something about client side, but not how to do it with the
 Astyanax client library.



 Searching the Astyanax source codes, I found the class
 SSLConnectionContext maybe useful

 And here is my code snippet:

 AstyanaxContextCluster clusterContext = new AstyanaxContext.Builder()

 .forCluster(clusterName)

 .forKeyspace(keyspaceName)

 .withAstyanaxConfiguration(new AstyanaxConfigurationImpl()

 .setRetryPolicy(new QueryRetryPolicy(10, 1000)))

 .withConnectionPoolConfiguration(new
 ConnectionPoolConfigurationImpl(_clusterName)

 .setMaxConnsPerHost(1)

 .setAuthenticationCredentials(credentials)

 *.setSSLConnectionContext(sslContext)*

 .setSeeds(String.format(%1$s:%2$d, uri.getHost(),

 uri.getPort()))

 )

 .buildCluster(ThriftFamilyFactory.getInstance());



 But when I tried to connect to the Cassandra server, I got following error:

 Caused by: org.apache.thrift.transport.TTransportException:
 javax.net.ssl.SSLHandshakeException: Remote host closed connection during
 handshake

 at
 org.apache.thrift.transport.TIOStreamTransport.flush(TIOStreamTransport.java:161)

 at
 org.apache.thrift.transport.TFramedTransport.flush(TFramedTransport.java:158)

 at
 org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:65)

 at
 org.apache.cassandra.thrift.Cassandra$Client.send_login(Cassandra.java:567)

 at
 org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:559)

 at
 com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.open(ThriftSyncConnectionFactoryImpl.java:203)

 ... 6 more



 It looks like that my SSL settings are incorrect.



 Does anyone know how to resolve this issue?



 Thanks



 Boying




-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
http://twitter.com/instaclustr | +61 415 936 359


Re: Doubts with the values of the parameter broadcast_rpc_address

2014-10-08 Thread Tyler Hobbs
On Wed, Oct 8, 2014 at 5:20 AM, Ricard Mestre Subirats 
ricard.mestre.subir...@everis.com wrote:

 At the machine with IP 192.168.150.112:

 -cluster_name: 'CassandraCluster1'

 -seeds: 192.168.150.112

 -listen_address: 192.168.150.112

 -rpc_address: 0.0.0.0

 -broadcast_rpc_address: 192.168.150.112



 At the machine with IP 192.168.150.113:

 -cluster_name: 'CassandraCluster1'

 -seeds: 192.168.150.112

 -listen_address: 192.168.150.113

 -rpc_address: 0.0.0.0

 -broadcast_rpc_address: 192.168.150.113


This is the correct configuration.




 Then, if we start the service and execute “nodetool status” the result is
 the following:

 *nodetool: Failed to connect to '127.0.0.1:7199 http://127.0.0.1:7199' -
 NoRouteToHostException: 'There is not any route to the `host''.*


Nodetool does not (generally) use rpc_address/broadcast_rpc_address,
because it's not using the normal API, it's using JMX.  This is a different
problem.  If you want to check rpc_address/broadcast_rpc_address, use cqlsh
(and pass an address).

You can specify a hostname for nodetool with the -h option: nodetool -h
192.168.150.112 status.  Depending on your setup, you may also need to edit
the line in conf/cassandra-env.sh that sets this option:
-Djava.rmi.server.hostname=public name


-- 
Tyler Hobbs
DataStax http://datastax.com/


significant NICE cpu usage

2014-10-08 Thread Ian Rose
Hi -

We are running a small 3-node cassandra cluster on Google Compute Engine.
I notice that our machines are reporting (via a collectd agent, confirmed
by `top`) a significant amount of cpu time in the NICE state.  For example,
one of our machines is a n1-highmem-4 (4 cores, 26 GB RAM).  Here is the
cpu line from to, just now:

%Cpu(s): 10.3 us, 15.7 sy, 22.8 ni, 44.4 id,  5.3 wa,  0.0 hi,  1.6 si,
 0.0 st

So the cpus are spending 20% of the time in NICE, which seems strange.

Now I have no direct evidence that Cassandra has anything to do with this,
but we have several other backend services that run on other nodes and none
of them have shown any significant NICE usage at all.  I have tried
searching for NICE processes to see if one of them is the source, but I
can't find anything that looks viable.  For example, this command (which I
*think* is correct, but please sanity check me) shows that none of the
processes with a negative priority have noticeable cpu usage.

$ ps -eo nice,pcpu,pid,args | grep '^\s*\-[1-9]'
-20  0.0 5 [kworker/0:0H]
-20  0.015 [kworker/1:0H]
-20  0.020 [kworker/2:0H]
-20  0.025 [kworker/3:0H]
-20  0.026 [khelper]
-20  0.028 [netns]
-20  0.029 [writeback]
-20  0.032 [kintegrityd]
-20  0.033 [bioset]
-20  0.034 [crypto]
-20  0.035 [kblockd]
-20  0.047 [kthrotld]
-20  0.048 [ipv6_addrconf]
-20  0.049 [deferwq]
-20  0.0   162 [scsi_tmf_0]
-20  0.0   179 [kworker/0:1H]
-20  0.0   197 [ext4-rsv-conver]
-20  0.0   214 [kworker/1:1H]
-20  0.0   480 [kworker/3:1H]
-20  0.0   481 [kworker/2:1H]
-20  0.0  1421 [ext4-rsv-conver]

By comparison, here is the cassandra process (just to verify that pcpu is
showing real values):

$ ps -eo nice,pcpu,pid,args | grep cassandra
  0  217  2498 java -ea
-javaagent:/mnt/data-1/mn/cassandra/bin/../lib/jamm-0.2.5.jar [...]

At this point I'm a bit stumped...  any ideas?

Cheers,
Ian


Re: significant NICE cpu usage

2014-10-08 Thread Andras Szerdahelyi
Hello,

AFAIK Compaction threads run with a lower affinity, I believe that will show up 
as niced..

Regards,
Andras

From: Ian Rose ianr...@fullstory.commailto:ianr...@fullstory.com
Reply-To: user user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wednesday 8 October 2014 17:29
To: user user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: significant NICE cpu usage

Hi -

We are running a small 3-node cassandra cluster on Google Compute Engine.  I 
notice that our machines are reporting (via a collectd agent, confirmed by 
`top`) a significant amount of cpu time in the NICE state.  For example, one of 
our machines is a n1-highmem-4 (4 cores, 26 GB RAM).  Here is the cpu line from 
to, just now:

%Cpu(s): 10.3 us, 15.7 sy, 22.8 ni, 44.4 id,  5.3 wa,  0.0 hi,  1.6 si,  0.0 st

So the cpus are spending 20% of the time in NICE, which seems strange.

Now I have no direct evidence that Cassandra has anything to do with this, but 
we have several other backend services that run on other nodes and none of them 
have shown any significant NICE usage at all.  I have tried searching for NICE 
processes to see if one of them is the source, but I can't find anything that 
looks viable.  For example, this command (which I think is correct, but please 
sanity check me) shows that none of the processes with a negative priority have 
noticeable cpu usage.

$ ps -eo nice,pcpu,pid,args | grep '^\s*\-[1-9]'
-20  0.0 5 [kworker/0:0H]
-20  0.015 [kworker/1:0H]
-20  0.020 [kworker/2:0H]
-20  0.025 [kworker/3:0H]
-20  0.026 [khelper]
-20  0.028 [netns]
-20  0.029 [writeback]
-20  0.032 [kintegrityd]
-20  0.033 [bioset]
-20  0.034 [crypto]
-20  0.035 [kblockd]
-20  0.047 [kthrotld]
-20  0.048 [ipv6_addrconf]
-20  0.049 [deferwq]
-20  0.0   162 [scsi_tmf_0]
-20  0.0   179 [kworker/0:1H]
-20  0.0   197 [ext4-rsv-conver]
-20  0.0   214 [kworker/1:1H]
-20  0.0   480 [kworker/3:1H]
-20  0.0   481 [kworker/2:1H]
-20  0.0  1421 [ext4-rsv-conver]

By comparison, here is the cassandra process (just to verify that pcpu is 
showing real values):

$ ps -eo nice,pcpu,pid,args | grep cassandra
  0  217  2498 java -ea 
-javaagent:/mnt/data-1/mn/cassandra/bin/../lib/jamm-0.2.5.jar [...]

At this point I'm a bit stumped...  any ideas?

Cheers,
Ian



Consistency Levels

2014-10-08 Thread William Katsak

Hello,

I was wondering if anyone (Datastax?) has any usage data about 
consistency levels. For example, what consistency levels are real 
applications using in real production scenarios. Who is using eventual 
consistency (ONE-ONE) in production vs strong consistency 
(QUORUM-QUORUM, ONE-ALL). Obviously it depends on the application, but I 
am trying to collect some information on this.


I saw the talk from Christos Kalantzis (from Cassandra13 I think) about 
Netflix using eventual consistency, but I was wondering if there is any 
more data out there.


Thanks in advance,

Bill Katask
Ph.D. Student
Department of Computer Science
Rutgers University


unsubscribe

2014-10-08 Thread Alexis


Re: Consistency Levels

2014-10-08 Thread DuyHai Doan
One should be carefull about using ALL consistency because by doing so, you
sacrify the high availability (loosing one node of the replica prevent you
from writing/reading with ALL). Lots of people choose Cassandra for high
availability so using ALL is kind of showstopper.

 Of course there are specific cases where such level can be relevant but
generally I advise people to use the couple QUORUM/QUORUM rather than
ONE/ALL or ALL/ONE if they want stronger consistency than ONE/ONE. The
latter combination is used for low latency when immediate consistency is
not a requirement. Users rely on all the anti-entropy processses
(read-repair, consistent read, scheduled repair) to make data converge.



On Wed, Oct 8, 2014 at 6:27 PM, William Katsak wkat...@cs.rutgers.edu
wrote:

 Hello,

 I was wondering if anyone (Datastax?) has any usage data about consistency
 levels. For example, what consistency levels are real applications using in
 real production scenarios. Who is using eventual consistency (ONE-ONE) in
 production vs strong consistency (QUORUM-QUORUM, ONE-ALL). Obviously it
 depends on the application, but I am trying to collect some information on
 this.

 I saw the talk from Christos Kalantzis (from Cassandra13 I think) about
 Netflix using eventual consistency, but I was wondering if there is any
 more data out there.

 Thanks in advance,

 Bill Katask
 Ph.D. Student
 Department of Computer Science
 Rutgers University



RE: Dynamic schema modification an anti-pattern?

2014-10-08 Thread Syed Masthan
I would definitely use dynamic columns for this instead of modifying schema 
(dynamically).
Sounds like an anti-pattern to me.



From: toddf...@gmail.com [mailto:toddf...@gmail.com] On Behalf Of Todd Fast
Sent: Monday, October 06, 2014 11:57 PM
To: Cassandra Users
Subject: Dynamic schema modification an anti-pattern?

There is a team at my work building a entity-attribute-value (EAV) store using 
Cassandra. There is a column family, called Entity, where the partition key is 
the UUID of the entity, and the columns are the attributes names with their 
values. Each entity will contain hundreds to thousands of attributes, out of a 
list of up to potentially ten thousand known attribute names.

However, instead of using wide rows with dynamic columns (and serializing type 
info with the value), they are trying to use a static column family and 
modifying the schema dynamically as new named attributes are created.

(I believe one of the main drivers of this approach is to use collection 
columns for certain attributes, and perhaps to preserve type metadata for a 
given attribute.)

This approach goes against everything I've seen and done in Cassandra, and is 
generally an anti-pattern for most persistence stores, but I want to gather 
feedback before taking the next step with the team.

Do others consider this approach an anti-pattern, and if so, what are the 
practical downsides?

For one, this means that the Entity schema would contain the superset of all 
columns for all rows. What is the impact of having thousands of columns names 
in the schema? And what are the implications of modifying the schema 
dynamically on a decent sized cluster (5 nodes now, growing to 10s later) under 
load?

Thanks,
Todd


Re: MIssing data in range query

2014-10-08 Thread Owen Kim
Nope. No secondary index. Just a slice query on the PK.



On Tuesday, October 7, 2014, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Oct 7, 2014 at 3:11 PM, Owen Kim ohech...@gmail.com
 javascript:_e(%7B%7D,'cvml','ohech...@gmail.com'); wrote:

 Sigh, it is a bit grating. I (genuinely) appreciate your acknowledgement
 of that. Though, I didn't intend for the question to be about
 supercolumns.


 (Yep, understand tho that if you hadn't been told that advice before, it
 would grate a lot less. I will try to remember that Owen Kim has received
 this piece of info, and will do my best to not repeat it to you... :D)


 It is possible I'm hitting an odd edge case though I'm having trouble
 reproducing the issue in a controlled environment since there seems to be a
 timing element to it, or at least it's not consistently happening. I
 haven't been able to reproduce it on a single node test cluster. I'm moving
 on to test a larger one now.


 Right, my hypothesis is that there is something within the supercolumn
 write path which differs from the non-supercolumn write path. In theory
 this should be less possible since the 1.2 era supercolumn rewrite.

 To be clear, are you reading back via PK? No secondary indexes involved,
 right? The only bells your symptoms are ringing are secondary index bugs...

 =Rob




Re: Consistency Levels

2014-10-08 Thread Robert Coli
On Wed, Oct 8, 2014 at 9:27 AM, William Katsak wkat...@cs.rutgers.edu
wrote:

 I was wondering if anyone (Datastax?) has any usage data about consistency
 levels. For example, what consistency levels are real applications using in
 real production scenarios. Who is using eventual consistency (ONE-ONE) in
 production vs strong consistency (QUORUM-QUORUM, ONE-ALL). Obviously it
 depends on the application, but I am trying to collect some information on
 this.


Anecdotally, my sense of the typical deploy is that it is RF=N=3, with
CL.ONE.

=Rob


Re: significant NICE cpu usage

2014-10-08 Thread Ian Rose
Ah, thanks Andras.

Running `ps -T` I do in fact seen that most Cassandra threads have priority
0 but a few are at priority 4 and these also show non-trivial pcpu.  So
that seems like it!

- Ian


On Wed, Oct 8, 2014 at 11:56 AM, Andras Szerdahelyi 
andras.szerdahe...@ignitionone.com wrote:

  Hello,

  AFAIK Compaction threads run with a lower affinity, I believe that will
 show up as “niced”..

  Regards,
 Andras

   From: Ian Rose ianr...@fullstory.com
 Reply-To: user user@cassandra.apache.org
 Date: Wednesday 8 October 2014 17:29
 To: user user@cassandra.apache.org
 Subject: significant NICE cpu usage

   Hi -

  We are running a small 3-node cassandra cluster on Google Compute
 Engine.  I notice that our machines are reporting (via a collectd agent,
 confirmed by `top`) a significant amount of cpu time in the NICE state.
 For example, one of our machines is a n1-highmem-4 (4 cores, 26 GB RAM).
 Here is the cpu line from to, just now:

  %Cpu(s): 10.3 us, 15.7 sy, 22.8 ni, 44.4 id,  5.3 wa,  0.0 hi,  1.6 si,
  0.0 st

  So the cpus are spending 20% of the time in NICE, which seems strange.

  Now I have no direct evidence that Cassandra has anything to do with
 this, but we have several other backend services that run on other nodes
 and none of them have shown any significant NICE usage at all.  I have
 tried searching for NICE processes to see if one of them is the source, but
 I can't find anything that looks viable.  For example, this command (which
 I *think* is correct, but please sanity check me) shows that none of the
 processes with a negative priority have noticeable cpu usage.

  $ ps -eo nice,pcpu,pid,args | grep '^\s*\-[1-9]'
 -20  0.0 5 [kworker/0:0H]
 -20  0.015 [kworker/1:0H]
 -20  0.020 [kworker/2:0H]
 -20  0.025 [kworker/3:0H]
 -20  0.026 [khelper]
 -20  0.028 [netns]
 -20  0.029 [writeback]
 -20  0.032 [kintegrityd]
 -20  0.033 [bioset]
 -20  0.034 [crypto]
 -20  0.035 [kblockd]
 -20  0.047 [kthrotld]
 -20  0.048 [ipv6_addrconf]
 -20  0.049 [deferwq]
 -20  0.0   162 [scsi_tmf_0]
 -20  0.0   179 [kworker/0:1H]
 -20  0.0   197 [ext4-rsv-conver]
 -20  0.0   214 [kworker/1:1H]
 -20  0.0   480 [kworker/3:1H]
 -20  0.0   481 [kworker/2:1H]
 -20  0.0  1421 [ext4-rsv-conver]

  By comparison, here is the cassandra process (just to verify that pcpu
 is showing real values):

  $ ps -eo nice,pcpu,pid,args | grep cassandra
   0  217  2498 java -ea
 -javaagent:/mnt/data-1/mn/cassandra/bin/../lib/jamm-0.2.5.jar [...]

  At this point I'm a bit stumped...  any ideas?

  Cheers,
 Ian




Re: Consistency Levels

2014-10-08 Thread Jack Krupansky
I don't know of any such data collected by DataStax - it's not like we're 
the NSA, sniffing all requests.


ONE is certainly fast, but only fine if you don't have immediate need to 
read the data or don't need the absolutely most recent value.


To be clear, even QUORUM write is eventual consistency - to all nodes beyond 
the immediate quorum.


-- Jack Krupansky

-Original Message- 
From: William Katsak

Sent: Wednesday, October 8, 2014 12:27 PM
To: user@cassandra.apache.org
Subject: Consistency Levels

Hello,

I was wondering if anyone (Datastax?) has any usage data about
consistency levels. For example, what consistency levels are real
applications using in real production scenarios. Who is using eventual
consistency (ONE-ONE) in production vs strong consistency
(QUORUM-QUORUM, ONE-ALL). Obviously it depends on the application, but I
am trying to collect some information on this.

I saw the talk from Christos Kalantzis (from Cassandra13 I think) about
Netflix using eventual consistency, but I was wondering if there is any
more data out there.

Thanks in advance,

Bill Katask
Ph.D. Student
Department of Computer Science
Rutgers University 



Re: Consistency Levels

2014-10-08 Thread William Katsak

Thanks.

I am thinking more in terms of the combination of read/write. If I am 
correct, QUORUM reads and QUORUM writes (or ONE-ALL) should deliver 
strong consistency in the absence of failures, correct? Or this this 
still considered eventual consistency, somehow?


-Bill


On 10/08/2014 06:17 PM, Jack Krupansky wrote:

I don't know of any such data collected by DataStax - it's not like
we're the NSA, sniffing all requests.

ONE is certainly fast, but only fine if you don't have immediate need to
read the data or don't need the absolutely most recent value.

To be clear, even QUORUM write is eventual consistency - to all nodes
beyond the immediate quorum.

-- Jack Krupansky

-Original Message- From: William Katsak
Sent: Wednesday, October 8, 2014 12:27 PM
To: user@cassandra.apache.org
Subject: Consistency Levels

Hello,

I was wondering if anyone (Datastax?) has any usage data about
consistency levels. For example, what consistency levels are real
applications using in real production scenarios. Who is using eventual
consistency (ONE-ONE) in production vs strong consistency
(QUORUM-QUORUM, ONE-ALL). Obviously it depends on the application, but I
am trying to collect some information on this.

I saw the talk from Christos Kalantzis (from Cassandra13 I think) about
Netflix using eventual consistency, but I was wondering if there is any
more data out there.

Thanks in advance,

Bill Katask
Ph.D. Student
Department of Computer Science
Rutgers University


--

William Katsak wkat...@cs.rutgers.edu
Ph.D. Student
Rutgers University
Department of Computer Science



Re: Consistency Levels

2014-10-08 Thread Jack Krupansky
It would certainly depend on the nuances of your definitions! Especially 
since you added this monster of a caveat:  in the absence of failures.


Here's a scenario for you: 1) Write at quorum, 2) Add 3 nodes, 3) 
Immediately read at [the new] quorum - no guarantee that the new nodes will 
have fully bootstrapped.


Who knows how many other modalities there might be - despite however many 
caveats you want to tack on.


Strong consistency is a model, not necessarily a reality at any point in 
time even if it was a reality at a prior point in time.


If I deliberately decommission a node, that isn't necessarily a failure 
is it?


All of that said, it depends on where you're trying to get to.

-- Jack Krupansky

-Original Message- 
From: William Katsak

Sent: Wednesday, October 8, 2014 7:19 PM
To: user@cassandra.apache.org
Subject: Re: Consistency Levels

Thanks.

I am thinking more in terms of the combination of read/write. If I am
correct, QUORUM reads and QUORUM writes (or ONE-ALL) should deliver
strong consistency in the absence of failures, correct? Or this this
still considered eventual consistency, somehow?

-Bill


On 10/08/2014 06:17 PM, Jack Krupansky wrote:

I don't know of any such data collected by DataStax - it's not like
we're the NSA, sniffing all requests.

ONE is certainly fast, but only fine if you don't have immediate need to
read the data or don't need the absolutely most recent value.

To be clear, even QUORUM write is eventual consistency - to all nodes
beyond the immediate quorum.

-- Jack Krupansky

-Original Message- From: William Katsak
Sent: Wednesday, October 8, 2014 12:27 PM
To: user@cassandra.apache.org
Subject: Consistency Levels

Hello,

I was wondering if anyone (Datastax?) has any usage data about
consistency levels. For example, what consistency levels are real
applications using in real production scenarios. Who is using eventual
consistency (ONE-ONE) in production vs strong consistency
(QUORUM-QUORUM, ONE-ALL). Obviously it depends on the application, but I
am trying to collect some information on this.

I saw the talk from Christos Kalantzis (from Cassandra13 I think) about
Netflix using eventual consistency, but I was wondering if there is any
more data out there.

Thanks in advance,

Bill Katask
Ph.D. Student
Department of Computer Science
Rutgers University


--

William Katsak wkat...@cs.rutgers.edu
Ph.D. Student
Rutgers University
Department of Computer Science