date:20130405

Re: Linear scalability problems

2013-04-05 Thread Hiller, Dean

If you double your nodes, you should be doubling your webservers too(that is if 
you are trying to prove it scales linearly).  We had to spend time finding the 
correct ratio for our application (it ended up being 19 webservers to 20 data 
nodes so now just assume 1 to 1…..you can use amazon to find that info for very 
cheap.

Dean

From: Anand Somani meatfor...@gmail.commailto:meatfor...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Thursday, April 4, 2013 1:05 PM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Linear scalability problems

RF=3.

On Thu, Apr 4, 2013 at 7:08 AM, Cem Cayiroglu 
cayiro...@gmail.commailto:cayiro...@gmail.com wrote:
What was the RF before adding nodes?

Sent from my iPhone

On 04 Apr 2013, at 15:12, Anand Somani 
meatfor...@gmail.commailto:meatfor...@gmail.com wrote:

We are using a single process with multiple threads, will look at client side 
delays.

Thanks

On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs 
ty...@datastax.commailto:ty...@datastax.com wrote:
If I had to guess, I would say that your client is the bottleneck, not the 
cluster.  Are you inserting data with multiple threads or processes?


On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani 
meatfor...@gmail.commailto:meatfor...@gmail.com wrote:
Hi,

I am running some tests trying to scale out our application from using a 3 node 
cluster to 6 node cluster. The thing I observed is that when using a 3 node 
cluster I was able to handle abt 41 req/second, so I added 3 more nodes 
thinking it should close to double, but instead it only goes upto bat 47 
req/second!! I am doing something wrong and it is not obvious, so wanted some 
help in what stats could/should I monitor to tell me things like if a node has 
more requests or if the load distribution is not random enough?

Note I am using direct thrift (old code base) and cassandra 1.1.6. The data 
model is for storing blobs (split across columns) and has around 6 CF, RF=3 and 
all operations are at quorum. Also at the end of the run nodetool ring reports 
the same data size.

Thanks
Anand



--
Tyler Hobbs
DataStaxhttp://datastax.com/

Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-05 Thread Edward Capriolo

Since there are few column names what you can do is this. Make a reverse
index, low read repair chance, Be aggressive with compaction. It will be
many extra writes but that is ok.

Other option is turn on row cache and try read before write. It is a good
case for row cache because it is a very small data set.

On Thursday, April 4, 2013, Drew Kutcharian d...@venarc.com wrote:
 I don't really need to answer what rows contain column named X, so no
need for a reverse index here. All I want is a distinct set of all the
column names, so I can answer what are all the available column names

 On Apr 4, 2013, at 4:20 PM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Your reverse index of which rows contain a column named X will have
very wide rows. You could look at cassandra's secondary indexing, or
possibly look at a solandra/solr approach. Another option is you can shift
the problem slightly, which rows have column X that was added between time
y and time z. Remember with few distinct column names that reverse index
of column to row is going to be a very big list.


 On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian d...@venarc.com wrote:

 Hi Edward,
 I anticipate that the column names will be reused a lot. For example,
key1 will be in many rows. So I think the number of distinct column names
will be much much smaller than the number of rows. Is there a way to have a
separate CF that keeps track of the column names?
 What I was thinking was to have a separate CF that I write only the
column name with a null value in there every time I write a key/value to
the main CF. In this case if that column name exist, then it will just be
overridden. Now if I wanted to get all the column names, then I can just
query that CF. Not sure if that's the best approach at high load (100k
inserts a second).
 -- Drew

 On Apr 4, 2013, at 12:02 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:

 You can not get only the column name (which you are calling a key) you
can use get_range_slice which returns all the columns. When you specify an
empty byte array (new byte[0]{}) as the start and finish you get back all
the columns. From there you can return only the columns to the user in a
format that you like.


 On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian d...@venarc.com wrote:

 Hey Guys,

 I'm working on a project and one of the requirements is to have a
schema free CF where end users can insert arbitrary key/value pairs per
row. What would be the best way to know what are all the keys that were
inserted (preferably w/o any locking). For example,

 Row1 = key1 - XXX, key2 - XXX
 Row2 = key1 - XXX, key3 - XXX
 Row3 = key4 - XXX, key5 - XXX
 Row4 = key2 - XXX, key5 - XXX
 …

 The query would be give me all the inserted keys and the response would
be {key1, key2, key3, key4, key5}

 Thanks,

 Drew

Re: Really have to repair ?

2013-04-05 Thread Jean-Armel Luce

Hi Cyril,

According to the documentation (
http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair),
I understand that is is not necessary to repair every node before
gc_grace_seconds if you are sure that you don't miss to run a repair each
time a node is down longer than gc_grace_seconds.

*IF* your operations team is sufficiently on the ball, you can get by
without repair as long as you do not have hardware failure -- in that case,
HintedHandoff is adequate to repair successful updates that some replicas
have missed

Am I wrong ? Thoughts ?




2013/4/4 cscetbon@orange.com

 Hi,

 I know that deleted rows can reappear if node repair is not run on every
 node before *gc_grace_seconds* seconds. However do we really need to obey
 this rule if we run node repair on node that are down for more than *
 max_hint_window_in_ms* milliseconds ?

 Thanks
 --
 Cyril SCETBON

 _

 Ce message et ses pieces jointes peuvent contenir des informations 
 confidentielles ou privilegiees et ne doivent donc
 pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
 ce message par erreur, veuillez le signaler
 a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
 electroniques etant susceptibles d'alteration,
 France Telecom - Orange decline toute responsabilite si ce message a ete 
 altere, deforme ou falsifie. Merci.

 This message and its attachments may contain confidential or privileged 
 information that may be protected by law;
 they should not be distributed, used or copied without authorisation.
 If you have received this email in error, please notify the sender and delete 
 this message and its attachments.
 As emails may be altered, France Telecom - Orange is not liable for messages 
 that have been modified, changed or falsified.
 Thank you.

Re: Really have to repair ?

2013-04-05 Thread cscetbon.ext

That's exactly what I understood and why I was using the max_hint_window_in_ms 
threshold to force a manual repair.
--
Cyril SCETBON

On Apr 5, 2013, at 5:22 PM, Jean-Armel Luce 
jaluc...@gmail.commailto:jaluc...@gmail.com wrote:

Hi Cyril,

According to the documentation 
(http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair), I 
understand that is is not necessary to repair every node before 
gc_grace_seconds if you are sure that you don't miss to run a repair each time 
a node is down longer than gc_grace_seconds.

*IF* your operations team is sufficiently on the ball, you can get by without 
repair as long as you do not have hardware failure -- in that case, 
HintedHandoff is adequate to repair successful updates that some replicas have 
missed

Am I wrong ? Thoughts ?




2013/4/4 cscetbon@orange.commailto:cscetbon@orange.com
Hi,

I know that deleted rows can reappear if node repair is not run on every node 
before gc_grace_seconds seconds. However do we really need to obey this rule if 
we run node repair on node that are down for more than max_hint_window_in_ms 
milliseconds ?

Thanks
--
Cyril SCETBON


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.




_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

2013-04-05 Thread aaron morton

 skipping sstable due to bloom filter debug messages
What were these messages?

Do you have the logs from the start up ? 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote:

 Hi,
 
 I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to 1.2.3. 
 During startup while tailing C*'s system.log, I observed a series of SSTable 
 batch load messages and skipping sstable due to bloom filter debug messages 
 which is normal for startup, but when it reached loading saved key caches, it 
 gets stuck forever. The I/O wait stays high in the CPU graph and I/O ops are 
 sent to disk, but C* never passes that step of loading the key cache file 
 successfully. The saved key cache file was about 75MB on one node and 125MB 
 on the other node and they were for different CFs. 
 
 image.jpeg
 
 The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at 
 loading one saved key cache file. I have marked that on the graph above. The 
 workaround was to delete the saved cache files and things loaded fine (See 
 marked Normal Startup). 
 
 These machines are m1.xlarge EC2 instances. And this issue happened on both 
 nodes upgraded. This did not happen during exercise of upgrade from 1.1.6 to 
 1.2.2 using the same snapshot. 
 
 Should I raise a JIRA? 
 
 -Arya

Re: Data Model and Query

2013-04-05 Thread aaron morton

 Whats the recommendation on querying a data model like  StartDate  “X” and 
 counter  “Y” .
 
 
it's not possible. 

If you are using secondary indexes you have to have an equals clause in the 
statement. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 6:53 AM, shubham srivastava shubha...@gmail.com wrote:

 Hi,
 
  
 Whats the recommendation on querying a data model like  StartDate  “X” and 
 counter  “Y” . Its kind of range queries across multiple columns and key.
 
 I have the flexibility for modelling the data for the above query accordingly.
 
 
  
 Regards,
 
 Shubham

Re: Issues running Bulkloader program on AIX server

2013-04-05 Thread aaron morton

 Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in 
 java.library.path)

You do not have the snappy compression library installed. 

http://www.datastax.com/docs/1.1/troubleshooting/index#cannot-initialize-class-org-xerial-snappy-snappy

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 1:36 PM, praveen.akun...@wipro.com wrote:

 Hi All, 
 
 Sorry, my environment is as below: 
 
 3 node cluster with Cassandra 1.1.9 provided with DSE 3.0 on Linux
 We are trying to run the bulk loader from AIX 6.1 server. Java version 1.5. 
 
 Regards, 
 Praveen
 
 From: Praveen Akunuru praveen.akun...@wipro.com
 Date: Thursday, April 4, 2013 12:21 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Issues running Bulkloader program on AIX server
 
 Hi All, 
 
 I am facing issues with running java Bulkloader program from a AIX server. 
 The program is working fine on Linux server. I am receiving the below error 
 on AIX. Can anyone help me in getting this working?
 
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at java.lang.J9VMInternals.initializeImpl(Native Method)
 at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.clinit(SnappyCompressor.java:37)
 at java.lang.J9VMInternals.initializeImpl(Native Method)
 at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
 at org.apache.cassandra.config.CFMetaData.clinit(CFMetaData.java:82)
 at java.lang.J9VMInternals.initializeImpl(Native Method)
 at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
 at 
 org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.init(SSTableSimpleUnsortedWriter.java:80)
 at 
 org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.init(SSTableSimpleUnsortedWriter.java:93)
 at BulkLoadExample.main(BulkLoadExample.java:55)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
 at java.lang.reflect.Method.invoke(Method.java:611)
 at 
 org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
 Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in 
 java.library.path)
 at java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1011)
 at 
 java.lang.ClassLoader.loadLibraryWithClassLoader(ClassLoader.java:975)
 at java.lang.System.loadLibrary(System.java:469)
 at 
 org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
 ... 25 more
 log4j:WARN No appenders could be found for logger 
 (org.apache.cassandra.io.compress.SnappyCompressor).
 log4j:WARN Please initialize the log4j system properly.
 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
 info.
 Unhandled exception
 Type=Segmentation error vmState=0x
 J9Generic_Signal_Number=0004 Signal_Number=000b Error_Value= 
 Signal_Code=0032
 Handler1=09001000A06FF5A0 Handler2=09001000A06F60F0
 
 Regards, 
 Praveen
 Wipro Limited (Company Regn No in UK - FC 019088) 
 Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. 
 Tel +44 20 7432 8500 Fax: +44 20 7286 5703
 
 VAT Number: 563 1964 27
 
 (Branch of Wipro Limited (Incorporated in India at Bangalore with limited 
 liability vide Reg no L9KA1945PLC02800 with Registrar of Companies at 
 Bangalore, India. Authorized share capital: Rs 5550 mn))
 
 Please do not print this email unless it is absolutely necessary.
 
 The information contained in this electronic message and any attachments to 
 this message are intended for the exclusive use of the addressee(s) and may 
 contain proprietary, confidential or privileged information. If you are not 
 the intended recipient, you should not disseminate, distribute or copy this 
 e-mail. Please notify the sender immediately and destroy all copies of this 
 message and any attachments.

Re: Lost data after expanding cluster c* 1.2.3-1

2013-04-05 Thread aaron morton

 but nothing's happening, how can i monitor the progress? and how can i know 
 when it's finished?

check nodetool compacitonstats

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 2:51 PM, Kais Ahmed k...@neteck-fr.com wrote:

 Hi aaron,
 
 I ran the command nodetool rebuild_index host keyspace cf on all the nodes, 
 in the log i see :
 
 INFO [RMI TCP Connection(5422)-10.34.139.xxx] 2013-04-04 08:31:53,641 
 ColumnFamilyStore.java (line 558) User Requested secondary index re-build for 
 ...
 
 but nothing's happening, how can i monitor the progress? and how can i know 
 when it's finished?
 
 Thanks,
  
 
 2013/4/2 aaron morton aa...@thelastpickle.com
 The problem come from that i don't put  auto_boostrap to true for the new 
 nodes, not in this documentation 
 (http://www.datastax.com/docs/1.2/install/expand_ami)
 auto_bootstrap defaults to True if not specified in the yaml. 
 
 can i do that at any time, or when the cluster are not loaded
 Not sure what the question is. 
 Both those operations are online operations you can do while the node is 
 processing requests. 
  
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 1/04/2013, at 9:26 PM, Kais Ahmed k...@neteck-fr.com wrote:
 
  At this moment the errors started, we see that members and other data are 
  gone, at this moment the nodetool status return (in red color the 3 new 
  nodes)
  What errors?
 The errors was in my side in the application, not cassandra errors
 
  I put for each of them seeds = A ip, and start each with two minutes 
  intervals.
  When I'm making changes I tend to change a single node first, confirm 
  everything is OK and then do a bulk change.
 Thank you for that advice.
 
 I'm not sure what or why it went wrong, but that should get you to a stable 
 place. If you have any problems keep an eye on the logs for errors or 
 warnings.
 The problem come from that i don't put  auto_boostrap to true for the new 
 nodes, not in this documentation 
 (http://www.datastax.com/docs/1.2/install/expand_ami)
 
 if you are using secondary indexes use nodetool rebuild_index to rebuild 
 those.
 can i do that at any time, or when the cluster are not loaded
 
 Thanks aaron,
 
 2013/4/1 aaron morton aa...@thelastpickle.com
 Please do not rely on colour in your emails, the best way to get your emails 
 accepted by the Apache mail servers is to use plain text.
 
  At this moment the errors started, we see that members and other data are 
  gone, at this moment the nodetool status return (in red color the 3 new 
  nodes)
 What errors?
 
  I put for each of them seeds = A ip, and start each with two minutes 
  intervals.
 When I'm making changes I tend to change a single node first, confirm 
 everything is OK and then do a bulk change.
 
  Now the cluster seem to work normally, but i can use the secondary for the 
  moment, the queryanswer are random
 run nodetool repair -pr on each node, let it finish before starting the next 
 one.
 if you are using secondary indexes use nodetool rebuild_index to rebuild 
 those.
 Add one node new node to the cluster and confirm everything is ok, then add 
 the remaining ones.
 
 I'm not sure what or why it went wrong, but that should get you to a stable 
 place. If you have any problems keep an eye on the logs for errors or 
 warnings.
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 31/03/2013, at 10:01 PM, Kais Ahmed k...@neteck-fr.com wrote:
 
  Hi aaron,
 
  Thanks for reply, i will try to explain what append exactly
 
  I had 4 C* called [A,B,C,D] cluster (1.2.3-1 version) start with ec2 ami 
  (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2) with
  this config --clustername myDSCcluster --totalnodes 4--version community
 
  Two days after this cluster in production, i saw that the cluster was 
  overload, I wanted to extend it by adding 3 another nodes.
 
  I create a new cluster with 3 C* [D,E,F]  
  (https://aws.amazon.com/amis/datastax-auto-clustering-ami-2-2)
 
  And follow the documentation 
  (http://www.datastax.com/docs/1.2/install/expand_ami) for adding them in 
  the ring.
  I put for each of them seeds = A ip, and start each with two minutes 
  intervals.
 
  At this moment the errors started, we see that members and other data are 
  gone, at this moment the nodetool status return (in red color the 3 new 
  nodes)
 
  Datacenter: eu-west
  ===
  Status=Up/Down
  |/ State=Normal/Leaving/Joining/
  Moving
  --  Address   Load   Tokens  Owns   Host ID   
  Rack
  UN  10.34.142.xxx 10.79 GB   256 15.4%  
  4e2e26b8-aa38-428c-a8f5-e86c13eb4442  1b
  UN  10.32.49.xxx   1.48 MB25613.7%  
  e86f67b6-d7cb-4b47-b090-3824a5887145  1b
  UN

Re: upgrading 1.1.x to 1.2.x via sstableloader

2013-04-05 Thread aaron morton

  Is it safe to change sstable file name to avoid name collisions?
Yes. 
Make sure to change the generation number for all the components. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 3:01 PM, Michał Czerwiński mic...@qubitproducts.com wrote:

 I see, thanks for the replay!
 
 One more question:
 
 I can see that multiple nodes have same sstable names for a certain keyspace 
 / cf.
 I am moving 8 nodes to a 6 nodes cluster, so at some point when putting 
 sstables in place I would overwrite files from other node. What is the best 
 way to solve this problem? Is it safe to change sstable file name to avoid 
 name collisions?
 
 
 
 On 4 April 2013 02:54, aaron morton aa...@thelastpickle.com wrote:
  java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db 
  is not compatible with current version ib
 You cannot stream files that have a different on disk format.
 
 1.2 can read the old files, but cannot accept them as streams. You can copy 
 the files to the new machines and use nodetool refresh to load them, then 
 upgradesstables to re-write them before running repair.
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 3/04/2013, at 10:53 PM, Michał Czerwiński mic...@qubitproducts.com wrote:
 
  Does anyone knows what is the best process to put data from cassandra 1.1.x 
  (1.1.7 to be more precise) to cassandra 1.2.3 ?
 
  I am trying to use sstableloader and stream data to a new cluster but I get.
 
  ERROR [Thread-125] 2013-04-03 16:37:27,330 IncomingTcpConnection.java (line 
  183) Received stream using protocol version 5 (my version 6). Terminating 
  connection
 
  ERROR [Thread-141] 2013-04-03 16:38:05,704 CassandraDaemon.java (line 164) 
  Exception in thread Thread[Thread-141,5,main]
 
  java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db 
  is not compatible with current version ib
 
  at 
  org.apache.cassandra.streaming.StreamIn.getContextMapping(StreamIn.java:77)
 
  at 
  org.apache.cassandra.streaming.IncomingStreamReader.init(IncomingStreamReader.java:87)
 
  at 
  org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238)
 
  at 
  org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178)
 
  at 
  org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
 
 
 
  I've changed Murmur3Partitioner to RandomPartitioner already and I've 
  noticed I am not able to use 1.1.7's sstableloader so I copied sstables to 
  new nodes and tried doing it locally on cassandra 1.2.3, but it seems 
  protocol versions do not match (see error above)
 
  The reason why I want to use sstableloader is that I have different number 
  of nodes and would like to avoid using rsync and then repair/cleanup of 
  excessive data.
 
  Thanks!

Re: nodetool status inconsistencies, repair performance and system keyspace compactions

2013-04-05 Thread aaron morton

monitor the repair using nodetool compactionstats to see the merkle trees being 
created, and nodetool netstats to see data streaming. 

Also look in the logs for messages from AntiEntropyService.java , that will 
tell you how long the node waited for each replica to get back to it. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 5:42 PM, Ondřej Černoš cern...@gmail.com wrote:

 Hi,
 
 most has been resolved - the failed to uncompress error was really a
 bug in cassandra (see
 https://issues.apache.org/jira/browse/CASSANDRA-5391) and the problem
 with different load reporting is a change between 1.2.1 (reports 100%
 for 3 replicas/3 nodes/2 DCs setup I have) and 1.2.3 which reports the
 fraction. Is this correct?
 
 Anyway, the nodetool repair still takes ages to finish, considering
 only megabytes of not changing data are involved in my test:
 
 [root@host:/etc/puppet] nodetool repair ks
 [2013-04-04 13:26:46,618] Starting repair command #1, repairing 1536
 ranges for keyspace ks
 [2013-04-04 13:47:17,007] Repair session
 88ebc700-9d1a-11e2-a0a1-05b94e1385c7 for range
 (-2270395505556181001,-2268004533044804266] finished
 ...
 [2013-04-04 13:47:17,063] Repair session
 65d31180-9d1d-11e2-a0a1-05b94e1385c7 for range
 (1069254279177813908,1070290707448386360] finished
 [2013-04-04 13:47:17,063] Repair command #1 finished
 
 This is the status before the repair (by the way, after the datacenter
 has been bootstrapped from the remote one):
 
 [root@host:/etc/puppet] nodetool status
 Datacenter: us-east
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address   Load   Tokens  Owns   Host ID
Rack
 UN  xxx.xxx.xxx.xxx5.74 MB256 17.1%
 06ff8328-32a3-4196-a31f-1e0f608d0638  1d
 UN  xxx.xxx.xxx.xxx5.73 MB256 15.3%
 7a96bf16-e268-433a-9912-a0cf1668184e  1d
 UN  xxx.xxx.xxx.xxx5.72 MB256 17.5%
 67a68a2a-12a8-459d-9d18-221426646e84  1d
 Datacenter: na-dev
 ==
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address  Load   Tokens  Owns   Host ID
   Rack
 UN  xxx.xxx.xxx.xxx   5.74 MB256 16.4%
 eb86aaae-ef0d-40aa-9b74-2b9704c77c0a  cmp02
 UN  xxx.xxx.xxx.xxx   5.74 MB256 17.0%
 cd24af74-7f6a-4eaa-814f-62474b4e4df1  cmp01
 UN  xxx.xxx.xxx.xxx   5.74 MB256 16.7%
 1a55cfd4-bb30-4250-b868-a9ae13d81ae1  cmp05
 
 Why does it take 20 minutes to finish? Fortunately the big number of
 compactions I reported in the previous email was not triggered.
 
 And is there a documentation where I could find the exact semantics of
 repair when vnodes are used (and what -pr means in such a setup) and
 when run in multiple datacenter setup? I still don't quite get it.
 
 regards,
 Ondřej Černoš
 
 
 On Thu, Mar 28, 2013 at 3:30 AM, aaron morton aa...@thelastpickle.com wrote:
 During one of my tests - see this thread in this mailing list:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html
 
 That thread has been updated, check the bug ondrej created.
 
 How will this perform in production with much bigger data if repair
 takes 25 minutes on 7MB and 11k compactions were triggered by the
 repair run?
 
 Seems a little odd.
 See what happens the next time you run repair.
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 27/03/2013, at 2:36 AM, Ondřej Černoš cern...@gmail.com wrote:
 
 Hi all,
 
 I have 2 DCs, 3 nodes each, RF:3, I use local quorum for both reads and
 writes.
 
 Currently I test various operational qualities of the setup.
 
 During one of my tests - see this thread in this mailing list:
 http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/java-io-IOException-FAILED-TO-UNCOMPRESS-5-exception-when-running-nodetool-rebuild-td7586494.html
 - I ran into this situation:
 
 - all nodes have all data and agree on it:
 
 [user@host1-dc1:~] nodetool status
 
 Datacenter: na-prod
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad Tokens  Owns
 (effective)  Host IDRack
 UN  XXX.XXX.XXX.XXX   7.74 MB256 100.0%
 0b1f1d79-52af-4d1b-a86d-bf4b65a05c49  cmp17
 UN  XXX.XXX.XXX.XXX   7.74 MB256 100.0%
 039f206e-da22-44b5-83bd-2513f96ddeac  cmp10
 UN  XXX.XXX.XXX.XXX   7.72 MB256 100.0%
 007097e9-17e6-43f7-8dfc-37b082a784c4  cmp11
 Datacenter: us-east
 ===
 Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  AddressLoad Tokens  Owns
 (effective)  Host IDRack
 UN

Re: Repair hangs when merkle tree request is not acknowledged

2013-04-05 Thread aaron morton

 A repair on a certain CF will fail, and I run it again and again, eventually 
 it will succeed.
How does it fail?

Can you see the repair start on the other node ? 
If you are getting errors in the log about streaming failing because a node 
died, and the FailureDetector is in the call stack, change the 
phi_convict_threshold. You can set it in the yaml file or via JMX on the 
FailureDetectorMBean, in either case boost it from 8 to 16 to get the repair 
through. This will make it less likely that a node is marked as down, you 
probably want to run with 8 or a little bit higher normally. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 6:41 PM, Paul Sudol paulsu...@gmail.com wrote:

 Hello,
 
 I have a cluster with 4 nodes, 2 nodes in 2 data centers. I had a hardware 
 failure in one DC and had to replace the nodes. I'm running 1.2.3 on all of 
 the nodes now. I was able to run nodetool rebuild on the two replacement 
 nodes, but now I cannot finish a repair on any of them. I have 18 column 
 families, if I run a repair on a single CF at a time, I can get the node 
 repaired eventually. A repair on a certain CF will fail, and I run it again 
 and again, eventually it will succeed.
 
 I've got an RF of 2, 1 copy in each DC, so the repair needs to pull data from 
 the other DC to finish it's repair.
 
 The problem seems to be that the merkle tree request sometimes is not 
 received by the node in the other DC. Usually when the merkle tree request is 
 sent, the nodes that it was sent to start a compaciton/validation. In certain 
 cases this does not happen, only the node that I ran the repair on will begin 
 compaction/validation and send the merkle tree to itself. Then it's waiting 
 for a merkle tree from the other node, and it will never get it. After about 
 24 hours it will time out and say the node in question died.
 
 Is there a setting I can use to force the merkle tree request to be 
 acknowledged or resent if it's not acknowledged? I setup NTPD on all the 
 nodes and tried the cross_node_timeout, but that did not help.
 
 Thanks in advance,
 
 Paul

Re: gossip not working

2013-04-05 Thread aaron morton

Starting the node with the JVM option -Dcassandra.load_ring_state=false in 
cassandra-env.sh sometimes works. 

If not post the output from nodetool gossipinfo

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 5/04/2013, at 9:38 AM, S C as...@outlook.com wrote:

 Is there a way to force gossip among the nodes?
 
 From: as...@outlook.com
 To: user@cassandra.apache.org
 Subject: RE: gossip not working
 Date: Thu, 4 Apr 2013 19:59:45 -0500
 
 I am not seeing anything in the logs other than Starting up server gossip 
 and there is no firewall between the nodes.
 From: paulsu...@gmail.com
 Subject: Re: gossip not working
 Date: Thu, 4 Apr 2013 18:49:29 -0500
 To: user@cassandra.apache.org
 
 What errors are you seeing in the log files of the down nodes? Did you run 
 upgradesstables? You need to upgradesstables when moving from  1.1.7 to 1.1.9
 
 On Apr 4, 2013, at 6:11 PM, S C as...@outlook.com wrote:
 
 I was in the middle of upgrade to 1.1.9. I brought one node with 1.1.9 while 
 the other were running on 1.1.5. Once one of the node was on 1.1.9 it is no 
 longer recognizing other nodes in the ring.
 
 On 192.168.56.10 and 11
 
 192.168.56.10  DC1-CassRAC1Up Normal  28.06 GB50.00%  
 0   
 192.168.56.11  DC1-CassRAC1Up Normal  31.59 GB25.00%  
 42535295865117307932921825928971026432  
 192.168.56.12  DC1-CassRAC1Down   Normal  29.02 GB25.00%  
 85070591730234615865843651857942052864
 
 
 On 192.168.56.12
 
 192.168.56.10  DC1-CassRAC1Down Normal  28.06 GB
 50.00%  0   
 192.168.56.11  DC1-CassRAC1Down Normal  31.59 GB
 25.00%  42535295865117307932921825928971026432  
 192.168.56.12  DC1-CassRAC1Up   Normal  29.02 GB25.00%
   85070591730234615865843651857942052864
 
 
 I do not see anything in the logs that tells me that there is a gossip issue.
 
 nodetool info
 Token: 85070591730234615865843651857942052864
 Gossip active: true
 Thrift active: true
 Load : 29.05 GB
 Generation No: 1365114563
 Uptime (seconds) : 2127
 Heap Memory (MB) : 848.71 / 7945.94
 Exceptions   : 0
 Key Cache: size 2208 (bytes), capacity 104857584 (bytes), 1056 hits, 
 1099 requests, 0.961 recent hit rate, 14400 save period in seconds
 Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, 
 NaN recent hit rate, 0 save period in seconds
 
 nodetool info
 Token: 42535295865117307932921825928971026432
 Gossip active: true
 Thrift active: true
 Load : 31.59 GB
 Generation No: 1364413038
 Uptime (seconds) : 703904
 Heap Memory (MB) : 733.02 / 7945.94
 Exceptions   : 1
 Key Cache: size 3693312 (bytes), capacity 104857584 (bytes), 26071678 
 hits, 26616282 requests, 0.980 recent hit rate, 14400 save period in seconds
 Row Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests, 
 NaN recent hit rate, 0 save period in seconds
 
 
 
 There is no firewall between the nodes and I can reach each other on storage 
 port. 
 What else should I be looking at to find root cause? Appreciate your inputs.

Re: Repair hangs when merkle tree request is not acknowledged

2013-04-05 Thread Paul Sudol

 How does it fail?
If I wait 24 hours, the repair command will return an error saying that the 
node died… but the node really didn't die, I watch it the whole time.
I have the DEBUG messages on in the log files, when the node I'm repairing 
sends out a merkle tree request, I will normally see, {ColumnFamilyStore.java 
(line 700) forceFlush requested but everything is clean in COLUMN FAMILY 
NAME}, in the log of the node that should be generating the merkle tree 
request. (in addition, when I run nodetool -h localhost compactionstats, I will 
see activity).

When the node that should be generating a merkle tree does not have this 
message, and has no activity to see via nodetool compactionstats, it will fail.

There are no errors about streaming, it does not even get to the point of 
streaming. One node will send requests for merkle trees, and sometimes, the 
node in the other data center just doesn't get the message. At least that's 
what it looks like.

Should I still try the phi_convict_threshold?

Thanks!

Paul



On Apr 5, 2013, at 12:19 PM, aaron morton aa...@thelastpickle.com wrote:

 A repair on a certain CF will fail, and I run it again and again, eventually 
 it will succeed.
 
 
 Can you see the repair start on the other node ? 
 If you are getting errors in the log about streaming failing because a node 
 died, and the FailureDetector is in the call stack, change the 
 phi_convict_threshold. You can set it in the yaml file or via JMX on the 
 FailureDetectorMBean, in either case boost it from 8 to 16 to get the repair 
 through. This will make it less likely that a node is marked as down, you 
 probably want to run with 8 or a little bit higher normally. 
 
 Cheers
 
 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 4/04/2013, at 6:41 PM, Paul Sudol paulsu...@gmail.com wrote:
 
 Hello,
 
 I have a cluster with 4 nodes, 2 nodes in 2 data centers. I had a hardware 
 failure in one DC and had to replace the nodes. I'm running 1.2.3 on all of 
 the nodes now. I was able to run nodetool rebuild on the two replacement 
 nodes, but now I cannot finish a repair on any of them. I have 18 column 
 families, if I run a repair on a single CF at a time, I can get the node 
 repaired eventually. A repair on a certain CF will fail, and I run it again 
 and again, eventually it will succeed.
 
 I've got an RF of 2, 1 copy in each DC, so the repair needs to pull data 
 from the other DC to finish it's repair.
 
 The problem seems to be that the merkle tree request sometimes is not 
 received by the node in the other DC. Usually when the merkle tree request 
 is sent, the nodes that it was sent to start a compaciton/validation. In 
 certain cases this does not happen, only the node that I ran the repair on 
 will begin compaction/validation and send the merkle tree to itself. Then 
 it's waiting for a merkle tree from the other node, and it will never get 
 it. After about 24 hours it will time out and say the node in question died.
 
 Is there a setting I can use to force the merkle tree request to be 
 acknowledged or resent if it's not acknowledged? I setup NTPD on all the 
 nodes and tried the cross_node_timeout, but that did not help.
 
 Thanks in advance,
 
 Paul

Re: Data Modeling: How to keep track of arbitrarily inserted column names?

2013-04-05 Thread Drew Kutcharian

One thing I can do is to have a client-side cache of the keys to reduce the 
number of updates.


On Apr 5, 2013, at 6:14 AM, Edward Capriolo edlinuxg...@gmail.com wrote:

 Since there are few column names what you can do is this. Make a reverse 
 index, low read repair chance, Be aggressive with compaction. It will be many 
 extra writes but that is ok. 
 
 Other option is turn on row cache and try read before write. It is a good 
 case for row cache because it is a very small data set.
 
 On Thursday, April 4, 2013, Drew Kutcharian d...@venarc.com wrote:
  I don't really need to answer what rows contain column named X, so no 
  need for a reverse index here. All I want is a distinct set of all the 
  column names, so I can answer what are all the available column names
 
  On Apr 4, 2013, at 4:20 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 
  Your reverse index of which rows contain a column named X will have very 
  wide rows. You could look at cassandra's secondary indexing, or possibly 
  look at a solandra/solr approach. Another option is you can shift the 
  problem slightly, which rows have column X that was added between time y 
  and time z. Remember with few distinct column names that reverse index of 
  column to row is going to be a very big list.
 
 
  On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian d...@venarc.com wrote:
 
  Hi Edward,
  I anticipate that the column names will be reused a lot. For example, key1 
  will be in many rows. So I think the number of distinct column names will 
  be much much smaller than the number of rows. Is there a way to have a 
  separate CF that keeps track of the column names? 
  What I was thinking was to have a separate CF that I write only the column 
  name with a null value in there every time I write a key/value to the main 
  CF. In this case if that column name exist, then it will just be 
  overridden. Now if I wanted to get all the column names, then I can just 
  query that CF. Not sure if that's the best approach at high load (100k 
  inserts a second).
  -- Drew
 
  On Apr 4, 2013, at 12:02 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 
  You can not get only the column name (which you are calling a key) you can 
  use get_range_slice which returns all the columns. When you specify an 
  empty byte array (new byte[0]{}) as the start and finish you get back all 
  the columns. From there you can return only the columns to the user in a 
  format that you like.
 
 
  On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian d...@venarc.com wrote:
 
  Hey Guys,
 
  I'm working on a project and one of the requirements is to have a schema 
  free CF where end users can insert arbitrary key/value pairs per row. 
  What would be the best way to know what are all the keys that were 
  inserted (preferably w/o any locking). For example,
 
  Row1 = key1 - XXX, key2 - XXX
  Row2 = key1 - XXX, key3 - XXX
  Row3 = key4 - XXX, key5 - XXX
  Row4 = key2 - XXX, key5 - XXX
  …
 
  The query would be give me all the inserted keys and the response would 
  be {key1, key2, key3, key4, key5}
 
  Thanks,
 
  Drew

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

2013-04-05 Thread Arya Goudarzi

Here is a chunk of bloom filter sstable skip messages from the node I
enabled DEBUG on:

DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39459
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39483
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39332
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39335
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39438
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39478
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39456
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39469
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39334
DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
737) Bloom filter allows skipping sstable 39406

This is the last chunk of log before C* gets stuck, right before I stop the
process, remove key caches and start again (This is from another node that
I upgraded 2 days ago):

INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499
(5273270 bytes)
 INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755
(5264359 bytes)
 INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762
(5260887 bytes)
 INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java (line
166) Opening
/var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886
(5262864 bytes)
 INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line
112) reading saved cache
/var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache


I finally upgrade all 12 nodes in our test environment yesterday. This
issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get
stuck on the same CF loading its saved KeyCache.


On Fri, Apr 5, 2013 at 9:56 AM, aaron morton aa...@thelastpickle.comwrote:

 skipping sstable due to bloom filter debug messages

 What were these messages?

 Do you have the logs from the start up ?

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote:

 Hi,

 I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to
 1.2.3. During startup while tailing C*'s system.log, I observed a series of
 SSTable batch load messages and skipping sstable due to bloom filter debug
 messages which is normal for startup, but when it reached loading saved key
 caches, it gets stuck forever. The I/O wait stays high in the CPU graph and
 I/O ops are sent to disk, but C* never passes that step of loading the key
 cache file successfully. The saved key cache file was about 75MB on one
 node and 125MB on the other node and they were for different CFs.

 image.jpeg

 The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at
 loading one saved key cache file. I have marked that on the graph above.
 The workaround was to delete the saved cache files and things loaded fine
 (See marked Normal Startup).

 These machines are m1.xlarge EC2 instances. And this issue happened on
 both nodes upgraded. This did not happen during exercise of upgrade from
 1.1.6 to 1.2.2 using the same snapshot.

 Should I raise a JIRA?

 -Arya

Re: Data Model and Query

2013-04-05 Thread Hiller, Dean

I would partition either with cassandra's partitioning or PlayOrm partitioning 
and query like so

Where beginOfMonth=x and startDateX and counter  Y.  This only 
returns stuff after X in that partition though so you may need to run multiple 
queries like this and if you have billions of rows it could take some 
time….instead you may want startData  X and startDate  Z such that Z and X 
are in the same month or if they span 2-3 partitions, then just run the 2-3 
queries.  I don't know enough detail on your use case to know if this works for 
you though.

Dean

From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Friday, April 5, 2013 10:59 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Data Model and Query


Whats the recommendation on querying a data model like  StartDate  “X” and 
counter  “Y” .

it's not possible.

If you are using secondary indexes you have to have an equals clause in the 
statement.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 6:53 AM, shubham srivastava 
shubha...@gmail.commailto:shubha...@gmail.com wrote:


Hi,



Whats the recommendation on querying a data model like  StartDate  “X” and 
counter  “Y” . Its kind of range queries across multiple columns and key.

I have the flexibility for modelling the data for the above query accordingly.



Regards,

Shubham

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

2013-04-05 Thread Edward Capriolo

This has happened before the save caches files were not compatible between
0.6 and 0.7. I have ran into this a couple other times before. The good
news is the save key cache is just an optimization, you can blow it away
and it is not usually a big deal.




On Fri, Apr 5, 2013 at 2:55 PM, Arya Goudarzi gouda...@gmail.com wrote:

 Here is a chunk of bloom filter sstable skip messages from the node I
 enabled DEBUG on:

 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39459
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39483
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39332
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39335
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39438
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39478
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39456
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,450 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39469
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39334
 DEBUG [OptionalTasks:1] 2013-04-04 02:44:01,451 SSTableReader.java (line
 737) Bloom filter allows skipping sstable 39406

 This is the last chunk of log before C* gets stuck, right before I stop
 the process, remove key caches and start again (This is from another node
 that I upgraded 2 days ago):

 INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,769 SSTableReader.java (line
 166) Opening
 /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-316499
 (5273270 bytes)
  INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,858 SSTableReader.java
 (line 166) Opening
 /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314755
 (5264359 bytes)
  INFO [SSTableBatchOpen:2] 2013-04-03 01:59:39,894 SSTableReader.java
 (line 166) Opening
 /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-314762
 (5260887 bytes)
  INFO [SSTableBatchOpen:1] 2013-04-03 01:59:39,980 SSTableReader.java
 (line 166) Opening
 /var/lib/cassandra/data/cardspring_production/UniqueIndexes/cardspring_production-UniqueIndexes-hf-315886
 (5262864 bytes)
  INFO [OptionalTasks:1] 2013-04-03 01:59:40,298 AutoSavingCache.java (line
 112) reading saved cache
 /var/lib/cassandra/saved_caches/cardspring_production-UniqueIndexes-KeyCache


 I finally upgrade all 12 nodes in our test environment yesterday. This
 issue seemed to exists on 7 nodes out of 12 nodes. They didn't alway get
 stuck on the same CF loading its saved KeyCache.


 On Fri, Apr 5, 2013 at 9:56 AM, aaron morton aa...@thelastpickle.comwrote:

 skipping sstable due to bloom filter debug messages

 What were these messages?

 Do you have the logs from the start up ?

 Cheers

-
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 4/04/2013, at 6:11 AM, Arya Goudarzi gouda...@gmail.com wrote:

 Hi,

 I have upgraded 2 nodes out of a 12 mode test cluster from 1.1.10 to
 1.2.3. During startup while tailing C*'s system.log, I observed a series of
 SSTable batch load messages and skipping sstable due to bloom filter debug
 messages which is normal for startup, but when it reached loading saved key
 caches, it gets stuck forever. The I/O wait stays high in the CPU graph and
 I/O ops are sent to disk, but C* never passes that step of loading the key
 cache file successfully. The saved key cache file was about 75MB on one
 node and 125MB on the other node and they were for different CFs.

 image.jpeg

 The CPU I/O wait constantly stayed at 40%~ while system.log was stuck at
 loading one saved key cache file. I have marked that on the graph above.
 The workaround was to delete the saved cache files and things loaded fine
 (See marked Normal Startup).

 These machines are m1.xlarge EC2 instances. And this issue happened on
 both nodes upgraded. This did not happen during exercise of upgrade from
 1.1.6 to 1.2.2 using the same snapshot.

 Should I raise a JIRA?

 -Arya

Re: Really have to repair ?

2013-04-05 Thread Edward Capriolo

There are a series of edge cases that dictate the need for repair. The
largest cases are 1) lost deletes 2) random disk corruptions

In our use case we only delete entire row keys, and if the row key comes
back it is not actually a problem because our software will find it an
delete it again. In those places we dodge running repair believe it or not.

Edward


On Fri, Apr 5, 2013 at 11:22 AM, Jean-Armel Luce jaluc...@gmail.com wrote:

 Hi Cyril,

 According to the documentation (
 http://wiki.apache.org/cassandra/Operations#Frequency_of_nodetool_repair),
 I understand that is is not necessary to repair every node before
 gc_grace_seconds if you are sure that you don't miss to run a repair each
 time a node is down longer than gc_grace_seconds.

 *IF* your operations team is sufficiently on the ball, you can get by
 without repair as long as you do not have hardware failure -- in that case,
 HintedHandoff is adequate to repair successful updates that some replicas
 have missed

 Am I wrong ? Thoughts ?




 2013/4/4 cscetbon@orange.com

 Hi,

 I know that deleted rows can reappear if node repair is not run on
 every node before *gc_grace_seconds* seconds. However do we really need
 to obey this rule if we run node repair on node that are down for more
 than *max_hint_window_in_ms* milliseconds ?

 Thanks
  --
 Cyril SCETBON

 _

 Ce message et ses pieces jointes peuvent contenir des informations 
 confidentielles ou privilegiees et ne doivent donc
 pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu 
 ce message par erreur, veuillez le signaler
 a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
 electroniques etant susceptibles d'alteration,
 France Telecom - Orange decline toute responsabilite si ce message a ete 
 altere, deforme ou falsifie. Merci.

 This message and its attachments may contain confidential or privileged 
 information that may be protected by law;
 they should not be distributed, used or copied without authorisation.
 If you have received this email in error, please notify the sender and 
 delete this message and its attachments.
 As emails may be altered, France Telecom - Orange is not liable for messages 
 that have been modified, changed or falsified.
 Thank you.

Counter batches query

2013-04-05 Thread Matt K

Hi,

I have an application that does batch (counter) writes to multiple CFs. The
application itself is multi-threaded and I'm using C* 1.2.2 and Astyanax
driver. Could someone share insights on:

1) When I see the cluster write throughput graph in opscenter, the number
is not reflective of actual number of writes. For example: If I issue a
single batch write ( internally have 5 mutation ), is the opscenter/JMX
cluster/node writes suppose to indicate 1 or 5 ? ( I would assume 5 )

2) I read that from C* 1.2.x, there is atomic counter batches which can
cause 30% performance hit - wondering if this applicable to existing thrift
based clients like Astyanax/Hector and if so, what is the way to turn it
off? Any server side settings too?

Thanks!

Re: schema disagrement exception

2013-04-05 Thread zg

Thanks a lot, and now I have solved this problem.


2013/3/28 aaron morton aa...@thelastpickle.com

 Your cluster is angry
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement

 If your are just starting I suggest blasting it away and restarting.

 Hope that helps

 -
 Aaron Morton
 Freelance Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 26/03/2013, at 8:57 PM, zg zhang.i...@gmail.com wrote:

 Hi,
 I just try to set up a 2 nodes cluster. It seems work,but when I use CLI
 to create a keyspace I meet an error SchemaDisagreementException(). Does
 anyone know how to solve it?

 Thanks

Failed to connect to '127.0.0.1:7199'

2013-04-05 Thread zg

Hi everyone,
I have a 3 nodes cluster, one node when I use  nodetool, I met a error
Failed to connect to '127.0.0.1:7199':Connection timed out, use CLI and
CQLsh is ok on this node, the other two nodes is OK when I use nodetool
commend. So what's the problem with that node?

Re: Linear scalability problems

Re: Data Modeling: How to keep track of arbitrarily inserted column names?

Re: Really have to repair ?

Re: Really have to repair ?

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

Re: Data Model and Query

Re: Issues running Bulkloader program on AIX server

Re: Lost data after expanding cluster c* 1.2.3-1

Re: upgrading 1.1.x to 1.2.x via sstableloader

Re: nodetool status inconsistencies, repair performance and system keyspace compactions

Re: Repair hangs when merkle tree request is not acknowledged

Re: gossip not working

Re: Repair hangs when merkle tree request is not acknowledged

Re: Data Modeling: How to keep track of arbitrarily inserted column names?

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

Re: Data Model and Query

Re: Problem loading Saved Key-Cache on 1.2.3 after upgrading from 1.1.10

Re: Really have to repair ?

Counter batches query

Re: schema disagrement exception

Failed to connect to '127.0.0.1:7199'

21 matches

Site Navigation

Mail list logo

Footer information