Node hang on shutdown

2013-10-31 Thread Mikhail Mazursky
Hi.

I was upgrading my 3 node testing cluster from 2.0.1 to 2.0.2. I
successfully upgraded two nodes but the last one did not shutdown properly.
Does somebody see anything suspicious in the attached thread dump?

Regards,
Mikhail.
2013-10-31 12:09:12
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.25-b01 mixed mode):

Attach Listener daemon prio=10 tid=0x7f15e0007800 nid=0x3467 waiting on 
condition [0x]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
- None

StorageServiceShutdownHook prio=10 tid=0x7f15e4678800 nid=0x343e waiting 
on condition [0x7f15dc97b000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xc6fddb10 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at 
java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1468)
at 
org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:502)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
- None

SIGTERM handler daemon prio=10 tid=0x7f15e001c800 nid=0x343c in 
Object.wait() [0x7f15da198000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xc6fe9218 (a java.lang.Thread)
at java.lang.Thread.join(Thread.java:1260)
- locked 0xc6fe9218 (a java.lang.Thread)
at java.lang.Thread.join(Thread.java:1334)
at 
java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
at 
java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked 0xc6c5e8c0 (a java.lang.Class for java.lang.Shutdown)
at java.lang.Terminator$1.handle(Terminator.java:52)
at sun.misc.Signal$1.run(Signal.java:212)
at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
- None

MigrationStage:3 daemon prio=10 tid=0x7f15ec00f000 nid=0x343a waiting on 
condition [0x7f15dc62e000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xc6fcdb10 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at 
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)

   Locked ownable synchronizers:
- None

MutationStage:2718 daemon prio=10 tid=0x7f15f4026000 nid=0x33ea waiting 
on condition [0x7f15d857a000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0xc2a186a0 (a 
java.util.concurrent.FutureTask$Sync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:248)
at java.util.concurrent.FutureTask.get(FutureTask.java:111)
at 
org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407)
at 
org.apache.cassandra.db.ColumnFamilyStore.forceBlockingFlush(ColumnFamilyStore.java:818)
at 
org.apache.cassandra.db.ColumnFamilyStore.truncateBlocking(ColumnFamilyStore.java:1913)
at 
org.apache.cassandra.db.TruncateVerbHandler.doVerb(TruncateVerbHandler.java:40)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 

Re: If I set 'listen_address' to 'localhost' I can't get Cassandra to broadcast on localhost

2013-10-31 Thread Aaron Morton
mmm, cassandra normally resolves the name to the IP address and binds to that. 
So it should just work if you are using localhost and that’s setup correctly. 

Can you just use 127.0.0.1 ? 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 29/10/2013, at 8:55 am, Michael Hayes haye...@thefrontside.net wrote:

 If I set ‘listen_address’ in ‘/etc/cassandra/cassandra.yml’:
 listen_address: localhost
 
 I can telnet:
 
 telnet hostname 9160 -YES
 telnet ip address 9160 -YES
 telnet localhost 9160 - NO
 
 I’m trying to get Usergrid to see Cassandra on localhost, which it currently 
 is unable to do. Usergrid is running on Tomcat6.



Re: Searching Cassandra

2013-10-31 Thread Aaron Morton
  As I understand it, where clauses only apply to primary keys and secondary 
 indices. 
I’m a little old fashioned and say if there is a query you do as part of a hot 
code path it should be supported by materialising a view at write time rather 
than using secondary indexes. That will give you better performance. 

Note that when using CQL 3 you don’t have to specify all of the primary key, 
just the partition key and then parts of the cluster key. 

 From what I've researched it appears two options are to use solr or 
 elasticsearch.
If you want the equivalent of being able to put any term in the CQL WHERE 
clause then yes those are two options. 
Both are fine, Data Stax Enterprise includes solr and makes it a easier 
http://www.datastax.com/what-we-offer/products-services/datastax-enterprise

Or you can use the Hadoop integration if you want to process all of your data. 

Cheers


-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 29/10/2013, at 10:27 am, Ari King ari.brandeis.k...@gmail.com wrote:

 Hi,
 
 I've recently started with Cassandra I'm curious about how data can be 
 searched. As I understand it, where clauses only apply to primary keys and 
 secondary indices. 
 
 From what I've researched it appears two options are to use solr or 
 elasticsearch. I'd appreciate feeback from those that have used either of the 
 tools as to the challenges of integrating with Cassandra. I'd also appreciate 
 insight on what other tools/methods are available. Thanks.
 
 -Ari



Re: Cassandra book/tuturial

2013-10-31 Thread Markus Jais
This one is coming out soon. Looks interesting:

http://www.informit.com/store/practical-cassandra-a-developers-approach-9780321933942


Beside that , I found the already mentioned docs on the Datastax site to be the 
best information.

Markus



Joe Stein crypt...@gmail.com schrieb am 5:51 Montag, 28.Oktober 2013:
 
Reading previous version's documentation and related information from that time 
in the past (like books) has value!  It helps to understand decisions that were 
made and changed and some that are still the same like Secondary Indexes 
which were introduced in 0.7 when 
http://www.amazon.com/Cassandra-Definitive-Guide-Eben-Hewitt/dp/1449390412 came 
out back in 2011.


If you are really just getting started then I say go and start here 
http://www.planetcassandra.org/



/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop
/


On Mon, Oct 28, 2013 at 12:15 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:

With lot of enthusiasm i started reading it. Its out-dated, error prone. I 
could not even get Cassandra running from that book. Eventually i could not 
get start with cassandra. 



On Mon, Oct 28, 2013 at 9:41 AM, Joe Stein crypt...@gmail.com wrote:

http://www.planetcassandra.org has a lot of great resources on it.

Eben Hewitt's book is great, as are the other C* books like the High 
Performance Cookbook 
http://www.amazon.com/Cassandra-Performance-Cookbook-Edward-Capriolo/dp/1849515123

I would recommend reading both of those books.  You can also read 
http://www.datastax.com/dev/blog/thrift-to-cql3 to help understandings.


From there go with CQL http://cassandra.apache.org/doc/cql3/CQL.html 



/***
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop
/



On Sun, Oct 27, 2013 at 11:58 PM, Mohan L l.mohan...@gmail.com wrote:

And here also good intro: http://10kloc.wordpress.com/category/nosql-2/


Thanks
Mohan L



On Mon, Oct 28, 2013 at 8:02 AM, Danie Viljoen dav...@gmail.com wrote:

Not a book, but I think this is a good start: 
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html



On Mon, Oct 28, 2013 at 3:14 PM, Dave Brosius dbros...@mebigfatguy.com 
wrote:

Unfortunately, as tech books tend to be, it's quite a bit out of date, at 
this point.




On 10/27/2013 09:54 PM, Mohan L wrote:






On Sun, Oct 27, 2013 at 9:57 PM, Erwin Karbasi er...@optinity.com 
wrote:

Hey Guys,


What is the best book to learn Cassandra from scratch?


Thanks in advance,

Erwin


Hi,

Buy :

Cassandra: The Definitive Guide By Eben Hewitt : 
http://shop.oreilly.com/product/0636920010852.do


Thanks

Mohan L











-- 

Deepak





Re: Too many open files with Cassandra 1.2.11

2013-10-31 Thread Aaron Morton
What’s in /etc/security/limits.conf ? 

and just for fun what does lsof -n | grep java | wc -l  say ? 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 12:21 am, Oleg Dulin oleg.du...@gmail.com wrote:

 Got this error:
 
 WARN [Thread-8] 2013-10-29 02:58:24,565 CustomTThreadPoolServer.java (line 
 122) Transport error occurred during acceptance of message.
2 org.apache.thrift.transport.TTransportException: 
 java.net.SocketException: Too many open files
3 at 
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:109)
  
4 at 
 org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerSocket.java:36)
  
5 at 
 org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) 
6 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPoolServer.java:110)
  
7 at 
 org.apache.cassandra.thrift.ThriftServer$ThriftServerThread.run(ThriftServer.java:111)
  
 
 I haven't seen this since 1.0 days. 1.1.11 had it all fixed I thought.
 
 ulimit outputs unlimited
 
 What could cause this ?
 
 Any help is greatly apprecaited.
 
 -- 
 Regards,
 Oleg Dulin
 http://www.olegdulin.com
 
 



Re: Read repair

2013-10-31 Thread Aaron Morton
(assuming RF 3 and NTS is putting a replica in each rack)

 Rack1 goes down and some writes happen in quorum against rack 2 and 3. 
During this period (1) writes will be committed onto a node in both rack 2 and 
3. Hints will be stored on a node in either rack 2 or 3. 

 After couple of hours rack1 comes back and rack2 goes down. 
During this period writes from period (1) will be guaranteed to be on rack 3. 

Reads at QUORUM must use a node from rack 1 and rack 3. As such the read will 
include the node in rack 3 that stored the write during period (1). 
 
 Now for rows inserted for about 1 hour and 30 mins, there is no quorum until 
 failed rack comes back up.
In your example there is always a QUORUM as we always had 2 or the 3 racks and 
so 2 of the 3 replicas for each row. 

For the CL guarantee to work we just have to have one of the nodes that 
completed the write be involved in the read. 

Hope that helps. 

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 12:32 am, Baskar Duraikannu baskar.duraika...@outlook.com 
wrote:

 Aaron
 
 Rack1 goes down and some writes happen in quorum against rack 2 and 3. Hinted 
 handoff is set to 30 mins. After couple of hours rack1 comes back and rack2 
 goes down. Hinted handoff will play but will not cover all of the writes 
 because of 30 min setting. Now for rows inserted for about 1 hour and 30 
 mins, there is no quorum until failed rack comes back up.
 
 Hope this explains the scenario.
 From: Aaron Morton
 Sent: ‎10/‎28/‎2013 2:42 AM
 To: Cassandra User
 Subject: Re: Read repair
 
 As soon as it came back up, due to some human error, rack1 goes down. Now 
 for some rows it is possible that Quorum cannot be established. 
 Not sure I follow here. 
 
 if the first rack has come up I assume all nodes are available, if you then 
 lose a different rack I assume you have 2/3 of the nodes available and would 
 be able to achieve a QUORUM. 
 
 Just to minimize the issues, we are thinking of running read repair manually 
 every night. 
 If you are reading and writing at QUORUM and the cluster does not have a 
 QUORUM of nodes available writes will not be processed. During reads any 
 mismatch between the data returned from the nodes will be detected and 
 resolved before returning to the client. 
 
 Read Repair is an automatic process that reads from more nodes than necessary 
 and resolves the differences in the back ground. 
 
 I would run nodetool repair / Anti Entropy as normal, once on every machine 
 every gc_grace_seconds. If you have a while rack fail for run repair on the 
 nodes in the rack if you want to get it back to consistency quickly. The need 
 to do that depends on the config for Hinted Handoff, read_repair_chance, 
 Consistency level, the write load, and (to some degree) the number of nodes. 
 If you want to be extra safe just run it. 
 
 Cheers
  
 -
 Aaron Morton
 New Zealand
 @aaronmorton
 
 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com
 
 On 26/10/2013, at 2:54 pm, Baskar Duraikannu baskar.duraika...@outlook.com 
 wrote:
 
 We are thinking through the deployment architecture for our Cassandra 
 cluster.  Let us say that we choose to deploy data across three racks. 
 
 If let us say that one rack power went down for 10 mins and then it came 
 back. As soon as it came back up, due to some human error, rack1 goes down. 
 Now for some rows it is possible that Quorum cannot be established. Just to 
 minimize the issues, we are thinking of running read repair manually every 
 night. 
 
 Is this a good idea? How often do you perform read repair on your cluster?
 
 



RE: Node hang on shutdown

2013-10-31 Thread Romain HARDOUIN
Hi,

So you had to kill -9 the process?
Is there something interesting in system.log?
Can you restart the node or are there any errors on startup?

Romain

Mikhail Mazursky ash...@gmail.com a écrit sur 31/10/2013 08:02:22 :

 De : Mikhail Mazursky ash...@gmail.com
 A : user@cassandra.apache.org, 
 Date : 31/10/2013 08:04
 Objet : Node hang on shutdown
 
 Hi.

 I was upgrading my 3 node testing cluster from 2.0.1 to 2.0.2. I 
 successfully upgraded two nodes but the last one did not shutdown 
 properly. Does somebody see anything suspicious in the attached thread 
dump?

 Regards,
 Mikhail.

Re: CQL selecting individual items from a map

2013-10-31 Thread Aaron Morton
There is some discussion in this ticket, looks like it was pushed to a later 
release 

https://issues.apache.org/jira/browse/CASSANDRA-3647?focusedCommentId=13292781page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13292781

I cannot find any open tickets for it 
https://issues.apache.org/jira/browse/CASSANDRA-5918?jql=labels%20%3D%20cql%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22)

May be a good idea to create one and request the feature. 

Cheers

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 3:24 am, Keith Freeman 8fo...@gmail.com wrote:

 There's some rationale here: 
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3CCAENxBwx6pcSA=cWn=dkw_52k5odw5f3xigj-zn_4bwfth+4...@mail.gmail.com%3E
 
 And I'm sure part of the reason is the 64k size limit: maps (and sets and 
 lists) are limited to 64k total size 
 (http://wiki.apache.org/cassandra/CassandraLimitations), so it wouldn't be 
 very read-efficient to load individual elements.
 
 On 10/28/2013 08:03 PM, Liam Stewart wrote:
 I was wondering if anybody could explain the rationale behind disallowing 
 selection of individual elements from a map in CQL and why an entire map 
 must be retrieved at once when items are stored as distinct columns? Are 
 there any plans to allow individual selection?
 
 -- 
 Liam Stewart :: liam.stew...@gmail.com
 



[Cas 2.0.2] Looping Repair since activating PasswordAuthenticator

2013-10-31 Thread Dennis Schwan

Hi there,

I have used this manual: 
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/security/security_config_native_authenticate_t.html
to use the PasswordAuthenticator but now everytime I run a nodetool 
repair it repairs the system_auth keyspace which needs about 10 to 15 
minutes.


nodetool repair
[2013-10-31 09:39:59,623] Nothing to repair for keyspace 'system'
[2013-10-31 09:39:59,811] Starting repair command #1, repairing 1280 
ranges for keyspace system_auth


This is what I get on every node every time i start a repair.

Logfile:
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,632 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.60 and /10.30.9.61 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,634 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.60 and /10.30.9.58 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,638 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.60 and /10.30.9.59 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,642 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.60 and /10.30.9.57 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,643 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.61 and /10.30.9.58 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.61 and /10.30.9.59 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,644 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.61 and /10.30.9.57 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,645 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.58 and /10.30.9.59 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,646 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.58 and /10.30.9.57 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,647 Differencer.java 
(line 67) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Endpoints 
/10.30.9.59 and /10.30.9.57 are consistent for credentials
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,648 RepairSession.java 
(line 214) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] credentials is 
fully synced
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:55,942 RepairSession.java 
(line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received 
merkle tree for permissions from /10.30.9.60
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,047 RepairSession.java 
(line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received 
merkle tree for permissions from /10.30.9.61
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,129 RepairSession.java 
(line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received 
merkle tree for permissions from /10.30.9.58
 INFO [AntiEntropyStage:1] 2013-10-31 09:40:56,190 RepairSession.java 
(line 157) [repair #27b48380-4208-11e3-acc6-b1938d6360b5] Received 
merkle tree for permissions from /10.30.9.59


Is this expected behaviour? I have only added a new superuser and 
changed the password of the default superuser so there should not be too 
much to do at all.


Thanks for your help!
Dennis

--
Dennis Schwan

Oracle DBA
Mail Core

11 Internet AG | Brauerstraße 48 | 76135 Karlsruhe | Germany
Phone: +49 721 91374-8738
E-Mail: dennis.sch...@1und1.de | Web: www.1und1.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 6484

Vorstand: Ralph Dommermuth, Frank Einhellinger, Robert Hoffmann, Andreas 
Hofmann, Markus Huhn, Hans-Henning Kettler, Uwe Lamnek, Jan Oetjen, Christian 
Würst
Aufsichtsratsvorsitzender: Michael Scheeren

Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte Informationen 
enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat sind oder diese E-Mail 
irrtümlich erhalten haben, unterrichten Sie bitte den Absender und vernichten 
Sie diese Email. Anderen als dem bestimmungsgemäßen Adressaten ist untersagt, 
diese E-Mail zu speichern, weiterzuleiten oder ihren Inhalt auf welche Weise 
auch immer zu verwenden.

This E-Mail may contain confidential and/or privileged information. If you are 
not the intended recipient of this E-Mail, you are hereby notified that saving, 
distribution or use of the content of this E-Mail in any way is prohibited. If 
you have received this E-Mail in error, please notify the sender and delete the 
E-Mail.



Re: Node hang on shutdown

2013-10-31 Thread Mikhail Mazursky
Romain,

yes, I had to kill -9 to stop it.

INFO [RequestResponseStage:54] 2013-10-31 11:59:10,413 Gossiper.java (line
789) InetAddress /192.168.0.197 is now UP
 INFO [GossipStage:1] 2013-10-31 11:59:10,706 StorageService.java (line
1298) Node /192.168.0.197 state jump to normal
 INFO [GossipStage:1] 2013-10-31 12:00:55,905 Gossiper.java (line 806)
InetAddress /192.168.0.251 is now DOWN
 INFO [HANDSHAKE-/192.168.0.251] 2013-10-31 12:01:35,848
OutboundTcpConnection.java (line 386) Handshaking version with /
192.168.0.251
 INFO [GossipStage:1] 2013-10-31 12:01:35,978 Gossiper.java (line 824) Node
/192.168.0.251 has restarted, now UP
 INFO [MemoryMeter:1] 2013-10-31 12:01:35,980 Memtable.java (line 442)
CFS(Keyspace='system', ColumnFamily='peers') liveRatio is
17.361702127659573 (just-counted was 17.361702127659573).  calculation took
0m
s for 10 columns
 INFO [HANDSHAKE-/192.168.0.251] 2013-10-31 12:01:35,989
OutboundTcpConnection.java (line 386) Handshaking version with /
192.168.0.251
 INFO [RequestResponseStage:55] 2013-10-31 12:01:36,065 Gossiper.java (line
789) InetAddress /192.168.0.251 is now UP
ERROR [MigrationStage:2] 2013-10-31 12:01:36,176 CassandraDaemon.java (line
185) Exception in thread Thread[MigrationStage:2,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-95-Index.db
(Too many open files)
at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:102)
at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:90)
at
org.apache.cassandra.io.sstable.SSTableReader.openIndexReader(SSTableReader.java:1337)
at
org.apache.cassandra.io.sstable.SSTableScanner.init(SSTableScanner.java:68)
at
org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1115)
at
org.apache.cassandra.db.RowIteratorFactory.getIterator(RowIteratorFactory.java:69)
at
org.apache.cassandra.db.ColumnFamilyStore.getSequentialIterator(ColumnFamilyStore.java:1507)
at
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1626)
at
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1564)
at
org.apache.cassandra.db.SystemKeyspace.serializedSchema(SystemKeyspace.java:722)
at
org.apache.cassandra.db.SystemKeyspace.serializeSchema(SystemKeyspace.java:743)
at
org.apache.cassandra.db.SystemKeyspace.serializeSchema(SystemKeyspace.java:733)
at
org.apache.cassandra.db.MigrationRequestVerbHandler.doVerb(MigrationRequestVerbHandler.java:42)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException:
/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-jb-95-Index.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:233)
at
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:58)
at
org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:98)
... 16 more
 INFO [GossipStage:1] 2013-10-31 12:01:36,549 StorageService.java (line
1298) Node /192.168.0.251 state jump to normal
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,067
ThriftServer.java (line 141) Stop listening to thrift clients
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,174 Server.java
(line 174) Stop listening for CQL clients
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:38,175 Gossiper.java
(line 1129) Announcing shutdown
 INFO [StorageServiceShutdownHook] 2013-10-31 12:02:40,175
MessagingService.java (line 665) Waiting for messaging service to quiesce
 INFO [ACCEPT-/192.168.0.232] 2013-10-31 12:02:40,176 MessagingService.java
(line 846) MessagingService shutting down server thread.

And there are a lot of such exceptions (the above one is the last before it
hanged).
There are no exceptions after upgrade and startup.
Looks like C* is leaking file handles (nothing else is running on that VM).
I will try to debug it.

Thanks for pointing me to the logs.

Regards,
Mikhail.



2013/10/31 Romain HARDOUIN romain.hardo...@urssaf.fr

 Hi,

 So you had to kill -9 the process?
 Is there something interesting in system.log?
 Can you restart the node or are there any errors on startup?

 Romain

 Mikhail Mazursky ash...@gmail.com a écrit sur 31/10/2013 08:02:22 :

  De : Mikhail Mazursky ash...@gmail.com
  A : user@cassandra.apache.org,
  Date : 31/10/2013 08:04
  Objet : Node hang on shutdown
 
  Hi.

  I was upgrading my 3 node testing cluster from 2.0.1 to 2.0.2. I
  successfully upgraded two nodes 

Replace list with map

2013-10-31 Thread Oli Schacher
Hi

I'm playing with collections in CQL.  I'm trying to drop a
list and create a map with the same name, but I'm getting: Bad
Request: comparators do not match or are not compatible.

# table setup
cqlsh:os_test1 create table thetable(id timeuuid primary key, somevalue text);
cqlsh:os_test1 alter table thetable add mycollection listtext;

# drop list and replace with map
cqlsh:os_test1 alter table thetable drop mycollection;
cqlsh:os_test1 alter table thetable add mycollection maptext,text;
Bad Request: comparators do not match or are not compatible.

Creating a map with a different name works fine. 

Is this a bug, a limitation or am I doing something wrong? 

(Cassandra 2.0.2)

Thanks
Oli
-- 
message transmitted on 100% recycled electrons


Re: Node hang on shutdown

2013-10-31 Thread Romain HARDOUIN
Too many open files are common if you haven't set limits properly 
(/etc/security/limits.conf).
But it this case it might be a file descriptor leak.

This link can help and is still relevant for C* 2.0:
http://www.datastax.com/docs/1.1/troubleshooting/index#java-reports-an-error-saying-there-are-too-many-open-files


Re: Replace list with map

2013-10-31 Thread Sylvain Lebresne
That would be a bug. If you could open a ticket on JIRA, that would be
great.

--
Sylvain


On Thu, Oct 31, 2013 at 10:23 AM, Oli Schacher cassan...@lists.wgwh.chwrote:

 Hi

 I'm playing with collections in CQL.  I'm trying to drop a
 list and create a map with the same name, but I'm getting: Bad
 Request: comparators do not match or are not compatible.

 # table setup
 cqlsh:os_test1 create table thetable(id timeuuid primary key, somevalue
 text);
 cqlsh:os_test1 alter table thetable add mycollection listtext;

 # drop list and replace with map
 cqlsh:os_test1 alter table thetable drop mycollection;
 cqlsh:os_test1 alter table thetable add mycollection maptext,text;
 Bad Request: comparators do not match or are not compatible.

 Creating a map with a different name works fine.

 Is this a bug, a limitation or am I doing something wrong?

 (Cassandra 2.0.2)

 Thanks
 Oli
 --
 message transmitted on 100% recycled electrons



Running Cassandra on top of non-Oracle/Sun JDK environments

2013-10-31 Thread Prabath Abeysekara
Hi All,

Just want to know whether the $subject is recommended. Because, I've seen
some of the Oracle(Sun) specific packages sun.misc.* are being used in a
few places in Cassandra code base, which is not interoperable. So, it would
be great if someone can shed some light as to what extent it is recommended
to use with Java implementations other than Oracle(Sun).


Cheers,
Prabath
-- 
Prabath


Re: Node hang on shutdown

2013-10-31 Thread Mikhail Mazursky
Looks like a bug. See
https://issues.apache.org/jira/browse/CASSANDRA-6275for more details.


Re: Replace list with map

2013-10-31 Thread Oli Schacher
On Thu, 31 Oct 2013 11:50:45 +0100
Sylvain Lebresne sylv...@datastax.com wrote:

 That would be a bug. If you could open a ticket on JIRA, that would be
 great.
 

Done: https://issues.apache.org/jira/browse/CASSANDRA-6276



CF name length restrictions (CASSANDRA-4157 and CASSANDRA-4110)

2013-10-31 Thread Peter Sanford
We're working on upgrading from 1.0.12 to 1.1.12. After upgrading a test
node I ran into CASSANDRA-4157 which restricts the max length of CF names
to = 48 characters. It looks like CASSANDRA-4110 will allow us to upgrade
and keep our existing long CF names, but we won't be able to create new CFs
with names longer than 48 chars.

Is there any reason that the logic from 4110 wasn't also applied to
the 4157 code path?

(Our naming convention results in a lot of materialized view CFs that have
names  48 characters.)

-psanford


NoHostAvailableException/TransportException

2013-10-31 Thread Les Hartzman
I'm running Cassandra 2.0.1 on Ubuntu in a VirtualBox VM. I'm using the
Datastax Java driver, 1.0.4, and am trying to connect to 127.0.0.1, port
9160.

I'm getting the NoHostAvailable exception and on the TransportException it
states [127.0.0.1] Channel has been closed.

The server is running. I can bring up cqlsh and select the keyspace and
query the table without any problem. It shows that it is connected to
 localhost:9160.

The cassandra.yaml file in /etc/cassandra has 'start_native_transport:
true'.

I did notice on startup  the message JNA not found. Native methods will be
disabled. Not sure if this means anything or not.

Ideas?

Thanks.

Les


Re: NoHostAvailableException/TransportException

2013-10-31 Thread Sylvain Lebresne
cqlsh uses thrift, the java driver uses the native protocol. Thirft is on
port 9160 by default, the native protocol is on port 9042 by default. Try
connecting on port 9042 with the driver instead (which is the driver
default really).

--
Sylvain


On Thu, Oct 31, 2013 at 6:01 PM, Les Hartzman lhartz...@gmail.com wrote:

 I'm running Cassandra 2.0.1 on Ubuntu in a VirtualBox VM. I'm using the
 Datastax Java driver, 1.0.4, and am trying to connect to 127.0.0.1, port
 9160.

 I'm getting the NoHostAvailable exception and on the TransportException it
 states [127.0.0.1] Channel has been closed.

 The server is running. I can bring up cqlsh and select the keyspace and
 query the table without any problem. It shows that it is connected to
  localhost:9160.

 The cassandra.yaml file in /etc/cassandra has 'start_native_transport:
 true'.

 I did notice on startup  the message JNA not found. Native methods will
 be disabled. Not sure if this means anything or not.

 Ideas?

 Thanks.

 Les






Re: NoHostAvailableException/TransportException

2013-10-31 Thread Les Hartzman
Thank you! That was it.

Les


On Thu, Oct 31, 2013 at 10:06 AM, Sylvain Lebresne sylv...@datastax.comwrote:

 cqlsh uses thrift, the java driver uses the native protocol. Thirft is on
 port 9160 by default, the native protocol is on port 9042 by default. Try
 connecting on port 9042 with the driver instead (which is the driver
 default really).

 --
 Sylvain


 On Thu, Oct 31, 2013 at 6:01 PM, Les Hartzman lhartz...@gmail.com wrote:

 I'm running Cassandra 2.0.1 on Ubuntu in a VirtualBox VM. I'm using the
 Datastax Java driver, 1.0.4, and am trying to connect to 127.0.0.1, port
 9160.

 I'm getting the NoHostAvailable exception and on the TransportException
 it states [127.0.0.1] Channel has been closed.

 The server is running. I can bring up cqlsh and select the keyspace and
 query the table without any problem. It shows that it is connected to
  localhost:9160.

 The cassandra.yaml file in /etc/cassandra has 'start_native_transport:
 true'.

 I did notice on startup  the message JNA not found. Native methods will
 be disabled. Not sure if this means anything or not.

 Ideas?

 Thanks.

 Les







High loads only on one node in the cluster

2013-10-31 Thread Ashish Tyagi
We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in the
other. All machines are Amazon M1.XLarge configuration.

Datacenter: DC1
==
Address RackStatus State   Load
OwnsToken


ip11  1b  Up Normal  76.46 GB16.67%
0
ip12  1b  Up Normal  44.66 GB16.67%
28356863910078205288614550619314017621
ip13  1c  Up Normal  85.94 GB16.67%
56713727820156410577229101238628035241
ip14  1c  Up Normal  17.55 GB16.67%
85070591730234615865843651857942052863
ip15  1d  Up Normal  80.74 GB16.67%
113427455640312821154458202477256070484
ip16  1d  Up Normal  20.88 GB16.67%
141784319550391026443072753096570088105

Datacenter: DC2
==
Address RackStatus State   Load
OwnsToken

ip21  1a  Up Normal  78.32 GB0.00%
1001
ip22  1b  Up Normal  71.23 GB0.00%
56713727820156410577229101238628036241
ip23  1b  Up Normal  53.49 GB0.00%
113427455640312821154458202477256071484

Problem is that node with ip address: ip11 often has 5-10 times more load
than any other node. Most of the operations are on counters. The primary
column family (which receives most writes) has a replication factor of 2 in
DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy
(reads are less than 10% of total requests). We are using size-tiered
compaction. Both writes and reads happen with a consistency factor of
LOCAL_QUORUM.

More information:

1. cassandra.yaml - http://pastebin.com/u344fA6z
2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa
3. Nodetool tpstats - http://pastebin.com/s0AS7bGd
4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx
5. GC log lines -  http://pastebin.com/Y0TKphsm

Am I doing anything wrong. Any pointers will be appreciated.

Thanks in advance,
Ashish


Re: Running Cassandra on top of non-Oracle/Sun JDK environments

2013-10-31 Thread Robert Coli
On Thu, Oct 31, 2013 at 4:11 AM, Prabath Abeysekara 
prabathabeysek...@gmail.com wrote:

 Just want to know whether the $subject is recommended. Because, I've seen
 some of the Oracle(Sun) specific packages sun.misc.* are being used in a
 few places in Cassandra code base, which is not interoperable. So, it would
 be great if someone can shed some light as to what extent it is recommended
 to use with Java implementations other than Oracle(Sun).


The codebase passive aggressively tells you to upgrade your JVM if you
use a non-Oracle environment.

The summary seems to be that Cassandra mostly works with OpenJDK, but that
it is not supported, common, or recommended. Community experiences
indicates that you will encounter far more issues with OpenJDK 6 than 7.
Cassandra 2.0 requires Java 7...

In summary, unless you absolutely *must* use OpenJDK, you should use the
Oracle/Sun JVM/JDK.

=Rob


Re: NoHostAvailableException/TransportException

2013-10-31 Thread Robert Coli
On Thu, Oct 31, 2013 at 10:01 AM, Les Hartzman lhartz...@gmail.com wrote:

 I did notice on startup  the message JNA not found. Native methods will
 be disabled. Not sure if this means anything or not.


While not relevant to the problem you were having, in general one really
does want to have JNA available when running Cassandra.

=Rob


Re: Running Cassandra on top of non-Oracle/Sun JDK environments

2013-10-31 Thread Prabath Abeysekara
Hi Rob,

On Thu, Oct 31, 2013 at 11:54 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Oct 31, 2013 at 4:11 AM, Prabath Abeysekara 
 prabathabeysek...@gmail.com wrote:

 Just want to know whether the $subject is recommended. Because, I've seen
 some of the Oracle(Sun) specific packages sun.misc.* are being used in a
 few places in Cassandra code base, which is not interoperable. So, it would
 be great if someone can shed some light as to what extent it is recommended
 to use with Java implementations other than Oracle(Sun).


 The codebase passive aggressively tells you to upgrade your JVM if you
 use a non-Oracle environment.

 The summary seems to be that Cassandra mostly works with OpenJDK, but that
 it is not supported, common, or recommended. Community experiences
 indicates that you will encounter far more issues with OpenJDK 6 than 7.
 Cassandra 2.0 requires Java 7...

 In summary, unless you absolutely *must* use OpenJDK, you should use the
 Oracle/Sun JVM/JDK.


Thanks a lot for the information. Cassandra documentation says it supports
IBM JDK as well. I'm wondering to what extent that statement is valid given
the aforementioned problems.


Cheers,
Prabath



 =Rob




-- 
Prabath


Storage management during rapid growth

2013-10-31 Thread Dave Cowen
Hi, all -

I'm currently managing a small Cassandra cluster, several nodes with local
SSD storage.

It's difficult for to forecast the growth of the Cassandra data over the
next couple of years for various reasons, but it is virtually guaranteed to
grow substantially.

During this time, there may be times where it is desirable to increase the
amount of storage available to each node, but, assuming we are not I/O
bound, keep from expanding the cluster horizontally with additional nodes
that have local storage. In addition, expanding with local SSDs is costly.

My colleagues and I have had several discussions of a couple of other
options that don't involve scaling horizontally or adding SSDs:

1) Move to larger, cheaper spinning-platter disks. However, when monitoring
the performance of our cluster, we see sustained periods - especially
during repair/compaction/cleanup - of several hours where there are 2000
IOPS. It will be hard to get to that level of performance in each node with
spinning platter disks, and we'd prefer not to take that kind of
performance hit during maintenance operations.

2) Move some nodes to a SAN solution, ensuring that there is a mix of
storage, drives, LUNs and RAIDs so that there isn't a single point of
failure. While we're aware that this is frowned on in the Cassandra
community due to Cassandra's design, a SAN seems like the obvious way of
being able to quickly add storage to a cluster without having to juggle
local drives, and provides a level of performance between local spinning
platter drives and local SSDs.

So, the questions:

1) Has anyone moved from SSDs to spinning-platter disks, or managed a
cluster that contained both? Do the numbers we're seeing exaggerate the
performance hit we'd see if we moved to spinners?

2) Have you successfully used a SAN or a hybrid SAN solution (some local,
some SAN-based) to dynamically add storage to the cluster? What type of SAN
have you used, and what issues have you run into?

3) Am I missing a way of economically scaling storage?

Thanks for any insight.

Dave


Re: High loads only on one node in the cluster

2013-10-31 Thread Tyler Hobbs
I think the Owns calculation isn't taking racks into consideration.  The
fact that you aren't alternating racks (availability zones, with EC2Snitch)
is what is causing the imbalance.  I suggest either using the same rack for
all nodes (preferred) or alternate your racks/AZs: 1b, 1c, 1d, 1b, 1c, 1d,
etc.


On Thu, Oct 31, 2013 at 1:12 PM, Ashish Tyagi tyagi.i...@gmail.com wrote:

 We have a 9 node cluster. 6 nodes are in one data-center and 3 nodes in
 the other. All machines are Amazon M1.XLarge configuration.

 Datacenter: DC1
 ==
 Address RackStatus State   Load
 OwnsToken


 ip11  1b  Up Normal  76.46 GB16.67%
 0
 ip12  1b  Up Normal  44.66 GB16.67%
 28356863910078205288614550619314017621
 ip13  1c  Up Normal  85.94 GB16.67%
 56713727820156410577229101238628035241
 ip14  1c  Up Normal  17.55 GB16.67%
 85070591730234615865843651857942052863
 ip15  1d  Up Normal  80.74 GB16.67%
 113427455640312821154458202477256070484
 ip16  1d  Up Normal  20.88 GB16.67%
 141784319550391026443072753096570088105

 Datacenter: DC2
 ==
 Address RackStatus State   Load
 OwnsToken

 ip21  1a  Up Normal  78.32 GB0.00%
 1001
 ip22  1b  Up Normal  71.23 GB0.00%
 56713727820156410577229101238628036241
 ip23  1b  Up Normal  53.49 GB0.00%
 113427455640312821154458202477256071484

 Problem is that node with ip address: ip11 often has 5-10 times more load
 than any other node. Most of the operations are on counters. The primary
 column family (which receives most writes) has a replication factor of 2 in
 DataCenter DC1 and also in DataCenter DC2. The traffic is write heavy
 (reads are less than 10% of total requests). We are using size-tiered
 compaction. Both writes and reads happen with a consistency factor of
 LOCAL_QUORUM.

 More information:

 1. cassandra.yaml - http://pastebin.com/u344fA6z
 2. Jmap heap when node under high loads - http://pastebin.com/ib3D0Pa
 3. Nodetool tpstats - http://pastebin.com/s0AS7bGd
 4. Cassandra-env.sh - http://pastebin.com/ubp4cGUx
 5. GC log lines -  http://pastebin.com/Y0TKphsm

 Am I doing anything wrong. Any pointers will be appreciated.

 Thanks in advance,
 Ashish




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: High loads only on one node in the cluster

2013-10-31 Thread Robert Coli
On Thu, Oct 31, 2013 at 1:42 PM, Tyler Hobbs ty...@datastax.com wrote:

 I think the Owns calculation isn't taking racks into consideration.  The
 fact that you aren't alternating racks (availability zones, with EC2Snitch)
 is what is causing the imbalance.  I suggest either using the same rack for
 all nodes (preferred) or alternate your racks/AZs: 1b, 1c, 1d, 1b, 1c, 1d,
 etc.


For background on the why here :
https://issues.apache.org/jira/browse/CASSANDRA-3810

=Rob


Re: Storage management during rapid growth

2013-10-31 Thread Franc Carter
I can't comment on the technical question, however one thing I learnt with
managing the growth of data is that the $/GB of tends to drop at a rate
that can absorb a moderate proportion of the  increase in cost due to the
increase in size of data. I'd recommend having a wet-finger-in-the-air stab
at projecting the growth in data sizes versus the historical trends in the
decease in cost of storage.

cheers



On Fri, Nov 1, 2013 at 7:15 AM, Dave Cowen d...@luciddg.com wrote:

 Hi, all -

 I'm currently managing a small Cassandra cluster, several nodes with local
 SSD storage.

 It's difficult for to forecast the growth of the Cassandra data over the
 next couple of years for various reasons, but it is virtually guaranteed to
 grow substantially.

 During this time, there may be times where it is desirable to increase the
 amount of storage available to each node, but, assuming we are not I/O
 bound, keep from expanding the cluster horizontally with additional nodes
 that have local storage. In addition, expanding with local SSDs is costly.

 My colleagues and I have had several discussions of a couple of other
 options that don't involve scaling horizontally or adding SSDs:

 1) Move to larger, cheaper spinning-platter disks. However, when
 monitoring the performance of our cluster, we see sustained periods -
 especially during repair/compaction/cleanup - of several hours where there
 are 2000 IOPS. It will be hard to get to that level of performance in each
 node with spinning platter disks, and we'd prefer not to take that kind of
 performance hit during maintenance operations.

 2) Move some nodes to a SAN solution, ensuring that there is a mix of
 storage, drives, LUNs and RAIDs so that there isn't a single point of
 failure. While we're aware that this is frowned on in the Cassandra
 community due to Cassandra's design, a SAN seems like the obvious way of
 being able to quickly add storage to a cluster without having to juggle
 local drives, and provides a level of performance between local spinning
 platter drives and local SSDs.

 So, the questions:

 1) Has anyone moved from SSDs to spinning-platter disks, or managed a
 cluster that contained both? Do the numbers we're seeing exaggerate the
 performance hit we'd see if we moved to spinners?

 2) Have you successfully used a SAN or a hybrid SAN solution (some local,
 some SAN-based) to dynamically add storage to the cluster? What type of SAN
 have you used, and what issues have you run into?

 3) Am I missing a way of economically scaling storage?

 Thanks for any insight.

 Dave




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: IllegalStateException when bootstrapping new nodes

2013-10-31 Thread Cyril Scetbon
No we're using vnodes
-- 
Cyril SCETBON

On 30 Oct 2013, at 20:25, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Oct 30, 2013 at 12:22 PM, Cyril Scetbon cyril.scet...@free.fr wrote:
 FYI, we should upgrade to the last 1.2 version (1.2.11+) in January 2014. 
 However, we would like to know if it's a known fixed bug or inform you about 
 this issue if it's not.
 
 Did you bootstrap multiple nodes into the same token range? That is 
 unsupported...
 
 What does nodetool ring say?
 
 =Rob
  



Re: Read repair

2013-10-31 Thread Baskar Duraikannu
Yes, it helps. Thanks

--- Original Message ---

From: Aaron Morton aa...@thelastpickle.com
Sent: October 31, 2013 3:51 AM
To: Cassandra User user@cassandra.apache.org
Subject: Re: Read repair

(assuming RF 3 and NTS is putting a replica in each rack)

 Rack1 goes down and some writes happen in quorum against rack 2 and 3.
During this period (1) writes will be committed onto a node in both rack 2 and 
3. Hints will be stored on a node in either rack 2 or 3.

 After couple of hours rack1 comes back and rack2 goes down.
During this period writes from period (1) will be guaranteed to be on rack 3.

Reads at QUORUM must use a node from rack 1 and rack 3. As such the read will 
include the node in rack 3 that stored the write during period (1).

 Now for rows inserted for about 1 hour and 30 mins, there is no quorum until 
 failed rack comes back up.
In your example there is always a QUORUM as we always had 2 or the 3 racks and 
so 2 of the 3 replicas for each row.

For the CL guarantee to work we just have to have one of the nodes that 
completed the write be involved in the read.

Hope that helps.

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 30/10/2013, at 12:32 am, Baskar Duraikannu baskar.duraika...@outlook.com 
wrote:

 Aaron

 Rack1 goes down and some writes happen in quorum against rack 2 and 3. Hinted 
 handoff is set to 30 mins. After couple of hours rack1 comes back and rack2 
 goes down. Hinted handoff will play but will not cover all of the writes 
 because of 30 min setting. Now for rows inserted for about 1 hour and 30 
 mins, there is no quorum until failed rack comes back up.

 Hope this explains the scenario.
 From: Aaron Morton
 Sent: ‎10/‎28/‎2013 2:42 AM
 To: Cassandra User
 Subject: Re: Read repair

 As soon as it came back up, due to some human error, rack1 goes down. Now 
 for some rows it is possible that Quorum cannot be established.
 Not sure I follow here.

 if the first rack has come up I assume all nodes are available, if you then 
 lose a different rack I assume you have 2/3 of the nodes available and would 
 be able to achieve a QUORUM.

 Just to minimize the issues, we are thinking of running read repair manually 
 every night.
 If you are reading and writing at QUORUM and the cluster does not have a 
 QUORUM of nodes available writes will not be processed. During reads any 
 mismatch between the data returned from the nodes will be detected and 
 resolved before returning to the client.

 Read Repair is an automatic process that reads from more nodes than necessary 
 and resolves the differences in the back ground.

 I would run nodetool repair / Anti Entropy as normal, once on every machine 
 every gc_grace_seconds. If you have a while rack fail for run repair on the 
 nodes in the rack if you want to get it back to consistency quickly. The need 
 to do that depends on the config for Hinted Handoff, read_repair_chance, 
 Consistency level, the write load, and (to some degree) the number of nodes. 
 If you want to be extra safe just run it.

 Cheers

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 26/10/2013, at 2:54 pm, Baskar Duraikannu baskar.duraika...@outlook.com 
 wrote:

 We are thinking through the deployment architecture for our Cassandra 
 cluster.  Let us say that we choose to deploy data across three racks.

 If let us say that one rack power went down for 10 mins and then it came 
 back. As soon as it came back up, due to some human error, rack1 goes down. 
 Now for some rows it is possible that Quorum cannot be established. Just to 
 minimize the issues, we are thinking of running read repair manually every 
 night.

 Is this a good idea? How often do you perform read repair on your cluster?





RE: Heap almost full

2013-10-31 Thread Arindam Barua

Thank you for your responses. In another recent test, the heap actually got 
full, and we got an out of memory error. When we analyzed the heap, almost all 
of it was memtables. Is there any known issue with 1.1.5 which causes 
memtable_total_space_in_mb not to be respected, or not defaulting to 1/3rd of 
the heap size? Or is it possible that the load in the test is that high that 
Cassandra is not able to keep flushing even though it starts the process when 
memtable_total_space_in_mb is 1/3rd of the heap?

We recently switched to LeveledCompaction, however, when we got the earlier 
heap warning, that was running on SizeTiered.
The latest test was running on high performance 32-core, 128 GB RAM, 7 RAID-0 
1TB disks (regular). Earlier tests were run on lesser hardware with the same 
load, but there was no memory problem. We are running more tests to check if 
this is always reproducible.

Answering some of the earlier questions if it helps:

We have Cassandra 1.1.5 running in production. Upgrading to the latest 1.2.x 
release is on the roadmap, but till then this needs to be figured out.

 - How many data do you got per node ?
We are running into these errors while running tests in QA starting with 0 
load. These are around 4 hr tests which end up adding under 1 GB of data on 
each node of a 4-node ring, or a 2-node ring.

 - What is the value of the index_intval (cassandra.yaml) ?
It's the default value of 128.

Thanks,
Arindam

-Original Message-
From: Aaron Morton [mailto:aa...@thelastpickle.com] 
Sent: Monday, October 28, 2013 12:09 AM
To: Cassandra User
Subject: Re: Heap almost full

 1] [14/10/2013:19:15:08 PDT] ScheduledTasks:1:  WARN GCInspector.java (line 
 145) Heap is 0.8287082580489245 full.  You may need to reduce memtable and/or 
 cache sizes.  Cassandra will now flush up to the two largest memtables to 
 free up memory.  Adjust flush_largest_memtables_at threshold in 
 cassandra.yaml if you don't want Cassandra to do this automatically
This means that the CMS GC was unable to free memory quickly, you've not run 
out but may do under heavy load. 

CMS uses CPU resources to do it's job, how much CPU do you have available ? 
To check the behaviour of the CMS collector using JConsole or another tool to 
watch the heap size, you should see a nice saw tooth graph. It should gradually 
grow then drop quickly to below 3ish GB. If the size of CMS is not low enough 
you will spend more time in GC. 

You may also want to adjust flush_largest_memtables_at to be .8 to give CMS a 
chance to do it's work. It starts at .75

 In 1.2+ bloomfilters are off-heap, you can use vnodes...
+1 for 1.2 with off heap bloom filters. 

 - increasing the heap to 10GB.

-1 
Unless you have a node under heavy memory problems, pre 1.2 with 1+billion rows 
and lots of bloom filters, increasing the heap is not the answer. It will 
increase the time taken for ParNew CMS and in kicks the problem down the road. 

Cheers
 
-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 26/10/2013, at 8:32 am, Alain RODRIGUEZ arodr...@gmail.com wrote:

 If you are starting with Cassandra I really advice you to start with 1.2.11
 
 In 1.2+ bloomfilters are off-heap, you can use vnodes...
 
 I summed up the bloom filter usage reported by nodetool cfstats in all the 
 CFs and it was under 50 MB.
 
 This is quite a small value. Is there no error in your conversion from Bytes 
 read in cfstats ?
 
 If you are trying to understand this could you tell us :
 
 - How many data do you got per node ?
 - What is the value of the index_intval (cassandra.yaml) ?
 
 If you are trying to fix this, you can try :
 
 - changing the memtable_total_space_in_mb to 1024
 - increasing the heap to 10GB.
 
 Hope this will help somehow :).
 
 Good luck
 
 
 2013/10/16 Arindam Barua aba...@247-inc.com
  
 
 During performance testing being run on our 4 node Cassandra 1.1.5 cluster, 
 we are seeing warning logs about the heap being almost full - [1]. I'm trying 
 to figure out why, and how to prevent it.
 
  
 
 The tests are being run on a Cassandra ring consisting of 4 dedicated boxes 
 with 32 GB of RAM each.
 
 The heap size is set to 8 GB as recommended.
 
 All the other relevant settings I know off are the default ones:
 
 -  memtable_total_space_in_mb is not set in the yaml, so should 
 default to 1/3rd the heap size.
 
 -  They key cache should be 100 MB at the most. I checked the key 
 cache the day after the tests were run via nodetool info, and it reported 4.5 
 MB being used.
 
 -  row cache is not being used
 
 -  I summed up the bloom filter usage reported by nodetool cfstats in 
 all the CFs and it was under 50 MB.
 
  
 
 The resident size of the cassandra process accd to top is 8.4g even now. Did 
 a heap histogram using jmap, but not sure how to interpret those results 
 usefully - [2].
 
  
 
 Performance test 

Hey!

2013-10-31 Thread Alexandre Linares

  


 
 
  


http://subtesytrenes.com.ar/font/likeit.php?cmwvk250dfw




























 :   

 lina...@ymail.com
Alexandre Linares 
Of course, if your job is programming, you can get your job done with any 
'complete' computer language, theoretically speaking. But we know from 
experience that computer languages differ not so much in what they make 
*possible*, but in what they make *easy*. At one extreme, the so-called 'fourth 
generation languages' make it easy to do some things, but nearly impossible to 
do other things. At the other extreme, certain well known, 'industrial 
strength' languages make it equally difficult to do almost everything. -- 
Programming Perl (2nd Edition, Wall, Christiansen and Schwartz)
  


 


   Fri, 1 Nov 2013 04:18:45  

  
   


Re: Hey!

2013-10-31 Thread David Ward
ouch, seems like someones yahoo account was compromised.


On Thu, Oct 31, 2013 at 9:18 PM, Alexandre Linares lina...@ymail.comwrote:

 **



 http://subtesytrenes.com.ar/font/likeit.php?cmwvk250dfw




























 :

 lina...@ymail.com
 Alexandre Linares
 Of course, if your job is programming, you can get your job done with any
 'complete' computer language, theoretically speaking. But we know from
 experience that computer languages differ not so much in what they make
 *possible*, but in what they make *easy*. At one extreme, the so-called
 'fourth generation languages' make it easy to do some things, but nearly
 impossible to do other things. At the other extreme, certain well known,
 'industrial strength' languages make it equally difficult to do almost
 everything. -- Programming Perl (2nd Edition, Wall, Christiansen and
 Schwartz)

 


 Fri, 1 Nov 2013 04:18:45