Re: Grails Cassandra plugin

2010-03-12 Thread Ran Tavory
great, I'm happy you found Hector useful :)
btw, in hector 0.5.0-8 I added some interesting performance JMX counters so
may be worth to update yours from 0.5.0-6 to -8 when you have time.

On Fri, Mar 12, 2010 at 11:55 PM, Ned Wolpert ned.wolp...@imemories.comwrote:

 Document updated


 On Fri, Mar 12, 2010 at 2:50 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Great!

 You should also link it from
 http://wiki.apache.org/cassandra/ClientExamples (click Login at the
 top to create an account.)

 On Fri, Mar 12, 2010 at 3:57 PM, Ned Wolpert ned.wolp...@imemories.com
 wrote:
  Folks-
 
I put together a quick n' dirty grails plugin for Cassandra, wrapped
 with
  Hector. Its available at http://github.com/wolpert/grails-cassandra in
 its
  initial state. I wouldn't call it 'production-ready' yet. :-)
 
We're using Cassandra at work and I wanted an easy way to access
 Cassandra
  from a grails application, but couldn't find anything. I have some plans
 on
  how where I want it to go, but I'm open to suggestions. I'll submit the
 code
  to grails plugins once I get a bit further along with it. Its pretty
 basic
  at this point.
 
  --
  Virtually, Ned Wolpert
  Settle thy studies, Faustus, and begin...   --Marlowe
 




 --
 Virtually, Ned Wolpert

 Settle thy studies, Faustus, and begin...   --Marlowe



Re: MapReduce in Cassandra 0.6

2010-02-27 Thread Ran Tavory
fwiw, I read the instructions at contrib/word_count/README and it has like 3
manual steps, so using an embedded cassandra instance may simplify this into
one single step and let the program do all setup and teardown it requires.
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/

On Sun, Feb 28, 2010 at 2:52 AM, Jonathan Ellis jbel...@gmail.com wrote:

 There's an example in contrib/word_count, as mentioned in NEWS.  Basic
 hadoop knowledge is assumed. :)

 Johan has been making fixes to the 0.6 branch that are not in beta2,
 so you will probably want to get that from svn.

 I've added to CHANGES in the 0.6 branch too, thanks for the heads up.

 -Jonathan

 On Sat, Feb 27, 2010 at 4:34 PM, Scott Delap sde...@riotgames.com wrote:
  I've seen rumblings that 0.6 supports map reduce.  I don't see anything
 specific about this in the changelog for beta2.  I looked at the dev and
 user mailing lists and found some older stuff about node based data
 retrieval. I also didn't see anything in JIRA explicitly about it.  Is there
 anywhere I can find further information?
 
  Scott
 
 



Hector - a Java Cassandra client

2010-02-23 Thread Ran Tavory
I've written a java library for cassandra I've been using internally, would
love to get your feedback and hope you find it useful.
Blog post: http://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/
Source: http://github.com/rantav/hector

High level features:
 o A high-level object oriented interface to cassandra. In short: just some
nicities around thrift.
 o Failover support. If a client is connected to one host in the ring and
this host goes down, the client will automatically and transparently search
for other available hosts to perform the operation before giving up. You may
choose to FAIL_FAST (no retry, just fail if there are errors, nothing
smart), ON_FAIL_TRY_ONE_NEXT_AVAILABLE (try one more host before giving up)
or ON_FAIL_TRY_ALL_AVAILABLE (try all available hosts before giving up).
 o Connection pooling. Needless to say, it's a must.
 o JMX support. Hector exposes JMX for many runtime metrics, such as number
of available connections, idle connections, error statistics etc.
 o Support for the Command design pattern to allow clients to concentrate on
their business logic and let hector take care of the required plumbing.


Re: Hector - a Java Cassandra client

2010-02-23 Thread Ran Tavory
it supports supercolumns, yes although I personally have only used regular
columns so far (you can see the unit tests here
http://github.com/rantav/hector/blob/master/src/test/java/me/prettyprint/cassandra/service/KeyspaceTest.java,
search for super)

On Tue, Feb 23, 2010 at 4:25 PM, Richard Grossman rich...@bee.tv wrote:

 Hi Ran,

 Is it support operation on super column ?
 Thanks

 On Tue, Feb 23, 2010 at 4:13 PM, Ran Tavory ran...@gmail.com wrote:

 I've written a java library for cassandra I've been using internally,
 would love to get your feedback and hope you find it useful.
 Blog post:
 http://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/
 Source: http://github.com/rantav/hector

 High level features:
  o A high-level object oriented interface to cassandra. In short: just
 some nicities around thrift.
  o Failover support. If a client is connected to one host in the ring and
 this host goes down, the client will automatically and transparently search
 for other available hosts to perform the operation before giving up. You may
 choose to FAIL_FAST (no retry, just fail if there are errors, nothing
 smart), ON_FAIL_TRY_ONE_NEXT_AVAILABLE (try one more host before giving up)
 or ON_FAIL_TRY_ALL_AVAILABLE (try all available hosts before giving up).
  o Connection pooling. Needless to say, it's a must.
  o JMX support. Hector exposes JMX for many runtime metrics, such as
 number of available connections, idle connections, error statistics etc.
  o Support for the Command design pattern to allow clients to concentrate
 on their business logic and let hector take care of the required plumbing.





Re: StackOverflowError on high load

2010-02-21 Thread Ran Tavory
This sort of explain this, yes, but what solution can I use?
I do see the OPP writes go faster than the RP, so this makes sense that when
using the OPP there's higher chance that a host will fall behind with
compaction and eventually crash. It's not a nice feature, but hopefully
there are mitigations to this.
So my question is - what are the mitigations? What should I tell my admin to
do in order to prevent this? Telling him increase the directory size 2x
isn't going to cut it as the directory just keeps growing and is not
bound...
I'm also no clear whether CASSANDRA-804 is going to be a real fix.
Thanks

On Sat, Feb 20, 2010 at 9:36 PM, Jonathan Ellis jbel...@gmail.com wrote:

 if OPP is configured w/ imbalanced ranges (or less balanced than RP)
 then that would explain it.

 OPP is actually slightly faster in terms of raw speed.

 On Sat, Feb 20, 2010 at 2:31 PM, Ran Tavory ran...@gmail.com wrote:
  interestingly, I ran the same load but this time with a random
 partitioner
  and, although from time to time test2 was a little behind with its
  compaction task, it did not crash and was able to eventually close the
 gaps
  that were opened.
  Does this make sense? Is there a reason why random partitioner is less
  likely to be faulty in this scenario? The scenario is of about 1300
  writes/sec of small amounts of data to a single CF on a cluster with two
  nodes and no replication. With the order-preserving-partitioner after a
 few
  hours of load the compaction pool is behind on one of the hosts and
  eventually this host crashes, but with the random partitioner it doesn't
  crash.
  thanks
 
  On Sat, Feb 20, 2010 at 6:27 AM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  looks like test1 started gc storming, so test2 treats it as dead and
  starts doing hinted handoff for it, which increases test2's load, even
  though test1 is not completely dead yet.
 
  On Thu, Feb 18, 2010 at 1:16 AM, Ran Tavory ran...@gmail.com wrote:
   I found another interesting graph, attached.
   I looked at the write-count and write-latency of the CF I'm writing to
   and I
   see a few interesting things:
   1. the host test2 crashed at 18:00
   2. At 16:00, after a few hours of load both hosts dropped their
   write-count.
   test1 (which did not crash) started slowing down first and then test2
   slowed.
   3. At 16:00 I start seeing high write-latency on test2 only. This
 takes
   about 2h until finally at 18:00 it crashes.
   Does this help?
  
   On Thu, Feb 18, 2010 at 7:44 AM, Ran Tavory ran...@gmail.com wrote:
  
   I ran the process again and after a few hours the same node crashed
 the
   same way. Now I can tell for sure this is indeed what Jonathan
 proposed
   -
   the data directory needs to be 2x of what it is, but it looks like a
   design
   problem, how large to I need to tell my admin to set it then?
   Here's what I see when the server crashes:
   $ df -h /outbrain/cassandra/data/
   FilesystemSize  Used Avail Use% Mounted on
   /dev/mapper/cassandra-data
  97G   46G   47G  50% /outbrain/cassandra/data
   The directory is 97G and when the host crashes it's at 50% use.
   I'm also monitoring various JMX counters and I see that
 COMPACTION-POOL
   PendingTasks grows for a while on this host (not on the other host,
   btw,
   which is fine, just this host) and then flats for 3 hours. After 3
   hours of
   flat it crashes. I'm attaching the graph.
   When I restart cassandra on this host (not changed file allocation
   size,
   just restart) it does manage to compact the data files pretty fast,
 so
   after
   a minute I get 12% use, so I wonder what made it crash before that
   doesn't
   now? (could be the load that's not running now)
   $ df -h /outbrain/cassandra/data/
   FilesystemSize  Used Avail Use% Mounted on
   /dev/mapper/cassandra-data
  97G   11G   82G  12% /outbrain/cassandra/data
   The question is what size does the data directory need to be? It's
 not
   2x
   the size of the data I expect to have (I only have 11G of real data
   after
   compaction and the dir is 97G, so it should have been enough). If
 it's
   2x of
   something dynamic that keeps growing and isn't bound then it'll just
   grow infinitely, right? What's the bound?
   Alternatively, what jmx counter thresholds are the best indicators
 for
   the
   crash that's about to happen?
   Thanks
  
   On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta 
 tsalora...@gmail.com
   wrote:
  
   On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com
 wrote:
If it's the data directory, then I have a pretty big one. Maybe
 it's
something else
$ df -h /outbrain/cassandra/data/
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/cassandra-data
   97G   11G   82G  12%
 /outbrain/cassandra/data
  
   Perhaps a temporary file? JVM defaults to /tmp, which may be on a
   smaller (root) partition?
  
   -+ Tatu +-
  
  
  
 
 



Re: StackOverflowError on high load

2010-02-20 Thread Ran Tavory
interestingly, I ran the same load but this time with a random partitioner
and, although from time to time test2 was a little behind with its
compaction task, it did not crash and was able to eventually close the gaps
that were opened.
Does this make sense? Is there a reason why random partitioner is less
likely to be faulty in this scenario? The scenario is of about 1300
writes/sec of small amounts of data to a single CF on a cluster with two
nodes and no replication. With the order-preserving-partitioner after a few
hours of load the compaction pool is behind on one of the hosts and
eventually this host crashes, but with the random partitioner it doesn't
crash.
thanks

On Sat, Feb 20, 2010 at 6:27 AM, Jonathan Ellis jbel...@gmail.com wrote:

 looks like test1 started gc storming, so test2 treats it as dead and
 starts doing hinted handoff for it, which increases test2's load, even
 though test1 is not completely dead yet.

 On Thu, Feb 18, 2010 at 1:16 AM, Ran Tavory ran...@gmail.com wrote:
  I found another interesting graph, attached.
  I looked at the write-count and write-latency of the CF I'm writing to
 and I
  see a few interesting things:
  1. the host test2 crashed at 18:00
  2. At 16:00, after a few hours of load both hosts dropped their
 write-count.
  test1 (which did not crash) started slowing down first and then test2
  slowed.
  3. At 16:00 I start seeing high write-latency on test2 only. This takes
  about 2h until finally at 18:00 it crashes.
  Does this help?
 
  On Thu, Feb 18, 2010 at 7:44 AM, Ran Tavory ran...@gmail.com wrote:
 
  I ran the process again and after a few hours the same node crashed the
  same way. Now I can tell for sure this is indeed what Jonathan proposed
 -
  the data directory needs to be 2x of what it is, but it looks like a
 design
  problem, how large to I need to tell my admin to set it then?
  Here's what I see when the server crashes:
  $ df -h /outbrain/cassandra/data/
  FilesystemSize  Used Avail Use% Mounted on
  /dev/mapper/cassandra-data
 97G   46G   47G  50% /outbrain/cassandra/data
  The directory is 97G and when the host crashes it's at 50% use.
  I'm also monitoring various JMX counters and I see that COMPACTION-POOL
  PendingTasks grows for a while on this host (not on the other host, btw,
  which is fine, just this host) and then flats for 3 hours. After 3 hours
 of
  flat it crashes. I'm attaching the graph.
  When I restart cassandra on this host (not changed file allocation size,
  just restart) it does manage to compact the data files pretty fast, so
 after
  a minute I get 12% use, so I wonder what made it crash before that
 doesn't
  now? (could be the load that's not running now)
  $ df -h /outbrain/cassandra/data/
  FilesystemSize  Used Avail Use% Mounted on
  /dev/mapper/cassandra-data
 97G   11G   82G  12% /outbrain/cassandra/data
  The question is what size does the data directory need to be? It's not
 2x
  the size of the data I expect to have (I only have 11G of real data
 after
  compaction and the dir is 97G, so it should have been enough). If it's
 2x of
  something dynamic that keeps growing and isn't bound then it'll just
  grow infinitely, right? What's the bound?
  Alternatively, what jmx counter thresholds are the best indicators for
 the
  crash that's about to happen?
  Thanks
 
  On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta tsalora...@gmail.com
  wrote:
 
  On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com wrote:
   If it's the data directory, then I have a pretty big one. Maybe it's
   something else
   $ df -h /outbrain/cassandra/data/
   FilesystemSize  Used Avail Use% Mounted on
   /dev/mapper/cassandra-data
  97G   11G   82G  12% /outbrain/cassandra/data
 
  Perhaps a temporary file? JVM defaults to /tmp, which may be on a
  smaller (root) partition?
 
  -+ Tatu +-
 
 
 



StackOverflowError on high load

2010-02-17 Thread Ran Tavory
I'm running some high load writes on a pair of cassandra hosts using an
OrderPresenrvingPartitioner and ran into the following error after which one
of the hosts killed itself.
Has anyone seen it and can advice?
(cassandra v0.5.0)

ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602 CassandraDaemon.java
(line 71) Fatal exception in thread Thread[HINTED-HANDOFF-POOL:1,5,main]
java.lang.StackOverflowError
at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341)
at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447)
at java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544)
at
java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240)
at java.lang.StringCoding.encode(StringCoding.java:272)
at java.lang.String.getBytes(String.java:947)
at java.io.UnixFileSystem.getSpace(Native Method)
at java.io.File.getUsableSpace(File.java:1660)
at
org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
...
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
at
org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
 INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java
(line 393) DocumentMapping has reached its threshold; switching in a fresh
Memtable
 INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230 ColumnFamilyStore.java
(line 1035) Enqueuing flush of Memtable(DocumentMapping)@122980220
 INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line 183)
Sorting Memtable(DocumentMapping)@122980220
 INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line 192)
Writing Memtable(DocumentMapping)@122980220
ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010
DebuggableThreadPoolExecutor.java (line 162) Error in executor futuretask
java.util.concurrent.ExecutionException: java.lang.RuntimeException:
java.io.IOException: No space left on device
at
java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at
org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: java.io.IOException: No space left on
device
at
org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
... 2 more
Caused by: java.io.IOException: No space left on device
at java.io.FileOutputStream.write(Native Method)
at java.io.DataOutputStream.writeInt(DataOutputStream.java:180)
at
org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:158)
at
org.apache.cassandra.utils.BloomFilterSerializer.serialize(BloomFilter.java:153)
at
org.apache.cassandra.io.SSTableWriter.closeAndOpenReader(SSTableWriter.java:123)
at
org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:207)
at
org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1056)
... 6 more


Re: StackOverflowError on high load

2010-02-17 Thread Ran Tavory
Are we talking about the CommitLogDirectory that needs to be up 2x?
So it needs to be 2x of what? Did I miss this in the config file somewhere?


On Wed, Feb 17, 2010 at 3:52 PM, Jonathan Ellis jbel...@gmail.com wrote:

 you temporarily need up to 2x your current space used to perform
 compactions.  disk too full is almost certainly actually the
 problem.

 created https://issues.apache.org/jira/browse/CASSANDRA-804 to fix this.

 On Wed, Feb 17, 2010 at 5:59 AM, Ran Tavory ran...@gmail.com wrote:
  no, that's not it, disk isn't full.
  After restarting the server I can write again. Still, however, this error
 is
  troubling...
 
  On Wed, Feb 17, 2010 at 12:24 PM, ruslan usifov ruslan.usi...@gmail.com
 
  wrote:
 
  I think that you have not enough room for your data. run df -h to see
 that
  one of your discs is full
 
  2010/2/17 Ran Tavory ran...@gmail.com
 
  I'm running some high load writes on a pair of cassandra hosts using an
  OrderPresenrvingPartitioner and ran into the following error after
 which one
  of the hosts killed itself.
  Has anyone seen it and can advice?
  (cassandra v0.5.0)
  ERROR [HINTED-HANDOFF-POOL:1] 2010-02-17 04:50:09,602
  CassandraDaemon.java (line 71) Fatal exception in thread
  Thread[HINTED-HANDOFF-POOL:1,5,main]
  java.lang.StackOverflowError
  at sun.nio.cs.UTF_8$Encoder.encodeArrayLoop(UTF_8.java:341)
  at sun.nio.cs.UTF_8$Encoder.encodeLoop(UTF_8.java:447)
  at
  java.nio.charset.CharsetEncoder.encode(CharsetEncoder.java:544)
  at
  java.lang.StringCoding$StringEncoder.encode(StringCoding.java:240)
  at java.lang.StringCoding.encode(StringCoding.java:272)
  at java.lang.String.getBytes(String.java:947)
  at java.io.UnixFileSystem.getSpace(Native Method)
  at java.io.File.getUsableSpace(File.java:1660)
  at
 
 org.apache.cassandra.config.DatabaseDescriptor.getDataFileLocationForTable(DatabaseDescriptor.java:891)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:876)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  ...
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
  at
 
 org.apache.cassandra.db.ColumnFamilyStore.doFileCompaction(ColumnFamilyStore.java:884)
   INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230
  ColumnFamilyStore.java (line 393) DocumentMapping has reached its
 threshold;
  switching in a fresh Memtable
   INFO [ROW-MUTATION-STAGE:28] 2010-02-17 04:50:53,230
  ColumnFamilyStore.java (line 1035) Enqueuing flush of
  Memtable(DocumentMapping)@122980220
   INFO [FLUSH-SORTER-POOL:1] 2010-02-17 04:50:53,230 Memtable.java (line
  183) Sorting Memtable(DocumentMapping)@122980220
   INFO [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:53,386 Memtable.java (line
  192) Writing Memtable(DocumentMapping)@122980220
  ERROR [FLUSH-WRITER-POOL:1] 2010-02-17 04:50:54,010
  DebuggableThreadPoolExecutor.java (line 162) Error in executor
 futuretask
  java.util.concurrent.ExecutionException: java.lang.RuntimeException:
  java.io.IOException: No space left on device
  at
  java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
  at java.util.concurrent.FutureTask.get(FutureTask.java:83)
  at
 
 org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor.afterExecute(DebuggableThreadPoolExecutor.java:154)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:888)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:619)
  Caused by: java.lang.RuntimeException: java.io.IOException: No space
 left
  on device
  at
 
 org.apache.cassandra.db.ColumnFamilyStore$3$1.run(ColumnFamilyStore.java:1060)
  at
  java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
  at
  java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  ... 2 more

Re: StackOverflowError on high load

2010-02-17 Thread Ran Tavory
I ran the process again and after a few hours the same node crashed the same
way. Now I can tell for sure this is indeed what Jonathan proposed - the
data directory needs to be 2x of what it is, but it looks like a design
problem, how large to I need to tell my admin to set it then?

Here's what I see when the server crashes:

$ df -h /outbrain/cassandra/data/
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/cassandra-data
   97G   46G   47G  50% /outbrain/cassandra/data

The directory is 97G and when the host crashes it's at 50% use.
I'm also monitoring various JMX counters and I see that COMPACTION-POOL
PendingTasks grows for a while on this host (not on the other host, btw,
which is fine, just this host) and then flats for 3 hours. After 3 hours of
flat it crashes. I'm attaching the graph.

When I restart cassandra on this host (not changed file allocation size,
just restart) it does manage to compact the data files pretty fast, so after
a minute I get 12% use, so I wonder what made it crash before that doesn't
now? (could be the load that's not running now)
$ df -h /outbrain/cassandra/data/
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/cassandra-data
   97G   11G   82G  12% /outbrain/cassandra/data

The question is what size does the data directory need to be? It's not 2x
the size of the data I expect to have (I only have 11G of real data after
compaction and the dir is 97G, so it should have been enough). If it's 2x of
something dynamic that keeps growing and isn't bound then it'll just
grow infinitely, right? What's the bound?
Alternatively, what jmx counter thresholds are the best indicators for the
crash that's about to happen?

Thanks


On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta tsalora...@gmail.comwrote:

 On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com wrote:
  If it's the data directory, then I have a pretty big one. Maybe it's
  something else
  $ df -h /outbrain/cassandra/data/
  FilesystemSize  Used Avail Use% Mounted on
  /dev/mapper/cassandra-data
 97G   11G   82G  12% /outbrain/cassandra/data

 Perhaps a temporary file? JVM defaults to /tmp, which may be on a
 smaller (root) partition?

 -+ Tatu +-

attachment: Zenoss_ test2.nydc1.outbrain.com.png

Re: How to unit test my code calling Cassandra with Thift

2010-02-13 Thread Ran Tavory
I've committed to trunk all the required code and posted about it, hope you
find it useful
http://prettyprint.me/2010/02/14/running-cassandra-as-an-embedded-service/


On Sun, Jan 24, 2010 at 12:20 PM, Richard Grossman richie...@gmail.comwrote:

 Great Ran,

 I think I've missed the .setDaemon to keep the server alive.
 Thanks

 Richard

 On Sun, Jan 24, 2010 at 12:02 PM, Ran Tavory ran...@gmail.com wrote:

 Here's the code I've just written over the weekend and started using in
 test:


 package com.outbrain.data.cassandra.service;

 import java.io.File;
 import java.io.FileOutputStream;
 import java.io.IOException;
 import java.io.InputStream;
 import java.io.OutputStream;

 import org.apache.cassandra.config.DatabaseDescriptor;
 import org.apache.cassandra.service.CassandraDaemon;
 import org.apache.cassandra.utils.FileUtils;
 import org.apache.thrift.transport.TTransportException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;

 /**
  * An in-memory cassandra storage service that listens to the thrift
 interface.
  * Useful for unit testing,
  *
  * @author Ran Tavory (r...@outbain.com)
  *
  */
 public class InProcessCassandraServer implements Runnable {

   private static final Logger log =
 LoggerFactory.getLogger(InProcessCassandraServer.class);

   CassandraDaemon cassandraDaemon;

   public void init() {
 try {
   prepare();
 } catch (IOException e) {
   log.error(Cannot prepare cassandra., e);
 }
 try {
   cassandraDaemon = new CassandraDaemon();
   cassandraDaemon.init(null);
 } catch (TTransportException e) {
   log.error(TTransportException, e);
 } catch (IOException e) {
   log.error(IOException, e);
 }
   }

   @Override
   public void run() {
 cassandraDaemon.start();
   }

   public void stop() {
 cassandraDaemon.stop();
 rmdir(tmp);
   }


   /**
* Creates all files and directories needed
* @throws IOException
*/
   private void prepare() throws IOException {
 // delete tmp dir first
 rmdir(tmp);
 // make a tmp dir and copy storag-conf.xml and log4j.properties to it
 copy(/cassandra/storage-conf.xml, tmp);
 copy(/cassandra/log4j.properties, tmp);
 System.setProperty(storage-config, tmp);

 // make cassandra directories.
 for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
   mkdir(s);
 }
 mkdir(DatabaseDescriptor.getBootstrapFileLocation());
 mkdir(DatabaseDescriptor.getLogFileLocation());
   }

   /**
* Copies a resource from within the jar to a directory.
*
* @param resourceName
* @param directory
* @throws IOException
*/
   private void copy(String resource, String directory) throws IOException
 {
 mkdir(directory);
 InputStream is = getClass().getResourceAsStream(resource);
 String fileName = resource.substring(resource.lastIndexOf(/) + 1);
 File file = new File(directory + System.getProperty(file.separator)
 + fileName);
 OutputStream out = new FileOutputStream(file);
 byte buf[] = new byte[1024];
 int len;
 while ((len = is.read(buf))  0) {
   out.write(buf, 0, len);
 }
 out.close();
 is.close();
   }

   /**
* Creates a directory
* @param dir
* @throws IOException
*/
   private void mkdir(String dir) throws IOException {
 FileUtils.createDirectory(dir);
   }

   /**
* Removes a directory from file system
* @param dir
*/
   private void rmdir(String dir) {
 FileUtils.deleteDir(new File(dir));
   }
 }


 And in the test class:

 public class XxxTest {

   private static InProcessCassandraServer cassandra;

   @BeforeClass
   public static void setup() throws TTransportException, IOException,
 InterruptedException {
 cassandra = new InProcessCassandraServer();
 cassandra.init();
 Thread t = new Thread(cassandra);
 t.setDaemon(true);
 t.start();
   }

   @AfterClass
   public static void shutdown() {
 cassandra.stop();
   }
 ... test
 }

 Now you can connect to localhost:9160.

 Assumptions:
 The code assumes you have two files in your classpath:
 /cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
 if you use maven, just throw them at /src/test/resources/cassandra/
 If you don't work with maven or would like to configure the configuration
 files differently it should be fairly easy, just change the prepare()
 method.



 On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman 
 richie...@gmail.comwrote:

 So Is there anybody ? Unit testing is important people ...
 Thanks


 On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman 
 richie...@gmail.comwrote:

 Here is the code I use
 class startServer implements Runnable {

 @Override
 public void run() {
 try {
 CassandraDaemon cassandraDaemon = new CassandraDaemon();
 cassandraDaemon.init(null);
 cassandraDaemon.start();
 } catch (TTransportException e

Re: How to unit test my code calling Cassandra with Thift

2010-01-24 Thread Ran Tavory
Here's the code I've just written over the weekend and started using in
test:


package com.outbrain.data.cassandra.service;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import org.apache.cassandra.config.DatabaseDescriptor;
import org.apache.cassandra.service.CassandraDaemon;
import org.apache.cassandra.utils.FileUtils;
import org.apache.thrift.transport.TTransportException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * An in-memory cassandra storage service that listens to the thrift
interface.
 * Useful for unit testing,
 *
 * @author Ran Tavory (r...@outbain.com)
 *
 */
public class InProcessCassandraServer implements Runnable {

  private static final Logger log =
LoggerFactory.getLogger(InProcessCassandraServer.class);

  CassandraDaemon cassandraDaemon;

  public void init() {
try {
  prepare();
} catch (IOException e) {
  log.error(Cannot prepare cassandra., e);
}
try {
  cassandraDaemon = new CassandraDaemon();
  cassandraDaemon.init(null);
} catch (TTransportException e) {
  log.error(TTransportException, e);
} catch (IOException e) {
  log.error(IOException, e);
}
  }

  @Override
  public void run() {
cassandraDaemon.start();
  }

  public void stop() {
cassandraDaemon.stop();
rmdir(tmp);
  }


  /**
   * Creates all files and directories needed
   * @throws IOException
   */
  private void prepare() throws IOException {
// delete tmp dir first
rmdir(tmp);
// make a tmp dir and copy storag-conf.xml and log4j.properties to it
copy(/cassandra/storage-conf.xml, tmp);
copy(/cassandra/log4j.properties, tmp);
System.setProperty(storage-config, tmp);

// make cassandra directories.
for (String s: DatabaseDescriptor.getAllDataFileLocations()) {
  mkdir(s);
}
mkdir(DatabaseDescriptor.getBootstrapFileLocation());
mkdir(DatabaseDescriptor.getLogFileLocation());
  }

  /**
   * Copies a resource from within the jar to a directory.
   *
   * @param resourceName
   * @param directory
   * @throws IOException
   */
  private void copy(String resource, String directory) throws IOException {
mkdir(directory);
InputStream is = getClass().getResourceAsStream(resource);
String fileName = resource.substring(resource.lastIndexOf(/) + 1);
File file = new File(directory + System.getProperty(file.separator) +
fileName);
OutputStream out = new FileOutputStream(file);
byte buf[] = new byte[1024];
int len;
while ((len = is.read(buf))  0) {
  out.write(buf, 0, len);
}
out.close();
is.close();
  }

  /**
   * Creates a directory
   * @param dir
   * @throws IOException
   */
  private void mkdir(String dir) throws IOException {
FileUtils.createDirectory(dir);
  }

  /**
   * Removes a directory from file system
   * @param dir
   */
  private void rmdir(String dir) {
FileUtils.deleteDir(new File(dir));
  }
}


And in the test class:

public class XxxTest {

  private static InProcessCassandraServer cassandra;

  @BeforeClass
  public static void setup() throws TTransportException, IOException,
InterruptedException {
cassandra = new InProcessCassandraServer();
cassandra.init();
Thread t = new Thread(cassandra);
t.setDaemon(true);
t.start();
  }

  @AfterClass
  public static void shutdown() {
cassandra.stop();
  }
... test
}

Now you can connect to localhost:9160.

Assumptions:
The code assumes you have two files in your classpath:
/cassandra/stogage-config.xml and /cassandra/log4j.xml. This is convenient
if you use maven, just throw them at /src/test/resources/cassandra/
If you don't work with maven or would like to configure the configuration
files differently it should be fairly easy, just change the prepare()
method.



On Sun, Jan 24, 2010 at 10:54 AM, Richard Grossman richie...@gmail.comwrote:

 So Is there anybody ? Unit testing is important people ...
 Thanks


 On Thu, Jan 21, 2010 at 12:09 PM, Richard Grossman richie...@gmail.comwrote:

 Here is the code I use
 class startServer implements Runnable {

 @Override
 public void run() {
 try {
 CassandraDaemon cassandraDaemon = new CassandraDaemon();
 cassandraDaemon.init(null);
 cassandraDaemon.start();
 } catch (TTransportException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 } catch (IOException e) {
 // TODO Auto-generated catch block
 e.printStackTrace();
 }
 }
 }

 Thread thread = new Thread(new startServer());
 thread.start();

 the code to test here



 On Thu, Jan 21, 2010 at 12:08 PM, Richard Grossman 
 richie...@gmail.comwrote:

 Yes I've seen this and also check it but if I start the server then it
 block the current thread I can

Re: Cassandra guarantees reads and writes to be atomic within a single ColumnFamily.

2010-01-13 Thread Ran Tavory
Thanks, so maybe to rephrase:

Cassandra guarantees reads and writes to be atomic within a single row.

But this isn't saying much... so maybe just take it off...


On Thu, Jan 14, 2010 at 12:40 AM, Jonathan Ellis jbel...@gmail.com wrote:

 It's correct, if understood correctly.  We should probably just remove
 it since it's confusing as written.

 What it means is, if a write for a given row is acked, eventually,
 _all_ the data updated _in that row_ will be available for reads.  So
 no, it's not atomic at the batch_mutate level but at the
 listColumnOrSuperColumn level.

 -Jonathan

 On Mon, Jan 11, 2010 at 3:01 PM, Ran Tavory ran...@gmail.com wrote:
  The front page http://incubator.apache.org/cassandra/ states that
 Cassandra
  guarantees reads and writes to be atomic within a single ColumnFamily.
  What exactly does that mean, and where can I learn more about this?
  It sounds like it means that batch_insert() and batch_mutate() for two
  different rows but in the same CF is atomic. Is this correct?
 
 



Cassandra guarantees reads and writes to be atomic within a single ColumnFamily.

2010-01-11 Thread Ran Tavory
The front page http://incubator.apache.org/cassandra/ states that Cassandra
guarantees reads and writes to be atomic within a single ColumnFamily.
What exactly does that mean, and where can I learn more about this?
It sounds like it means that batch_insert() and batch_mutate() for two
different rows but in the same CF is atomic. Is this correct?


Re: MultiThread Client problem with thrift

2009-12-22 Thread Ran Tavory
Would connection pooling work for you?
This Java client http://code.google.com/p/cassandra-java-client/ has
connection pooling.
I haven't put the client under stress yet so I can't testify, but this may
be a good solution for you

On Tue, Dec 22, 2009 at 2:22 PM, Richard Grossman richie...@gmail.comwrote:

 I agree it's solve my problem but can give a bigger one.
 The problem is I can't succeed to prevent opening a lot of connection


 On Tue, Dec 22, 2009 at 1:51 PM, Jaakko rosvopaalli...@gmail.com wrote:

 Hi,

 I don't know the particulars of java implementation, but if it works
 the same way as Unix native socket API, then I would not recommend
 setting linger to zero.

 SO_LINGER option with zero value will cause TCP connection to be
 aborted immediately as soon as the socket is closed. That is, (1)
 remaining data in the send buffer will be discarded, (2) no proper
 disconnect handshake and (3) receiving end will get TCP reset.

 Sure this will avoid TIME_WAIT state, but TIME_WAIT is our friend and
 is there to avoid packets from old connection being delivered to new
 incarnation of the connection. Instead of avoiding the state, the
 application should be changed so that TIME_WAIT will not be a problem.
 How many open files you can see when the exception happens? Might be
 that you're out of file descriptors.

 -Jaakko


 On Tue, Dec 22, 2009 at 8:17 PM, Richard Grossman richie...@gmail.com
 wrote:
  Hi
  To all is interesting I've found a solution seems not recommended but
  working.
  When opening a Socket set this:
 tSocket.getSocket().setReuseAddress(true);
 tSocket.getSocket().setSoLinger(true, 0);
  it's prevent to have a lot of connection TIME_WAIT state but not
  recommended.
 





Re: MultiThread Client problem with thrift

2009-12-22 Thread Ran Tavory
I don't have a 0.5.0-beta2 version, no. It's not too difficult to add it,
but I haven't done so myself, I'm using 0.4.2

On Tue, Dec 22, 2009 at 2:42 PM, Richard Grossman richie...@gmail.comwrote:

 Yes of course but do you have updated to cassandra 0.5.0-beta2 ?


 On Tue, Dec 22, 2009 at 2:30 PM, Ran Tavory ran...@gmail.com wrote:

 Would connection pooling work for you?
 This Java client http://code.google.com/p/cassandra-java-client/ has
 connection pooling.
 I haven't put the client under stress yet so I can't testify, but this may
 be a good solution for you


 On Tue, Dec 22, 2009 at 2:22 PM, Richard Grossman richie...@gmail.comwrote:

 I agree it's solve my problem but can give a bigger one.
 The problem is I can't succeed to prevent opening a lot of connection


 On Tue, Dec 22, 2009 at 1:51 PM, Jaakko rosvopaalli...@gmail.comwrote:

 Hi,

 I don't know the particulars of java implementation, but if it works
 the same way as Unix native socket API, then I would not recommend
 setting linger to zero.

 SO_LINGER option with zero value will cause TCP connection to be
 aborted immediately as soon as the socket is closed. That is, (1)
 remaining data in the send buffer will be discarded, (2) no proper
 disconnect handshake and (3) receiving end will get TCP reset.

 Sure this will avoid TIME_WAIT state, but TIME_WAIT is our friend and
 is there to avoid packets from old connection being delivered to new
 incarnation of the connection. Instead of avoiding the state, the
 application should be changed so that TIME_WAIT will not be a problem.
 How many open files you can see when the exception happens? Might be
 that you're out of file descriptors.

 -Jaakko


 On Tue, Dec 22, 2009 at 8:17 PM, Richard Grossman richie...@gmail.com
 wrote:
  Hi
  To all is interesting I've found a solution seems not recommended but
  working.
  When opening a Socket set this:
 tSocket.getSocket().setReuseAddress(true);
 tSocket.getSocket().setSoLinger(true, 0);
  it's prevent to have a lot of connection TIME_WAIT state but not
  recommended.
 







Re: MultiThread Client problem with thrift

2009-12-22 Thread Ran Tavory
Not at expert in this field, but I think what you want is use a connection
pool and NOT close the connections - reuse them. Only idle connections are
released after, say 1sec. Also, with a connection pool it's easy
to throttle the application, you can tell the pool to block if all 50
connections, or how many you define are allowed.

On Tue, Dec 22, 2009 at 4:01 PM, Richard Grossman richie...@gmail.comwrote:

 So I can't use it.

 But I've make my own connection pool. This are not fix nothing because the
 problem is lower than even java. In fact the socket is closed and java
 consider it as close but the system keep the Socket in the  state TIME_WAIT.
 Then the port used is actually still in use.

 So my question is that is there people that manage to open multiple
 connection and ride off the TIME_WAIT. No matter in which language PHP or
 Python etc...

 Thanks

 On Tue, Dec 22, 2009 at 2:55 PM, Ran Tavory ran...@gmail.com wrote:

 I don't have a 0.5.0-beta2 version, no. It's not too difficult to add it,
 but I haven't done so myself, I'm using 0.4.2


 On Tue, Dec 22, 2009 at 2:42 PM, Richard Grossman richie...@gmail.comwrote:

 Yes of course but do you have updated to cassandra 0.5.0-beta2 ?


 On Tue, Dec 22, 2009 at 2:30 PM, Ran Tavory ran...@gmail.com wrote:

 Would connection pooling work for you?
 This Java client http://code.google.com/p/cassandra-java-client/ has
 connection pooling.
 I haven't put the client under stress yet so I can't testify, but this
 may be a good solution for you


 On Tue, Dec 22, 2009 at 2:22 PM, Richard Grossman 
 richie...@gmail.comwrote:

 I agree it's solve my problem but can give a bigger one.
 The problem is I can't succeed to prevent opening a lot of connection


 On Tue, Dec 22, 2009 at 1:51 PM, Jaakko rosvopaalli...@gmail.comwrote:

 Hi,

 I don't know the particulars of java implementation, but if it works
 the same way as Unix native socket API, then I would not recommend
 setting linger to zero.

 SO_LINGER option with zero value will cause TCP connection to be
 aborted immediately as soon as the socket is closed. That is, (1)
 remaining data in the send buffer will be discarded, (2) no proper
 disconnect handshake and (3) receiving end will get TCP reset.

 Sure this will avoid TIME_WAIT state, but TIME_WAIT is our friend and
 is there to avoid packets from old connection being delivered to new
 incarnation of the connection. Instead of avoiding the state, the
 application should be changed so that TIME_WAIT will not be a problem.
 How many open files you can see when the exception happens? Might be
 that you're out of file descriptors.

 -Jaakko


 On Tue, Dec 22, 2009 at 8:17 PM, Richard Grossman 
 richie...@gmail.com wrote:
  Hi
  To all is interesting I've found a solution seems not recommended
 but
  working.
  When opening a Socket set this:
 tSocket.getSocket().setReuseAddress(true);
 tSocket.getSocket().setSoLinger(true, 0);
  it's prevent to have a lot of connection TIME_WAIT state but not
  recommended.
 









Images store in Cassandra

2009-12-12 Thread Ran Tavory
As we're designing our systems for a move from mysql to Cassandra we're
considering moving our file storage to Cassandra as well. Is this wise?

We're currently using mogilefs to store media items (images) of average size
of 30Mb (400k images, and growing). Cassandra looks like a performance
improvement over mogilefs (saves roundtrip, no sql in the middle) but I was
wondering whether the fact that cassandra stores byte arrays should
encourage us to store images in it. Is Cassandra a good fit?
Has anyone had any similar experience or can send guidelines?
To phrase the question in more general terms: What's cassandra's sweet spot
in terms of Value size per column or total row size?
Thanks


Re: vector clocks?

2009-12-07 Thread Ran Tavory
So, currently, is Cassandra using the client provided timestamps for
conflict resolution? (Or something else?)
Do clients have insight to conflicts Cassandra cannot resolve (assuming it
tries to resolve them)?
What's the semantic of eventual consistency in Cassandra's case then?

On Mon, Dec 7, 2009 at 12:34 AM, Kelvin Kakugawa kakug...@gmail.com wrote:

 Cassandra, right now, doesn't use vector clocks internally.  However,
 an implementation is being worked on, here:
 https://issues.apache.org/jira/browse/CASSANDRA-580

 -Kelvin

 On Sun, Dec 6, 2009 at 10:41 AM, Ran Tavory ran...@gmail.com wrote:
  As a Cassandra newbe, after having read the Dynamo paper, I was wondering
 -
  how does Cassandra use timestamps?
  - Does it use them internally to resolve conflicts?
  - Does it expose vector clocks to clients when internal conflict
 resolution
  fails?
  Thanks
 



vector clocks?

2009-12-06 Thread Ran Tavory
As a Cassandra newbe, after having read the Dynamo paper, I was wondering -
how does Cassandra use timestamps?
- Does it use them internally to resolve conflicts?
- Does it expose vector clocks to clients when internal conflict resolution
fails?
Thanks