Re: Testing row cache feature in trunk: write should put record in cache

2010-02-19 Thread Jonathan Ellis
The whole point of rowcache is to avoid the serialization overhead,
though.  If we just wanted the serialized form cached, we would let
the os block cache handle that without adding an extra layer.  (0.6
uses mmap'd i/o by default on 64bit JVMs so this is very efficient.)

On Fri, Feb 19, 2010 at 3:29 AM, Weijun Li weiju...@gmail.com wrote:
 The memory overhead issue is not directly related to GC because when JVM ran
 out of memory the GC has been very busy for quite a while. In my case JVM
 consumed all of the 6GB when the row cache size hit 1.4mil.

 I haven't started test the row cache feature yet. But I think data
 compression is useful to reduce memory consumption because in my impression
 disk i/o is always the bottleneck for Cassandra while its CPU usage is
 usually low all the time. In addition to this, compression should also help
 to reduce the number of java objects dramatically (correct me if I'm wrong),
 --especially in case we need to cache most of the data to achieve decent
 read latency.

 If ColumnFamily is serializable it shouldn't be that hard to implement the
 compression feature which can be controlled by an option (again :-) in
 storage conf xml.

 When I get to that point you can instruct me to implement this feature along
 with the row-cache-write-through. Our goal is straightforward: to support
 short read latency in high volume web application with write/read ratio to
 be 1:1.

 -Weijun

 -Original Message-
 From: Jonathan Ellis [mailto:jbel...@gmail.com]
 Sent: Thursday, February 18, 2010 12:04 PM
 To: cassandra-user@incubator.apache.org
 Subject: Re: Testing row cache feature in trunk: write should put record in
 cache

 Did you force a GC from jconsole to make sure you weren't just
 measuring uncollected garbage?

 On Wed, Feb 17, 2010 at 2:51 PM, Weijun Li weiju...@gmail.com wrote:
 OK I'll work on the change later because there's another problem to solve:
 the overhead for cache is too big that 1.4mil records (1k each) consumed
 all
 of the 6gb memory of JVM (I guess 4gb are consumed by the row cache). I'm
 thinking that ConcurrentHashMap is not a good choice for LRU and the row
 cache needs to store compressed key data to reduce memory usage. I'll do
 more investigation on this and let you know.

 -Weijun

 On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis jbel...@gmail.com wrote:

 ... tell you what, if you write the option-processing part in
 DatabaseDescriptor I will do the actual cache part. :)

 On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
  https://issues.apache.org/jira/secure/CreateIssue!default.jspa, but
  this is pretty low priority for me.
 
  On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li weiju...@gmail.com wrote:
  Just tried to make quick change to enable it but it didn't work out
 :-(
 
     ColumnFamily cachedRow =
  cfs.getRawCachedRow(mutation.key());
 
      // What I modified
      if( cachedRow == null ) {
          cfs.cacheRow(mutation.key());
          cachedRow = cfs.getRawCachedRow(mutation.key());
      }
 
      if (cachedRow != null)
          cachedRow.addAll(columnFamily);
 
  How can I open a ticket for you to make the change (enable row cache
  write
  through with an option)?
 
  Thanks,
  -Weijun
 
  On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
  On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
   On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com
   wrote:
   Just started to play with the row cache feature in trunk: it seems
   to
   be
   working fine so far except that for RowsCached parameter you need
   to
   specify
   number of rows rather than a percentage (e.g., 20% doesn't
 work).
  
   20% works, but it's 20% of the rows at server startup.  So on a
   fresh
   start that is zero.
  
   Maybe we should just get rid of the % feature...
 
  (Actually, it shouldn't be hard to update this on flush, if you want
  to open a ticket.)
 
 
 






Re: Cassandra News Page

2010-02-19 Thread Ian Holsman
Hi Sal.
we'll be moving off the incubator site shortly. we'll address that when we go 
to cassandra.apache.org

regards
Ian
On Feb 18, 2010, at 4:06 PM, Sal Fuentes wrote:

 This is just a thought, but I think some type of *latest news* page would be 
 nice to have on the main site (http://incubator.apache.org/cassandra/) even 
 if its a bit outdated. Not sure if this has been previously considered. 
 
 -- 
 Salvador Fuentes Jr.

--
Ian Holsman
i...@holsman.net





Re: Testing row cache feature in trunk: write should put record in cache

2010-02-19 Thread Weijun Li
I see. How much is the overhead of java serialization? Does it slow down the
system a lot? It seems to be a tradeoff between CPU usage and memory.

As for mmap of 0.6, do you mmap the sstable data file even it is a lot
larger than the available memory (e.g., the data file is over 100GB while
you have only 8GB ram)? How efficient is mmap in this case? Is mmap already
checked into 0.6 branch?

-Weijun

On Fri, Feb 19, 2010 at 4:56 AM, Jonathan Ellis jbel...@gmail.com wrote:

 The whole point of rowcache is to avoid the serialization overhead,
 though.  If we just wanted the serialized form cached, we would let
 the os block cache handle that without adding an extra layer.  (0.6
 uses mmap'd i/o by default on 64bit JVMs so this is very efficient.)

 On Fri, Feb 19, 2010 at 3:29 AM, Weijun Li weiju...@gmail.com wrote:
  The memory overhead issue is not directly related to GC because when JVM
 ran
  out of memory the GC has been very busy for quite a while. In my case JVM
  consumed all of the 6GB when the row cache size hit 1.4mil.
 
  I haven't started test the row cache feature yet. But I think data
  compression is useful to reduce memory consumption because in my
 impression
  disk i/o is always the bottleneck for Cassandra while its CPU usage is
  usually low all the time. In addition to this, compression should also
 help
  to reduce the number of java objects dramatically (correct me if I'm
 wrong),
  --especially in case we need to cache most of the data to achieve decent
  read latency.
 
  If ColumnFamily is serializable it shouldn't be that hard to implement
 the
  compression feature which can be controlled by an option (again :-) in
  storage conf xml.
 
  When I get to that point you can instruct me to implement this feature
 along
  with the row-cache-write-through. Our goal is straightforward: to support
  short read latency in high volume web application with write/read ratio
 to
  be 1:1.
 
  -Weijun
 
  -Original Message-
  From: Jonathan Ellis [mailto:jbel...@gmail.com]
  Sent: Thursday, February 18, 2010 12:04 PM
  To: cassandra-user@incubator.apache.org
  Subject: Re: Testing row cache feature in trunk: write should put record
 in
  cache
 
  Did you force a GC from jconsole to make sure you weren't just
  measuring uncollected garbage?
 
  On Wed, Feb 17, 2010 at 2:51 PM, Weijun Li weiju...@gmail.com wrote:
  OK I'll work on the change later because there's another problem to
 solve:
  the overhead for cache is too big that 1.4mil records (1k each) consumed
  all
  of the 6gb memory of JVM (I guess 4gb are consumed by the row cache).
 I'm
  thinking that ConcurrentHashMap is not a good choice for LRU and the row
  cache needs to store compressed key data to reduce memory usage. I'll do
  more investigation on this and let you know.
 
  -Weijun
 
  On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  ... tell you what, if you write the option-processing part in
  DatabaseDescriptor I will do the actual cache part. :)
 
  On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
   https://issues.apache.org/jira/secure/CreateIssue!default.jspahttps://issues.apache.org/jira/secure/CreateIssue%21default.jspa,
 but
   this is pretty low priority for me.
  
   On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li weiju...@gmail.com
 wrote:
   Just tried to make quick change to enable it but it didn't work out
  :-(
  
  ColumnFamily cachedRow =
   cfs.getRawCachedRow(mutation.key());
  
   // What I modified
   if( cachedRow == null ) {
   cfs.cacheRow(mutation.key());
   cachedRow = cfs.getRawCachedRow(mutation.key());
   }
  
   if (cachedRow != null)
   cachedRow.addAll(columnFamily);
  
   How can I open a ticket for you to make the change (enable row cache
   write
   through with an option)?
  
   Thanks,
   -Weijun
  
   On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis jbel...@gmail.com
   wrote:
  
   On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis jbel...@gmail.com
 
   wrote:
On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com
wrote:
Just started to play with the row cache feature in trunk: it
 seems
to
be
working fine so far except that for RowsCached parameter you
 need
to
specify
number of rows rather than a percentage (e.g., 20% doesn't
  work).
   
20% works, but it's 20% of the rows at server startup.  So on a
fresh
start that is zero.
   
Maybe we should just get rid of the % feature...
  
   (Actually, it shouldn't be hard to update this on flush, if you
 want
   to open a ticket.)
  
  
  
 
 
 
 



Unbalanced read latency among nodes in a cluster

2010-02-19 Thread Weijun Li
I setup a two cassandra clusters with 2 nodes each. Both use random
partitioner. It's strange that for each cluster, one node has much shortter
read latency than the other one

This is the info of one of the cluster:

Node A: read count 77302, data file 41GB, read latency 58180, io saturation
100%
Node B: read count 488753, data file 26GB, read latency 5822 , io saturation
35%.

I first started node A, then ran B to join the cluster. Both machines have
exactly the same hardware and OS. The test client randomly pick a node to
write and it worked fine for the other cluster.

Address   Status Load
Range  Ring

169400792707028208569145873749456918214
10.xxx Up 38.39 GB
103633195217832666843316719920043079797|--|
10.xxx Up 24.22 GB
169400792707028208569145873749456918214|--|

For both clusters, whichever node that took more reads (with larger data
file) owns the much worse read latency.

What's the algorithm that cassandra use to split token when a new node is
joining? What could cause this unbalanced read latency issue? How can I fix
this? How to make sure all nodes get evenly distributed data and traffic?

-Weijun


Re: Testing row cache feature in trunk: write should put record in cache

2010-02-19 Thread Jonathan Ellis
mmap is designed to handle that case, yes.  it is already in 0.6 branch.

On Fri, Feb 19, 2010 at 2:44 PM, Weijun Li weiju...@gmail.com wrote:
 I see. How much is the overhead of java serialization? Does it slow down the
 system a lot? It seems to be a tradeoff between CPU usage and memory.

 As for mmap of 0.6, do you mmap the sstable data file even it is a lot
 larger than the available memory (e.g., the data file is over 100GB while
 you have only 8GB ram)? How efficient is mmap in this case? Is mmap already
 checked into 0.6 branch?

 -Weijun

 On Fri, Feb 19, 2010 at 4:56 AM, Jonathan Ellis jbel...@gmail.com wrote:

 The whole point of rowcache is to avoid the serialization overhead,
 though.  If we just wanted the serialized form cached, we would let
 the os block cache handle that without adding an extra layer.  (0.6
 uses mmap'd i/o by default on 64bit JVMs so this is very efficient.)

 On Fri, Feb 19, 2010 at 3:29 AM, Weijun Li weiju...@gmail.com wrote:
  The memory overhead issue is not directly related to GC because when JVM
  ran
  out of memory the GC has been very busy for quite a while. In my case
  JVM
  consumed all of the 6GB when the row cache size hit 1.4mil.
 
  I haven't started test the row cache feature yet. But I think data
  compression is useful to reduce memory consumption because in my
  impression
  disk i/o is always the bottleneck for Cassandra while its CPU usage is
  usually low all the time. In addition to this, compression should also
  help
  to reduce the number of java objects dramatically (correct me if I'm
  wrong),
  --especially in case we need to cache most of the data to achieve decent
  read latency.
 
  If ColumnFamily is serializable it shouldn't be that hard to implement
  the
  compression feature which can be controlled by an option (again :-) in
  storage conf xml.
 
  When I get to that point you can instruct me to implement this feature
  along
  with the row-cache-write-through. Our goal is straightforward: to
  support
  short read latency in high volume web application with write/read ratio
  to
  be 1:1.
 
  -Weijun
 
  -Original Message-
  From: Jonathan Ellis [mailto:jbel...@gmail.com]
  Sent: Thursday, February 18, 2010 12:04 PM
  To: cassandra-user@incubator.apache.org
  Subject: Re: Testing row cache feature in trunk: write should put record
  in
  cache
 
  Did you force a GC from jconsole to make sure you weren't just
  measuring uncollected garbage?
 
  On Wed, Feb 17, 2010 at 2:51 PM, Weijun Li weiju...@gmail.com wrote:
  OK I'll work on the change later because there's another problem to
  solve:
  the overhead for cache is too big that 1.4mil records (1k each)
  consumed
  all
  of the 6gb memory of JVM (I guess 4gb are consumed by the row cache).
  I'm
  thinking that ConcurrentHashMap is not a good choice for LRU and the
  row
  cache needs to store compressed key data to reduce memory usage. I'll
  do
  more investigation on this and let you know.
 
  -Weijun
 
  On Tue, Feb 16, 2010 at 9:22 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
  ... tell you what, if you write the option-processing part in
  DatabaseDescriptor I will do the actual cache part. :)
 
  On Tue, Feb 16, 2010 at 11:07 PM, Jonathan Ellis jbel...@gmail.com
  wrote:
   https://issues.apache.org/jira/secure/CreateIssue!default.jspa, but
   this is pretty low priority for me.
  
   On Tue, Feb 16, 2010 at 8:37 PM, Weijun Li weiju...@gmail.com
   wrote:
   Just tried to make quick change to enable it but it didn't work out
  :-(
  
      ColumnFamily cachedRow =
   cfs.getRawCachedRow(mutation.key());
  
       // What I modified
       if( cachedRow == null ) {
           cfs.cacheRow(mutation.key());
           cachedRow =
   cfs.getRawCachedRow(mutation.key());
       }
  
       if (cachedRow != null)
           cachedRow.addAll(columnFamily);
  
   How can I open a ticket for you to make the change (enable row
   cache
   write
   through with an option)?
  
   Thanks,
   -Weijun
  
   On Tue, Feb 16, 2010 at 5:20 PM, Jonathan Ellis jbel...@gmail.com
   wrote:
  
   On Tue, Feb 16, 2010 at 7:17 PM, Jonathan Ellis
   jbel...@gmail.com
   wrote:
On Tue, Feb 16, 2010 at 7:11 PM, Weijun Li weiju...@gmail.com
wrote:
Just started to play with the row cache feature in trunk: it
seems
to
be
working fine so far except that for RowsCached parameter you
need
to
specify
number of rows rather than a percentage (e.g., 20% doesn't
  work).
   
20% works, but it's 20% of the rows at server startup.  So on a
fresh
start that is zero.
   
Maybe we should just get rid of the % feature...
  
   (Actually, it shouldn't be hard to update this on flush, if you
   want
   to open a ticket.)
  
  
  
 
 
 
 




Re: Unbalanced read latency among nodes in a cluster

2010-02-19 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/Operations

On Fri, Feb 19, 2010 at 3:03 PM, Weijun Li weiju...@gmail.com wrote:
 I setup a two cassandra clusters with 2 nodes each. Both use random
 partitioner. It's strange that for each cluster, one node has much shortter
 read latency than the other one

 This is the info of one of the cluster:

 Node A: read count 77302, data file 41GB, read latency 58180, io saturation
 100%
 Node B: read count 488753, data file 26GB, read latency 5822 , io saturation
 35%.

 I first started node A, then ran B to join the cluster. Both machines have
 exactly the same hardware and OS. The test client randomly pick a node to
 write and it worked fine for the other cluster.

 Address   Status Load
 Range  Ring

 169400792707028208569145873749456918214
 10.xxx Up 38.39 GB
 103633195217832666843316719920043079797    |--|
 10.xxx Up 24.22 GB
 169400792707028208569145873749456918214    |--|

 For both clusters, whichever node that took more reads (with larger data
 file) owns the much worse read latency.

 What's the algorithm that cassandra use to split token when a new node is
 joining? What could cause this unbalanced read latency issue? How can I fix
 this? How to make sure all nodes get evenly distributed data and traffic?

 -Weijun



Re: cassandra freezes

2010-02-19 Thread Jonathan Ellis
are you using the old deb package?  because that had broken gc settings.

On Fri, Feb 19, 2010 at 10:40 PM, Santal Li santal...@gmail.com wrote:
 I meet almost same thing as you. When I do some benchmarks write test, some
 times one Cassandra will freeze and other node will consider it was shutdown
 and up after 30+ second. I am using 5 node, each node 8G mem for java heap.

 From my investigate, it was caused by GC thread, because I start the
 JConsole and monitor with the memory heap usage, each time when the GC
 happend, heap usage will drop down from 6G to 1G, and check the casandra
 log, I found the freeze happend at exactly same times.

 So I think when using huge memory(2G), maybe need using some different GC
 stratege other than the default one provide by Cassandra lunch script.
 Dose't anyone meet this situation, can you please provide some guide?


 Thanks
 -Santal

 2010/2/17 Tatu Saloranta tsalora...@gmail.com

 On Tue, Feb 16, 2010 at 6:25 AM, Boris Shulman shulm...@gmail.com wrote:
  Hello, I'm running some benchmarks on 2 cassandra nodes each running
  on 8 cores machine with 16G RAM, 10G for Java heap. I've noticed that
  during benchmarks with numerous writes cassandra just freeze for
  several minutes (in those benchmarks I'm writing batches of 10 columns
  with 1K data each for every key in a single CF). Usually after
  performing 50K writes I'm getting a TimeOutException and cassandra
  just freezes. What configuration changes can I make in order to
  prevent this? Is it possible that my setup just can't handle the load?
  How can I calculate the number of casandra nodes for a desired load?

 One thing that can cause seeming lockups is garbage collector. So
 enabling GC debug output would be heplful, to see GC activity. Some
 collector (CMS specifically) can stop the system for very long time,
 up to minutes. This is not necessarily the root cause, but is easy to
 rule out.
 Beyond this, getting a stack trace during lockup would make sense.
 That can pinpoint what threads are doing, or what they are blocked on
 in case there is a deadlock or heavy contention on some shared
 resource.

 -+ Tatu +-




PHP/TBinaryProtocolAccelerated I64 timestamp (microtime) issue

2010-02-19 Thread Michael Pearson
Hi, is anyone else using thrift/php TBinaryProtocolAccelerated
(thrift_protocol.so) call?  It doesn't look to be sending timestamps
correctly (casting as signed int 32)  no such issue with
TBinaryProtocol.

eg:

generating 13 digit cassandra_Column-timestamp via microtime()

insert() via TBinaryProtocol :

  cassandra get Keyspace1.Standard1['PandraTest_Dim3']
  = (column=column1, value=TEST DATA, timestamp=1266635621757)

... which is fine

insert() via TBinaryProtocolAccelerated/thrift_protocol_write_binary  :

  cassandra get Keyspace1.Standard1['PandraTest_Dim33']
  = (column=column1, value=TEST DATA, timestamp=-379667704)

... using different keys there, as the 'new' timestamp written with
thrift_protocol_write_binary is negative and therefore never seen.


Keen to see if anyone else has experienced this behaviour, I can
confirm most recent thrift has been pulled and installed from svn.

thanks

-michael


Re: cassandra freezes

2010-02-19 Thread Santal Li
the GC options as bellow:

JVM_OPTS= \
-ea \
-Xms2G \
-Xmx8G \
-XX:SurvivorRatio=8 \
-XX:TargetSurvivorRatio=90 \
-XX:+AggressiveOpts \
-XX:+UseParNewGC \
-XX:+UseConcMarkSweepGC \
-XX:+CMSParallelRemarkEnabled \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:SurvivorRatio=128 \
-XX:MaxTenuringThreshold=0 \
-Dcom.sun.management.jmxremote.port=8080 \
-Dcom.sun.management.jmxremote.ssl=false \
-Dcom.sun.management.jmxremote.authenticate=false


Regards
-Santal


2010/2/20 Jonathan Ellis jbel...@gmail.com

 are you using the old deb package?  because that had broken gc settings.

 On Fri, Feb 19, 2010 at 10:40 PM, Santal Li santal...@gmail.com wrote:
  I meet almost same thing as you. When I do some benchmarks write test,
 some
  times one Cassandra will freeze and other node will consider it was
 shutdown
  and up after 30+ second. I am using 5 node, each node 8G mem for java
 heap.
 
  From my investigate, it was caused by GC thread, because I start the
  JConsole and monitor with the memory heap usage, each time when the GC
  happend, heap usage will drop down from 6G to 1G, and check the casandra
  log, I found the freeze happend at exactly same times.
 
  So I think when using huge memory(2G), maybe need using some different
 GC
  stratege other than the default one provide by Cassandra lunch script.
  Dose't anyone meet this situation, can you please provide some guide?
 
 
  Thanks
  -Santal
 
  2010/2/17 Tatu Saloranta tsalora...@gmail.com
 
  On Tue, Feb 16, 2010 at 6:25 AM, Boris Shulman shulm...@gmail.com
 wrote:
   Hello, I'm running some benchmarks on 2 cassandra nodes each running
   on 8 cores machine with 16G RAM, 10G for Java heap. I've noticed that
   during benchmarks with numerous writes cassandra just freeze for
   several minutes (in those benchmarks I'm writing batches of 10 columns
   with 1K data each for every key in a single CF). Usually after
   performing 50K writes I'm getting a TimeOutException and cassandra
   just freezes. What configuration changes can I make in order to
   prevent this? Is it possible that my setup just can't handle the load?
   How can I calculate the number of casandra nodes for a desired load?
 
  One thing that can cause seeming lockups is garbage collector. So
  enabling GC debug output would be heplful, to see GC activity. Some
  collector (CMS specifically) can stop the system for very long time,
  up to minutes. This is not necessarily the root cause, but is easy to
  rule out.
  Beyond this, getting a stack trace during lockup would make sense.
  That can pinpoint what threads are doing, or what they are blocked on
  in case there is a deadlock or heavy contention on some shared
  resource.
 
  -+ Tatu +-
 
 



Re: StackOverflowError on high load

2010-02-19 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-804 should have fixed
this in trunk / 0.6.  or at least log more about what is going on so
we can fix it better. :)

On Thu, Feb 18, 2010 at 12:44 AM, Ran Tavory ran...@gmail.com wrote:
 I ran the process again and after a few hours the same node crashed the same
 way. Now I can tell for sure this is indeed what Jonathan proposed - the
 data directory needs to be 2x of what it is, but it looks like a design
 problem, how large to I need to tell my admin to set it then?
 Here's what I see when the server crashes:
 $ df -h /outbrain/cassandra/data/
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/mapper/cassandra-data
                        97G   46G   47G  50% /outbrain/cassandra/data
 The directory is 97G and when the host crashes it's at 50% use.
 I'm also monitoring various JMX counters and I see that COMPACTION-POOL
 PendingTasks grows for a while on this host (not on the other host, btw,
 which is fine, just this host) and then flats for 3 hours. After 3 hours of
 flat it crashes. I'm attaching the graph.
 When I restart cassandra on this host (not changed file allocation size,
 just restart) it does manage to compact the data files pretty fast, so after
 a minute I get 12% use, so I wonder what made it crash before that doesn't
 now? (could be the load that's not running now)
 $ df -h /outbrain/cassandra/data/
 Filesystem            Size  Used Avail Use% Mounted on
 /dev/mapper/cassandra-data
                        97G   11G   82G  12% /outbrain/cassandra/data
 The question is what size does the data directory need to be? It's not 2x
 the size of the data I expect to have (I only have 11G of real data after
 compaction and the dir is 97G, so it should have been enough). If it's 2x of
 something dynamic that keeps growing and isn't bound then it'll just
 grow infinitely, right? What's the bound?
 Alternatively, what jmx counter thresholds are the best indicators for the
 crash that's about to happen?
 Thanks

 On Wed, Feb 17, 2010 at 9:00 PM, Tatu Saloranta tsalora...@gmail.com
 wrote:

 On Wed, Feb 17, 2010 at 6:40 AM, Ran Tavory ran...@gmail.com wrote:
  If it's the data directory, then I have a pretty big one. Maybe it's
  something else
  $ df -h /outbrain/cassandra/data/
  Filesystem            Size  Used Avail Use% Mounted on
  /dev/mapper/cassandra-data
                         97G   11G   82G  12% /outbrain/cassandra/data

 Perhaps a temporary file? JVM defaults to /tmp, which may be on a
 smaller (root) partition?

 -+ Tatu +-