upgrade from 0.7.6 to 0.8.4

2011-08-15 Thread Jonathan Colby
Hi - sorry if this was asked before but I couldn't find any answers about it.

Is the upgrade path from 0.7.6 to 0.8.4 possible via a simple rolling restart?  
  

Are nodes with these different versions compatible - i.e., can one node be 
upgraded in order to see if we run into any problems before upgrading the 
others?




Re: Expiring Columns

2011-08-15 Thread Edward Capriolo
On Mon, Aug 15, 2011 at 6:51 PM, Stephen McKamey wrote:

> I'm curious about Expiring Columns. Say I create a Column Family where
> *all* of the Columns are set to be expiring columns. When a row's entire set
> of columns have expired, will an empty row it sill be returned in range
> queries? Or will it just be nicely compacted away?
>
>

[default@edstest] use abc;
Authenticated to keyspace: abc
[default@abc] create column family abc;
7d4d8c30-c7bf-11e0--242d50cf1fbc
Waiting for schema agreement...
... schemas agree across the cluster
[default@abc] set abc[ascii('a_ttl')][ascii('x')]=ascii('y') with ttl=50;
Value inserted.
[default@abc] list abc;
Using default limit of 100
---
RowKey: 615f74746c
=> (column=78, value=y, timestamp=1313468660115000, ttl=50)

1 Row Returned.


***getting
soda**
[default@abc] list abc;
Using default limit of 100
---
RowKey: 615f74746c

1 Row Returned.

http://wiki.apache.org/cassandra/FAQ#range_ghosts


Re: CQL query using 'OR' in WHERE clause

2011-08-15 Thread Jonathan Ellis
Disjunctions are not yet supported and probably will not be until after 1.0.

On Mon, Aug 15, 2011 at 6:45 PM, Deeter, Derek
 wrote:
> Hi,
>
> We are using CQL to obtain data from Cassandra 0.8.1 using Hector and
> getting an error when using ‘OR’ on a secondary index.  I get the same error
> when using CQL 1.0.3.  All the items in the WHERE clause are secondary
> indices and they are all UTF8Type validation.  The query works when leaving
> out everything from ‘OR’ onwards.   Example:
>
> cqlsh> SELECT '.id' , '.ipAddress' , '.userProduct' , '.offeringId' ,
> '.appId', 'timeStamp', '.logType', '.tzOffset'  , '.id' , 'member' ,
> 'mfaEnrolled' , 'sessionId' , 'startPage' , 'timeStamp' FROM Audit_Log USING
> CONSISTENCY ONE WHERE '.bcId' =  '01112' AND '.userProduct' =  'IB' AND
> 'timeStamp' >=  131218200 AND '.logType' = 'login' OR '.logType' =
> 'badLogin';
>
> Bad Request: line 1:336 mismatched input 'OR' expecting EOF
>
> I also tried to use the ‘IN’ keyword to no avail:
>
> cqlsh> SELECT '.id' , '.ipAddress' , '.userProduct' , '.offeringId' ,
> '.appId', 'timeStamp', '.logType', '.tzOffset'  , '.id' , 'member' ,
> 'mfaEnrolled' , 'sessionId' , 'startPage' , 'timeStamp' FROM Audit_Log USING
> CONSISTENCY ONE WHERE '.bcId' =  '01112' AND '.userProduct' =  'IB' AND
> 'timeStamp' >=  131218200 AND '.logType' IN ( 'login', 'badLogin');
>
> Bad Request: line 1:326 mismatched input 'IN' expecting set null
>
> I also tried simplifying the query WHERE clause to only “WHERE '.logType' =
> 'login' OR '.logType' = 'badLogin';”  but get the same ‘mismatched input’
> error.  Is there any way to set up a query on a set of values such as the
> above?  Or do I have the syntax wrong?
>
>     Thanks in advance,
>
>     -Derek
>
> Derek Deeter
> Software Engineer, Sr
>
> o: 818-597-5932  |  m: 661-645-7842  |  f: 818-878-7555
>
> This email may contain confidential and privileged material for the sole use
> of the intended recipient. Any review or distribution by others is strictly
> prohibited. If you are not the intended recipient, please contact the sender
> and delete all copies.



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: performance problems on new cluster

2011-08-15 Thread aaron morton
Just checking do you have read_repair_chance set to something ? The second 
request is going to all replicas which should only happen with CL ONE if read 
repair is running for the request. 

The exceptions are happening during read repair which is running async to the 
main read request. It's occurring after we have detected a digest mis match, 
when the process is trying to reconcile the full data responses from the 
replicas. The Assertion error is happening because the replica sent a digest 
response. The NPE is probably happening because the response did not include a 
row, how / why the response is not marked as digest is a mystery. 

This may be related to the main problem. If not dont forget to some back to it.

In you first log with the timeout something is not right…
> DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,187 StorageProxy.java (line 546) 
> reading data from /dc1host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:26:35,191 StorageProxy.java (line 593) 
> Read timeout: java.util.concurrent.TimeoutException: Operation timed out - 
> received only 1 responses from /dc1host3,  .
The reading… log messages are written before the inter node messages are sent. 
For this CL ONE read only node dc 1 host 3 is involved and it has been asked 
for the data response. Makes sense if Read Repair is not running for the 
request. 

*But* the timeout error says we got a response from dc 1 host 3. One way I can 
see that happening is dc 1 host 3 returning a digest instead of a data response 
(see o.a.c.service.ReadCallback.response(Message)). Which kind of matches what 
we saw above. 

We need some more extensive logging and probably a trip to 
https://issues.apache.org/jira/browse/CASSANDRA

Would be good to see full DEBUG logs from both dc1 host 1 and dc1 host 3 if you 
can that reproduce the fault like the first one. Turn off read repair to make 
things a little simpler. If thats too much we need StorageProxy, ReadCalback, 
ReadVerbHandler

Can you update the email thread with the ticket. 

Thanks
A

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/08/2011, at 7:34 PM, Anton Winter wrote:

>> OK, node latency is fine and you are using some pretty low
>> consistency. You said NTS with RF 2, is that RF 2 for each DC ?
> 
> Correct, I'm using RF 2 for each DC.
> 
> 
> 
> I was able to reproduce the cli timeouts on the non replica nodes.
> 
> The debug log output from dc1host1 (non replica node):
> 
> DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,183 StorageProxy.java (line 518) 
> Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', key='userid1', 
> column_parent='QueryPath(columnFamilyName='cf1', 
> superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
> columnName='null')', start='', finish='', reversed=false, count=100)/ONE
> DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,187 StorageProxy.java (line 546) 
> reading data from /dc1host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:26:35,191 StorageProxy.java (line 593) 
> Read timeout: java.util.concurrent.TimeoutException: Operation timed out - 
> received only 1 responses from /dc1host3,  .
> 
> 
> If the query is run again on the same node (dc1host1) 0 rows are returned and 
> the following DEBUG messages are logged:
> 
> 
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java (line 518) 
> Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', key='userid1', 
> column_parent='QueryPath(columnFamilyName='cf1', 
> superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
> columnName='null')', start='', finish='', reversed=false, count=100)/ONE
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java (line 546) 
> reading data from /dc1host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java (line 562) 
> reading digest from /dc1host2
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc2host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc2host2
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc3host2
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc3host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc4host3
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java (line 562) 
> reading digest from /dc4host2
> DEBUG [pool-2-thread-14] 2011-08-15 05:32:06,022 StorageProxy.java (line 588) 
> Read: 508 ms.
> ERROR [ReadRepairStage:2112] 2011-08-15 05:32:06,404 
> AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
> Thread[ReadRepairStage:2112,5,main]
> java.lang.AssertionError
>at 
> org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:73)
>at 
> org.apache.cassandra.service

Re: Max heap not sticking?

2011-08-15 Thread Ian Danforth
False alarm, typo further down in cassandra-env.sh was removing all the
opts.

Ian

On Mon, Aug 15, 2011 at 6:08 PM, Ian Danforth  wrote:

>  All,
>
>   When I connect to a node through jconsole it's telling me that my max
> heap is only 1.7gb. (eg http://screencast.com/t/7DP8ovdUv) However I
> believe I have properly specified that it should be 4GB in
> cassandra-env.sh. Total memory is 7.5GB
>
>   I see greatly increased GC activity as the heap approaches 1.7GB.
>
>  Can anyone suggest why the heap might not use all the memory as
> specified?
>
>  Ian
>
>
>  From cassandra-env.sh
>
>  MAX_HEAP_SIZE="4096M"
> HEAP_NEWSIZE="400M"
>
>  JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
> JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
> JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"
>
>


Max heap not sticking?

2011-08-15 Thread Ian Danforth
All,

 When I connect to a node through jconsole it's telling me that my max heap
is only 1.7gb. (eg http://screencast.com/t/7DP8ovdUv) However I believe I
have properly specified that it should be 4GB in cassandra-env.sh. Total
memory is 7.5GB

 I see greatly increased GC activity as the heap approaches 1.7GB.

Can anyone suggest why the heap might not use all the memory as specified?

Ian


>From cassandra-env.sh

MAX_HEAP_SIZE="4096M"
HEAP_NEWSIZE="400M"

JVM_OPTS="$JVM_OPTS -Xms${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmx${MAX_HEAP_SIZE}"
JVM_OPTS="$JVM_OPTS -Xmn${HEAP_NEWSIZE}"


Cassandra for numerical data set

2011-08-15 Thread Yi Yang
Dear all,

I wanna report my use case, and have a discussion with you guys.

I'm currently working on my second Cassandra project.   I got into somehow a 
unique use case: storing traditional, relational data set into Cassandra 
datastore, it's a dataset of int and float numbers, no more strings, no more 
other data and the column names are much longer than the value itself.   
Besides, row-key is the md-5 hash ver3 UUID of some other data.

1)
I did some workaround to make it save some disk space however it still takes 
approximately 12-15x more disk space than MySQL.   I looked into Cassandra 
SSTable internal, did some optimizing on selecting better data serializer and 
also hashed the column name into one byte.   That made the current database 
having ~6x overhead on disk space comparing with MySQL, which I think it might 
be acceptable.

I'm currently interested into CASSANDRA-674 and will also test CASSANDRA-47 in 
the coming days.   I'll keep you updated on my testing.   But I'm willing to 
hear your idea on saving disk space.

2)
I'm doing batch writes to the database (pulling data from multiple resources 
and put them together).   I wish to know if there's some better methods to 
improve the writing efficiency since it's just about the same speed as MySQL, 
when writing sequentially.   Seems like the commitlog requires a huge mount of 
disk IO comparing with my test machine can afford.

3)
In my case, each row is read randomly with the same chance.   I have around 
0.5M rows in total.   Can you provide some practical advices on optimizing the 
row cache and key cache?   I can use up to 8 gig of memory on test machines.

Thanks for your help.


Best,

Steve




CQL query using 'OR' in WHERE clause

2011-08-15 Thread Deeter, Derek
Hi,

We are using CQL to obtain data from Cassandra 0.8.1 using Hector and
getting an error when using 'OR' on a secondary index.  I get the same
error when using CQL 1.0.3.  All the items in the WHERE clause are
secondary indices and they are all UTF8Type validation.  The query works
when leaving out everything from 'OR' onwards.   Example:

cqlsh> SELECT '.id' , '.ipAddress' , '.userProduct' , '.offeringId' ,
'.appId', 'timeStamp', '.logType', '.tzOffset'  , '.id' , 'member' ,
'mfaEnrolled' , 'sessionId' , 'startPage' , 'timeStamp' FROM Audit_Log
USING CONSISTENCY ONE WHERE '.bcId' =  '01112' AND '.userProduct' =
'IB' AND 'timeStamp' >=  131218200 AND '.logType' = 'login' OR
'.logType' = 'badLogin';
Bad Request: line 1:336 mismatched input 'OR' expecting EOF

I also tried to use the 'IN' keyword to no avail:

cqlsh> SELECT '.id' , '.ipAddress' , '.userProduct' , '.offeringId' ,
'.appId', 'timeStamp', '.logType', '.tzOffset'  , '.id' , 'member' ,
'mfaEnrolled' , 'sessionId' , 'startPage' , 'timeStamp' FROM Audit_Log
USING CONSISTENCY ONE WHERE '.bcId' =  '01112' AND '.userProduct' =
'IB' AND 'timeStamp' >=  131218200 AND '.logType' IN ( 'login',
'badLogin');
Bad Request: line 1:326 mismatched input 'IN' expecting set null

I also tried simplifying the query WHERE clause to only "WHERE
'.logType' = 'login' OR '.logType' = 'badLogin';"  but get the same
'mismatched input' error.  Is there any way to set up a query on a set
of values such as the above?  Or do I have the syntax wrong?

Thanks in advance,
-Derek


Derek Deeter
Software Engineer, Sr
  
o: 818-597-5932  |  m: 661-645-7842  |  f: 818-878-7555
This email may contain confidential and privileged material for the sole
use of the intended recipient. Any review or distribution by others is
strictly prohibited. If you are not the intended recipient, please
contact the sender and delete all copies.




Re: Expiring Columns

2011-08-15 Thread aaron morton
I believe (have not tested) that you would still see the range ghosts talked 
about here http://wiki.apache.org/cassandra/FAQ#range_ghosts until compaction 
had removed all the columns, and the row once all the columns are gone. Expired 
columns are purged during compaction when their ttl runs out. 

Consider the range query in two parts. First get me row keys between here and 
there. Then get me the columns that match this SlicePredicate. 

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2011, at 10:51 AM, Stephen McKamey wrote:

> I'm curious about Expiring Columns. Say I create a Column Family where *all* 
> of the Columns are set to be expiring columns. When a row's entire set of 
> columns have expired, will an empty row it sill be returned in range queries? 
> Or will it just be nicely compacted away?
> 



Expiring Columns

2011-08-15 Thread Stephen McKamey
I'm curious about Expiring Columns. Say I create a Column Family where *all*
of the Columns are set to be expiring columns. When a row's entire set of
columns have expired, will an empty row it sill be returned in range
queries? Or will it just be nicely compacted away?


Re: Scalability question

2011-08-15 Thread Teijo Holzer

Hi,

we have come across this as well. We run continuously run rolling repairs 
followed by major compactions followed by a gc() (or node restart) to get rid 
of all these sstables files. Combined with aggressive ttls on most inserts, the 
cluster stays nice and lean.


You don't want your working set to grow indefinitely.

Cheers,

T.


On 16/08/11 08:08, Philippe wrote:

Forgot to mention that stopping & restarting the server brought the data
directory down to 283GB in less than 1 minute.

Philippe
2011/8/15 Philippe mailto:watche...@gmail.com>>

It's another reason to avoid major / manual compactions which create a
single big SSTable. Minor compactions keep things in buckets   which
means newer SSTable can be compacted needing to read the bigger older
tables.

I've never run a major/manual compaction on this ring.
In my case running repair on a "big" keyspace results in SSTables piling
up. My problematic node just filled up 483GB (yes, GB) of SSTTables. Here
are the biggest
ls -laSrh
(...)

-rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:13
PUBLIC_MONTHLY_20-g-4581-Data.db

-rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:52
PUBLIC_MONTHLY_20-g-4641-Data.db

-rw-r--r-- 1 cassandra cassandra  2.8G 2011-08-15 14:39
PUBLIC_MONTHLY_20-tmp-g-4878-Data.db

-rw-r--r-- 1 cassandra cassandra  2.9G 2011-08-15 15:00
PUBLIC_MONTHLY_20-g-4656-Data.db

-rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 14:17
PUBLIC_MONTHLY_20-g-4599-Data.db

-rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 15:11
PUBLIC_MONTHLY_20-g-4675-Data.db

-rw-r--r-- 3 cassandra cassandra  3.1G 2011-08-13 10:34
PUBLIC_MONTHLY_18-g-3861-Data.db

-rw-r--r-- 1 cassandra cassandra  3.2G 2011-08-15 14:41
PUBLIC_MONTHLY_20-tmp-g-4884-Data.db

-rw-r--r-- 1 cassandra cassandra  3.6G 2011-08-15 14:44
PUBLIC_MONTHLY_20-tmp-g-4894-Data.db

-rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:56
PUBLIC_MONTHLY_20-tmp-g-4934-Data.db

-rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:46
PUBLIC_MONTHLY_20-tmp-g-4905-Data.db

-rw-r--r-- 1 cassandra cassandra  4.0G 2011-08-15 14:57
PUBLIC_MONTHLY_20-tmp-g-4935-Data.db

-rw-r--r-- 3 cassandra cassandra  5.9G 2011-08-13 12:53
PUBLIC_MONTHLY_19-g-4219-Data.db

-rw-r--r-- 3 cassandra cassandra  6.0G 2011-08-13 13:57
PUBLIC_MONTHLY_20-g-4538-Data.db

-rw-r--r-- 3 cassandra cassandra   12G 2011-08-13 09:27
PUBLIC_MONTHLY_20-g-4501-Data.db


On the other nodes the same directory is around 69GB. Why are there so
fewer large files there and so many big ones on the repairing node ?
  -rw-r--r-- 1 cassandra cassandra 434M 2011-08-15 16:02
PUBLIC_MONTHLY_17-g-3525-Data.db
-rw-r--r-- 1 cassandra cassandra 456M 2011-08-15 15:50
PUBLIC_MONTHLY_19-g-4253-Data.db
-rw-r--r-- 1 cassandra cassandra 485M 2011-08-15 14:30
PUBLIC_MONTHLY_20-g-5280-Data.db
-rw-r--r-- 1 cassandra cassandra 572M 2011-08-15 15:15
PUBLIC_MONTHLY_18-g-3774-Data.db
-rw-r--r-- 2 cassandra cassandra 664M 2011-08-09 15:39
PUBLIC_MONTHLY_20-g-4893-Index.db
-rw-r--r-- 2 cassandra cassandra 811M 2011-08-11 21:27
PUBLIC_MONTHLY_16-g-2597-Data.db
-rw-r--r-- 2 cassandra cassandra 915M 2011-08-13 04:00
PUBLIC_MONTHLY_18-g-3695-Data.db
-rw-r--r-- 1 cassandra cassandra 925M 2011-08-15 03:39
PUBLIC_MONTHLY_17-g-3454-Data.db
-rw-r--r-- 1 cassandra cassandra 1.3G 2011-08-15 13:46
PUBLIC_MONTHLY_19-g-4199-Data.db
-rw-r--r-- 2 cassandra cassandra 1.5G 2011-08-10 15:37
PUBLIC_MONTHLY_17-g-3218-Data.db
-rw-r--r-- 1 cassandra cassandra 1.9G 2011-08-15 14:35
PUBLIC_MONTHLY_20-g-5281-Data.db
-rw-r--r-- 2 cassandra cassandra 2.1G 2011-08-10 16:33
PUBLIC_MONTHLY_19-g-3946-Data.db
-rw-r--r-- 2 cassandra cassandra 3.1G 2011-08-10 22:23
PUBLIC_MONTHLY_18-g-3509-Data.db
-rw-r--r-- 2 cassandra cassandra 4.0G 2011-08-10 18:18
PUBLIC_MONTHLY_20-g-5024-Data.db
-rw--- 2 cassandra cassandra 5.1G 2011-08-09 15:23
PUBLIC_MONTHLY_19-g-3847-Data.db
-rw-r--r-- 2 cassandra cassandra 9.6G 2011-08-09 15:39
PUBLIC_MONTHLY_20-g-4893-Data.db

This whole compaction thing is getting me worried : how are sites in
production dealing with SSTables becoming larger and larger and thus taking
longer and longer to compact ? Adding nodes every couple of weeks ?

Philippe






Re: Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-15 Thread Jeremiah Jordan
Assign the tokens like they are two separate rings, just make sure you 
don't have any duplicate tokens.

http://wiki.apache.org/cassandra/Operations#Token_selection

The two datacenters are treated as separate rings, LOCAL_QUORUM will 
only delay the client as long as it takes to write the data to the local 
nodes.  The nodes in the other datacenter will get asynchronous writes.


On 08/15/2011 03:39 PM, Oleg Tsvinev wrote:

Hi all,

I have a question that documentation has not clear answer for. I have
the following requirements:

1. Synchronously store data in datacenter DC1 on 2+ nodes
2. Asynchronously replicate the same data to DC2 and store it on 2+
nodes to act as a hot standby

Now, I have configured keyspaces with o.a.c.l.NetworkTopologyStrategy
with strategy_options=[{DC1:2, DC2:2}] and use LOCAL_QUORUM
consistency level, following documentation here:
http://www.datastax.com/docs/0.8/operations/datacenter

Now, how do I assign initial tokens? If I have, say 6 nodes total, 3
in DC1 and 3 in DC2, and create a ring as if all 6 nodes share the
total 2^128 space equally.
Now say node N1:DC2 has key K and is in remote datacenter (for an app
in DC1). Wouldn't Cassandra always forward K to the DC2 node N1 thus
turning asynchronous writes into synchronous ones? Performance impact
will be huge as the latency between DC1 and DC2 is significant.

I hope there's an answer and I'm just missing something. My case falls
under Disaster Recovery in
http://www.datastax.com/docs/0.8/operations/datacenter but I don't see
how Cassandra will support my use case.

I appreciate any help on this.

Thank you,
   Oleg


Cassandra in Multiple Datacenters Active - Standby configuration

2011-08-15 Thread Oleg Tsvinev
Hi all,

I have a question that documentation has not clear answer for. I have
the following requirements:

1. Synchronously store data in datacenter DC1 on 2+ nodes
2. Asynchronously replicate the same data to DC2 and store it on 2+
nodes to act as a hot standby

Now, I have configured keyspaces with o.a.c.l.NetworkTopologyStrategy
with strategy_options=[{DC1:2, DC2:2}] and use LOCAL_QUORUM
consistency level, following documentation here:
http://www.datastax.com/docs/0.8/operations/datacenter

Now, how do I assign initial tokens? If I have, say 6 nodes total, 3
in DC1 and 3 in DC2, and create a ring as if all 6 nodes share the
total 2^128 space equally.
Now say node N1:DC2 has key K and is in remote datacenter (for an app
in DC1). Wouldn't Cassandra always forward K to the DC2 node N1 thus
turning asynchronous writes into synchronous ones? Performance impact
will be huge as the latency between DC1 and DC2 is significant.

I hope there's an answer and I'm just missing something. My case falls
under Disaster Recovery in
http://www.datastax.com/docs/0.8/operations/datacenter but I don't see
how Cassandra will support my use case.

I appreciate any help on this.

Thank you,
  Oleg


Re: Scalability question

2011-08-15 Thread Jonathan Ellis
This is more an artifact of repair's problems than compaction per se.
We're addressing these in
https://issues.apache.org/jira/browse/CASSANDRA-2816 and
https://issues.apache.org/jira/browse/CASSANDRA-2280.

On Mon, Aug 15, 2011 at 3:06 PM, Philippe  wrote:
>> It's another reason to avoid major / manual compactions which create a
>> single big SSTable. Minor compactions keep things in buckets   which means
>> newer SSTable can be compacted needing to read the bigger older tables.
>
> I've never run a major/manual compaction on this ring.
> In my case running repair on a "big" keyspace results in SSTables piling up.
> My problematic node just filled up 483GB (yes, GB) of SSTTables. Here are
> the biggest
> ls -laSrh
> (...)
>
> -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:13
> PUBLIC_MONTHLY_20-g-4581-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:52
> PUBLIC_MONTHLY_20-g-4641-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.8G 2011-08-15 14:39
> PUBLIC_MONTHLY_20-tmp-g-4878-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.9G 2011-08-15 15:00
> PUBLIC_MONTHLY_20-g-4656-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 14:17
> PUBLIC_MONTHLY_20-g-4599-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 15:11
> PUBLIC_MONTHLY_20-g-4675-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  3.1G 2011-08-13 10:34
> PUBLIC_MONTHLY_18-g-3861-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.2G 2011-08-15 14:41
> PUBLIC_MONTHLY_20-tmp-g-4884-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.6G 2011-08-15 14:44
> PUBLIC_MONTHLY_20-tmp-g-4894-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:56
> PUBLIC_MONTHLY_20-tmp-g-4934-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:46
> PUBLIC_MONTHLY_20-tmp-g-4905-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  4.0G 2011-08-15 14:57
> PUBLIC_MONTHLY_20-tmp-g-4935-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  5.9G 2011-08-13 12:53
> PUBLIC_MONTHLY_19-g-4219-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  6.0G 2011-08-13 13:57
> PUBLIC_MONTHLY_20-g-4538-Data.db
>
> -rw-r--r-- 3 cassandra cassandra   12G 2011-08-13 09:27
> PUBLIC_MONTHLY_20-g-4501-Data.db
>
> On the other nodes the same directory is around 69GB. Why are there so fewer
> large files there and so many big ones on the repairing node ?
>  -rw-r--r-- 1 cassandra cassandra 434M 2011-08-15 16:02
> PUBLIC_MONTHLY_17-g-3525-Data.db
> -rw-r--r-- 1 cassandra cassandra 456M 2011-08-15 15:50
> PUBLIC_MONTHLY_19-g-4253-Data.db
> -rw-r--r-- 1 cassandra cassandra 485M 2011-08-15 14:30
> PUBLIC_MONTHLY_20-g-5280-Data.db
> -rw-r--r-- 1 cassandra cassandra 572M 2011-08-15 15:15
> PUBLIC_MONTHLY_18-g-3774-Data.db
> -rw-r--r-- 2 cassandra cassandra 664M 2011-08-09 15:39
> PUBLIC_MONTHLY_20-g-4893-Index.db
> -rw-r--r-- 2 cassandra cassandra 811M 2011-08-11 21:27
> PUBLIC_MONTHLY_16-g-2597-Data.db
> -rw-r--r-- 2 cassandra cassandra 915M 2011-08-13 04:00
> PUBLIC_MONTHLY_18-g-3695-Data.db
> -rw-r--r-- 1 cassandra cassandra 925M 2011-08-15 03:39
> PUBLIC_MONTHLY_17-g-3454-Data.db
> -rw-r--r-- 1 cassandra cassandra 1.3G 2011-08-15 13:46
> PUBLIC_MONTHLY_19-g-4199-Data.db
> -rw-r--r-- 2 cassandra cassandra 1.5G 2011-08-10 15:37
> PUBLIC_MONTHLY_17-g-3218-Data.db
> -rw-r--r-- 1 cassandra cassandra 1.9G 2011-08-15 14:35
> PUBLIC_MONTHLY_20-g-5281-Data.db
> -rw-r--r-- 2 cassandra cassandra 2.1G 2011-08-10 16:33
> PUBLIC_MONTHLY_19-g-3946-Data.db
> -rw-r--r-- 2 cassandra cassandra 3.1G 2011-08-10 22:23
> PUBLIC_MONTHLY_18-g-3509-Data.db
> -rw-r--r-- 2 cassandra cassandra 4.0G 2011-08-10 18:18
> PUBLIC_MONTHLY_20-g-5024-Data.db
> -rw--- 2 cassandra cassandra 5.1G 2011-08-09 15:23
> PUBLIC_MONTHLY_19-g-3847-Data.db
> -rw-r--r-- 2 cassandra cassandra 9.6G 2011-08-09 15:39
> PUBLIC_MONTHLY_20-g-4893-Data.db
> This whole compaction thing is getting me worried : how are sites in
> production dealing with SSTables becoming larger and larger and thus taking
> longer and longer to compact ? Adding nodes every couple of weeks ?
> Philippe



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: CompositeType

2011-08-15 Thread Benoit Perroud
You should give a look at https://github.com/edanuff/CassandraIndexedCollections

This is a rather good starting point for Composites.

2011/8/15 Stephen Pope :
>  Hey, is there any documentation or examples of how to use the CompositeType? 
> I can't find anything about it on the wiki or the datastax docs.
>
>  Cheers,
>  Steve
>


Re: Scalability question

2011-08-15 Thread Philippe
Forgot to mention that stopping & restarting the server brought the data
directory down to 283GB in less than 1 minute.

Philippe
2011/8/15 Philippe 

> It's another reason to avoid major / manual compactions which create a
>> single big SSTable. Minor compactions keep things in buckets   which means
>> newer SSTable can be compacted needing to read the bigger older tables.
>>
> I've never run a major/manual compaction on this ring.
> In my case running repair on a "big" keyspace results in SSTables piling
> up. My problematic node just filled up 483GB (yes, GB) of SSTTables. Here
> are the biggest
> ls -laSrh
> (...)
>
> -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:13
> PUBLIC_MONTHLY_20-g-4581-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:52
> PUBLIC_MONTHLY_20-g-4641-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.8G 2011-08-15 14:39
> PUBLIC_MONTHLY_20-tmp-g-4878-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  2.9G 2011-08-15 15:00
> PUBLIC_MONTHLY_20-g-4656-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 14:17
> PUBLIC_MONTHLY_20-g-4599-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 15:11
> PUBLIC_MONTHLY_20-g-4675-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  3.1G 2011-08-13 10:34
> PUBLIC_MONTHLY_18-g-3861-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.2G 2011-08-15 14:41
> PUBLIC_MONTHLY_20-tmp-g-4884-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.6G 2011-08-15 14:44
> PUBLIC_MONTHLY_20-tmp-g-4894-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:56
> PUBLIC_MONTHLY_20-tmp-g-4934-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:46
> PUBLIC_MONTHLY_20-tmp-g-4905-Data.db
>
> -rw-r--r-- 1 cassandra cassandra  4.0G 2011-08-15 14:57
> PUBLIC_MONTHLY_20-tmp-g-4935-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  5.9G 2011-08-13 12:53
> PUBLIC_MONTHLY_19-g-4219-Data.db
>
> -rw-r--r-- 3 cassandra cassandra  6.0G 2011-08-13 13:57
> PUBLIC_MONTHLY_20-g-4538-Data.db
>
> -rw-r--r-- 3 cassandra cassandra   12G 2011-08-13 09:27
> PUBLIC_MONTHLY_20-g-4501-Data.db
>
> On the other nodes the same directory is around 69GB. Why are there so
> fewer large files there and so many big ones on the repairing node ?
>  -rw-r--r-- 1 cassandra cassandra 434M 2011-08-15 16:02
> PUBLIC_MONTHLY_17-g-3525-Data.db
> -rw-r--r-- 1 cassandra cassandra 456M 2011-08-15 15:50
> PUBLIC_MONTHLY_19-g-4253-Data.db
> -rw-r--r-- 1 cassandra cassandra 485M 2011-08-15 14:30
> PUBLIC_MONTHLY_20-g-5280-Data.db
> -rw-r--r-- 1 cassandra cassandra 572M 2011-08-15 15:15
> PUBLIC_MONTHLY_18-g-3774-Data.db
> -rw-r--r-- 2 cassandra cassandra 664M 2011-08-09 15:39
> PUBLIC_MONTHLY_20-g-4893-Index.db
> -rw-r--r-- 2 cassandra cassandra 811M 2011-08-11 21:27
> PUBLIC_MONTHLY_16-g-2597-Data.db
> -rw-r--r-- 2 cassandra cassandra 915M 2011-08-13 04:00
> PUBLIC_MONTHLY_18-g-3695-Data.db
> -rw-r--r-- 1 cassandra cassandra 925M 2011-08-15 03:39
> PUBLIC_MONTHLY_17-g-3454-Data.db
> -rw-r--r-- 1 cassandra cassandra 1.3G 2011-08-15 13:46
> PUBLIC_MONTHLY_19-g-4199-Data.db
> -rw-r--r-- 2 cassandra cassandra 1.5G 2011-08-10 15:37
> PUBLIC_MONTHLY_17-g-3218-Data.db
> -rw-r--r-- 1 cassandra cassandra 1.9G 2011-08-15 14:35
> PUBLIC_MONTHLY_20-g-5281-Data.db
> -rw-r--r-- 2 cassandra cassandra 2.1G 2011-08-10 16:33
> PUBLIC_MONTHLY_19-g-3946-Data.db
> -rw-r--r-- 2 cassandra cassandra 3.1G 2011-08-10 22:23
> PUBLIC_MONTHLY_18-g-3509-Data.db
> -rw-r--r-- 2 cassandra cassandra 4.0G 2011-08-10 18:18
> PUBLIC_MONTHLY_20-g-5024-Data.db
> -rw--- 2 cassandra cassandra 5.1G 2011-08-09 15:23
> PUBLIC_MONTHLY_19-g-3847-Data.db
> -rw-r--r-- 2 cassandra cassandra 9.6G 2011-08-09 15:39
> PUBLIC_MONTHLY_20-g-4893-Data.db
>
> This whole compaction thing is getting me worried : how are sites in
> production dealing with SSTables becoming larger and larger and thus taking
> longer and longer to compact ? Adding nodes every couple of weeks ?
>
> Philippe
>


Re: Scalability question

2011-08-15 Thread Philippe
>
> It's another reason to avoid major / manual compactions which create a
> single big SSTable. Minor compactions keep things in buckets   which means
> newer SSTable can be compacted needing to read the bigger older tables.
>
I've never run a major/manual compaction on this ring.
In my case running repair on a "big" keyspace results in SSTables piling up.
My problematic node just filled up 483GB (yes, GB) of SSTTables. Here are
the biggest
ls -laSrh
(...)

-rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:13
PUBLIC_MONTHLY_20-g-4581-Data.db

-rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:52
PUBLIC_MONTHLY_20-g-4641-Data.db

-rw-r--r-- 1 cassandra cassandra  2.8G 2011-08-15 14:39
PUBLIC_MONTHLY_20-tmp-g-4878-Data.db

-rw-r--r-- 1 cassandra cassandra  2.9G 2011-08-15 15:00
PUBLIC_MONTHLY_20-g-4656-Data.db

-rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 14:17
PUBLIC_MONTHLY_20-g-4599-Data.db

-rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 15:11
PUBLIC_MONTHLY_20-g-4675-Data.db

-rw-r--r-- 3 cassandra cassandra  3.1G 2011-08-13 10:34
PUBLIC_MONTHLY_18-g-3861-Data.db

-rw-r--r-- 1 cassandra cassandra  3.2G 2011-08-15 14:41
PUBLIC_MONTHLY_20-tmp-g-4884-Data.db

-rw-r--r-- 1 cassandra cassandra  3.6G 2011-08-15 14:44
PUBLIC_MONTHLY_20-tmp-g-4894-Data.db

-rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:56
PUBLIC_MONTHLY_20-tmp-g-4934-Data.db

-rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:46
PUBLIC_MONTHLY_20-tmp-g-4905-Data.db

-rw-r--r-- 1 cassandra cassandra  4.0G 2011-08-15 14:57
PUBLIC_MONTHLY_20-tmp-g-4935-Data.db

-rw-r--r-- 3 cassandra cassandra  5.9G 2011-08-13 12:53
PUBLIC_MONTHLY_19-g-4219-Data.db

-rw-r--r-- 3 cassandra cassandra  6.0G 2011-08-13 13:57
PUBLIC_MONTHLY_20-g-4538-Data.db

-rw-r--r-- 3 cassandra cassandra   12G 2011-08-13 09:27
PUBLIC_MONTHLY_20-g-4501-Data.db

On the other nodes the same directory is around 69GB. Why are there so fewer
large files there and so many big ones on the repairing node ?
 -rw-r--r-- 1 cassandra cassandra 434M 2011-08-15 16:02
PUBLIC_MONTHLY_17-g-3525-Data.db
-rw-r--r-- 1 cassandra cassandra 456M 2011-08-15 15:50
PUBLIC_MONTHLY_19-g-4253-Data.db
-rw-r--r-- 1 cassandra cassandra 485M 2011-08-15 14:30
PUBLIC_MONTHLY_20-g-5280-Data.db
-rw-r--r-- 1 cassandra cassandra 572M 2011-08-15 15:15
PUBLIC_MONTHLY_18-g-3774-Data.db
-rw-r--r-- 2 cassandra cassandra 664M 2011-08-09 15:39
PUBLIC_MONTHLY_20-g-4893-Index.db
-rw-r--r-- 2 cassandra cassandra 811M 2011-08-11 21:27
PUBLIC_MONTHLY_16-g-2597-Data.db
-rw-r--r-- 2 cassandra cassandra 915M 2011-08-13 04:00
PUBLIC_MONTHLY_18-g-3695-Data.db
-rw-r--r-- 1 cassandra cassandra 925M 2011-08-15 03:39
PUBLIC_MONTHLY_17-g-3454-Data.db
-rw-r--r-- 1 cassandra cassandra 1.3G 2011-08-15 13:46
PUBLIC_MONTHLY_19-g-4199-Data.db
-rw-r--r-- 2 cassandra cassandra 1.5G 2011-08-10 15:37
PUBLIC_MONTHLY_17-g-3218-Data.db
-rw-r--r-- 1 cassandra cassandra 1.9G 2011-08-15 14:35
PUBLIC_MONTHLY_20-g-5281-Data.db
-rw-r--r-- 2 cassandra cassandra 2.1G 2011-08-10 16:33
PUBLIC_MONTHLY_19-g-3946-Data.db
-rw-r--r-- 2 cassandra cassandra 3.1G 2011-08-10 22:23
PUBLIC_MONTHLY_18-g-3509-Data.db
-rw-r--r-- 2 cassandra cassandra 4.0G 2011-08-10 18:18
PUBLIC_MONTHLY_20-g-5024-Data.db
-rw--- 2 cassandra cassandra 5.1G 2011-08-09 15:23
PUBLIC_MONTHLY_19-g-3847-Data.db
-rw-r--r-- 2 cassandra cassandra 9.6G 2011-08-09 15:39
PUBLIC_MONTHLY_20-g-4893-Data.db

This whole compaction thing is getting me worried : how are sites in
production dealing with SSTables becoming larger and larger and thus taking
longer and longer to compact ? Adding nodes every couple of weeks ?

Philippe


Re: Solandra multiple schemas

2011-08-15 Thread Ashley Martens
Multiple cores it is. Thanks.


Re: Solandra distributed search

2011-08-15 Thread Jake Luciani
Solandra manages the "shard" parameters for you. you don't need to specify
anything.

On Mon, Aug 15, 2011 at 3:00 PM, Jeremiah Jordan <
jeremiah.jor...@morningstar.com> wrote:

> When using Solandra, do I need to use the Solr sharding synxtax in my
> queries? I don't think I do because Cassandra is handling the "sharding",
> not Solr, but just want to make sure.  The Solandra wiki references the
> distributed search limitations, which talks about the shard syntax further
> down the page.
> From what I see with how it is implemented I should just be able to pick a
> random Solandra node and do my query, since they are all backed by the same
> Cassandra data store. Correct?
>
> Thanks!
> -Jeremiah
>



-- 
http://twitter.com/tjake


Solandra distributed search

2011-08-15 Thread Jeremiah Jordan
When using Solandra, do I need to use the Solr sharding synxtax in my 
queries? I don't think I do because Cassandra is handling the 
"sharding", not Solr, but just want to make sure.  The Solandra wiki 
references the distributed search limitations, which talks about the 
shard syntax further down the page.
From what I see with how it is implemented I should just be able to 
pick a random Solandra node and do my query, since they are all backed 
by the same Cassandra data store. Correct?


Thanks!
-Jeremiah


Re: Solandra multiple schemas

2011-08-15 Thread Jake Luciani
You want the solandra data stored under two keyspaces? Or you just want two
different logical indexes.

The former requires changing the keyspace name located in
solandra.properties but you can only access one per process.

The latter would involve creating two different solr cores at different
endpoints.

-Jake

On Mon, Aug 15, 2011 at 1:56 PM, Ashley Martens  wrote:

> Does Solandra support multiple schemas? For example I have staging and test
> data in two different keyspaces in Cassandra and want that echoed in
> Solandra. Possible?
>



-- 
http://twitter.com/tjake


Re: CassandraUnit

2011-08-15 Thread Jonathan Ellis
Thanks, Jérémy!

2011/8/15 Jérémy SEVELLEC :
> Hi all,
> I have published some documentations on it :
> https://github.com/jsevellec/cassandra-unit/wiki
> regards
> Jérémy
> Le 10 août 2011 23:20, Jérémy SEVELLEC  a écrit :
>>
>> Hi everyone,
>> Let me present you CassandraUnit, a test framework to develop application
>> with Cassandra backend in TDD Style.
>> It allows to embed and load data from an XML DataSet into your Junit Test.
>> CassandraUnit is build on top of Hector and is licensed as LGPL V3.0 on
>> github.
>> Here is the post I made to present it :
>> http://www.unchticafe.fr/2011/08/cassandraunit-java-test-framework-to.html
>> This may interest you...
>>
>> Regards
>> --
>> Jérémy
>
>
>
> --
> Jérémy
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: CassandraUnit

2011-08-15 Thread Jérémy SEVELLEC
Hi all,

I have published some documentations on it :

https://github.com/jsevellec/cassandra-unit/wiki

regards

Jérémy

Le 10 août 2011 23:20, Jérémy SEVELLEC  a écrit :

> Hi everyone,
>
> Let me present you CassandraUnit, a test framework to develop application
> with Cassandra backend in TDD Style.
>
> It allows to embed and load data from an XML DataSet into your Junit Test.
>
> CassandraUnit is build on top of Hector and is licensed as LGPL V3.0 on
> github.
>
> Here is the post I made to present it :
> http://www.unchticafe.fr/2011/08/cassandraunit-java-test-framework-to.html
>
> This may interest you...
>
>
> Regards
>
> --
> Jérémy
>
>


-- 
Jérémy


Solandra multiple schemas

2011-08-15 Thread Ashley Martens
Does Solandra support multiple schemas? For example I have staging and test
data in two different keyspaces in Cassandra and want that echoed in
Solandra. Possible?


CompositeType

2011-08-15 Thread Stephen Pope
 Hey, is there any documentation or examples of how to use the CompositeType? I 
can't find anything about it on the wiki or the datastax docs.

 Cheers,
 Steve


Re: Planet Cassandra is now live

2011-08-15 Thread Konstantin Naryshkin
Thanks. I did not see a link to it when I was sending my message.

- Original Message -
From: "Zhu Han" 
To: user@cassandra.apache.org
Sent: Saturday, August 13, 2011 12:11:37 AM
Subject: Re: Planet Cassandra is now live






On Sat, Aug 13, 2011 at 4:35 AM, Konstantin Naryshkin < konstant...@a-bb.net > 
wrote: 


Would you consider adding an RSS feed to the site for the benefit of those who 
like to use feed readers to keep track of unread posts and what not? 


Here it is: http://planetcassandra.org/aggregator/rss 







- Original Message - 
From: "Lynn Bender" < line...@gmail.com > 
To: user@cassandra.apache.org 
Sent: Friday, August 12, 2011 2:18:45 PM 
Subject: Planet Cassandra is now live 

http://planetcassandra.org/ 

Help us improve the site by suggesting Cassandra-related news and blogs. 

If you have any suggestions for the site whatsoever, feel free to send me a 
note directly. 

-- 
Lynn Bender 
Events and Outreach 
DataStax.com 
http://www.linkedin.com/in/lynnbender 




Re: Internal error processing get_range_slices

2011-08-15 Thread Jonathan Ellis
The count you specify is the worst case, so if you can't even allocate
a List to handle it, you shouldn't be specifying such a high count.
Better find that out immediately, then when your data set grows in
production.

On Mon, Aug 15, 2011 at 8:15 AM, Patrik Modesto
 wrote:
> On Mon, Aug 15, 2011 at 15:09, Jonathan Ellis  wrote:
>> On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
>>  wrote:
>>> PS: while reading the email before I'd send it, I've noticed the
>>> keyRange.count =... is it possible that Cassandra is preallocating
>>> some internal data acording the KeyRange.count parameter?
>>
>> That's exactly what it does.
>
> Ok. But is this pre-alocating really needed? Can't cassandra deduce
> that it doesn't need that much space?
>
> Regards,
> P.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra Certification

2011-08-15 Thread Edward Capriolo
A friends friend developed the FreeBSD certification. It is a actually a
difficult process either you need to give thousands of dollars to a place
like prometrics or you need to have people across the world that can
administer the test. You also need to design and keep changing the test
because it is sad but people cheat all the time. Check Google for (ccna test
answers). Some companies even make businesses of helping you cheat/prepare.

It would be cool but the biggest problem is administering the test around
the us/world.


On Monday, August 15, 2011, aaron morton  wrote:
> Depending on where in the world you are, keep an eye / ear out for Data
Stax training http://www.datastax.com/events
> Cheers
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> On 15/08/2011, at 5:56 PM, Joe Stein wrote:
>
> Certification is good when a community gets to the point that proverbial
management cannot easily discern between posers and those that know what
they are talking about.  I hope one day Cassandra and it's community grows
to that point but as of now there is enough transparency in my opinion.
> I would no more get a Cassandra certification than I would get one from
Cloudera for Hadoop (no offense) nor even a CISSP (which I could do also).
> I would rather see a certification in "scalable distributed computing
solutions" paramount to what the CSA (Cloud Security Alliance) has done with
security.  Cassandra is the answer in a lot of situations, but not always
the answer.  It is probably one of the best tools in your toolbox.
> As the saying goes => a man with a hammer every problem is a nail, DON'T
BE THAT GUY.
> My .02121513E9 cents
>
> /*
> Joe Stein
> Chief Architect @medialets 
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop 
> */
> On Mon, Aug 15, 2011 at 1:23 AM, samal  wrote:
>>
>> Does it really make sense?
>> If yes, I think Apache Cassandra Project (ASF) should offer Open
Certification. Other entity can offer courses, training materials.
>
>
>
>


Re: Internal error processing get_range_slices

2011-08-15 Thread Patrik Modesto
On Mon, Aug 15, 2011 at 15:09, Jonathan Ellis  wrote:
> On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
>  wrote:
>> PS: while reading the email before I'd send it, I've noticed the
>> keyRange.count =... is it possible that Cassandra is preallocating
>> some internal data acording the KeyRange.count parameter?
>
> That's exactly what it does.

Ok. But is this pre-alocating really needed? Can't cassandra deduce
that it doesn't need that much space?

Regards,
P.


Re: Merged counter shard with a count != 0

2011-08-15 Thread Jonathan Ellis
Can you create a bug report on https://issues.apache.org/jira/browse/CASSANDRA ?

On Mon, Aug 15, 2011 at 2:24 AM, Philippe  wrote:
>> Did you try what it says to do first? "You need to restart this node
>> with -Dcassandra.renew_counter_id=true to fix."
>
> Yes I did and it still logged that error upon restarting.
> I'm loath to removing the SSTable as every single repair I run on any node
> is streaming data because of out of sync nodes.
> P
>>
>> On Sun, Aug 14, 2011 at 12:28 PM, Philippe  wrote:
>> > Hi I'm getting the following at startup on one of the nodes on my 3 node
>> > cluster with RF=3.
>> > I have 6 keyspaces each with 10 column families that contain
>> > supercolumns
>> > that contain only counter columns.
>> > Looking
>> >
>> > at http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters
>> > I see that I am supposed to "remove all data for that column family".
>> > Does looking at the previous line for the same thread tell me which
>> > column
>> > family this is happening to ?
>> > How do I "remove the data" on that node ?
>> > Thanks
>> > ERROR [CompactionExecutor:6] 2011-08-14 19:02:55,117
>> > AbstractCassandraDaemon.java (line 134) Fatal exception in thread
>> > Thread[CompactionExecutor:6,1,main]
>> > java.lang.RuntimeException: Merged counter shard with a count != 0
>> > (likely
>> > due to #2968). You need to restart this node with
>> > -Dcassandra.renew_counter_id=true to fix.
>> >         at
>> >
>> > org.apache.cassandra.db.context.CounterContext.removeOldShards(CounterContext.java:633)
>> >         at
>> >
>> > org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:237)
>> >         at
>> >
>> > org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:273)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:67)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:60)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:75)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:140)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:123)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:43)
>> >         at
>> >
>> > org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
>> >         at
>> >
>> > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
>> >         at
>> >
>> > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
>> >         at
>> >
>> > org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
>> >         at
>> >
>> > org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
>> >         at
>> >
>> > org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
>> >         at
>> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>> >         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>> >         at
>> >
>> > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>> >         at java.lang.Thread.run(Thread.java:662)
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of DataStax, the source for professional Cassandra support
>> http://www.datastax.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Internal error processing get_range_slices

2011-08-15 Thread Jonathan Ellis
On Mon, Aug 15, 2011 at 7:13 AM, Patrik Modesto
 wrote:
> PS: while reading the email before I'd send it, I've noticed the
> keyRange.count =... is it possible that Cassandra is preallocating
> some internal data acording the KeyRange.count parameter?

That's exactly what it does.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Internal error processing get_range_slices

2011-08-15 Thread Patrik Modesto
Hi,

on our dev cluster of 4 cassandra nodes 0.7.8 I'm suddenly getting:

ERROR 13:40:50,848 Internal error processing get_range_slices
java.lang.OutOfMemoryError: Java heap space
at java.util.ArrayList.(ArrayList.java:112)
at 
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:480)
at 
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:486)
at 
org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:2868)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:206)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

I run get_range_slices() on all keys with 3 columns named in the Thrift request.

columnParent.column_family = CATEGORIES_CATEGORY;

keyRange.start_key = "";
keyRange.end_key   = "";
keyRange.__isset.start_key = true;
keyRange.__isset.end_key   = true;
keyRange.count = std::numeric_limits::max();

slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_ID);
slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_NAME);
slicePredicate.column_names.push_back(CATEGORIES_CATEGORY_PARENT);
slicePredicate.__isset.column_names = true;

 std::vector  rangeSlices;
 cassandraWrapper->get_range_slices(rangeSlices, columnParent,
slicePredicate, keyRange, oacassandra::ConsistencyLevel::QUORUM);

There are just 102 rows each with 6 columns. Maximum rowsize is 3 379
391B, mean rowsize is 407 756B. Suddenly Cassandra needs 9GB of
heap-space to fulfill this get_range_slices. There is no cache
enabled.

What could be the problem here?

Regards,
Patrik

PS: while reading the email before I'd send it, I've noticed the
keyRange.count =... is it possible that Cassandra is preallocating
some internal data acording the KeyRange.count parameter?


Re: Cassandra Certification

2011-08-15 Thread aaron morton
Depending on where in the world you are, keep an eye / ear out for Data Stax 
training http://www.datastax.com/events

Cheers

-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 15/08/2011, at 5:56 PM, Joe Stein wrote:

> Certification is good when a community gets to the point that proverbial 
> management cannot easily discern between posers and those that know what they 
> are talking about.  I hope one day Cassandra and it's community grows to that 
> point but as of now there is enough transparency in my opinion.
> 
> I would no more get a Cassandra certification than I would get one from 
> Cloudera for Hadoop (no offense) nor even a CISSP (which I could do also).
> 
> I would rather see a certification in "scalable distributed computing 
> solutions" paramount to what the CSA (Cloud Security Alliance) has done with 
> security.  Cassandra is the answer in a lot of situations, but not always the 
> answer.  It is probably one of the best tools in your toolbox.  
> 
> As the saying goes => a man with a hammer every problem is a nail, DON'T BE 
> THAT GUY.
> 
> My .02121513E9 cents
> 
> /*
> Joe Stein
> Chief Architect @medialets
> http://www.linkedin.com/in/charmalloc
> Twitter: @allthingshadoop
> */
> On Mon, Aug 15, 2011 at 1:23 AM, samal  wrote:
> Does it really make sense?
> If yes, I think Apache Cassandra Project (ASF) should offer Open 
> Certification. Other entity can offer courses, training materials.  
> 
> 
> 



Re: performance problems on new cluster

2011-08-15 Thread Anton Winter

OK, node latency is fine and you are using some pretty low
consistency. You said NTS with RF 2, is that RF 2 for each DC ?


Correct, I'm using RF 2 for each DC.



I was able to reproduce the cli timeouts on the non replica nodes.

The debug log output from dc1host1 (non replica node):

DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,183 StorageProxy.java 
(line 518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=100)/ONE
DEBUG [pool-2-thread-14] 2011-08-15 05:26:15,187 StorageProxy.java 
(line 546) reading data from /dc1host3
DEBUG [pool-2-thread-14] 2011-08-15 05:26:35,191 StorageProxy.java 
(line 593) Read timeout: java.util.concurrent.TimeoutException: 
Operation timed out - received only 1 responses from /dc1host3,  .



If the query is run again on the same node (dc1host1) 0 rows are 
returned and the following DEBUG messages are logged:



DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=100)/ONE
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 546) reading data from /dc1host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,513 StorageProxy.java 
(line 562) reading digest from /dc1host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc2host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc2host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc3host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc3host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc4host3
DEBUG [pool-2-thread-14] 2011-08-15 05:32:05,514 StorageProxy.java 
(line 562) reading digest from /dc4host2
DEBUG [pool-2-thread-14] 2011-08-15 05:32:06,022 StorageProxy.java 
(line 588) Read: 508 ms.
ERROR [ReadRepairStage:2112] 2011-08-15 05:32:06,404 
AbstractCassandraDaemon.java (line 133) Fatal exception in thread 
Thread[ReadRepairStage:2112,5,main]

java.lang.AssertionError
at 
org.apache.cassandra.service.RowRepairResolver.resolve(RowRepairResolver.java:73)
at 
org.apache.cassandra.service.AsyncRepairCallback$1.runMayThrow(AsyncRepairCallback.java:54)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

at java.lang.Thread.run(Thread.java:662)


Subsequent queries afterwards show "reading data from /dc1host2" 
however the results returned remains 0.



If I run the same query on a replica I get the correct result returned 
but with 2 exceptions as follows:



DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,792 StorageProxy.java (line 
518) Command/ConsistencyLevel is SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, 
count=100)/ONE
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
541) reading data locally
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from /dc1host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc3host2/dc3host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,793 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc3host3/dc3host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc2host2/dc2host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dns.entry.for.dc2host3/dc2host3
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dc4host2/dc4host2
DEBUG [pool-2-thread-5] 2011-08-15 05:45:49,794 StorageProxy.java (line 
562) reading digest from dc4host3/dc4host3
DEBUG [ReadStage:20102] 2011-08-15 05:45:49,793 StorageProxy.java (line 
690) LocalReadRunnable reading SliceFromReadCommand(table='ks1', 
key='userid1', column_parent='QueryPath(columnFamilyName='cf1', 
superColumnName='java.nio.HeapByteBuffer[pos=64 lim=67 cap=109]', 
columnName='null')', start='', finish='', reversed=false, count=100)
DEBUG [pool-2-thread-5] 2011-08-

Re: Merged counter shard with a count != 0

2011-08-15 Thread Philippe
>
> It looks like the error was thrown during a minor compaction. There should
> be a log line from the CompactionManager before hand that says "Compacting…"
> and lists the SSTables it is going to compact. Check that it's from the same
> thread, i.e. [CompactionExecutor:6] in the example below
>
Ok.


> With the node stopped, delete / move the SSTable files from the data
> directory for the keyspace. They will have the same ColumnFamily-g-XXX.*
>
Ah, that's what I had missed. I should have looked into the directory. It's
obvious, sorry !


> By the way, you may want to do some stress testing with 60 column families
> to make sure thing behave as expected.
>
Yes, I've been running all our traffic through the new cassandra cluster and
then replaying it on our legacy infrastructure. Been getting a lot of ops
experience by doing that !



>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 15 Aug 2011, at 05:28, Philippe wrote:
>
> Hi I'm getting the following at startup on one of the nodes on my 3 node
> cluster with RF=3.
> I have 6 keyspaces each with 10 column families that contain supercolumns
> that contain only counter columns.
>
> Looking at
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-countersI 
> see that I am supposed to "remove
> all data for that column family".
> Does looking at the previous line for the same thread tell me which column
> family this is happening to ?
> How do I "remove the data" on that node ?
>
> Thanks
>
> ERROR [CompactionExecutor:6] 2011-08-14 19:02:55,117
> AbstractCassandraDaemon.java (line 134) Fatal exception in thread
> Thread[CompactionExecutor:6,1,main]
> java.lang.RuntimeException: Merged counter shard with a count != 0 (likely
> due to #2968). You need to restart this node with
> -Dcassandra.renew_counter_id=true to fix.
> at
> org.apache.cassandra.db.context.CounterContext.removeOldShards(CounterContext.java:633)
> at
> org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:237)
> at
> org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:273)
> at
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:67)
> at
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:60)
> at
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:75)
> at
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:140)
> at
> org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:123)
> at
> org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:43)
> at
> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> at
> org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
> at
> org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
> at
> org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
> at
> org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
> at
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
> at
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
>
>
>


Re: Merged counter shard with a count != 0

2011-08-15 Thread Philippe
>
> Did you try what it says to do first? "You need to restart this node
> with -Dcassandra.renew_counter_id=true to fix."
>
Yes I did and it still logged that error upon restarting.
I'm loath to removing the SSTable as every single repair I run on any node
is streaming data because of out of sync nodes.

P

>
> On Sun, Aug 14, 2011 at 12:28 PM, Philippe  wrote:
> > Hi I'm getting the following at startup on one of the nodes on my 3 node
> > cluster with RF=3.
> > I have 6 keyspaces each with 10 column families that contain supercolumns
> > that contain only counter columns.
> > Looking
> > at
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters
> > I see that I am supposed to "remove all data for that column family".
> > Does looking at the previous line for the same thread tell me which
> column
> > family this is happening to ?
> > How do I "remove the data" on that node ?
> > Thanks
> > ERROR [CompactionExecutor:6] 2011-08-14 19:02:55,117
> > AbstractCassandraDaemon.java (line 134) Fatal exception in thread
> > Thread[CompactionExecutor:6,1,main]
> > java.lang.RuntimeException: Merged counter shard with a count != 0
> (likely
> > due to #2968). You need to restart this node with
> > -Dcassandra.renew_counter_id=true to fix.
> > at
> >
> org.apache.cassandra.db.context.CounterContext.removeOldShards(CounterContext.java:633)
> > at
> >
> org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:237)
> > at
> >
> org.apache.cassandra.db.CounterColumn.removeOldShards(CounterColumn.java:273)
> > at
> >
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:67)
> > at
> >
> org.apache.cassandra.db.compaction.PrecompactedRow.removeDeletedAndOldShards(PrecompactedRow.java:60)
> > at
> >
> org.apache.cassandra.db.compaction.PrecompactedRow.(PrecompactedRow.java:75)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionController.getCompactedRow(CompactionController.java:140)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:123)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionIterator.getReduced(CompactionIterator.java:43)
> > at
> >
> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:74)
> > at
> >
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
> > at
> >
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
> > at
> >
> org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
> > at
> >
> org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionManager.doCompactionWithoutSizeEstimation(CompactionManager.java:569)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionManager.doCompaction(CompactionManager.java:506)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:141)
> > at
> >
> org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:107)
> > at
> > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> > at
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> > at java.lang.Thread.run(Thread.java:662)
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>