Re: Counter question

2012-03-29 Thread Shimi Kiviti
Like everything else in Cassandra, If you need full consistency you need to
make sure that you have the right combination of (write consistency level)
+ (read consistency level)

if
W = write consistency level
R = read consistency level
N = replication factor
then
W + R  N

Shimi

On Thu, Mar 29, 2012 at 10:09 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 Hi!
 Asking again, as I didn't get responses :)

 I have a ring with 3 nodes and replication factor of 2.
 I have counter cf with the following definition:

 CREATE COLUMN FAMILY tk_counters
 with comparator = 'UTF8Type'
 and default_validation_class = 'CounterColumnType'
 and key_validation_class = 'CompositeType(UTF8Type,UUIDType)'
 and replicate_on_write = true;

 In my code (Java, Hector), I increment a counter and then read it.
 Is it possible that the value read will be the value before increment?
 If yes, how can I ensure it does not happen. All my reads and writes are
 done with consistency level one.
 If this is consistency issue, can I do only the actions on tk_counters
 column family with a higher consistency level?
 What does replicate_on_write mean? I thought this should help, but maybe
 even if replicating after write, my read happen before replication
 finished and it returns value from a still not updated node.

 My increment code is:
 MutatorComposite mutator =
 HFactory.createMutator(keyspace,
 CompositeSerializer.get());
 mutator.incrementCounter(key,tk_counters, columnName, inc);
 mutator.execute();

 My read counter code is:
 CounterQueryComposite,String query =
 createCounterColumnQuery(keyspace,
 CompositeSerializer.get(), StringSerializer.get());
 query.setColumnFamily(tk_counters);
 query.setKey(key);
 query.setName(columnName);
 QueryResultHCounterColumnString r = query.execute();
 return r.get().getValue();

 Thanks,
 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956




tokLogo.png

Re: Counter question

2012-03-29 Thread Shimi Kiviti
You set the consistency with every request.
Usually a client library will let you set a default one for all write/read
requests.
I don't know if Hector lets you set a default consistency level per CF.
Take a look at the Hector docs or ask it in the Hector mailing list.

Shimi

On Thu, Mar 29, 2012 at 11:47 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 Can this be set on a CF basis.
 Only this CF needs higher consistency level.
 Thanks,
 Tamar

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Thu, Mar 29, 2012 at 10:44 AM, Shimi Kiviti shim...@gmail.com wrote:

 Like everything else in Cassandra, If you need full consistency you need
 to make sure that you have the right combination of (write consistency
 level) + (read consistency level)

 if
 W = write consistency level
 R = read consistency level
 N = replication factor
 then
 W + R  N

 Shimi


 On Thu, Mar 29, 2012 at 10:09 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 Hi!
 Asking again, as I didn't get responses :)

 I have a ring with 3 nodes and replication factor of 2.
 I have counter cf with the following definition:

 CREATE COLUMN FAMILY tk_counters
 with comparator = 'UTF8Type'
 and default_validation_class = 'CounterColumnType'
 and key_validation_class = 'CompositeType(UTF8Type,UUIDType)'
 and replicate_on_write = true;

 In my code (Java, Hector), I increment a counter and then read it.
 Is it possible that the value read will be the value before increment?
 If yes, how can I ensure it does not happen. All my reads and writes are
 done with consistency level one.
 If this is consistency issue, can I do only the actions on tk_counters
 column family with a higher consistency level?
 What does replicate_on_write mean? I thought this should help, but maybe
 even if replicating after write, my read happen before replication
 finished and it returns value from a still not updated node.

 My increment code is:
 MutatorComposite mutator =
 HFactory.createMutator(keyspace,
 CompositeSerializer.get());
 mutator.incrementCounter(key,tk_counters, columnName, inc);
 mutator.execute();

 My read counter code is:
 CounterQueryComposite,String query =
 createCounterColumnQuery(keyspace,
 CompositeSerializer.get(), StringSerializer.get());
 query.setColumnFamily(tk_counters);
 query.setKey(key);
 query.setName(columnName);
 QueryResultHCounterColumnString r = query.execute();
 return r.get().getValue();

 Thanks,
 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956






tokLogo.png

Re: Row iteration over indexed clause

2012-03-13 Thread Shimi Kiviti
Yes.use get_indexed_slices (http://wiki.apache.org/cassandra/API)
On Tue, Mar 13, 2012 at 2:12 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Hi,
 Is it possible to iterate and fetch in chunks using thrift API by querying
 using secondary indexes?

 -Vivek



Re: Composite column docs

2012-01-06 Thread Shimi Kiviti
On Thu, Jan 5, 2012 at 9:13 PM, aaron morton aa...@thelastpickle.comwrote:

 What client are you using ?

I am writing a client.


 For example pycassa has some sweet documentation
 http://pycassa.github.com/pycassa/assorted/composite_types.html

It is a sweet documentation but it doesn't help me. I a lower level
documntation


 Cheers

 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 6/01/2012, at 12:48 AM, Shimi Kiviti wrote:

 Is there a doc for using composite columns with thrift?
 Is
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/marshal/CompositeType.java
  the
 only doc?
 does the client needs to add the length to the get \ get_slice... queries
 or is it taken care of on the server side?

 Shimi





Composite column docs

2012-01-05 Thread Shimi Kiviti
Is there a doc for using composite columns with thrift?
Is
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/marshal/CompositeType.java
the
only doc?
does the client needs to add the length to the get \ get_slice... queries
or is it taken care of on the server side?

Shimi


Re: CassandraDaemon deactivate doesn't shutdown Cassandra

2011-10-15 Thread Shimi Kiviti
The problem doesn't exist after the column family is truncated or
if durable_writes=true

Shimi

On Tue, Oct 11, 2011 at 9:30 PM, Shimi Kiviti shim...@gmail.com wrote:

 I am running an Embedded Cassandra (0.8.7) and
 calling CassandraDaemon.deactivate() after I write rows (at least 1),
 doesn't shutdown Cassandra.
 If I run only reads it does shutdown even without
 calling CassandraDaemon.deactivate()

 Anyone have any idea what can cause this problem?

 Shimi



CassandraDaemon deactivate doesn't shutdown Cassandra

2011-10-11 Thread Shimi Kiviti
I am running an Embedded Cassandra (0.8.7) and
calling CassandraDaemon.deactivate() after I write rows (at least 1),
doesn't shutdown Cassandra.
If I run only reads it does shutdown even without
calling CassandraDaemon.deactivate()

Anyone have any idea what can cause this problem?

Shimi


Re: Cassandra Capistrano recipes

2011-07-06 Thread shimi
Modify your Capistrano script to install an init script. If you use debian
or redhat you can copy these or modify them:
https://github.com/Shimi/cassandra/blob/trunk/debian/init
https://github.com/Shimi/cassandra/blob/trunk/redhat/cassandra

and setup Capistrano to call /etc/init.d/cassandra stop/start/restart

Shimi

On Thu, Jul 7, 2011 at 4:27 AM, R Headley headle...@yahoo.com wrote:

 Hi

 I'm using Capistrano with Cassandra and was wondering if anyone has a
 recipe(s), for in particular, starting Cassandra as a daemon.  Running the
 'bin/cassandra' shell script (without the '-f' switch) doesn't quite work as
 this only runs Cassandra in the background, logging out will kill it.

 Thanks, Richard



Re: Read time get worse during dynamic snitch reset

2011-05-11 Thread shimi
I finally found some time to get back to this issue.
I turned on the DEBUG log on the StorageProxy and it shows that all of these
request are read from the other datacenter.

Shimi

On Tue, Apr 12, 2011 at 2:31 PM, aaron morton aa...@thelastpickle.comwrote:

 Something feels odd.

 From Peters nice write up of the dynamic snitch
 http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html The
 RackInferringSnitch (and the PropertyFileSnitch) derive from the
 AbstractNetworkTopologySnitch and should...
 
 In the case of the NetworkTopologyStrategy, it inherits the
 implementation in AbstractNetworkTopologySnitch which sorts by
 AbstractNetworkTopologySnitch.compareEndPoints(), which:

 (1) Always prefers itself to any other node. So myself is always
 closest, no matter what.
 (2) Else, always prefers a node in the same rack, to a node in a different
 rack.
 (3) Else, always prefers a node in the same dc, to a node in a different
 dc.
 http://www.mail-archive.com/user@cassandra.apache.org/msg12092.html

 AFAIK the (data) request should be going to the local DC even after the
 DynamicSnitch has reset the scores. Because the underlying
 RackInferringSnitch should prefer local nodes.

 Just for fun check rack and dc assignments are what you thought using the
 operations on o.a.c.db.EndpointSnitchInfo bean in JConsole. Pass in the ip
 address for the nodes in each dc. If possible can you provide some info on
 the ip's in each dc?

 Aaron

 On 12 Apr 2011, at 18:24, shimi wrote:

 On Tue, Apr 12, 2011 at 12:26 AM, aaron morton aa...@thelastpickle.comwrote:

 The reset interval clears the latency tracked for each node so a bad node
 will be read from again. The scores for each node are then updated every
 100ms (default) using the last 100 responses from a node.

 How long does the bad performance last for?

 Only a few seconds and but there are a lot of read requests during this
 time


 What CL are you reading at ? At Quorum with RF 4 the read request will be
 sent to 3 nodes, ordered by proximity and wellness according to the dynamic
 snitch. (for background recent discussion on dynamic snitch
 http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)

 I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch
 and keys_cached = rows_cached = 0


 You can take a look at the weights and timings used by the DynamicSnitch
 in JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level
 you will be able to see which nodes the request is sent to.

 Everything looks OK. The weights are around 3 for the nodes in the same
 data center and around 5 for the others. I will turn on the DEBUG level to
 see if I can find more info.


 My guess is the DynamicSnitch is doing the right thing and the slow down
 is a node with a problem getting back into the list of nodes used for your
 read. It's then moved down the list as it's bad performance is noticed.

 Looking the DynamicSnitch MBean I don't see any problems with any of the
 nodes. My guess is that during the reset time there are reads that are sent
 to the other data center.


 Hope that helps
 Aaron


 Shimi



 On 12 Apr 2011, at 01:28, shimi wrote:

 I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
 version for several days across 2 data centers.
 I noticed that the read time in some of the nodes increase by x50-60 every
 ten minutes.
 There was no indication in the logs for something that happen at the same
 time. The only thing that I know that is running every 10 minutes is
 the dynamic snitch reset.
 So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
 have the problem once in every 20 minutes.

 I am running all nodes with:
 replica_placement_strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   strategy_options:
 DC1 : 2
 DC2 : 2
   replication_factor: 4

 (DC1 and DC2 are taken from the ips)
 Does anyone familiar with this kind of behavior?

 Shimi







Re: Combining all CFs into one big one

2011-05-01 Thread shimi
On Sun, May 1, 2011 at 9:48 PM, Jake Luciani jak...@gmail.com wrote:

 If you have N column families you need N * memtable size of RAM to support
 this.  If that's not an option you can merge them into one as you suggest
 but then you will have much larger SSTables, slower compactions, etc.



 I don't necessarily agree with Tyler that the OS cache will be less
 effective... But I do agree that if the sizes of sstables are too large for
 you then more hardware is the solution...


If you merge CFs which are hardly accessed with one which are accessed
frequently, when you read the SSTable you load data that is hardly accessed
to the OS cache.

Another thing which you should be aware is that if you need to run any of
the nodetool cf tasks, and you really need it for a specific CF running it
on the specific CF is better and faster.

Shimi




 On Sun, May 1, 2011 at 1:24 PM, Tyler Hobbs ty...@datastax.com wrote:

 When you have a high number of CFs, it's a good idea to consider merging
 CFs with highly correlated access patterns and similar structure into one.
 It is *not* a good idea to merge all of your CFs into one (unless they all
 happen to meet this criteria). Here's why:

 Besides big compactions and long repairs that you can't break down into
 smaller pieces, the main problem here is that your caching will become much
 less efficient. The OS buffer cache will be less effective because rows from
 all of the CFs will be interspersed in the SSTables. You will no longer be
 able to tune the key or row cache to only cache frequently accessed data.
 Both of these will tend to cause a serious increase in latency for your hot
 data.

 Shouldn't these kinds of problems be solved by Cassandra?

 They are mainly solved by Cassandra's general solution to any performance
 problem: the addition of more nodes. There are tickets open to improve
 compaction strategies, put bounds on SSTable sizes, etc; for example,
 https://issues.apache.org/jira/browse/CASSANDRA-1608 , but the addition
 of more nodes is a reliable solution to problems of this nature.

 On Sun, May 1, 2011 at 7:28 AM, David Boxenhorn da...@taotown.comwrote:

 Shouldn't these kinds of problems be solved by Cassandra? Isn't there a
 maximum SSTable size?

 On Sun, May 1, 2011 at 3:24 PM, shimi shim...@gmail.com wrote:

 Big sstables, long compactions, in major compaction you will need to
 have free disk space in the size of all the sstables (which you should have
 anyway).

 Shimi


 On Sun, May 1, 2011 at 2:03 PM, David Boxenhorn da...@taotown.comwrote:

 I'm having problems administering my cluster because I have too many
 CFs (~40).

 I'm thinking of combining them all into one big CF. I would prefix the
 current CF name to the keys, repeat the CF name in a column, and index the
 column (so I can loop over all rows, which I have to do sometimes, for 
 some
 CFs).

 Can anyone think of any disadvantages to this approach?






 --
 Tyler Hobbs
 Software Engineer, DataStax http://datastax.com/
 Maintainer of the pycassa http://github.com/pycassa/pycassa Cassandra
 Python client library




 --
 http://twitter.com/tjake



Re: Tombstones and memtable_operations

2011-04-19 Thread shimi
You can use memtable_flush_after_mins instead of the cron

Shimi

2011/4/19 Héctor Izquierdo Seliva izquie...@strands.com


 El mié, 20-04-2011 a las 08:16 +1200, aaron morton escribió:
  I think their may be an issue here, we are counting the number of columns
 in the operation. When deleting an entire row we do not have a column count.
 
  Can you let us know what version you are using and how you are doing the
 delete ?
 
  Thanks
  Aaron
 

 I'm using 0.7.4. I have a file with all the row keys I have to delete
 (around 100 million) and I just go through the file and issue deletes
 through pelops.

 Should I manually issue flushes with a cron every x time?

  On 20 Apr 2011, at 04:21, Héctor Izquierdo Seliva wrote:
 
   Ok, I've read about gc grace seconds, but i'm not sure I understand it
   fully. Untill gc grace seconds have passed, and there is a compaction,
   the tombstones live in memory? I have to delete 100 million rows and my
   insert rate is very low, so I don't have a lot of compactions. What
   should I do in this case? Lower the major compaction threshold and
   memtable_operations to some very low number?
  
   Thanks
  
   El mar, 19-04-2011 a las 17:36 +0200, Héctor Izquierdo Seliva escribió:
   Hi everyone. I've configured in one of my column families
   memtable_operations = 0.02 and started deleting keys. I have already
   deleted 54k, but there hasn't been any flush of the memtable. Memory
   keeps pilling up and eventually nodes start to do stop-the-world GCs.
 Is
   this the way this is supposed to work or have I done something wrong?
  
   Thanks!
  
  
  
 





Re: Cassandra 0.7.4 Bug?

2011-04-17 Thread shimi
I had the same thing.
Node restart should solve it.

Shimi


On Sun, Apr 17, 2011 at 4:25 PM, Dikang Gu dikan...@gmail.com wrote:

 +1.

 I also met this problem several days before, and I haven't got a solution
 yet...


 On Sun, Apr 17, 2011 at 9:17 PM, csharpplusproject 
 csharpplusproj...@gmail.com wrote:

  Often, I see the following behavior:

 (1) Cassandra works, all nodes are up etc

 (2) a 'move' operation is being run on one of the nodes

 (3) following this 'move' operation, even after a couple of hours / days
 where it is obvious the operation has ended, the node which had 'moved'
 remains with a status of *?*

 perhaps it's a bug?


 ___

 shalom@host:/opt/cassandra/apache-cassandra-0.7.4$ bin/nodetool -host
 192.168.0.5 ring
 Address Status State   LoadOwns
 Token

 127605887595351923798765477786913079296
 192.168.0.253   Up Normal  88.66 MB25.00%
 0
   192.168.0.4 Up Normal  558.2 MB50.00%
 85070591730234615865843651857942052863
   192.168.0.5 Up Normal  71.03 MB16.67%
 113427455640312821154458202477256070485
   192.168.0.6 Up Normal  44.71 MB8.33%
 127605887595351923798765477786913079296

 shalom@host:/opt/cassandra/apache-cassandra-0.7.4$ bin/nodetool -host
 192.168.0.4 move 92535295865117307932921825928971026432

 shalom@host:/opt/cassandra/apache-cassandra-0.7.4$ bin/nodetool -host
 192.168.0.5 ring
 Address Status State   LoadOwns
 Token

 127605887595351923798765477786913079296
 192.168.0.253   Up Normal  171.17 MB   25.00%
 0
 192.168.0.4 *?*  Normal  212.11 MB   54.39%
 92535295865117307932921825928971026432
 192.168.0.5 Up Normal  263.91 MB   12.28%
 113427455640312821154458202477256070485
 192.168.0.6 Up Normal  26.21 MB8.33%
 127605887595351923798765477786913079296




 --
 Dikang Gu

 0086 - 18611140205




Re: Read time get worse during dynamic snitch reset

2011-04-12 Thread shimi
On Tue, Apr 12, 2011 at 12:26 AM, aaron morton aa...@thelastpickle.comwrote:

 The reset interval clears the latency tracked for each node so a bad node
 will be read from again. The scores for each node are then updated every
 100ms (default) using the last 100 responses from a node.

 How long does the bad performance last for?

Only a few seconds and but there are a lot of read requests during this time


 What CL are you reading at ? At Quorum with RF 4 the read request will be
 sent to 3 nodes, ordered by proximity and wellness according to the dynamic
 snitch. (for background recent discussion on dynamic snitch
 http://www.mail-archive.com/user@cassandra.apache.org/msg12089.html)

I am reading with CL of ONE,  read_repair_chance=0.33, RackInferringSnitch
and keys_cached = rows_cached = 0


 You can take a look at the weights and timings used by the DynamicSnitch in
 JConsole under o.a.c.db.DynamicSnitchEndpoint . Also at DEBUG log level you
 will be able to see which nodes the request is sent to.

Everything looks OK. The weights are around 3 for the nodes in the same data
center and around 5 for the others. I will turn on the DEBUG level to see if
I can find more info.


 My guess is the DynamicSnitch is doing the right thing and the slow down is
 a node with a problem getting back into the list of nodes used for your
 read. It's then moved down the list as it's bad performance is noticed.

Looking the DynamicSnitch MBean I don't see any problems with any of the
nodes. My guess is that during the reset time there are reads that are sent
to the other data center.


 Hope that helps
 Aaron


Shimi



 On 12 Apr 2011, at 01:28, shimi wrote:

 I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
 version for several days across 2 data centers.
 I noticed that the read time in some of the nodes increase by x50-60 every
 ten minutes.
 There was no indication in the logs for something that happen at the same
 time. The only thing that I know that is running every 10 minutes is
 the dynamic snitch reset.
 So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
 have the problem once in every 20 minutes.

 I am running all nodes with:
 replica_placement_strategy:
 org.apache.cassandra.locator.NetworkTopologyStrategy
   strategy_options:
 DC1 : 2
 DC2 : 2
   replication_factor: 4

 (DC1 and DC2 are taken from the ips)
 Does anyone familiar with this kind of behavior?

 Shimi





Read time get worse during dynamic snitch reset

2011-04-11 Thread shimi
I finally upgraded 0.6.x to 0.7.4.  The nodes are running with the new
version for several days across 2 data centers.
I noticed that the read time in some of the nodes increase by x50-60 every
ten minutes.
There was no indication in the logs for something that happen at the same
time. The only thing that I know that is running every 10 minutes is
the dynamic snitch reset.
So I changed dynamic_snitch_reset_interval_in_ms to 20 minutes and now I
have the problem once in every 20 minutes.

I am running all nodes with:
replica_placement_strategy:
org.apache.cassandra.locator.NetworkTopologyStrategy
  strategy_options:
DC1 : 2
DC2 : 2
  replication_factor: 4

(DC1 and DC2 are taken from the ips)
Does anyone familiar with this kind of behavior?

Shimi


Re: nodetool cleanup - results in more disk use?

2011-04-04 Thread shimi
The bigger the file the longer it will take for it to be part of a
compaction again.
Compacting bucket of large files takes longer then compacting bucket of
small files

Shimi

On Mon, Apr 4, 2011 at 3:58 PM, aaron morton aa...@thelastpickle.comwrote:

 mmm, interesting. My theory was

 t0 - major compaction runs, there is now one sstable
 t1 - x new sstables have been created
 t2 - minor compaction runs and determines there are two buckets, one with
 the x new sstables and one with the single big file. The bucket of many
 files is compacted into one, the bucket of one file is ignored.

 I can see that it takes longer for the big file to be involved in
 compaction again, and when it finally was it would take more time. But that
 minor compactions of new SSTables would still happen at the same rate,
 especially if they are created at the same rate as previously.

 Am I missing something or am I just reading the docs wrong ?

 Cheers
 Aaron


 On 4 Apr 2011, at 22:20, Jonathan Colby wrote:

 hi Aaron -

 The Datastax documentation brought to light the fact that over time, major
 compactions  will be performed on bigger and bigger SSTables.   They
 actually recommend against performing too many major compactions.  Which is
 why I am wary to trigger too many major compactions ...

 http://www.datastax.com/docs/0.7/operations/scheduled_tasks
 Performing Major 
 Compaction¶http://www.datastax.com/docs/0.7/operations/scheduled_tasks#performing-major-compaction

 A major compaction process merges all SSTables for all column families in a
 keyspace – not just similar sized ones, as in minor compaction. Note that
 this may create extremely large SStables that result in long intervals
 before the next minor compaction (and a resulting increase in CPU usage for
 each minor compaction).

 Though a major compaction ultimately frees disk space used by accumulated
 SSTables, during runtime it can temporarily double disk space usage. It is
 best to run major compactions, if at all, at times of low demand on the
 cluster.






 On Apr 4, 2011, at 1:57 PM, aaron morton wrote:

 cleanup reads each SSTable on disk and writes a new file that contains the
 same data with the exception of rows that are no longer in a token range the
 node is a replica for. It's not compacting the files into fewer files or
 purging tombstones. But it is re-writing all the data for the CF.

 Part of the process will trigger GC if needed to free up disk space from
 SSTables no longer needed.

 AFAIK having fewer bigger files will not cause longer minor compactions.
 Compaction thresholds are applied per bucket of files that share a similar
 size, there is normally more smaller files and fewer larger files.

 Aaron

 On 2 Apr 2011, at 01:45, Jonathan Colby wrote:

 I discovered that a Garbage collection cleans up the unused old SSTables.
   But I still wonder whether cleanup really does a full compaction.  This
 would be undesirable if so.



 On Apr 1, 2011, at 4:08 PM, Jonathan Colby wrote:


 I ran node cleanup on a node in my cluster and discovered the disk usage
 went from 3.3 GB to 5.4 GB.  Why is this?


 I thought cleanup just removed hinted handoff information.   I read that
 *during* cleanup extra disk space will be used similar to a compaction.  But
 I was expecting the disk usage to go back down when it finished.


 I hope cleanup doesn't trigger a major compaction.  I'd rather not run
 major compactions because it means future minor compactions will take longer
 and use more CPU and disk.










index file contains a different key or row size

2011-04-04 Thread shimi
It make sense to me that compaction should solved this as well since
compaction creates new index files.
Am I missing something here?

WARN [CompactionExecutor:1] 2011-04-04 14:50:54,105 CompactionManager.java
(line 602) Row scrubbed successfully but index file contains a different key
or row size; consider rebuilding the index as described in
http://www.mail-archive.com/user@cassandra.apache.org/msg03325.html

Shimi


Re: urgent

2011-04-03 Thread shimi
How did you solve it?

On Sun, Apr 3, 2011 at 7:32 PM, Anurag Gujral anurag.guj...@gmail.comwrote:

 Now it is using all the three disks . I want to understand why recommended
 approach is to use
 one single large volume /directory and not multiple ones,can you please
 explain in detail.
 I am using SSDs using  three small ones is cheaper than using one large
 one.
 Please Suggest
 Thanks
 Anurag


 On Sun, Apr 3, 2011 at 7:31 AM, aaron morton aa...@thelastpickle.comwrote:

 Is this still a problem ? Are you getting errors on the server ?

 It should be choosing the directory with the most space.

 btw, the recommended approach is to use a single large volume/directory
 for the data.

 Aaron

 On 2 Apr 2011, at 01:56, Anurag Gujral wrote:

  Hi All,
I have setup a cassandra cluster with three data directories
 but cassandra is using only one of them and that disk is out of space
  and .Why is cassandra not using all the three data directories.
 
  Plz Suggest.
 
  Thanks
  Anurag





Re: Exceptions on 0.7.0

2011-02-22 Thread shimi
I didn't solved it.
Since it is a test cluster I deleted all the data. I copied some sstables
from my production cluster and I tried again, this time I didn't have this
problem.
I am planing on removing everything from this test cluster. I will start all
over again with 0.6.x , then I will load it with 10th of GB of data (not
sstable copy) and test the upgrade again.

I did a mistake that I didn't backup the data files before I upgraded.

Shimi

On Tue, Feb 22, 2011 at 2:24 PM, David Boxenhorn da...@lookin2.com wrote:

 Shimi,

 I am getting the same error that you report here. What did you do to solve
 it?

 David


 On Thu, Feb 10, 2011 at 2:54 PM, shimi shim...@gmail.com wrote:

 I upgraded the version on all the nodes but I still gets the Exceptions.
 I run cleanup on one of the nodes but I don't think there is any cleanup
 going on.

 Another weird thing that I see is:
 INFO [CompactionExecutor:1] 2011-02-10 12:08:21,353
 CompactionIterator.java (line 135) Compacting large row
 333531353730363835363237353338383836383035363036393135323132383
 73630323034313a446f20322e384c20656e67696e657320686176652061646a75737461626c65206c696674657273
 (725849473109 bytes) incrementally

 In my production version the largest row is 10259. It shouldn't be
 different in this case.

 The first Exception is been thrown on 3 nodes during compaction.
 The second Exception (Internal error processing get_range_slices) is been
 thrown all the time by a forth node. I disabled gossip and any client
 traffic to it and I still get the Exceptions.
 Is it possible to boot a node with gossip disable?

 Shimi

 On Thu, Feb 10, 2011 at 11:11 AM, aaron morton 
 aa...@thelastpickle.comwrote:

 I should be able to repair, install the new version and kick off nodetool
 repair .

 If you are uncertain search for cassandra-1992 on the list, there has
 been some discussion. You can also wait till some peeps in the states wake
 up if you want to be extra sure.

  The number if the number of columns the iterator is going to return from
 the row. I'm guessing that because this happening during compaction it's
 using asked for the maximum possible number of columns.

 Aaron



 On 10 Feb 2011, at 21:37, shimi wrote:

 On 10 Feb 2011, at 13:42, Dan Hendry wrote:

  Out of curiosity, do you really have on the order of 1,986,622,313
 elements (I believe elements=keys) in the cf?

 Dan

 No. I was too puzzled by the numbers


 On Thu, Feb 10, 2011 at 10:30 AM, aaron morton aa...@thelastpickle.com
  wrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test
 with the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

 I will. I hope the data was not corrupted.



 On Thu, Feb 10, 2011 at 10:30 AM, aaron morton 
 aa...@thelastpickle.comwrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test
 with the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

 On 10 Feb 2011, at 13:42, Dan Hendry wrote:

 Out of curiosity, do you really have on the order of 1,986,622,313
 elements (I believe elements=keys) in the cf?

 Dan

  *From:* shimi [mailto:shim...@gmail.com]
 *Sent:* February-09-11 15:06
 *To:* user@cassandra.apache.org
 *Subject:* Exceptions on 0.7.0

 I have a 4 node test cluster were I test the port to 0.7.0 from 0.6.X
 On 3 out of the 4 nodes I get exceptions in the log.
 I am using RP.
 Changes that I did:
 1. changed the replication factor from 3 to 4
 2. configured the nodes to use Dynamic Snitch
 3. RR of 0.33

 I run repair on 2 nodes  before I noticed the errors. One of them is
 having the first error and the other the second.
 I restart the nodes but I still get the exceptions.

 The following Exception I get from 2 nodes:
  WARN [CompactionExecutor:1] 2011-02-09 19:50:51,281 BloomFilter.java
 (line 84) Cannot provide an optimal Bloom
 Filter for 1986622313 elements (1/4 buckets per element).
 ERROR [CompactionExecutor:1] 2011-02-09 19:51:10,190
 AbstractCassandraDaemon.java (line 91) Fatal exception in
 thread Thread[CompactionExecutor:1,1,main]
 java.io.IOError: java.io.EOFException
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:105)
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:34)
 at
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131

EOFException: attempted to skip x bytes

2011-02-21 Thread shimi
)
at
org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:69)
... 19 more

Shimi


Re: Exceptions on 0.7.0

2011-02-10 Thread shimi
On 10 Feb 2011, at 13:42, Dan Hendry wrote:

Out of curiosity, do you really have on the order of 1,986,622,313 elements
(I believe elements=keys) in the cf?

Dan

No. I was too puzzled by the numbers


On Thu, Feb 10, 2011 at 10:30 AM, aaron morton aa...@thelastpickle.com
 wrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test with
 the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

I will. I hope the data was not corrupted.



On Thu, Feb 10, 2011 at 10:30 AM, aaron morton aa...@thelastpickle.comwrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test with
 the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

 On 10 Feb 2011, at 13:42, Dan Hendry wrote:

 Out of curiosity, do you really have on the order of 1,986,622,313 elements
 (I believe elements=keys) in the cf?

 Dan

 *From:* shimi [mailto:shim...@gmail.com]
 *Sent:* February-09-11 15:06
 *To:* user@cassandra.apache.org
 *Subject:* Exceptions on 0.7.0

 I have a 4 node test cluster were I test the port to 0.7.0 from 0.6.X
 On 3 out of the 4 nodes I get exceptions in the log.
 I am using RP.
 Changes that I did:
 1. changed the replication factor from 3 to 4
 2. configured the nodes to use Dynamic Snitch
 3. RR of 0.33

 I run repair on 2 nodes  before I noticed the errors. One of them is having
 the first error and the other the second.
 I restart the nodes but I still get the exceptions.

 The following Exception I get from 2 nodes:
  WARN [CompactionExecutor:1] 2011-02-09 19:50:51,281 BloomFilter.java (line
 84) Cannot provide an optimal Bloom
 Filter for 1986622313 elements (1/4 buckets per element).
 ERROR [CompactionExecutor:1] 2011-02-09 19:51:10,190
 AbstractCassandraDaemon.java (line 91) Fatal exception in
 thread Thread[CompactionExecutor:1,1,main]
 java.io.IOError: java.io.EOFException
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:105)
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:34)
 at
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at
 com.google.common.collect.Iterators$7.computeNext(Iterators.java:604)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at
 org.apache.cassandra.db.ColumnIndexer.serializeInternal(ColumnIndexer.java:76)
 at
 org.apache.cassandra.db.ColumnIndexer.serialize(ColumnIndexer.java:50)
 at
 org.apache.cassandra.io.LazilyCompactedRow.init(LazilyCompactedRow.java:88)
 at
 org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:136)
 at
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107)
 at
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42)
 at
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at
 org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
 at
 org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
 at
 org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
 at
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
 at
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.EOFException
 at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
 at
 org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:280)
 at
 org.apache.cassandra.db.ColumnSerializer.deserialize

Re: Exceptions on 0.7.0

2011-02-10 Thread shimi
I upgraded the version on all the nodes but I still gets the Exceptions.
I run cleanup on one of the nodes but I don't think there is any cleanup
going on.

Another weird thing that I see is:
INFO [CompactionExecutor:1] 2011-02-10 12:08:21,353 CompactionIterator.java
(line 135) Compacting large row
333531353730363835363237353338383836383035363036393135323132383
73630323034313a446f20322e384c20656e67696e657320686176652061646a75737461626c65206c696674657273
(725849473109 bytes) incrementally

In my production version the largest row is 10259. It shouldn't be different
in this case.

The first Exception is been thrown on 3 nodes during compaction.
The second Exception (Internal error processing get_range_slices) is been
thrown all the time by a forth node. I disabled gossip and any client
traffic to it and I still get the Exceptions.
Is it possible to boot a node with gossip disable?

Shimi

On Thu, Feb 10, 2011 at 11:11 AM, aaron morton aa...@thelastpickle.comwrote:

 I should be able to repair, install the new version and kick off nodetool
 repair .

 If you are uncertain search for cassandra-1992 on the list, there has been
 some discussion. You can also wait till some peeps in the states wake up if
 you want to be extra sure.

  The number if the number of columns the iterator is going to return from
 the row. I'm guessing that because this happening during compaction it's
 using asked for the maximum possible number of columns.

 Aaron



 On 10 Feb 2011, at 21:37, shimi wrote:

 On 10 Feb 2011, at 13:42, Dan Hendry wrote:

  Out of curiosity, do you really have on the order of 1,986,622,313
 elements (I believe elements=keys) in the cf?

 Dan

 No. I was too puzzled by the numbers


 On Thu, Feb 10, 2011 at 10:30 AM, aaron morton aa...@thelastpickle.com
  wrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test with
 the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

 I will. I hope the data was not corrupted.



 On Thu, Feb 10, 2011 at 10:30 AM, aaron morton aa...@thelastpickle.comwrote:

 Shimi,
 You may be seeing the result of CASSANDRA-1992, are you able to test with
 the most recent 0.7 build ?
 https://hudson.apache.org/hudson/job/Cassandra-0.7/


 Aaron

 On 10 Feb 2011, at 13:42, Dan Hendry wrote:

 Out of curiosity, do you really have on the order of 1,986,622,313
 elements (I believe elements=keys) in the cf?

 Dan

  *From:* shimi [mailto:shim...@gmail.com]
 *Sent:* February-09-11 15:06
 *To:* user@cassandra.apache.org
 *Subject:* Exceptions on 0.7.0

 I have a 4 node test cluster were I test the port to 0.7.0 from 0.6.X
 On 3 out of the 4 nodes I get exceptions in the log.
 I am using RP.
 Changes that I did:
 1. changed the replication factor from 3 to 4
 2. configured the nodes to use Dynamic Snitch
 3. RR of 0.33

 I run repair on 2 nodes  before I noticed the errors. One of them is
 having the first error and the other the second.
 I restart the nodes but I still get the exceptions.

 The following Exception I get from 2 nodes:
  WARN [CompactionExecutor:1] 2011-02-09 19:50:51,281 BloomFilter.java
 (line 84) Cannot provide an optimal Bloom
 Filter for 1986622313 elements (1/4 buckets per element).
 ERROR [CompactionExecutor:1] 2011-02-09 19:51:10,190
 AbstractCassandraDaemon.java (line 91) Fatal exception in
 thread Thread[CompactionExecutor:1,1,main]
 java.io.IOError: java.io.EOFException
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:105)
 at
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:34)
 at
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at
 com.google.common.collect.Iterators$7.computeNext(Iterators.java:604)
 at
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at
 org.apache.cassandra.db.ColumnIndexer.serializeInternal(ColumnIndexer.java:76)
 at
 org.apache.cassandra.db.ColumnIndexer.serialize(ColumnIndexer.java:50)
 at
 org.apache.cassandra.io.LazilyCompactedRow.init(LazilyCompactedRow.java:88)
 at
 org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:136)
 at
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107

Exceptions on 0.7.0

2011-02-09 Thread shimi
(CollatingIterator.java:217)
at
org.apache.cassandra.db.RowIteratorFactory$3.getReduced(RowIteratorFactory.java:136)
at
org.apache.cassandra.db.RowIteratorFactory$3.getReduced(RowIteratorFactory.java:106)
at
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
at
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1294)
at
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:438)
at
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:473)
at
org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:2868)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:1
67)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.EOFException
at java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
at
org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:280)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
at
org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:78)
... 21 more

any idea what went wrong?
Shimi


Re: Do you have a site in production environment with Cassandra? What client do you use?

2011-01-15 Thread shimi
Same here, Hector with Java.

Shimi

On Fri, Jan 14, 2011 at 9:13 PM, Dan Kuebrich dan.kuebr...@gmail.comwrote:

 We've done hundreds of gigs in and out of cassandra 0.6.8 with pycassa 0.3.
  Working on upgrading to 0.7 and pycassa 1.03.

 I don't know if we're using it wrong, but the connection object is tied to
 a particular keyspace constraint isn't that awesome--we have a number of
 keyspaces used simultaneously.  Haven't looked into it yet.


 On Fri, Jan 14, 2011 at 1:52 PM, Mike Wynholds m...@carbonfive.comwrote:

 We have one in production with Ruby / fauna Cassandra gem and Cassandra
 0.6.x.  The project is live but is stuck in a sort of private beta, so it
 hasn't really been run through any load scenarios.

 ..mike..

 --
 Michael Wynholds | Carbon Five | 310.821.7125 x13 | m...@carbonfive.com



 On Fri, Jan 14, 2011 at 9:24 AM, Ertio Lew ertio...@gmail.com wrote:

 Hey,

 If you have a site in production environment or considering so, what
 is the client that you use to interact with Cassandra. I know that
 there are several clients available out there according to the
 language you use but I would love to know what clients are being used
 widely in production environments and are best to work with(support
 most required features for performance).

 Also preferably tell about the technology stack for your applications.

 Any suggestions, comments appreciated ?

 Thanks
 Ertio






Re: Reclaim deleted rows space

2011-01-06 Thread shimi
Am I missing something here? It is already possible to trigger major
compaction on a specific CF.

On Thu, Jan 6, 2011 at 4:50 AM, Tyler Hobbs ty...@riptano.com wrote:

 Although it's not exactly the ability to list specific SSTables, the
 ability to only compact specific CFs will be in upcoming releases:

 https://issues.apache.org/jira/browse/CASSANDRA-1812

 - Tyler


 On Wed, Jan 5, 2011 at 7:46 PM, Edward Capriolo edlinuxg...@gmail.comwrote:

 On Wed, Jan 5, 2011 at 4:31 PM, Jonathan Ellis jbel...@gmail.com wrote:
  Pretty sure there's logic in there that says don't bother compacting
  a single sstable.
 
  On Wed, Jan 5, 2011 at 2:26 PM, shimi shim...@gmail.com wrote:
  How does minor compaction is triggered? Is it triggered Only when a new
  SStable is added?
 
  I was wondering if triggering a compaction
 with minimumCompactionThreshold
  set to 1 would be useful. If this can happen I assume it will do
 compaction
  on files with similar size and remove deleted rows on the rest.
  Shimi
  On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
 peter.schul...@infidyne.com
  wrote:
 
   I don't have a problem with disk space. I have a problem with the
 data
   size.
 
  [snip]
 
   Bottom line is that I want to reduce the number of requests that
 goes to
   disk. Since there is enough data that is no longer valid I can do it
 by
   reclaiming the space. The only way to do it is by running Major
   compaction.
   I can wait and let Cassandra do it for me but then the data size
 will
   get
   even bigger and the response time will be worst. I can do it
 manually
   but I
   prefer it to happen in the background with less impact on the system
 
  Ok - that makes perfect sense then. Sorry for misunderstanding :)
 
  So essentially, for workloads that are teetering on the edge of cache
  warmness and is subject to significant overwrites or removals, it may
  be beneficial to perform much more aggressive background compaction
  even though it might waste lots of CPU, to keep the in-memory working
  set down.
 
  There was talk (I think in the compaction redesign ticket) about
  potentially improving the use of bloom filters such that obsolete data
  in sstables could be eliminated from the read set without
  necessitating actual compaction; that might help address cases like
  these too.
 
  I don't think there's a pre-existing silver bullet in a current
  release; you probably have to live with the need for
  greater-than-theoretically-optimal memory requirements to keep the
  working set in memory.
 
  --
  / Peter Schuller
 
 
 
 
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of Riptano, the source for professional Cassandra support
  http://riptano.com
 

 I was wording if it made sense to have a JMX operation that can
 compact a list of tables by file name. This opens it up for power
 users to have more options then compact entire keyspace.





Re: Reclaim deleted rows space

2011-01-06 Thread shimi
On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Pretty sure there's logic in there that says don't bother compacting
 a single sstable.

No. You can do it.
Based on the log I have a feeling that it triggers an infinite compaction
loop.



 On Wed, Jan 5, 2011 at 2:26 PM, shimi shim...@gmail.com wrote:
  How does minor compaction is triggered? Is it triggered Only when a new
  SStable is added?
 
  I was wondering if triggering a compaction
 with minimumCompactionThreshold
  set to 1 would be useful. If this can happen I assume it will do
 compaction
  on files with similar size and remove deleted rows on the rest.
  Shimi
  On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
 peter.schul...@infidyne.com
  wrote:
 
   I don't have a problem with disk space. I have a problem with the data
   size.
 
  [snip]
 
   Bottom line is that I want to reduce the number of requests that goes
 to
   disk. Since there is enough data that is no longer valid I can do it
 by
   reclaiming the space. The only way to do it is by running Major
   compaction.
   I can wait and let Cassandra do it for me but then the data size will
   get
   even bigger and the response time will be worst. I can do it manually
   but I
   prefer it to happen in the background with less impact on the system
 
  Ok - that makes perfect sense then. Sorry for misunderstanding :)
 
  So essentially, for workloads that are teetering on the edge of cache
  warmness and is subject to significant overwrites or removals, it may
  be beneficial to perform much more aggressive background compaction
  even though it might waste lots of CPU, to keep the in-memory working
  set down.
 
  There was talk (I think in the compaction redesign ticket) about
  potentially improving the use of bloom filters such that obsolete data
  in sstables could be eliminated from the read set without
  necessitating actual compaction; that might help address cases like
  these too.
 
  I don't think there's a pre-existing silver bullet in a current
  release; you probably have to live with the need for
  greater-than-theoretically-optimal memory requirements to keep the
  working set in memory.
 
  --
  / Peter Schuller
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Re: Reclaim deleted rows space

2011-01-06 Thread shimi
According to the code it make sense.
submitMinorIfNeeded() calls doCompaction() which calls
submitMinorIfNeeded().
With minimumCompactionThreshold = 1 submitMinorIfNeeded() will always run
compaction.

Shimi

On Thu, Jan 6, 2011 at 10:26 AM, shimi shim...@gmail.com wrote:



 On Wed, Jan 5, 2011 at 11:31 PM, Jonathan Ellis jbel...@gmail.com wrote:

 Pretty sure there's logic in there that says don't bother compacting
 a single sstable.

 No. You can do it.
 Based on the log I have a feeling that it triggers an infinite compaction
 loop.



  On Wed, Jan 5, 2011 at 2:26 PM, shimi shim...@gmail.com wrote:
  How does minor compaction is triggered? Is it triggered Only when a new
  SStable is added?
 
  I was wondering if triggering a compaction
 with minimumCompactionThreshold
  set to 1 would be useful. If this can happen I assume it will do
 compaction
  on files with similar size and remove deleted rows on the rest.
  Shimi
  On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller 
 peter.schul...@infidyne.com
  wrote:
 
   I don't have a problem with disk space. I have a problem with the
 data
   size.
 
  [snip]
 
   Bottom line is that I want to reduce the number of requests that goes
 to
   disk. Since there is enough data that is no longer valid I can do it
 by
   reclaiming the space. The only way to do it is by running Major
   compaction.
   I can wait and let Cassandra do it for me but then the data size will
   get
   even bigger and the response time will be worst. I can do it manually
   but I
   prefer it to happen in the background with less impact on the system
 
  Ok - that makes perfect sense then. Sorry for misunderstanding :)
 
  So essentially, for workloads that are teetering on the edge of cache
  warmness and is subject to significant overwrites or removals, it may
  be beneficial to perform much more aggressive background compaction
  even though it might waste lots of CPU, to keep the in-memory working
  set down.
 
  There was talk (I think in the compaction redesign ticket) about
  potentially improving the use of bloom filters such that obsolete data
  in sstables could be eliminated from the read set without
  necessitating actual compaction; that might help address cases like
  these too.
 
  I don't think there's a pre-existing silver bullet in a current
  release; you probably have to live with the need for
  greater-than-theoretically-optimal memory requirements to keep the
  working set in memory.
 
  --
  / Peter Schuller
 
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com





Re: maven cassandra plugin

2011-01-06 Thread shimi
I use Capistrano for install, upgrades, start, stop and restart.
I use it for other projects as well.
It is very useful for automated tasks that needs to run on multiple machines

Shiy

On 2011 1 6 21:38, B. Todd Burruss bburr...@real.com wrote:

has anyone created a maven plugin, like cargo for tomcat, for automating
starting/stopping a cassandra instance?


Re: Reclaim deleted rows space

2011-01-05 Thread shimi
How does minor compaction is triggered? Is it triggered Only when a new
SStable is added?

I was wondering if triggering a compaction with minimumCompactionThreshold
set to 1 would be useful. If this can happen I assume it will do compaction
on files with similar size and remove deleted rows on the rest.

Shimi

On Tue, Jan 4, 2011 at 9:56 PM, Peter Schuller
peter.schul...@infidyne.comwrote:

  I don't have a problem with disk space. I have a problem with the data
  size.

 [snip]

  Bottom line is that I want to reduce the number of requests that goes to
  disk. Since there is enough data that is no longer valid I can do it by
  reclaiming the space. The only way to do it is by running Major
 compaction.
  I can wait and let Cassandra do it for me but then the data size will get
  even bigger and the response time will be worst. I can do it manually but
 I
  prefer it to happen in the background with less impact on the system

 Ok - that makes perfect sense then. Sorry for misunderstanding :)

 So essentially, for workloads that are teetering on the edge of cache
 warmness and is subject to significant overwrites or removals, it may
 be beneficial to perform much more aggressive background compaction
 even though it might waste lots of CPU, to keep the in-memory working
 set down.

 There was talk (I think in the compaction redesign ticket) about
 potentially improving the use of bloom filters such that obsolete data
 in sstables could be eliminated from the read set without
 necessitating actual compaction; that might help address cases like
 these too.

 I don't think there's a pre-existing silver bullet in a current
 release; you probably have to live with the need for
 greater-than-theoretically-optimal memory requirements to keep the
 working set in memory.

 --
 / Peter Schuller



Re: Bootstrapping taking long

2011-01-04 Thread shimi
In my experience most of the time it takes for a node to join the cluster is
the anticompaction on the other nodes. The streaming part is very fast.
Check the other nodes logs to see if there is any node doing anticompaction.
I don't remember how much data I had in the cluster when I needed to
add/remove nodes. I do remember that it took a few hours.

The node will join the ring only when it will finish the bootstrap.

Shimi


On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory ran...@gmail.com wrote:

 I asked the same question on the IRC but no luck there, everyone's asleep
 ;)...

 Using 0.6.6 I'm adding a new node to the cluster.
 It starts out fine but then gets stuck on the bootstrapping state for too
 long. More than an hour and still counting.

 $ bin/nodetool -p 9004 -h localhost streams
 Mode: Bootstrapping
 Not sending any streams.
 Not receiving any streams.


 It seemed to have streamed data from other nodes and indeed the load is
 non-zero but I'm not clear what's keeping it right now from finishing.

 $ bin/nodetool -p 9004 -h localhost info
 51042355038140769519506191114765231716
 Load : 22.49 GB
 Generation No: 1294133781
 Uptime (seconds) : 1795
 Heap Memory (MB) : 315.31 / 6117.00


 nodetool ring does not list this new node in the ring, although nodetool
 can happily talk to the new node, it's just not listing itself as a member
 of the ring. This is expected when the node is still bootstrapping, so the
 question is still how long might the bootstrap take and whether is it stuck.

 The data ins't huge so I find it hard to believe that streaming or anti
 compaction are the bottlenecks. I have ~20G on each node and the new node
 already has just about that so it seems that all data had already been
 streamed to it successfully, or at least most of the data... So what is it
 waiting for now? (same question, rephrased... ;)

 I tried:
 1. Restarting the new node. No good. All logs seem normal but at the end
 the node is still in bootstrap mode.
 2. As someone suggested I increased the rpc timeout from 10k to 30k
 (RpcTimeoutInMillis) but that didn't seem to help. I did this only on the
 new node. Should I have done that on all (old) nodes as well? Or maybe only
 on the ones that were supposed to stream data to that node.
 3. Logging level at DEBUG now but nothing interesting going on except
 for occasional messages such as [1] or [2]

 So the question is: what's keeping the new node from finishing the
 bootstrap and how can I check its status?
 Thanks

 [1] DEBUG [Timer-1] 2011-01-04 05:21:24,402 LoadDisseminator.java (line 36)
 Disseminating load info ...
 [2] DEBUG [RMI TCP Connection(22)-192.168.252.88] 2011-01-04 05:12:48,033
 StorageService.java (line 1189) computing ranges for
 28356863910078205288614550619314017621,
 56713727820156410577229101238628035242,
  85070591730234615865843651857942052863,
 113427455640312821154458202477256070484,
 141784319550391026443072753096570088105,
 170141183460469231731687303715884105727

 --
 /Ran




Re: Reclaim deleted rows space

2011-01-04 Thread shimi
I think I didn't make myself clear.
I don't have a problem with disk space. I have a problem with the data
size.
I have a simple crud application. Most of the requests are read but there
are update/delete and when the time pass the number of deleted rows is big
enough in order to free some disk space (a matter of days and not hours).
Since not all of the data can fit to RAM (and I have a lot of RAM) the rest
is served from disk. Since disk is slow I want to reduce as much as possible
the number of requests that goes to the disk. The more requests to the disk,
the disk wait time gets longer and it takes more time to return a response.

Bottom line is that I want to reduce the number of requests that goes to
disk. Since there is enough data that is no longer valid I can do it by
reclaiming the space. The only way to do it is by running Major compaction.
I can wait and let Cassandra do it for me but then the data size will get
even bigger and the response time will be worst. I can do it manually but I
prefer it to happen in the background with less impact on the system

Shimi


On Tue, Jan 4, 2011 at 2:33 PM, Peter Schuller
peter.schul...@infidyne.comwrote:

  This is what I thought. I was wishing there might be another way to
 reclaim
  the space.

 Be sure you really need this first :) Normally you just let it happen in
 the bg.

  The problem is that the more data you have the more time it will take to
  Cassandra to response.

 Relative to what though? There are definitely important side-effects
 of having very large data sets, and part of that involves compactions,
 but in a normal steady state type of system you should never be in the
 position to wait for a major compaction to run. Compactions are
 something that is intended to run every now and then in the
 background. It will result in variations in disk space within certain
 bounds, which is expected.

 Certainly the situation can be improved and the current disk space
 utilization situation is not perfect, but the above suggests to me
 that you're trying to do something that is not really intended to be
 done.

  Reclaim space of deleted rows in the biggest SSTable requires Major
  compaction. This compaction can be triggered by adding x2 data (or x4
 data
  in the default configuration) to the system or by executing it manually
  using JMX.

 You can indeed choose to trigger major compactions by e.g. cron jobs.
 But just be aware that if you're operating under conditions where you
 are close to disk space running out, you have other concerns too -
 such as periodic repair operations also needing disk space.

 Also; suppose you're overwriting lots of data (or replacing by
 deleting and adding other data). It is not necessarily true that you
 need 4x the space relative to what you otherwise do just because of
 the compaction threshold.

 Keep in mind that compactions already need extra space anyway. If
 you're *not* overwriting or adding data, a compaction of a single CF
 is expected to need up to twice the amount of space that it occupies.
 If you're doing more overwrites and deletions though, as you point out
 you will have more dead data at any given point in time. But on the
 other hand, the peak disk space usage during compactions is lower. So
 the actual peak disk space usage (which is what matters since you must
 have this much disk space) is actually helped by the
 deletions/overwrites too.

 Further, suppose you trigger major compactions more often. That means
 each compaction will have a higher relative spike of disk usage
 because less data has had time to be overwritten or removed.

 So in a sense, it's like the disk space demands is being moved between
 the category of dead data retained for longer than necessary and
 peak disk usage during compaction.

 Also keep in mind that the *low* peak of disk space usage is not
 subject to any fragmentation concerns. Depending on the size of your
 data compared to e.g. column names, that disk space usage might be
 significantly lower than what you would get with an in-place updating
 database. There are lots of trade-offs :)

 You say you have to wait for deletions though which sounds like
 you're doing something unusual. Are you doing stuff like deleting lots
 of data in bulk from one CF, only to then write data to *another* CF?
 Such that you're actually having to wait for disk space to be freed to
 make room for data somewhere else?

  In case of a system that deletes data regularly, which needs to serve
  customers all day and the time it takes should be in ms, this is a
 problem.

 Not in general. I am afraid there may be some misunderstanding here.
 Unless disk space is a problem for you (i.e., you're running out of
 space), there is no need to wait for compactions. And certainly
 whether you can serve traffic 24/7 at low-ms latencies is an important
 consideration, and does become complex when disk I/O is involved, but
 it is not about disk *space*. If you have important performance

Re: Bootstrapping taking long

2011-01-04 Thread shimi
You will have something new to talk about in your talk tomorrow :)

You said that the anti compaction was only on a single node? I think that
your new node should get data from at least two other nodes (depending on
the replication factor). Maybe the problem is not in the new node.
In old version (I think prior to 0.6.3) there was case of stuck bootstrap
that required restart to the new node and the nodes which were suppose to
stream data to it. As far as I remember this case was resolved. I haven't
seen this problem since then.

Shimi

On Tue, Jan 4, 2011 at 3:01 PM, Ran Tavory ran...@gmail.com wrote:

 Running nodetool decommission didn't help. Actually the node refused to
 decommission itself (b/c it wasn't part of the ring). So I simply stopped
 the process, deleted all the data directories and started it again. It
 worked in the sense of the node bootstrapped again but as before, after it
 had finished moving the data nothing happened for a long time (I'm still
 waiting, but nothing seems to be happening).

 Any hints how to analyze a stuck bootstrapping node??
 thanks

 On Tue, Jan 4, 2011 at 1:51 PM, Ran Tavory ran...@gmail.com wrote:

 Thanks Shimi, so indeed anticompaction was run on one of the other nodes
 from the same DC but to my understanding it has already ended. A few hour
 ago...
 I plenty of log messages such as [1] which ended a couple of hours ago,
 and I've seen the new node streaming and accepting the data from the node
 which performed the anticompaction and so far it was normal so it seemed
 that data is at its right place. But now the new node seems sort of stuck.
 None of the other nodes is anticompacting right now or had been
 anticompacting since then.
 The new node's CPU is close to zero, it's iostats are almost zero so I
 can't find another bottleneck that would keep it hanging.

 On the IRC someone suggested I'd maybe retry to join this node,
 e.g. decommission and rejoin it again. I'll try it now...


 [1]
  INFO [COMPACTION-POOL:1] 2011-01-04 04:04:09,721 CompactionManager.java
 (line 338) AntiCompacting
 [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]
  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:18,683 CompactionManager.java
 (line 338) AntiCompacting
 [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3874-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3873-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvImpressions-3876-Data.db')]
  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:19,132 CompactionManager.java
 (line 338) AntiCompacting
 [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-951-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-976-Data.db'),org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvRatings-978-Data.db')]
  INFO [COMPACTION-POOL:1] 2011-01-04 04:34:26,486 CompactionManager.java
 (line 338) AntiCompacting
 [org.apache.cassandra.io.SSTableReader(path='/outbrain/cassandra/data/outbrain_kvdb/KvAds-6449-Data.db')]

 On Tue, Jan 4, 2011 at 12:45 PM, shimi shim...@gmail.com wrote:

 In my experience most of the time it takes for a node to join the cluster
 is the anticompaction on the other nodes. The streaming part is very fast.
 Check the other nodes logs to see if there is any node doing
 anticompaction.
 I don't remember how much data I had in the cluster when I needed to
 add/remove nodes. I do remember that it took a few hours.

 The node will join the ring only when it will finish the bootstrap.

 Shimi


 On Tue, Jan 4, 2011 at 12:28 PM, Ran Tavory ran...@gmail.com wrote:

 I asked the same question on the IRC but no luck there, everyone's
 asleep ;)...

 Using 0.6.6 I'm adding a new node to the cluster.
 It starts out fine but then gets stuck on the bootstrapping state for
 too long. More than an hour and still counting.

 $ bin/nodetool -p 9004 -h localhost streams
 Mode: Bootstrapping
 Not sending any streams.
 Not receiving any streams.


 It seemed to have streamed data from other nodes and indeed the load is
 non-zero but I'm not clear what's keeping it right now from finishing.

 $ bin/nodetool -p 9004 -h localhost info
 51042355038140769519506191114765231716
 Load : 22.49 GB
 Generation No: 1294133781
 Uptime (seconds) : 1795
 Heap Memory (MB) : 315.31 / 6117.00


 nodetool ring does not list this new node in the ring, although nodetool
 can happily talk to the new node, it's just not listing itself as a member
 of the ring. This is expected when the node is still bootstrapping, so the
 question is still how long might the bootstrap take and whether is it 
 stuck.

 The data ins't huge so I find it hard to believe that streaming or anti
 compaction are the bottlenecks. I have ~20G on each node

Reclaim deleted rows space

2011-01-02 Thread shimi
Lets assume I have:
* single 100GB SSTable file
* min compaction threshold is set to 2

If I delete rows which are located in this file. Is the only way to clean
the deleted rows is by inserting another 100GB of data or by triggering a
painful major compaction?

Shimi


iterate over all the rows with RP

2010-12-12 Thread shimi
Is the same connection is required when iterating over all the rows with
Random Paritioner or is it possible to use a different connection for each
iteration?

Shimi


Re: iterate over all the rows with RP

2010-12-12 Thread shimi
So if I will use a different connection (thrift via Hector), will I get the
same results? It's make sense when you use OPP and I assume it is the same
with RP. I just wanted to make sure this is the case and there is no state
which is kept.

Shimi

On Sun, Dec 12, 2010 at 8:14 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  Is the same connection is required when iterating over all the rows with
  Random Paritioner or is it possible to use a different connection for
 each
  iteration?

 In general, the choice of RPC connection (I assume you mean the
 underlying thrift connection) does not affect the semantics of the RPC
 calls.

 --
 / Peter Schuller



FatClient Gossip error and some other problems

2010-09-20 Thread shimi
)
at java.util.TimerThread.run(Timer.java:462)
 INFO [GMFD:1] 2010-09-20 13:56:43,251 Gossiper.java (line 586) Node
/X.X.X.X is now part of the cluster

Does anyone have any idea how can I cleanup the problematic node?
Does anyone have any idea how can I get rid of the Gossip error?

Shimi


Re: FatClient Gossip error and some other problems

2010-09-20 Thread shimi
I was patient (although it is hard when you have millions of requests which
are not served in time). I was waiting for a long time. There was nothing in
the Logs and in JMX.

Shimi

On Mon, Sep 20, 2010 at 6:12 PM, Gary Dusbabek gdusba...@gmail.com wrote:

 On Mon, Sep 20, 2010 at 09:51, shimi shim...@gmail.com wrote:
  I have a cluster with 6 nodes on 2 datacenters (3 on each datacenter).
  I replaced all of the servers in the cluster (0.6.4) with new ones
 (0.6.5).
  My old cluster was unbalanced since I was using Random Partitioner and I
  bootstrapped all the nodes without specifying their tokens.
 
  Since I wanted the the cluster to be balanced I first added all the new
  nodes one after the other (with the right tokens this time) and then I
 run
  decommission on all the old ones, one after the other.
  One of the decommissioned nodes began throwing too many open files errors
  while It was decommissioning taking other nodes with him. After the
 second
  try I decided to stop it and run removetoken on his token from one of the
  other nodes. After that everything went well except that in the end one
 of
  the nodes looked unbalanced.
 
  I decided to run repair on the cluster. What I got is totally unbalanced
  nodes with way to much data then what is suppose to be. each node had
 x2-x4
  more data.
  I run cleanup and all of them except the one which was unbalanced to
 begin
  with got back to the size they were suppose to be.
  Now whenever I try to run cleanup on this node I get:
 
   INFO [COMPACTION-POOL:1] 2010-09-20 12:04:23,069 CompactionManager.java
  (line 339) AntiCompacting ...
   INFO [GC inspection] 2010-09-20 12:05:37,600 GCInspector.java (line 129)
 GC
  for ConcurrentMarkSweep: 1525 ms, 13641032 reclaimed leaving 767863520
 used;
  max is 6552551424
   INFO [GC inspection] 2010-09-20 12:05:37,601 GCInspector.java (line 150)
  Pool NameActive   Pending
   INFO [GC inspection] 2010-09-20 12:05:37,605 GCInspector.java (line 156)
  STREAM-STAGE  0 0
   INFO [GC inspection] 2010-09-20 12:05:37,605 GCInspector.java (line 156)
  RESPONSE-STAGE0 0
   INFO [GC inspection] 2010-09-20 12:05:37,606 GCInspector.java (line 156)
  ROW-READ-STAGE8   717
   INFO [GC inspection] 2010-09-20 12:05:37,607 GCInspector.java (line 156)
  LB-OPERATIONS 0 0
   INFO [GC inspection] 2010-09-20 12:05:37,607 GCInspector.java (line 156)
  MISCELLANEOUS-POOL0 0
   INFO [GC inspection] 2010-09-20 12:05:37,607 GCInspector.java (line 156)
  GMFD  0 2
   INFO [GC inspection] 2010-09-20 12:05:37,608 GCInspector.java (line 156)
  CONSISTENCY-MANAGER   0 1
   INFO [GC inspection] 2010-09-20 12:05:37,608 GCInspector.java (line 156)
  LB-TARGET 0 0
   INFO [GC inspection] 2010-09-20 12:05:37,609 GCInspector.java (line 156)
  ROW-MUTATION-STAGE0 0
   INFO [GC inspection] 2010-09-20 12:05:37,610 GCInspector.java (line 156)
  MESSAGE-STREAMING-POOL0 0
   INFO [GC inspection] 2010-09-20 12:05:37,610 GCInspector.java (line 156)
  LOAD-BALANCER-STAGE   0 0
   INFO [GC inspection] 2010-09-20 12:05:37,611 GCInspector.java (line 156)
  FLUSH-SORTER-POOL 0 0
   INFO [GC inspection] 2010-09-20 12:05:37,612 GCInspector.java (line 156)
  MEMTABLE-POST-FLUSHER 0 0
   INFO [GC inspection] 2010-09-20 12:05:37,612 GCInspector.java (line 156)
  AE-SERVICE-STAGE  0 0
   INFO [GC inspection] 2010-09-20 12:05:37,613 GCInspector.java (line 156)
  FLUSH-WRITER-POOL 0 0
   INFO [GC inspection] 2010-09-20 12:05:37,613 GCInspector.java (line 156)
  HINTED-HANDOFF-POOL   0 0
   INFO [GC inspection] 2010-09-20 12:05:37,616 GCInspector.java (line 161)
  CompactionManager   n/a 0
   INFO [SSTABLE-CLEANUP-TIMER] 2010-09-20 12:05:40,402
  SSTableDeletingReference.java (line 104) Deleted ...
   INFO [SSTABLE-CLEANUP-TIMER] 2010-09-20 12:05:40,727
  SSTableDeletingReference.java (line 104) Deleted ...
   INFO [SSTABLE-CLEANUP-TIMER] 2010-09-20 12:05:40,730
  SSTableDeletingReference.java (line 104) Deleted ...
   INFO [SSTABLE-CLEANUP-TIMER] 2010-09-20 12:05:40,735
  SSTableDeletingReference.java (line 104) Deleted ...
 
  and after that I saw an increase in the node response time and the number
  ROW-READ-STAGE pending tasks. Since there was no indication that
 something
  is wrong or that the node is doing anyuthing (logs ,nodetool and JMX),
 the
  only thing that I could have done is to restart the server.
 
  I don't know if this is related but every hour I see this error (I think
 it
  is the IP of the machine that I couldn't decommission properly):
 
   INFO [Timer-0] 2010-09-20 13:56:11,406 Gossiper.java (line 402

Re: Bootstrap question

2010-07-18 Thread shimi
If I have problems with never ending bootstraping I do the following. I try
each one if it doesn't help I try the next. It might not be the right thing
to do but it worked for me.

1. Restart the bootstraping node
2. If I see streaming 0/ I restart the node and all the streaming nodes
3. Restart all the nodes
4. If there is data in the bootstraing node I delete it before I restart.

Good luck
Shimi

On Sun, Jul 18, 2010 at 12:21 AM, Anthony Molinaro 
antho...@alumni.caltech.edu wrote:

 So still waiting for any sort of answer on this one.  The cluster still
 refuses to do anything when I bring up new nodes.  I shut down all the
 new nodes and am waiting.  I'm guessing that maybe the old nodes have
 some state which needs to get cleared out?  Is there anything I can do
 at this point?  Are there alternate strategies for bootstrapping I can
 try?  (For instance can I just scp all the sstables to all the new
 nodes and do a repair, would that actually work?).

 Anyone seen this sort of issue?  All this is with 0.6.3 so I assume
 eventually others will see this issue.

 -Anthony

 On Thu, Jul 15, 2010 at 10:45:08PM -0700, Anthony Molinaro wrote:
  Okay, so things were pretty messed up.  I shut down all the new nodes,
  then the old nodes started doing the half the ring is down garbage which
  pretty much requires a full restart of everything.  So I had to shut
  everything down, then bring the seed back, then the rest of the nodes,
  so they finally all agreed on the ring again.
 
  Then I started one of the new nodes, and have been watching the logs, so
  far 2 hours since the Bootstrapping message appeared in the new
  log and nothing has happened.  No anticompaction messages anywhere,
 there's
  one node compacting, but its on the other end of the ring, so no where
 near
  that new node.  I'm wondering if it will ever get data at this point.
 
  Is there something else I should try?  The only thing I can think of
  is deleting the system directory on the new node, and restarting, so
  I'll try that and see if it does anything.
 
  -Anthony
 
  On Thu, Jul 15, 2010 at 03:43:49PM -0500, Jonathan Ellis wrote:
   On Thu, Jul 15, 2010 at 3:28 PM, Anthony Molinaro
   antho...@alumni.caltech.edu wrote:
Is the fact that 2 new nodes are in the range messing it up?
  
   Probably.
  
 And if so
how do I recover (I'm thinking, shutdown new nodes 2,3,4,5, the
 bringing
up nodes 2,4, waiting for them to finish, then bringing up 3,5?).
  
   Yes.
  
   You might have to restart the old nodes too to clear out the confusion.
  
   --
   Jonathan Ellis
   Project Chair, Apache Cassandra
   co-founder of Riptano, the source for professional Cassandra support
   http://riptano.com
 
  --
  
  Anthony Molinaro   antho...@alumni.caltech.edu

 --
 
 Anthony Molinaro   antho...@alumni.caltech.edu



get_range_slices return the same rows

2010-07-14 Thread shimi
I wrote a code that iterate on all the rows by using get_range_slices.
for the first call I use KeyRange from  to .
for all the others I use from the last key that I got in the previous
iteration to .
I always get the same rows that I got in the previous iteration. I tried
changing the batch size but I still gets the same results.
I tried it both in single node and a cluster.
I use RP with version 0.6.3 and Hector.

Does anyone know how this can be done?

Shimi