Re: Heavy writes ok for single node, but failed for cluster

2011-04-28 Thread Sheng Chen
, 2011 at 10:32 AM, Sheng Chen chensheng2...@gmail.com wrote: I succeeded to insert 1 billion records into a single node cassandra, bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 Inserts finished in about 14 hours at a speed of 20k/sec. But when I added another node, tests

Re: Heavy writes ok for single node, but failed for cluster

2011-04-28 Thread Sheng Chen
, Sheng Chen chensheng2...@gmail.com wrote: Thank you for your advice. Rf=2 is a good work around. I was using 0.7.4 and have updated to the latest 0.7 branch, which includes 2554 patch. But it doesn't help. I still get lots of UnavailableException after the following logs, INFO

Heavy writes ok for single node, but failed for cluster

2011-04-27 Thread Sheng Chen
I succeeded to insert 1 billion records into a single node cassandra, bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 Inserts finished in about 14 hours at a speed of 20k/sec. But when I added another node, tests always failed with UnavailableException in an hour. bin/stress -d

Re: Test idea on cassandra

2011-04-06 Thread Sheng Chen
Stress tools in contrib directory use multiple threads/processes. 2011/4/7 Mengchen Yu yum...@umail.iu.edu I'm trying to simulate a multi-user scenario. The reason why I want to use MPJ is to create different processes act like individual users. Do any one have idea how to do this clearly?

Re: Compaction threshold does not save with nodetool

2011-04-06 Thread Sheng Chen
the cli and the ‘update column family X with min_compaction_threshold=Y and max_compaction_threshold=X’ command. Dan *From:* Sheng Chen [mailto:chensheng2...@gmail.com] *Sent:* April-06-11 1:42 *To:* user@cassandra.apache.org *Subject:* Compaction threshold does not save with nodetool

Re: Stress tests failed with secondary index

2011-04-06 Thread Sheng Chen
happens with secondary indexes. Consider things like - reducing the throughput - reducing the number of clients - ensuring the clients are connecting to all nodes in the cluster. You will probably find some logs about dropped messages on some nodes. Aaron On 6 Apr 2011, at 20:39, Sheng Chen

Compaction threshold does not save with nodetool

2011-04-05 Thread Sheng Chen
Cassandra 0.7.4 # nodetool -h localhost getcompactionthreshold Keyspace1 Standard1 min=4 max=32 # nodetool -h localhost setcompactionthreshold Keyspace1 Standard1 0 0 # nodetool -h localhost getcompactionthreshold Keyspace1 Standard1 min=0 max=0 Now the thresholds have changed on the JMX pannel,

Re: Endless minor compactions after heavy inserts

2011-04-03 Thread Sheng Chen
Apr 2011, at 12:45, Sheng Chen wrote: Thank you very much. The major compaction will merge everything into one big file., which would be very large. Is there any way to control the number or size of files created by major compaction? Or, is there a recommended number or size of files

Re: Endless minor compactions after heavy inserts

2011-04-01 Thread Sheng Chen
for the commit log and a stripe set for the data. Hope that helps. Aaron On 1 Apr 2011, at 14:52, Sheng Chen wrote: I've got a single node of cassandra 0.7.4, and I used the java stress tool to insert about 100 million records. The inserts took about 6 hours (45k inserts/sec

Re: newbie question: how do I know the total number of rows of a cf?

2011-03-31 Thread Sheng Chen
I just found an estmateKeys() method of the ColumnFamilyStoreMBean. Is there any indication about how it works? Sheng 2011/3/28 Sheng Chen chensheng2...@gmail.com Hi all, I want to know how many records I am holding in Cassandra, just like count(*) in sql. What can I do ? Thank you. Sheng

Endless minor compactions after heavy inserts

2011-03-31 Thread Sheng Chen
I've got a single node of cassandra 0.7.4, and I used the java stress tool to insert about 100 million records. The inserts took about 6 hours (45k inserts/sec) but the following minor compactions last for 2 days and the pending compaction jobs are still increasing. From jconsole I can read the

Compaction doubles disk space

2011-03-29 Thread Sheng Chen
I use 'nodetool compact' command to start a compaction. I can understand that extra disk spaces are required during the compaction, but after the compaction, the extra spaces are not released. Before compaction: SSTable count: 10 space used (live): 19G space used (total): 21G After compaction:

Re: Compaction doubles disk space

2011-03-29 Thread Sheng Chen
From a previous thread of the same topic, I used a force GC and the extra spaces are released. What about my second question? 2011/3/29 Sheng Chen chensheng2...@gmail.com I use 'nodetool compact' command to start a compaction. I can understand that extra disk spaces are required during

Re: Compaction doubles disk space

2011-03-29 Thread Sheng Chen
Yes. I think at least we can remove the tombstones for each sstable first, and then do the merge. 2011/3/29 Karl Hiramoto k...@hiramoto.org Would it be possible to improve the current compaction disk space issue by compacting one only a few SSTables at a time then imediately deleting the old

Re: stress.py bug?

2011-03-22 Thread Sheng Chen
I am just wondering, why the stress test tools (python, java) need more threads ? Is the bottleneck of a single thread in the client, or in the server? Thanks. Sean 2011/3/22 Ryan King r...@twitter.com On Mon, Mar 21, 2011 at 4:02 AM, pob peterob...@gmail.com wrote: Hi, I'm inserting data