virtual nodes + map reduce = too many mappers

2013-02-15 Thread cem
suggestion to improve the performance? It seems like I need to lower the number of virtual nodes. Best Regards, Cem

Re: virtual nodes + map reduce = too many mappers

2013-02-17 Thread cem
because in the typical range query you need to set start and end token. But in the virtual nodes I realized that tokens are not continuous. Best Regards, Cem On Sun, Feb 17, 2013 at 2:47 AM, Edward Capriolo edlinuxg...@gmail.comwrote: Split size does not have to equal block size. http

Re: cassandra performance

2013-03-24 Thread cem
Hi, Could you provide some other details about your schema design and queries? It is very hard to tell anything. Regards, Cem On Sun, Mar 24, 2013 at 12:40 PM, dong.yajun dongt...@gmail.com wrote: Hello, I'd suggest you to take look at the difference between Nosql and RDMS. Best

Re: Vnodes - HUNDRED of MapReduce jobs

2013-03-28 Thread cem
Hi Alicia , Cassandra input format creates mappers as many as vnodes. It is a known issue. You need to lower the number of vnodes :( I have a simple solution for that and ready to write a patch. Should I create a ticket about that? I don't know the procedure about that. Regards, Cem On Thu

TTL 3 hours + GC grace 0

2012-03-11 Thread cem
records after a down node comes back. So I assumed that transferring expired records will not cause any problem. Do you have any idea? Thank you! Regards, Cem.

Re: TTL 3 hours + GC grace 0

2012-03-12 Thread cem
Thank you for the swift response. Cem. On Sun, Mar 11, 2012 at 11:03 PM, Peter Schuller peter.schul...@infidyne.com wrote: I am using TTL 3 hours and GC grace 0 for a CF. I have a normal CF that has records with TTL 3 hours and I dont send any delete request. I just wonder if using GC

Re: All host pools Marked Down

2012-05-29 Thread cem
Since all hosts are seem to be down, Hector will not do retry. There should be at least one node up in a cluster. Make sure that you have a proper connection from your webapps to your cluster. Cem. On Tue, May 29, 2012 at 1:46 PM, Shubham Srivastava shubham.srivast...@makemytrip.com wrote

Re: All host pools Marked Down

2012-05-29 Thread cem
with telnet. Cem. On Tue, May 29, 2012 at 3:06 PM, Shubham Srivastava shubham.srivast...@makemytrip.com wrote: My webapp connects to the LoadBalancer IP which has the actual nodes in its pool. If there is by any chance a connection break then will hector not retry to re-establish

Cassandra 1.2 TTL histogram problem

2013-05-21 Thread cem
and the droppableRatio is *0.9. Cassandra skips all sstables which are already expired.* This line was introduced by https://issues.apache.org/jira/browse/CASSANDRA-4022. Best Regards, Cem

Re: Cassandra 1.2 TTL histogram problem

2013-05-21 Thread cem
for each key and send a single write request. Cem On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita mor.y...@gmail.com wrote: Why does Cassandra single table compaction skips the keys that are in the other sstables? because we don't want to resurrect deleted columns. Say, sstable A has

Re: Cassandra 1.2 TTL histogram problem

2013-05-22 Thread cem
to estimate remainingKeys like that? Best Regards, Cem On Wed, May 22, 2013 at 5:58 PM, Yuki Morishita mor.y...@gmail.com wrote: Can method calculate non-overlapping keys as overlapping? Yes. And randomized keys don't matter here since sstables are sorted by token calculated from key

data clean up problem

2013-05-28 Thread cem
would you solve it in another way? Thanks in advance! Cem

Re: data clean up problem

2013-05-28 Thread cem
Thanks for the answer but it is already set to 0 since I don't do any delete. Cem On Tue, May 28, 2013 at 9:03 PM, Edward Capriolo edlinuxg...@gmail.comwrote: You need to change the gc_grace time of the column family. It defaults to 10 days. By default the tombstones will not go away for 10

Re: data clean up problem

2013-05-28 Thread cem
for each partition and drop when you know that all records are expired. I have 5 nodes. Cem. On Tue, May 28, 2013 at 9:37 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Also, how many nodes are you running? From: cem cayiro...@gmail.commailto:cayiro...@gmail.com Reply-To: user

Re: data clean up problem

2013-05-29 Thread cem
Thanks for the answers! Cem On Wed, May 29, 2013 at 1:26 AM, Robert Coli rc...@eventbrite.com wrote: On Tue, May 28, 2013 at 2:38 PM, Bryan Talbot btal...@aeriagames.com wrote: I think what you're asking for (efficient removal of TTL'd write-once data) is already in the works

Re: Is there anyone who implemented time range partitions with column families?

2013-05-29 Thread cem
Thank you very much for the fast answer. Does playORM use different column families for each partition in Cassandra? Cem On Wed, May 29, 2013 at 5:30 PM, Jeremy Powell jeremym.pow...@gmail.comwrote: Cem, yes, you can do this with C*, though you have to handle the logic yourself (other

Dropped mutation messages

2013-06-18 Thread cem
. http://www.datastax.com/docs/1.2/cluster_architecture/cluster_planning Do I need to enable anything to leverage from 1.2? Do you have any other advice? What should be the path to investigate this? Thanks in advance! Best Regards, Cem.

Compression ratio

2013-07-12 Thread cem
Hi All, Can anyone explain the compression ratio? Is it the compressed data / original or original/ compressed ? Or something else. thanks a lot. Best Regards, Cem

Re: Compression ratio

2013-07-12 Thread cem
Thank you very much! On Fri, Jul 12, 2013 at 5:59 PM, Yuki Morishita mor.y...@gmail.com wrote: it's compressed/original. https://github.com/apache/cassandra/blob/cassandra-1.1.11/src/java/org/apache/cassandra/io/sstable/SSTableMetadata.java#L124 On Fri, Jul 12, 2013 at 10:02 AM, cem

Re: maximum storage per node

2013-07-25 Thread cem
Between 500GB - 1TB is recommended. But it depends also your hardware, traffic characteristics and requirements. Can you give some details on that? Best Regards, Cem On Thu, Jul 25, 2013 at 5:35 PM, Pruner, Anne (Anne) pru...@avaya.comwrote: Does anyone have opinions on the maximum amount

Re: maximum storage per node

2013-07-25 Thread cem
You will suffer from long compactions if you are planning to get rid of from old records by TTL. Best Regards, Cem. On Thu, Jul 25, 2013 at 5:51 PM, Kanwar Sangha kan...@mavenir.com wrote: Issues with large data nodes would be – ** ** **· **Nodetool repair will be impossible

Re: maximum storage per node

2013-07-26 Thread cem
will store is relatively small. Why didnt you partition your data according to time instead of using your own compactor? Cem On Fri, Jul 26, 2013 at 3:50 AM, sankalp kohli kohlisank...@gmail.comwrote: Try putting multiple instances per machine with each instance mapped to its own disk. This might

Re: Periodical deletes and compaction strategy

2013-08-26 Thread cem
Hi Alain, I solved the same issue by implementing a client that manages time range partitions. Each time range partition is a CF. Cem. On Mon, Aug 26, 2013 at 11:34 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, Any guidance on this topic would be appreciated :). 2013/8/23 Alain

How to contribute to C*?

2013-09-04 Thread cem
to know how to contribute to C* code base? Am I going to open a ticket and assign to me? Can someone provide me the path? Thank you very much! Best Regards, Cem

Re: How to contribute to C*?

2013-09-04 Thread cem
Thanks Rob! On Thu, Sep 5, 2013 at 12:17 AM, Robert Coli rc...@eventbrite.com wrote: On Wed, Sep 4, 2013 at 2:48 PM, cem cayiro...@gmail.com wrote: I would prefer to have it on server side since it introduces too much complexity on client side and CF overheads. I would like to know how

Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread cem
Have a look to index_interval. Cem. On Wed, Oct 2, 2013 at 2:25 PM, srmore comom...@gmail.com wrote: The version of Cassandra I am using is 1.0.11, we are migrating to 1.2.X though. We had tuned bloom filters (0.1) and AFAIK making it lower than this won't matter. Thanks ! On Tue, Oct

Re: Cassandra Heap Size for data more than 1 TB

2013-10-02 Thread cem
I think 512 is fine. Could you tell more about your traffic characteristics? Cem On Wed, Oct 2, 2013 at 4:32 PM, srmore comom...@gmail.com wrote: I changed my index_interval from 128 to index_interval: 128 to 512, does it make sense to increase more than this ? On Wed, Oct 2, 2013 at 9:30

2 nodes cassandra cluster raid10 or JBOD

2013-12-10 Thread cem
with 2 nodes cluster since there is a higher chance to lose 50% of our cluster compare to a larger cluster. I may prefer to have stronger nodes if I have limited number of nodes. What do you think about that? Is there anyone who has 2 nodes cluster? Best Regards, Cem

Re: 2 nodes cassandra cluster raid10 or JBOD

2013-12-12 Thread cem
New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 11/12/2013, at 9:33 pm, Veysel Taşçıoğlu veysel.tascio...@gmail.com wrote: Hi, What about using JBOD and replication factor 2? Regards. On 11 Dec 2013 02:03, cem

Re: disk space issue

2014-10-01 Thread cem
thanks for the answers! Cem On Wed, Oct 1, 2014 at 2:38 PM, Ken Hancock ken.hanc...@schange.com wrote: *https://github.com/hancockks/cassandra-compact-cf https://github.com/hancockks/cassandra-compact-cf* On Tue, Sep 30, 2014 at 5:49 PM, cem cayiro...@gmail.com wrote: Hi All, I have a 7

Re: Linear scalability problems

2013-04-04 Thread Cem Cayiroglu
What was the RF before adding nodes? Sent from my iPhone On 04 Apr 2013, at 15:12, Anand Somani meatfor...@gmail.com wrote: We are using a single process with multiple threads, will look at client side delays. Thanks On Wed, Apr 3, 2013 at 9:30 AM, Tyler Hobbs ty...@datastax.com wrote:

Re: Deleting Row Key

2013-10-05 Thread Cem Cayiroglu
It will be deleted after a compaction. Sent from my iPhone On 05 Oct 2013, at 07:29, Sebastian Schmidt isib...@gmail.com wrote: Hi, per default, the key of a row is not deleted, if all columns were deleted. I tried to figure out why, but I didn't find an answer, except that it is

Re: Hbase vs Cassandra

2015-05-31 Thread Cem Cayiroglu
for that. There is a nice integration with flume and kite. High availability didnet matter for us. 10 secs down is fine for our use cases.HBase started to support eventually consistent reads. Cem Sent from my iPhone On May 30, 2015, at 4:24 PM, Brady Gentile br...@datastax.com wrote: Hey Ajay, Here