Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

2012-12-10 Thread Richard Low
Quorum reads of many keys, so we'd likely hit every virtual range with our queries, even if num_tokens was 256. Thanks, Eric -- Tyler Hobbs DataStax http://datastax.com/ -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Virtual Nodes, lots of physical nodes and potentially increasing outage count?

2012-12-11 Thread Richard Low
will be possible for others. In a vnode world if any two nodes are down, then the intersection of vnode token ranges they have are unavailable. I think it is two sides of the same coin. On Mon, Dec 10, 2012 at 7:41 AM, Richard Low r...@acunu.com wrote: Hi Tyler, You're right, the math does assume

Re: Vnode migration path

2012-12-11 Thread Richard Low
for exclusive content and other resources on all Barracuda Networks solutions. Visit http://barracudanetworks.com/facebook -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Why Secondary indexes is so slowly by my test?

2012-12-11 Thread Richard Low
, what's the result? Can Secondary indexes be used in product? I hope it's my mistake in doing this test.Can anyone give some tips about it? Thanks in advance. fancy -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Problems with shuffle

2013-04-15 Thread Richard Low
On 14 April 2013 00:56, Rustam Aliyev rustam.li...@code.az wrote: Just a followup on this issue. Due to the cost of shuffle, we decided not to do it. Recently, we added new node and ended up in not well balanced cluster: Datacenter: datacenter1 === Status=Up/Down |/

Re: Misc Performance Questions

2011-06-08 Thread Richard Low
:  Will there be a problem having multiple keyspaces on a cluster all with different replication factors, from 1-3? No. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Misc Performance Questions

2011-06-08 Thread Richard Low
. It will also help buffer caching to separate them - the small SSTables are more likely to remain in cache. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Retrieving a column from a fat row vs retrieving a single row

2011-06-09 Thread Richard Low
Remember also that partitioning is done by rows, not columns. So large rows are stored on a single host. This means they can't be load balanced and also all requests to that row will hit one host. Having separate rows will allow load balancing of I/Os. -- Richard Low Acunu | http

Re: Retrieving a column from a fat row vs retrieving a single row

2011-06-09 Thread Richard Low
2011/6/9 Héctor Izquierdo Seliva izquie...@strands.com: Yeah, but if I have RF=3 then there are three nodes that can answer the request right? Yes, if you're happy to read ConsistencyLevel.ONE.

Re: issue with querying SuperColumn

2011-06-21 Thread Richard Low
You have key validation class UTF8Type for the standard CF, but BytesType for the super. This is why the key is 1 for standard, but printed as 31 for super, which is the hex ascii code for 1. In your java code, use 1.getBytes() as your key and it should work. Richard. -- Richard Low Acunu

Re: deduct token values for BOP

2011-07-06 Thread Richard Low
string that starts with a-d with only characters a-z afterwards will go to N1. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: deduct token values for BOP

2011-07-07 Thread Richard Low
On Thu, Jul 7, 2011 at 3:39 PM, A J s5a...@gmail.com wrote: Thanks. The above works. But when I try to use the binary values rather than the hex values, it does not work. i.e. instead of using 64ff, I use 01100100. Instead of 6Dff, I use 01101101. When using the binary values, everything

Pre-CassandraSF Happy Hour on Sunday

2011-07-08 Thread Richard Low
-happyhour.eventbrite.com/ Hope you can join us! -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: hw requirements

2011-08-29 Thread Richard Low
with 1-8 TB of storage, but there are cases where bigger or smaller makes sense. Don't overspec your nodes - you'll be better off with more smaller nodes. You can use SSDs if you need the random read rate, and SATA drives are fine too. -- Richard Low Acunu | http://www.acunu.com | @acunu On Mon

Re: 15 seconds to increment 17k keys?

2011-09-01 Thread Richard Low
per second is about right, although you can probably do some tuning to improve this. I've also found that the pycassa client uses significant amounts of CPU, so be careful you are not CPU bound on the client. -- Richard Low Acunu | http://www.acunu.com | @acunu On Thu, Sep 1, 2011 at 2:31 AM

Re: 15 seconds to increment 17k keys?

2011-09-02 Thread Richard Low
(with the exception of reads with low consistency levels and read_repair_chance 1.0). Note also that there is just one read per counter increment, not a read per replica. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Cassandra 0.8 Counters Inverted Index?

2011-10-03 Thread Richard Low
. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: GC for ParNew on 0.8.6

2011-10-07 Thread Richard Low
else upgraded and not run into this? What do you mean by the cluster has been weird since the upgrade? Have you noticed slow-downs? Any other messages in the logs that have appeared since the upgrade? Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Doubts related to composite type column names/values

2011-12-20 Thread Richard Low
the type for each column, so they can be different. There is extra storage overhead for this and care must be taken to ensure all column names remain comparable. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: Hector counter question

2012-03-20 Thread Richard Low
.  But reading and incrementing is unsafe. Richard. -- Richard Low Acunu | http://www.acunu.com | @acunu

Re: cassandra-shuffle time to completion and required disk space

2013-05-01 Thread Richard Low
Hi John, - Each machine needed enough free diskspace to potentially hold the entire cluster's sstables on disk I wrote a possible explanation for why Cassandra is trying to use too much space on your ticket: https://issues.apache.org/jira/browse/CASSANDRA-5525 if you could provide the

Re: Why so many vnodes?

2013-06-10 Thread Richard Low
Hi Theo, The number (let's call it T and the number of nodes N) 256 was chosen to give good load balancing for random token assignments for most cluster sizes. For small T, a random choice of initial tokens will in most cases give a poor distribution of data. The larger T is, the closer to

Re: Why so many vnodes?

2013-06-11 Thread Richard Low
On 11 June 2013 09:54, Theo Hultberg t...@iconara.net wrote: But in the paragraph just before Richard said that finding the node that owns a token becomes slower on large clusters with lots of token ranges, so increasing it further seems contradictory. I do mean increase for larger clusters,

Re: [Cassandra] Expanding a Cassandra cluster

2013-06-18 Thread Richard Low
On 10 June 2013 22:00, Emalayan Vairavanathan svemala...@yahoo.com wrote: b) Will Cassandra automatically take care of removing obsolete keys in future ? In a future version Cassandra should automatically clean up for you:

Re: Cassandra with vnode and ByteOrderedPartition

2013-07-03 Thread Richard Low
On 3 July 2013 21:04, Sávio Teles savio.te...@lupa.inf.ufg.br wrote: We're using ByteOrderedPartition to programmatically choose the machine which a objet will be inserted.* *How can I use *ByteOrderedPartition *with vnode on Cassandra 1.2? Don't. Managing tokens with ByteOrderedPartitioner

Re: Cassandra with vnode and ByteOrderedPartition

2013-07-03 Thread Richard Low
On 3 July 2013 22:18, Sávio Teles savio.te...@lupa.inf.ufg.br wrote: We were able to implement ByteOrderedPartition on Cassandra 1.1 and insert an object in a specific machine. However, with Cassandra 1.2 and VNodes we can't implement VNode with ByteOrderedPartitioner to insert an object

Re: [deletion in the future]

2013-07-20 Thread Richard Low
On 19 July 2013 23:31, Alexis Rodríguez arodrig...@inconcertcc.com wrote: Hi guys, I've read here [1] that you can make a deletion mutation for the future. That mechanism operates as a schedule for deletions according to the stackoverflow post. But, I've been having problems to make it work

Re: [deletion in the future]

2013-07-20 Thread Richard Low
On 20 July 2013 15:16, Alexis Rodríguez arodrig...@inconcertcc.com wrote: That's exactly what is happening with my row, but not what I was trying to do. It seems that I misunderstood the stackoverflow post. I was trying to schedule a delete for an entire row, is using ttl for columns the only

Re: Cassandra and RAIDs

2013-07-24 Thread Richard Low
On 24 July 2013 15:36, Jan Algermissen jan.algermis...@nordsc.com wrote: is it recommended to set up Cassandra using 'RAID-ed' disks for per-node reliability or do people usually just rely on having the multiple nodes anyway - why bother with replicated disks? It's not necessary, due to

Re: Installing Debian package from ASF repo

2013-07-29 Thread Richard Low
On 29 July 2013 12:00, Pavel Kirienko pavel.kirienko.l...@gmail.com wrote: Hi, I failed to install the Debian package of Cassandra 1.2.7 from ASF repository because of 404 error. APT said: http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_1.2.7_all.deb 404 Not

Re: nodetool cfstats write count ?

2013-07-29 Thread Richard Low
On 29 July 2013 14:43, Langston, Jim jim.langs...@compuware.com wrote: Running nodetool and looking at the cfstats output, for the counters such as write count and read count, do those numbers reflect any replication ? For instance, if write count shows 3000 and the replication factor is

Re: Reducing the number of vnodes

2013-08-05 Thread Richard Low
On 5 August 2013 12:30, Christopher Wirt chris.w...@struq.com wrote: I’m thinking about reducing the number of vnodes per server. ** ** We have 3 DC setup – one with 9 nodes, two with 3 nodes each. ** ** Each node has 256 vnodes. We’ve found that repair operations are beginning

Re: Counters and replication

2013-08-05 Thread Richard Low
On 5 August 2013 20:04, Christopher Wirt chris.w...@struq.com wrote: Hello, ** ** Question about counters, replication and the ReplicateOnWriteStage ** ** I’ve recently turned on a new CF which uses a counter column. ** ** We have a three DC setup running Cassandra 1.2.4

Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?

2013-08-06 Thread Richard Low
On 6 August 2013 08:40, Aaron Morton aa...@thelastpickle.com wrote: The reason for me looking at virtual nodes is because of terrible experiences we had with 0.8 repairs and as per documentation (an logically) the virtual nodes seems like it will help repairs being smoother. Is this true?

Re: clarification of token() in CQL3

2013-08-06 Thread Richard Low
On 6 August 2013 15:12, Keith Freeman 8fo...@gmail.com wrote: I've seen in several places the advice to use queries like to this page through lots of rows: select id from mytable where token(id) token(last_id) But it's hard to find detailed information about how this works (at least

Re: clarification of token() in CQL3

2013-08-06 Thread Richard Low
On 6 August 2013 16:56, Keith Freeman 8fo...@gmail.com wrote: Your description makes me think that if new rows are added during the paging (i.e. between one select with token()'s and another), they might show up in the query results, right? (because the hash of the new row keys might fall

Re: cassandra 1.2.5- virtual nodes (num_token) pros/cons?

2013-08-13 Thread Richard Low
On 13 August 2013 10:15, Alain RODRIGUEZ arodr...@gmail.com wrote: Streaming from all the physical nodes in the cluster should make repair faster, for the same reason it makes bootstrap faster. Shouldn't it ? Virtual nodes doesn't speed up either very much. Repair and bootstrap will be

Re: Vnodes, adding a node ?

2013-08-14 Thread Richard Low
On 14 August 2013 20:02, Andrew Cobley a.e.cob...@dundee.ac.uk wrote: I have small test cluster of 2 nodes. I ran a stress test on it and with nodetool status received the following: /usr/local/bin/apache-cassandra-2.0.0-rc1/log $ ../bin/nodetool status Datacenter: datacenter1

Re: token(), limit and wide rows

2013-08-17 Thread Richard Low
You can do it by using two types of query. One using token as you suggest, the other by fixing the partition key and walking through the other parts of the composite primary key. For example, consider the table: create table paging (a text, b text, c text primary key (a, b)); I inserted ('1',

Re: How many seed nodes should I use?

2013-08-29 Thread Richard Low
On 29 August 2013 01:55, Ike Walker ike.wal...@flite.com wrote: What is the best practice for how many seed nodes to have in a Cassandra cluster? I remember reading a recommendation of 2 seeds per datacenter in Datastax documentation for 0.7, but I'm interested to know what other people are

Re: successful use of shuffle?

2013-09-02 Thread Richard Low
On 30 August 2013 18:42, Jeremiah D Jordan jeremiah.jor...@gmail.comwrote: You need to introduce the new vnode enabled nodes in a new DC. Or you will have similar issues to https://issues.apache.org/jira/browse/CASSANDRA-5525 Add vnode DC:

Re: w00tw00t.at.ISC.SANS.DFind not found

2013-09-08 Thread Richard Low
On 8 September 2013 02:55, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm seeing this exception in my cassandra logs: Exception during http request mx4j.tools.adaptor.http.HttpException: file mx4j/tools/adaptor/http/xsl/w00tw00t.at.ISC.SANS.DFind:) not found at

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 02:06, Jayadev Jayaraman jdisal...@gmail.com wrote: We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After loading some data with sstableloader , we find that the cluster is heavily imbalanced : How did you select the tokens? Is this a brand new cluster

Re: Row size in cfstats vs cfhistograms

2013-09-19 Thread Richard Low
On 19 September 2013 10:31, Rene Kochen rene.koc...@schange.com wrote: I use Cassandra 1.0.11 If I do cfstats for a particular column family, I see a Compacted row maximum size of 43388628 However, when I do a cfhistograms I do not see such a big row in the Row Size column. The biggest row

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
. Thanks, Suruchi On Sep 19, 2013, at 3:46, Richard Low rich...@wentnet.com wrote: On 19 September 2013 02:06, Jayadev Jayaraman jdisal...@gmail.com wrote: We use vnodes with num_tokens = 256 ( 256 tokens per node ) . After loading some data with sstableloader , we find that the cluster

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
The only thing you need to guarantee is that Cassandra doesn't start with num_tokens=1 (the default in 1.2.x) or, if it does, that you wipe all the data before starting it with higher num_tokens. On 19 September 2013 19:07, Robert Coli rc...@eventbrite.com wrote: On Thu, Sep 19, 2013 at 10:59

Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-19 Thread Richard Low
On 19 September 2013 20:36, Suruchi Deodhar suruchi.deod...@generalsentiment.com wrote: Thanks for your replies. I wiped out my data from the cluster and also cleared the commitlog before restarting it with num_tokens=256. I then uploaded data using sstableloader. However, I am still not

Re: nodetool cfhistograms refresh

2013-10-01 Thread Richard Low
On 1 October 2013 16:21, Rene Kochen rene.koc...@schange.com wrote: Quick question. I am using Cassandra 1.0.11 When is nodetool cfhistograms output reset? I know that data is collected during read requests. But I am wondering if it is data since the beginning (start of Cassandra) or if it

Re: Denial of Service Issue

2013-10-11 Thread Richard Low
On 11 October 2013 14:03, thorsten.s...@t-systems.com wrote: I found the issue below concerning inactive client connections (see *Cassandra Security*http://jkb.netii.net/index.php/pub/sinosqldb/cassandra-security). We are using Cassandra 1.2.4 and the Cassandra JDBC driver as client. Is

Re: Vnodes and replication

2014-04-08 Thread Richard Low
On 8 April 2014 09:29, vck veesee...@gmail.com wrote: After reading through the vnodes and partitioning described in the datastax documentation, I am still confused about how rows are partitioned/replicated. With vnodes, I know that each Node on the ring now supports many token ranges per

Re: How safe is nodetool move in 1.2 ?

2014-04-16 Thread Richard Low
On 16 April 2014 05:08, Jonathan Lacefield jlacefi...@datastax.com wrote: Assuming you have enough nodes not undergoing move to meet your CL requirements, then yes, your cluster will still accept reads and writes. However, it's always good to test this before doing it in production to ensure