Re: how to do a get_range_slices where all keys start with same string

2011-01-12 Thread Stephen Connolly
or set the end key to com.googlf On 12 January 2011 02:49, Aaron Morton aa...@thelastpickle.com wrote: If you were using OPP and get_range_slices then set the start_key to be com.google and the end_key to be . Get is slices of say 1,000 (use the last key read as the next start_ket) and when

Re: Reclaim deleted rows space

2011-01-12 Thread David Boxenhorn
I think that if SSTs are partitioned within the node using RP, so that each partition is small and can be compacted independently of all other partitions, you can implement an algorithm that will spread out the work of compaction over time so that it never takes a node out of commission, as it

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On 12 January 2011 05:28, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: Whatever I do, it happens :( On Wed, Jan 12, 2011 at 1:53 AM, Arijit Mukherjee ariji...@gmail.com wrote: I think this happens for RTF. Some of the mails in the post are RTF, and the reply button creates an RTF reply -

Re: Why my posts are marked as spam?

2011-01-12 Thread David Boxenhorn
What's wrong with topposting? This email is non-plain and topposted... On Wed, Jan 12, 2011 at 4:32 PM, zGreenfelder zgreenfel...@gmail.comwrote: On 12 January 2011 05:28, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: Whatever I do, it happens :( On Wed, Jan 12, 2011 at 1:53 AM, Arijit

Re: Why my posts are marked as spam?

2011-01-12 Thread Sven Johansson
On Wed, Jan 12, 2011 at 3:46 PM, David Boxenhorn da...@lookin2.com wrote: What's wrong with topposting? A: Because it's counterintuitive to the way we read. Q: Why is top-posting bad? ...and because it disregards context and makes a thread harder to follow. -- Sven Johansson Twitter:

Usage Pattern : unique value of a key.

2011-01-12 Thread Benoit Perroud
Hi ML, I wonder if someone has already experiment some kind of unique index on a column family key. Let's go for a short example : the key is the username. What happens if 2 users want to signup at the same time with the same username ? So has someone already addressed this pattern in Cassandra

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 9:46 AM, David Boxenhorn da...@lookin2.com wrote: What's wrong with topposting? This email is non-plain and topposted... I suspect your origin domain (lookin2.com) gets tagged less often by spam assassin (or whatever the moral equivalent being used for this list may

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 16:46 +0200, David Boxenhorn wrote: What's wrong with topposting? This email is non-plain and topposted... Because a little piece of me dies every time you do. -- Eric Evans eev...@rackspace.com

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote: And I be damned if I spam. Time to tweak some filters, eh? Maybe so. We don't have any control over that though I'm afraid. Can you submit a ticket to INFRA? https://issues.apache.org/jira/browse/INFRA On Wed, Jan 12, 2011 at 8:17 AM,

Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Which component? Mail Archives or Mail (qmail)? On Wed, Jan 12, 2011 at 9:06 AM, Eric Evans eev...@rackspace.com wrote: On Wed, 2011-01-12 at 08:39 -0800, Oleg Tsvinev wrote: And I be damned if I spam. Time to tweak some filters, eh? Maybe so. We don't have any control over that though

Re: Why my posts are marked as spam?

2011-01-12 Thread Eric Evans
On Wed, 2011-01-12 at 09:09 -0800, Oleg Tsvinev wrote: Which component? Mail Archives or Mail (qmail)? Mail would be my guess. -- Eric Evans eev...@rackspace.com

Re: Why my posts are marked as spam?

2011-01-12 Thread zGreenfelder
On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: I'm sending it from my GMail account. I'm opening a new topic, which rules out top-posting. The message had mixed fonts in it, that might be a problem. Here's what I'm getting from GMail while sending the message in

Re: Why my posts are marked as spam?

2011-01-12 Thread Oleg Tsvinev
Created: https://issues.apache.org/jira/browse/INFRA-3356 On Wed, Jan 12, 2011 at 9:25 AM, zGreenfelder zgreenfel...@gmail.comwrote: On Wed, Jan 12, 2011 at 11:39 AM, Oleg Tsvinev oleg.tsvi...@gmail.com wrote: I'm sending it from my GMail account. I'm opening a new topic, which rules out

Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Jairam Chandar
Hi folks, We have a Cassandra 0.6.6 cluster running in production. We want to run Hadoop (version 0.20.2) jobs over this cluster in order to generate reports. I modified the word_count example in the contrib folder of the cassandra distribution. While the program is running fine for small

best way to do a count

2011-01-12 Thread Michael Fortin
I was working on a schema that looks something like this: HitFamily [UUID 1] ['user-agent'] = '…' HitFamily [UUID 1] ['referer'] = '…' HitFamily [UUID 1] ['client_id'] = Long … HitCountFamily [client_id as Long] [Current Date as Long] = UUID1 What I'd like to do is count the columns between a

Re: best way to do a count

2011-01-12 Thread Aaron Morton
There is a get_count() API functionhttp://wiki.apache.org/cassandra/API, it's going tocountthe columns in a row or row+super column. This function is available in me.prettyprint.cassandra.service.KeyspaceService.There are distributed counters submitted to the

Re: unsubscribe

2011-01-12 Thread Robert Coli
On Tue, Jan 11, 2011 at 10:29 PM, Nichole Kulobone nkulob...@hotmail.com wrote: http://wiki.apache.org/cassandra/FAQ#unsubscribe =Rob

Should nodetool ring give equal load ?

2011-01-12 Thread mck
I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. When i run nodetool ring it reports Address Status State LoadOwnsToken

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:00 PM, mck m...@apache.org wrote: I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. When i run nodetool ring it reports Address         Status State   Load            Owns    Token                                                        

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread Aaron Morton
Whats happening in the cassandra server logs when you get these errors?Reading through the hadoop 0.6.6 code it looks like it creates a thrift client with an infinite timeout. So it may be an internode timeout, which is set in storage-conf.xml.AaronOn 13 Jan, 2011,at 07:40 AM, Jairam Chandar

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 18:40 +, Jairam Chandar wrote: Caused by: TimedOutException() What is the exception in the cassandra logs? ~mck -- Don't use Outlook. Outlook is really just a security hole with a small e-mail client attached to it. Brian Trosko | www.semb.wever.org | www.sesat.no |

Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
You're using an ordered partitioner and your nodes are evenly spread around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs data_file_directories` and since i've got N == RF shouldn't the data size always be the same on every node? ~mck --

Re: Timeout Errors while running Hadoop over Cassandra

2011-01-12 Thread mck
On Wed, 2011-01-12 at 23:04 +0100, mck wrote: Caused by: TimedOutException() What is the exception in the cassandra logs? Or tried increasing rpc_timeout_in_ms? ~mck -- When there is no enemy within, the enemies outside can't hurt you. African proverb | www.semb.wever.org | www.sesat.no

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:08 PM, mck m...@apache.org wrote: You're using an ordered partitioner and your nodes are evenly spread around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs data_file_directories` and since i've got N == RF

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Brandon Williams
On Wed, Jan 12, 2011 at 4:08 PM, mck m...@apache.org wrote: You're using an ordered partitioner and your nodes are evenly spread around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs data_file_directories` and since i've got N == RF

Re: Node Inconsistency

2011-01-12 Thread Peter Schuller
We will follow your suggestion and we will run Node Repair tool more often in the future. However, what happens to data inserted/deleted after Node Repair tool runs (i.e., between Node Repair and Major Compaction). It is handled as you would expect; deletions are propagated across the cluster

Re: Advice wanted on modeling

2011-01-12 Thread Peter Schuller
The application will have a large number of records, with the records consisting of a fixed part and a number (n) of periodic parts. * The fixed part is updated occasionally. * The periodic parts are never updated, but a new one is added every 5 to 10 minutes. Only the last n periodic parts

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
So you mean the coordinator node is just responsible for routing the request. where the request will be Routed? whether the coordinator node route the request to the first replica to insert the data? whether -Original Message- From: sc...@scode.org [mailto:sc...@scode.org] On Behalf

about the write consistency

2011-01-12 Thread raoyixuan (Shandy)
if I have 20 nodes, and replica factor is 3, whether all the node have the replica finally or just have 3 replica? 华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo] Phone: 28358610 Mobile: 13425182943 Email: raoyix...@huawei.commailto:raoyix...@huawei.com 地址:深圳市龙岗区坂田华为基地 邮编:518129 Huawei

Re: about the write consistency

2011-01-12 Thread Brandon Williams
2011/1/12 raoyixuan (Shandy) raoyix...@huawei.com if I have 20 nodes, and replica factor is 3, whether all the node have the replica finally or just have 3 replica? 3. -Brandon

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
I mean whether both the coordinate node and the replica node keep the insert data. Or just the replica node keep the insert data. And the coordinate node just route the insert data to the replica. Can you get me? -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent:

Re: about the insert data

2011-01-12 Thread Tyler Hobbs
The coordinator node routes the request in parallel to all of the replicas and waits for responses. One of those replicas might happen to be the coordinator itself. Only replicas read/write data they are responsible for, not the coordinator (unless the coordinator is also a replica for that

Re: Should nodetool ring give equal load ?

2011-01-12 Thread mck
On Wed, 2011-01-12 at 14:21 -0800, Ryan King wrote: What consistency level did you use to write the data? R=1,W=1 (reads happen a long time afterwards). ~mck -- It is now quite lawful for a Catholic woman to avoid pregnancy by a resort to mathematics, though she is still forbidden to resort

RE: about the insert data

2011-01-12 Thread raoyixuan (Shandy)
Thanks , I totally get it. From: Tyler Hobbs [mailto:ty...@riptano.com] Sent: Thursday, January 13, 2011 2:19 PM To: user@cassandra.apache.org Subject: Re: about the insert data The coordinator node routes the request in parallel to all of the replicas and waits for responses. One of those

about the data directory

2011-01-12 Thread raoyixuan (Shandy)
I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica 华为技术有限公司 Huawei Technologies Co., Ltd.[Company_logo] Phone: 28358610 Mobile:

RE: about the data directory

2011-01-12 Thread Viktor Jevdokimov
I have 4 nodes, then I I create one keyspace (such as FOO) with replica factor =1 and insert an data, why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica So why do you have installed 4 nodes, not 1? They're for your data to be

RE: Advice wanted on modeling

2011-01-12 Thread Steven Mac
Date: Thu, 13 Jan 2011 01:29:33 +0100 Subject: Re: Advice wanted on modeling From: peter.schul...@infidyne.com To: user@cassandra.apache.org The application will have a large number of records, with the records consisting of a fixed part and a number (n) of periodic parts. * The fixed

Old data not indexed

2011-01-12 Thread Tan Yeh Zheng
I tried to run the example on http://www.riptano.com/blog/whats-new-cassandra-07-secondary-indexes programatically. After I index the column state, I tried to get_indexed_slices (where state = 'UT') but it returned an empty list. But if I index first, then query, it'll return the correct result.

Re: Usage Pattern : quot;uniquequot; value of a key.

2011-01-12 Thread Oleg Anastasyev
Benoit Perroud benoit at noisette.ch writes: My idea to solve such use case is to have both thread writing the username, but with a colum like lock-RANDOM VALUE, and then read the row, and find out if the first lock column appearing belong to the thread. If this is the case, it can continue