Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Not sure if anyone has seen this before but it's really killing me right now. Perhaps that was too long of a description of the issue so here's a more succinct question -- How do I remove nodes associated with a cluster that contain no data and have no reason to be associated with the cluster

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
I don't think that's what I'm after here since the unwanted nodes were originally assimilated into the cluster with the same initial_token values as other nodes that were already in the cluster (that have, and still do have, useful data). I know this is an awkward situation so I'll try to depict

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Stephen, this is a great idea but unfortunately doesn't work for us either as we can not store the data in an unencrypted form. Kind regards Matthias On 10/12/2011 07:42 PM, Stephen Connolly wrote: could you prefix the data with 3-4 bytes of a linear hash of the unencypted data? it

Re: Existing column(s) not readable

2011-10-13 Thread Thomas Richter
Hi Aaron, I guess i found it :-). I added logging for the used IndexInfo to SSTableNamesIterator.readIndexedColumns and got negative index postions for the missing columns. This is the reason why the columns are not loaded from sstable. So I had a look at ColumnIndexer.serializeInternal and

Re: Cassandra as session store under heavy load

2011-10-13 Thread Maciej Miklas
durable_writes sounds great - thank you! I really do not need commit log here. Another question: it is possible to configure live time of Tombstones? Regards, Maciej

Re: Storing pre-sorted data

2011-10-13 Thread Zach Richardson
Matthias, This is an interesting problem. I would consider using long's as the column type, where your column names are evenly distributed longs in sort order when you first write your list out. So if you have items A and C with the long column names 1000 and 2000, and then you have to insert

Re: [Solved] column index offset miscalculation (was: Existing column(s) not readable)

2011-10-13 Thread Sylvain Lebresne
JIRA is not read-only, you should be able to create a ticket at https://issues.apache.org/jira/browse/CASSANDRA, though that probably require that you create an account. -- Sylvain On Thu, Oct 13, 2011 at 3:20 PM, Thomas Richter t...@tricnet.de wrote: Hi Aaron, the fix does the trick. I

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread hani elabed
Hi Dean, I don't have have an answer to your question, but just in case you haven't seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot. http://blip.tv/datastax/indexing-in-cassandra-5495633 Hani On Wed, Oct 12, 2011 at 12:18 PM, Dean Hiller d...@alvazan.com wrote: I

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Mohit Anchlia
Do you have same seed node specified in cass-analysis-1 as cass-1,2,3? I am thinking that changing the seed node in cass-analysis-2 and following the directions in http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve the problem. Somone please correct me. On Thu, Oct 13, 2011 at

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Zach, thanks for that good idea. Unfortunately, our list needs to be rewritten often because our data is far away from being evenly distributed. However, we could get this under control but there is a more severe problem: Random access is very hard to implement on a structure with

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Nope, there was definitely no intersection of the seed nodes between the two clusters so I'm fairly certain that the second cluster found out about the first through what was in the LocationInfo* system tables. Also, I don't think that procedure will really help because I don't actually want the

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Brandon Williams
You're running into https://issues.apache.org/jira/browse/CASSANDRA-3259 Try upgrading and doing a rolling restart. -Brandon On Thu, Oct 13, 2011 at 9:11 AM, Eric Czech e...@nextbigsound.com wrote: Nope, there was definitely no intersection of the seed nodes between the two clusters so I'm

Re: [Solved] column index offset miscalculation

2011-10-13 Thread Thomas Richter
Thanks for the hint. Ticket created: https://issues.apache.org/jira/browse/CASSANDRA-3358 Best, Thomas On 10/13/2011 03:27 PM, Sylvain Lebresne wrote: JIRA is not read-only, you should be able to create a ticket at https://issues.apache.org/jira/browse/CASSANDRA, though that probably

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread Dean Hiller
great video, thanks! On Thu, Oct 13, 2011 at 7:45 AM, hani elabed hani.ela...@gmail.com wrote: Hi Dean, I don't have have an answer to your question, but just in case you haven't seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot.

Re: Hector Problem Basic one

2011-10-13 Thread Patricio Echagüe
Hi, Hector does not retry on a down server. In the unit tests where you have just one server, Hector will pass the exception to the client. Can you tell us please what your test looks like ? 2011/10/12 Wangpei (Peter) peter.wang...@huawei.com I only saw this error message when all Cassandra

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM,

Re: Efficiency of hector's setRowCount

2011-10-13 Thread Patricio Echagüe
Hi Don. No it will not. IndexedSlicesQuery will read just the amount of rows specified by RowCount and will go to the DB to get the new page when needed. SetRowCount is doing indexClause.setCount(rowCount); On Mon, Oct 10, 2011 at 3:52 PM, Don Smith dsm...@likewise.com wrote: Hector's

Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Don Smith
It's actually setStartKey that's the important method call (in combination with setRowCount). So I should have been clearer. The following code performs as expected, as far as returning the expected data in the expected order. I believe that the use of IndexedSliceQuery's setStartKey will

Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Patricio Echagüe
On Thu, Oct 13, 2011 at 9:39 AM, Don Smith dsm...@likewise.com wrote: ** It's actually setStartKey that's the important method call (in combination with setRowCount). So I should have been clearer. The following code performs as expected, as far as returning the expected data in the

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Zach, thanks for your additional input. You are absolutely right: The long namespace should be big enough. We are going to insert up to 2^32 values into the list. We only need support for get(index), insert(index) and remove(index) while get and insert will be used very often. Remove is

Re: MapReduce with two ethernet cards

2011-10-13 Thread Brandon Williams
What is your rpc_address set to? If it's 0.0.0.0 (bind everything) then that's not going to work if listen_address is blocked. -Brandon On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote: I upgraded to cassandra 0.8.7, and the problem persists. Scott

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
The listen address on all machines are set to the 10.1.1.* addresses, while the thrift rpc address is the 172.28.* addresses From: Brandon Williams [dri...@gmail.com] Sent: Thursday, October 13, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re:

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
When I look at the source for ColumnFamilyInputFormat, it appears that it does a call to client.describe_ring; when you do the equivalent call with nodetool, you get the 10.1.1.* addresses. This seems to indicate to me that I should open up the firewall and attempt to contact those IPs

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
in theory, however they have less than 32 bits of entropy from which they can do that, leaving them with at least 32 more bits of combinations to try... that's 2 billion or so... must be a big dictionary - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Stephen, we are hashing the first 8 byte (8 US-ASCII characters) of text that has been written by humans. Wouldn't it be easy for the attacker to do a dictionary attack on this text, especially if he knows the language of the text? Kind regards Matthias On 10/13/2011 08:20 PM, Stephen

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
Then just use a soundex function on the first word in the text... that will shrink it sufficiently and give nice buckets in near sequential order (http://en.wikipedia.org/wiki/Soundex) On 13 October 2011 21:21, Matthias Pfau p...@l3s.de wrote: Hi Stephen, we are hashing the first 8 byte (8

Re: Cassandra as session store under heavy load

2011-10-13 Thread Jonathan Ellis
Or upgrade to 1.0 and use leveled compaction (http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra) On Thu, Oct 13, 2011 at 4:28 PM, aaron morton aa...@thelastpickle.com wrote: They only have a minimum time, gc_grace_seconds for deletes. If you want to be really watch disk

Restore snapshots suggestion

2011-10-13 Thread Daning
If I need to restore snapshots from all nodes, but I can only shutdown one node a time since it is production, is there a way I can stop data syncing between nodes temporarily? I don't want the existing data overwrites the snapshot. I found this undocumented parameter

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Thanks Brandon! Out of curiosity, would making schema changes through a thrift interface (via hector) be any different? In other words, would using hector instead of the cli make schema changes possible without upgrading? On Thu, Oct 13, 2011 at 8:22 AM, Brandon Williams dri...@gmail.com wrote: