Re: Cassandra on iSCSI?

2011-01-21 Thread Mick Semb Wever
Of course with a SAN you'd want RF=1 since it's replicating internally. Isn't this the same case for raid-5 as well? And we want RF=2 if we need to keep reading while doing rolling restarts? ~mck -- “Anyone who lives within their means suffers from a lack of imagination.” - Oscar Wilde |

Re: Cassandra on iSCSI?

2011-01-21 Thread Mick Semb Wever
[OT] They're quoting roughly the same price for both (claiming that the extra cost goes into having for each node a separate disk cabinet to run local raid-5). You might not need raid-5 for local attached storage. Yes we did ask. But raid-5 is the

Re: Use Cassandra to store 2 million records of persons

2011-01-21 Thread Dave Gardner
Our experience of Cassandra+Hadoop is good. We have a 16 node Cassandra cluster storing 110m users plus a 5 node Hadoop cluster. We can scan through all rows in about 2.5 hours. Dave On Thursday, 20 January 2011, David G. Boney dbon...@semanticartifacts.com wrote: I don't think the below

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Daniel Josefsson
No, what I'm thinking of is having two clusters (0.6 and 0.7) running on different ports so they can't find each other. Or isn't that configurable? Then, when I have the two clusters, I could upgrade all of the clients to run against the new cluster, and finally upgrade the rest of the Cassandra

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Dave Gardner
What about executing writes against both clusters during the changeover? Interested in this topic because we're currently thinking about the same thing - how to upgrade to 0.7 without any interruption. Dave On 21 January 2011 09:20, Daniel Josefsson jid...@gmail.com wrote: No, what I'm

client threads locked up - JIRA ISSUE 1594

2011-01-21 Thread Arijit Mukherjee
Hi All I'm facing the same issue as this one mentioned here - https://issues.apache.org/jira/browse/CASSANDRA-1594 Is there any solution or work-around for this? Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be.

Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-21 Thread Maxim Potekhin
Thanks Jonathan, sorry for confusing terms. In any event -- does one have to whack the obsoleted data manually on all nodes?? Regards Maxim On 1/20/2011 10:51 PM, Jonathan Ellis wrote: obsolete sstables are not the same thing as tombstones. On Thu, Jan 20, 2011 at 8:11 PM,

the java client problem

2011-01-21 Thread raoyixuan (Shandy)
I exec the code as below by hector client: package com.riptano.cassandra.hector.example; import me.prettyprint.cassandra.serializers.StringSerializer; import me.prettyprint.hector.api.Cluster; import me.prettyprint.hector.api.Keyspace; import me.prettyprint.hector.api.beans.HColumn; import

Re: the java client problem

2011-01-21 Thread Ashish
you are missing the column family in your keyspace. If you are using the default definitions of schema shipped with cassandra, ensure to load the schema from JMX. thanks ashish 2011/1/21 raoyixuan (Shandy) raoyix...@huawei.com I exec the code as below by hector client: *package*

RE: the java client problem

2011-01-21 Thread raoyixuan (Shandy)
Which schema is it? From: Ashish [mailto:paliwalash...@gmail.com] Sent: Friday, January 21, 2011 7:57 PM To: user@cassandra.apache.org Subject: Re: the java client problem you are missing the column family in your keyspace. If you are using the default definitions of schema shipped with

Re: the java client problem

2011-01-21 Thread Ashish
check cassandra-install-dir/conf/cassandra.yaml start cassandra connect via jconsole find MBeans - org.apache.cassandra.db - StorageServicehttp://wiki.apache.org/cassandra/StorageService - Operations - loadSchemaFromYAML load the schema and then try the example again. HTH ashish 2011/1/21

Multiple indexes - how does Cassandra handle these internally?

2011-01-21 Thread buddhasystem
Greetings -- if I use multiple secondary indexes in the query, what will Cassandra do? Some examples say it will index on first EQ and then loop on others. Does it ever do a proper index product to avoid inner loops? Thanks Maxim -- View this message in context:

the problem of elasticity

2011-01-21 Thread 魏金仙
I've no idea why it doesn't work well. We are testing Elasticity of Cassandra 0.6.6.We choose orderPreservingPartitioner and set replicationFactor as 2. We start from 6-server cluster(node A, B,C,D,E,F), which is load balanced. roughly every node has 12GB. we then add node G between A and B.

java.lang.AssertionError in MessagingService.receive during heavy write.

2011-01-21 Thread Michael Haspra
Hi all, I get the following error when I have cassandra running on 2 nodes (I don't get it when I start only one node). The startup on both nodes seems to be fine (e.g no error messages). Then I set up a keyspace and insert some data on one node, that also works. I start to insert data on both

Re: Multiple indexes - how does Cassandra handle these internally?

2011-01-21 Thread Timo Nentwig
On Jan 21, 2011, at 13:55, buddhasystem wrote: if I use multiple secondary indexes in the query, what will Cassandra do? Some examples say it will index on first EQ and then loop on others. Does it ever do a proper index product to avoid inner loops? Just asked the same question on the

Re: Multiple indexes - how does Cassandra handle these internally?

2011-01-21 Thread Maxim Potekhin
But Timo, this is even more mysterious! If both conditions are met, at least something must be returned in the second query. Have you tried this in CLI? That would allow you to at least alleviate client concerns. On 1/21/2011 10:38 AM, Timo Nentwig wrote: On Jan 21, 2011, at 13:55, buddhasystem

Re: Multiple indexes - how does Cassandra handle these internally?

2011-01-21 Thread Timo Nentwig
On Jan 21, 2011, at 16:46, Maxim Potekhin wrote: But Timo, this is even more mysterious! If both conditions are met, at least something must be returned in the second query. Have you tried this in CLI? That would allow you to at least alleviate client concerns. I did this on the CLI only so

Re: Embedded Cassandra server startup question

2011-01-21 Thread Anand Somani
It is a little slow not to the point where it concerns me (only have few tests for now), but keeps things very clean so no surprise effects. On Thu, Jan 20, 2011 at 6:33 PM, Roshan Dawrani roshandawr...@gmail.comwrote: On Fri, Jan 21, 2011 at 5:14 AM, Anand Somani meatfor...@gmail.comwrote:

Re: Multiple indexes - how does Cassandra handle these internally?

2011-01-21 Thread Maxim Potekhin
Well it does sound like a bug in Cassandra. Indexes MUST commute. I really need this functionality, it's a show stopper for me... On 1/21/2011 10:56 AM, Timo Nentwig wrote: On Jan 21, 2011, at 16:46, Maxim Potekhin wrote: But Timo, this is even more mysterious! If both conditions are met, at

Re: client threads locked up - JIRA ISSUE 1594

2011-01-21 Thread Nate McCall
What versions of Cassandra and Hector? The versions mentioned on this ticket are both several releases behind. On Fri, Jan 21, 2011 at 3:53 AM, Arijit Mukherjee ariji...@gmail.com wrote: Hi All I'm facing the same issue as this one mentioned here -

Re: UnserializableColumnFamilyException: Couldn't find cfId

2011-01-21 Thread Ching-Cheng Chen
We have similar exception before, and the root cause was like Aaron mentioned. You will encounter this exception If you have code create CF on the fly and data was insert into the node which hasn't got schema synced yet. You will have to call describe_schema_version() to ensure all nodes has

The authorize method of IAuthority

2011-01-21 Thread indika kumara
Hi All, Shouldn't the existing method be changed to the following? public boolean authorize(AuthenticatedUser user, ListObject resource, Permission permission); // checks the authority for a given user for a given resource for a given permission The existing method: public EnumSetPermission

Re: Cassandra on iSCSI?

2011-01-21 Thread Jonathan Ellis
On Fri, Jan 21, 2011 at 2:19 AM, Mick Semb Wever m...@apache.org wrote: Of course with a SAN you'd want RF=1 since it's replicating internally. Isn't this the same case for raid-5 as well? No, because the replication is (mainly) to protect you from machine failures; if the SAN is a SPOF then

Re: The authorize method of IAuthority

2011-01-21 Thread Eric Evans
On Fri, 2011-01-21 at 22:45 +0600, indika kumara wrote: Shouldn't the existing method be changed to the following? public boolean authorize(AuthenticatedUser user, ListObject resource, Permission permission); // checks the authority for a given user for a given resource for a given

Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-21 Thread Jonathan Ellis
No, that's explained in the link Aaron gave. On Fri, Jan 21, 2011 at 5:20 AM, Maxim Potekhin potek...@bnl.gov wrote: Thanks Jonathan, sorry for confusing terms. In any event -- does one have to whack the obsoleted data manually on all nodes?? Regards Maxim On 1/20/2011 10:51 PM,

Re: java.lang.AssertionError in MessagingService.receive during heavy write.

2011-01-21 Thread Jonathan Ellis
What version? On Fri, Jan 21, 2011 at 9:20 AM, Michael Haspra mhas...@gmail.com wrote: Hi all, I get the following error when I have cassandra running on 2 nodes (I don't get it when I start only one node). The startup on both nodes seems to be fine (e.g no error messages). Then I set up a

Re: java.lang.AssertionError in MessagingService.receive during heavy write.

2011-01-21 Thread Michael Haspra
Oh sorry: The version is 0.7.0-beta3-SNAPSHOT 2011/1/21 Jonathan Ellis jbel...@gmail.com What version? On Fri, Jan 21, 2011 at 9:20 AM, Michael Haspra mhas...@gmail.com wrote: Hi all, I get the following error when I have cassandra running on 2 nodes (I don't get it when I start

Re: Distributed counters

2011-01-21 Thread Rustam Aliyev
Hi Kelvin, Thanks for sharing! That's exactly what I was looking for. Good luck with the migration. Regards, Rustam. On 20/01/2011 17:40, Kelvin Kakugawa wrote: Hi Rustam, All of our large production clusters are still on 0.6.6. However, we have an 0.7 branch, here:

GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Joseph Stein
I hear that a bunch of folks have GeoIndexing built on top of Cassandra and running in production. Any of them open sourced (Twitter? SimpleGeo? Bueller?) planning on it? /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
Not open source, but here's a preso on how simplegeo do it: http://www.slideshare.net/mmalone/scaling-gis-data-in-nonrelational-data-stores Note: we do it very differently here at Twitter (but aren't at liberty to discuss in detail)– I say this just to point out that there are several valid

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Jake Luciani
One possible open source approach would be to use the Solr 1.4 spatial plugin[1] along with Solandra[2] What kind of spatial searches are you looking for? basic bounding box/radius? [1] https://github.com/outoftime/solr-spatial-light [2] https://github.com/tjake/lucandra On Fri, Jan 21, 2011

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Mike Malone
A more recent preso I gave about the SimpleGeo architecture is up at http://strangeloop2010.com/system/talks/presentations/000/014/495/Malone-DimensionalDataDHT.pdf Mike On Fri, Jan 21, 2011 at 10:02 AM, Joseph Stein crypt...@gmail.com wrote: I hear that a bunch of folks have GeoIndexing built

Re: Cassandra on iSCSI?

2011-01-21 Thread Edward Capriolo
On Fri, Jan 21, 2011 at 12:07 PM, Jonathan Ellis jbel...@gmail.com wrote: On Fri, Jan 21, 2011 at 2:19 AM, Mick Semb Wever m...@apache.org wrote: Of course with a SAN you'd want RF=1 since it's replicating internally. Isn't this the same case for raid-5 as well? No, because the replication

Re: java.lang.AssertionError in MessagingService.receive during heavy write.

2011-01-21 Thread Jonathan Ellis
I don't see an assert in current 0.7 MessagingService that looks like a candidate for that. So it's probably fixed. Since apparently you're comfortable running snapshot builds, I'd upgrade to the latest 0.7 branch. At least then you'd be running into new bugs and not two month old ones. On

Re: Cassandra on iSCSI?

2011-01-21 Thread Anthony John
Sort of - do not agree!! This is the Shared nothing V/s Shared Disk debate. There are many mainstream RDBMS products that pretend to do horizontal scalability with Shared Disks. They have the kinds of problems that Cassandra is specifically architected to avoid! The original question here has 2

Re: Does Major Compaction work on dropped CFs? Doesn't seem so.

2011-01-21 Thread Peter Schuller
What's strange anyhow is that the GC period for these cfs expired some days ago. I thought that a compaction would take care of these tombstones. I used nodetool to compact. I think the confusion here is that GC when mentioned in terms of sstable removal refers to the JVM garbage collection.

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Aaron Morton
Yup, you can use diff ports and you can give them different cluster names and different seed lists. After you upgrade the second cluster partition the data should repair across, either via RR or the HHs that were stored while the first partition was down. Easiest thing would be to run node

Re: the problem of elasticity

2011-01-21 Thread Peter Schuller
In general if you think the data is not distributed correctly run nodetool repair on the node. http://wiki.apache.org/cassandra/Operations#Repairing_missing_or_inconsistent_data And before expecting the old node to throw away it's data, 'nodetool cleanup' is required (but don't do this while

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Dave Viner
I agree. I am running a 0.6 cluster and would like to upgrade to 0.7. But, I can not simply stop my existing nodes. I need a way to load a new cluster - either on the same machines or new machines - with the existing data. I think my overall preference would be to upgrade the cluster to 0.7

Re: java.lang.AssertionError in MessagingService.receive during heavy write.

2011-01-21 Thread Michael Haspra
It seams that this error was caused by an extension sending wrong messages around, so that Message.getMessageType(). would return null since the verb was not known to cassandra. Unfortunately I couldn't tell from the error. But upgrading would be a good idea anyway... 2011/1/21 Jonathan Ellis

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Joseph Stein
On Fri, Jan 21, 2011 at 1:49 PM, Mike Malone m...@simplegeo.com wrote: A more recent preso I gave about the SimpleGeo architecture is up at http://strangeloop2010.com/system/talks/presentations/000/014/495/Malone-DimensionalDataDHT.pdf Mike On Fri, Jan 21, 2011 at 10:02 AM, Joseph Stein

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Anthony Molinaro
Dual writes would require you to have both a 0.6 and 0.7 client in the same code base unless you have some sort of intermediate file or queue or something. Since 0.6 and 0.7 use the same names in their thrift files this won't work, thus my suggestion of adding a second service to the 0.6 and 0.7

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
On Fri, Jan 21, 2011 at 12:24 PM, Joseph Stein crypt...@gmail.com wrote: Thanks Ryan, Jake and Mike for the quick responses. I will mull through this weekend between engineering things from scratch or going the Solr/Solandra route as Jake points out is an option (and the effort/time related

Re: Upgrading from 0.6 to 0.7.0

2011-01-21 Thread Stephen Connolly
the maven shade plugin might be able to help somewhat... if I get some spare cycles I'll have a look at knocking up a thrift proxy that either makes 0.7 appear as 0.6 or vice versa - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense are

Re: Question re: the use of multiple ColumnFamilies

2011-01-21 Thread Robert Coli
On 1/18/11, Andy Burgess andy.burg...@rbsworldpay.com wrote: Sorry for the delayed reply, but thanks very much - this pointed me at the exact problem. I found that the queue size here was equal to the number of configured DataFileDirectories, so a good test was to lie to Cassandra and claim

Re: Question re: the use of multiple ColumnFamilies

2011-01-21 Thread Peter Schuller
A number of people have experienced lose from using multiple DataFileDirectories, and to my knowledge no one has experienced win from doing so. I presume that's disk space reasons. Do you have an actual use case for this functionality in which you experience win? I understood his use case

Re: The authorize method of IAuthority

2011-01-21 Thread indika kumara
Thanks Eric for the clarification. On Fri, Jan 21, 2011 at 11:11 PM, Eric Evans eev...@rackspace.com wrote: On Fri, 2011-01-21 at 22:45 +0600, indika kumara wrote: Shouldn't the existing method be changed to the following? public boolean authorize(AuthenticatedUser user, ListObject