Re: migrating from 0.6 to 0.8, java.io.IOError: ... cannot extend file to required size

2011-08-11 Thread Zhu Han
On Wed, Aug 10, 2011 at 5:24 PM, aaron morton aa...@thelastpickle.comwrote: I remember seeing this once before upgrading a system from 0.6 to 0.7 on a Ubuntu EC2 (non data stax build) with EBS disks. I did the same thing and just assumed it was an EBS or 0.6 bug. From memory after the upgrade

AW: IndexSliceQuery issue - ReadStage piling up (looks like deadlock/infinite loop or similar)

2011-08-11 Thread Roland Gude
Yes, i can reproduce this behavior If issue a query like this (on 0.7.8 with patch for CASSANDRA-2964 applied) [default@demo]get users where birth_date = 1968 and state = 'UT'; with an index on birth_date but no index on state I do not get results (actually I get '0 rows') even though there are

Re: SOLVED: Problem upgrading to 0.8.3 - replication_factor is an option for SimpleStrategy, not NetworkTopologyStrategy

2011-08-11 Thread Martin Lansler
Hi Eldad / All, On Wed, Aug 10, 2011 at 8:32 AM, Eldad Yamin elda...@gmail.com wrote: Can you please explain how did you upgraded. something like step-by-step. Thanks! I took the liberty of replying to the group as it would be interesting to hear how other folks out there are doing it... I'm

Re: Need help in CF design

2011-08-11 Thread Benoit Perroud
You can apply this query really simply using cassandra and secondary indexes. You will have a CF TABLE, where row keys are your PK. Just to be sure of my understanding, your SQL query will either return 1 row or no row, right ? 3) SliceQuery returns a range of columns for a given key, it

Re: Tuning a column family for archival

2011-08-11 Thread Edward Capriolo
On Thu, Aug 11, 2011 at 12:07 AM, aaron morton aa...@thelastpickle.comwrote: There's not much to do other than turn off the caches (which you have done) and leave it alone. If you want to poke around perhaps look at the compaction settings (from CLI help): - max_compaction_threshold: The

Best practices when deploying upgrading a cassandra cluster

2011-08-11 Thread Martin Lansler
(Note: This is a repost from another thread which did not have a relevant subject, sorry for the spamming) Hi Eldad / All, On Wed, Aug 10, 2011 at 8:32 AM, Eldad Yamin elda...@gmail.com wrote: Can you please explain how did you upgraded. something like step-by-step. Thanks! I took the liberty

Re: SOLVED: Problem upgrading to 0.8.3 - replication_factor is an option for SimpleStrategy, not NetworkTopologyStrategy

2011-08-11 Thread Martin Lansler
Please continue discussion in new thread Best practices when deploying upgrading a cassandra cluster. -Martin On Thu, Aug 11, 2011 at 12:43 PM, Martin Lansler martin.lans...@gmail.com wrote: Hi Eldad / All, On Wed, Aug 10, 2011 at 8:32 AM, Eldad Yamin elda...@gmail.com wrote: Can you please

Re: Ec2Snitch

2011-08-11 Thread Viliam Holub
Yup, work perfectly now. Thanks, V. On 10. Aug (Wednesday) v 20:45:36 -0500 2011, Brandon Williams wrote: You probably have other nodes that are NOT using the snitch yet, so they haven't populated DC/RACK info yet. The exceptions will stop when all snitches have been changed. On Wed, Aug

performance problems on new cluster

2011-08-11 Thread Anton Winter
Hi, I have recently been migrating to a small 12 node Cassandra cluster spanning across 4 DC's and have been encountering various issues with what I suspect to be a performance tuning issue with my data set. I've learnt a few lessons along the way but I'm at a bit of a roadblock now where I

Running cassandra on a Blades + SAN

2011-08-11 Thread David McNelis
Hey folks, I was wondering if anyone has a cassandra cluster running on a server setup using blades, with a SAN appliance as the data file storage medium. I would expect that there would be performance let-downs if the SAN was connected with anything other than a fiber channel, but are there

Re: Running cassandra on a Blades + SAN

2011-08-11 Thread Jonathan Ellis
The SAN model defeats some of the point of using Cassandra, but a hybrid of commitlog-local would be better than everything on san. http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-on-iSCSI-td5945217.html is a good thread for background. On Thu, Aug 11, 2011 at 8:56 AM,

Re: Solandra/Cassandra ring issue

2011-08-11 Thread Ashley Martens
No shared seeds. Downright freaky.

Re: Solandra/Cassandra ring issue

2011-08-11 Thread Jake Luciani
seriously, If you change the cluster name in cassandra.yaml they won't join. On Thu, Aug 11, 2011 at 12:31 PM, Ashley Martens amart...@ngmoco.comwrote: No shared seeds. Downright freaky. -- http://twitter.com/tjake

[RELEASE] Apache Cassandra 0.8.4 released

2011-08-11 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 0.8.4. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here:

Re: Cassandra on Java SE 7?

2011-08-11 Thread Edward Capriolo
On Thu, Aug 11, 2011 at 11:32 AM, Martin Lansler martin.lans...@gmail.comwrote: Hi, Has anybody run Cassandra on Java SE 7? Any issues/caveats? I'm on a mac so I have to wait a while :-( -Martin Now mind you I am running Cassandra 0.7, Tomcat 7.0, and Hive 0.7, so I love anything once

Re: ColumnFamilyOutputFormat problem

2011-08-11 Thread Jian Fang
53 seconds included the map phase to read and process the input file. The records were updated at the end of the reduce phase. I checked the sales ranks in the update file and the sales ranks in the Cassandra, they are different and thus, the records were not actually updated. I remember I run

Re: Tuning a column family for archival

2011-08-11 Thread Jason Baker
On Thu, Aug 11, 2011 at 6:14 AM, Edward Capriolo edlinuxg...@gmail.comwrote: In many regards Cassandra automatically does the correct thing. Other then the costs of the bloom filters for the table size being in ram, if you never read or write to those sstables and you are not reusing the row

Re: Tuning a column family for archival

2011-08-11 Thread Jonathan Ellis
No. He's saying that one of the points of mmaping the data files is that the OS is free to only keep files that are actually used, in the page cache. Since this data is backed by an actual file swap is not involved. On Thu, Aug 11, 2011 at 12:59 PM, Jason Baker ja...@apture.com wrote: On Thu,

Re: migrating from 0.6 to 0.8, java.io.IOError: ... cannot extend file to required size

2011-08-11 Thread aaron morton
I did a test upgrade first and ran the scrub as part of that process to make sure everything was working. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 11 Aug 2011, at 21:47, Zhu Han wrote: On Wed, Aug 10, 2011 at 5:24

Re: Need help in CF design

2011-08-11 Thread aaron morton
1) Is it possible to design to get equivalent results for above query ( using CQL or Hector) with Cassandra. If this is a common query in your app it's god idea to design the data model to support the request. Seems safe to assume the PK in your example is non unique, I'll call it the FKID

Re: [RELEASE] Apache Cassandra 0.8.4 released

2011-08-11 Thread Ian Danforth
Would you be so kind as to announce the rpm release as well? On Thu, Aug 11, 2011 at 10:52 AM, Sylvain Lebresne sylv...@datastax.comwrote: The Cassandra team is pleased to announce the release of Apache Cassandra version 0.8.4. Cassandra is a highly scalable second-generation distributed

Re: Best practices when deploying upgrading a cassandra cluster

2011-08-11 Thread aaron morton
In a non dev system it's a lot easier to use the packages http://wiki.apache.org/cassandra/DebianPackaging http://www.datastax.com/docs/0.8/install/packaged_releases Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 12 Aug 2011, at

Re: tpstats confusion

2011-08-11 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2889 On Thu, Aug 11, 2011 at 2:55 PM, Ian Danforth idanfo...@numenta.com wrote: I don't think so, perhaps more succinctly, why would ROW actions only be performed on a single node? Ian On Wed, Aug 10, 2011 at 8:12 PM, Jonathan Ellis

Re: performance problems on new cluster

2011-08-11 Thread aaron morton
Is there a reason you are using the trunk and not one of the tagged releases? Official releases are a lot more stable than the trunk. 1) thrift timeouts general degraded response times For read or writes ? What sort of queries are you running ? Check the local latency on each node using

Re: tpstats confusion

2011-08-11 Thread Ian Danforth
I'm writing at QUORUM though and (pardon me for being dense) it would seem that the first replica shouldn't always be on the same server if I'm using RP. I very much appreciate your time, I'm sure there is something basic that just hasn't clicked. Ian On Thu, Aug 11, 2011 at 2:56 PM, Jonathan

RE: sstableloader throws storage_port error

2011-08-11 Thread Tom Davidson
I am trying to sstableloader and I do not want to access Cassandra on the same node. I haved edited my Cassandra.yaml to with appropriate values for the listen_address and rpc_address but I keep getting the error below. The Cassandra-cli tool, nodetool etc. works find when trying to connect to

Re: [RELEASE] Apache Cassandra 0.8.4 released

2011-08-11 Thread Nate McCall
RPMs for 0.8.4 are available via http://rpm.datastax.com/ Note - the Apache Cassandra team has nothing to do with the above RPMs. We (DataStax) maintain this repository independently and try to deploy them as close to the official releases as possible. On Thu, Aug 11, 2011 at 4:14 PM, Ian

Re: performance problems on new cluster

2011-08-11 Thread Anton Winter
Is there a reason you are using the trunk and not one of the tagged releases? Official releases are a lot more stable than the trunk. Yes, as we are using a combination of Ec2 and colo servers we are needing to use broadcast_address from CASSANDRA-2491. The patch that is associated with

Re: sstableloader throws storage_port error

2011-08-11 Thread Jonathan Ellis
Unable to bind to address nadevsan04/10.168.121.57:7000 means something else is using that address/port. netstat can tell you what process that is, if you're not sure. On Thu, Aug 11, 2011 at 4:24 PM, Tom Davidson tdavid...@covario.com wrote: I am trying to sstableloader and I do not want to

Re: column metadata and sstable

2011-08-11 Thread Yi Yang
Thanks Aaron, This is the same as I've thought. But I'm wondering if there's come triggers that can hash the column name in order to save disk space. Or do you think it's better to have this feature? Best, Steve On Aug 6, 2011, at 7:06 PM, aaron morton wrote: AFAIK it just makes it easier

Re: tpstats confusion

2011-08-11 Thread aaron morton
I've not checked the code but from memory when the nodes are ordered in proximity to the coordinator the local node is always first if it's in the replica set. So with RF=3 and N=3 the closest node is always the local one. Cheers - Aaron Morton Freelance Cassandra Developer

Re: performance problems on new cluster

2011-08-11 Thread aaron morton
iostat doesn't show a request queue bottleneck. The timeouts we are seeing is for reads. The latency on the nodes I have temporarily used for reads is around 2-45ms. The next token in the ring at an alternate DC is showing ~4ms with everything else around 0.05ms. tpstats desn't show

Re: Client traffic encryption best practices....

2011-08-11 Thread Vijay
https://issues.apache.org/jira/browse/THRIFT-106 seems to be the right way to go but the cassandra server needs to support too which we might want to add Regards, /VJ On Thu, Aug 11, 2011 at 2:54 PM, Chris Marino ch...@vcider.com wrote: Hello, is there any consensus on how to secure