Re: JNA C library errors on OSX

2011-04-27 Thread John Lennard
Thanks for that. If it is a known issue I will leave it be as everything is working fine without it. Have no intention of deploying on OSX, was more that seeing the error puzzled me some what. Cheers John On 26/04/2011, at 2:05 AM, Jonathan Ellis wrote: Pretty sure this is b/c OS X

Expanding single node to 2 node cluster

2011-04-27 Thread maneela a
Hi, I had a 2 node cassandra cluster with replication factor 2 and OrderPreservingPartitioner but we did not provide InitialToken in the configuration files. One of the node was affected in the recent AWS EBS outage and had been partitioned from cluster. However, I continued to allowed all

Re: Manual Conflict Resolution in Cassandra

2011-04-27 Thread Oleg Anastasyev
David Strauss david at davidstrauss.net writes: You can actually already perform manual conflict resolution in Cassandra by naming your columns so that they don't squash each other in Cassandra's internal replication. Then, ensure the code that accesses Cassandra reads all columns with data

Heavy writes ok for single node, but failed for cluster

2011-04-27 Thread Sheng Chen
I succeeded to insert 1 billion records into a single node cassandra, bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 Inserts finished in about 14 hours at a speed of 20k/sec. But when I added another node, tests always failed with UnavailableException in an hour. bin/stress -d

Re: Heavy writes ok for single node, but failed for cluster

2011-04-27 Thread Sylvain Lebresne
On Wed, Apr 27, 2011 at 10:32 AM, Sheng Chen chensheng2...@gmail.com wrote: I succeeded to insert 1 billion records into a single node cassandra, bin/stress -d cas01 -o insert -n 10 -c 5 -S 34 -C5 -t 20 Inserts finished in about 14 hours at a speed of 20k/sec. But when I added another

read failed?

2011-04-27 Thread pob
Hello, im expecting this problem: with c-cli: get messagesContent['558a512f30a46f55e75e63f2f816f7435283269f92070618ba9213c0bfac730f']; Returned 33 results. within pycassa code: server_list=['SERVER:9160',], prefill=False, pool_size=15, max_overflow=10, max_retries=-1, timeout=5,

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-27 Thread Pierre-Yves Ritschard
Thanks Jonathan, Should I repackage myself or do you think updated Debian packages will be made available shortly ? Regards, - pyr On mar., 2011-04-26 at 11:47 -0500, Jonathan Ellis wrote: https://issues.apache.org/jira/browse/CASSANDRA-2549 is open to fix this

Re: encryption_options 0.8

2011-04-27 Thread Sasha Dolgy
IBM WebSphere applies a hardcoded XOR. Each caracter is XOR'd with the caracter ‘_’, and the resulting string is encoded in base64. This is not cryptography, it is just enough encoding so that a casual glance at the file will not reveal the password. I'm sure there are many different options.

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
While I haven't configured it for multi-region yet, Sasha is exactly right now how amzon's DNS works (returning private vs. public IP depending on if the machine is local to the region or not). For extra fun, now that Route53 exists you can (somewhat trivially) map and dynamically maintain all

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
It's great advice, but I'm still torn. I've never done multi-region work before, and I'd prefer to wait for 0.8 with built-in inter-node security, but I'm otherwise ready to roll (and need to roll) cassandra out sooner than that. Given how well my system held up with a total single AZ failure,

Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
Hi William, The default behavior of Ec2Snitch is outlined below: http://svn.apache.org/repos/asf/cassandra/trunk/src/java/org/apache/cassandra/locator/Ec2Snitch.java // Split us-east-1a or asia-1a into us-east/1a and asia/1a. String azone = new String(b ,UTF-8); String[]

Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
if you migrate the instance, does Route53 automatically re-map all the information to the new ec2 instance? another issue is that cassandra only maintains the IP of the other nodes, and not the hostname (assumed based on output of the nodetool ring) ... which means, if you migrate the instance

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
Thanks Sasha. Fortunately/unfortunately I did realize the default current behavior of the Ec2Snitch, but my application isn't multi-region capable (yet), so I need to get intra-region redundancy. And having a SingleRegionEc2Snitch that did DC=ec2zone and RACK=??? would be much better for me

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
I don't think of it as migrating an instance, it's more of a destroy/start with EC2. But, I still think it would be very useful to spin up a set of instances with known hostnames (cassandra1, 2, 3... N) and be able to quickly SSH to them by doing ssh ec2u...@cassandra1.random.ec2.mydomain.com .

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
Oh, and Route53 doesn't do anything automatically, but there is an API to manage the DNS. It's up to you to run a task on instance boot/terminate, or a cron job if you want to do this trick (for now, seems like a solid future feature of Route53). Though, I hear geographical aware Route53 is

Re: advice for EC2 deployment

2011-04-27 Thread Sasha Dolgy
so can you not simply leverage a strategy that replicates data between racks and at some point in the future when you move to multi-dc upgrade the replication strategy to maintain the current replication and add in some replication between DC's ... ? i'll go re-read your posts to see if you've

nodes reference by hostname and not IP

2011-04-27 Thread Sasha Dolgy
Hi , Silly question maybe ... but came to me in the Ec2 thread. Is there a design reason why cassandra stores nodes as IP addresses and not hostnames? -- Sasha Dolgy sasha.do...@gmail.com

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-27 Thread Stephen Connolly
Similar issue with the RPMs from riptano On 27 April 2011 11:01, Pierre-Yves Ritschard p...@milestonelab.com wrote: Thanks Jonathan, Should I repackage myself or do you think updated Debian packages will be made available shortly ? Regards, - pyr On mar., 2011-04-26 at 11:47 -0500,

Re: advice for EC2 deployment

2011-04-27 Thread William Oberman
I think you're right about changing NetworkToplogyStrategy, but the timing isn't working in my favor at this point. I wonder how bad that will really be On Wed, Apr 27, 2011 at 9:35 AM, Sasha Dolgy sdo...@gmail.com wrote: so can you not simply leverage a strategy that replicates data

suggestion: sstable2json to ignore TTL

2011-04-27 Thread Timo Nentwig
Hi! What about a simple option for sstable2json to not print out expiration TTL+LocalDeletionTime (maybe even ignore isMarkedForDelete)? I want to move old data from a live cluster (with TTL) to an archive cluster (-data does not expire there). BTW is there a smarter way to do this? Actually

Thrift client thread is locked. (TSocket is initialized with _timeout)

2011-04-27 Thread 박용욱
Hello. I have a problem with thrift client socket. Server (0.7.4) - 6 nodes cluster - reboot 1 node(EC2 instance) suddenly. Client (hector-core-0.7.0-22, libthrift-0.5) - hector's cassandraThriftSocketTimeout option is set to 3ms and *It initiated TSocket with same

Re: suggestion: sstable2json to ignore TTL

2011-04-27 Thread Edward Capriolo
On Wed, Apr 27, 2011 at 9:40 AM, Timo Nentwig timo.nent...@toptarif.de wrote: Hi! What about a simple option for sstable2json to not print out expiration TTL+LocalDeletionTime (maybe even ignore isMarkedForDelete)? I want to move old data from a live cluster (with TTL) to an archive cluster

Seeking Cassandra in production speaker volunteers! (free beer on offer)

2011-04-27 Thread Dave Gardner
Hi all Influenced by the up and coming Redis in production meetup in London, I'm on the lookout for volunteers to speak at a Cassandra in production meetup (again, in London). You will get the satisfaction of becoming Internet famous, plus I will personally buy you a beer. Links:

Re: suggestion: sstable2json to ignore TTL

2011-04-27 Thread Timo Nentwig
On Apr 27, 2011, at 15:58, Edward Capriolo wrote: Hacking a separate copy of SSTable2json is trivial. Just look for the section of the code that writes the data and change what it writes. If I did. The method's private... you can make it a knob --nottl then it could be included in Cassandra

JDBC Driver issue in 0.8beta1

2011-04-27 Thread David McNelis
I have a feeling that I'm likely doing something dumb. I have the following code compiling without any issues: String url = null; try { Class.forName(org.apache.cassandra.cql.jdbc.CassandraDriver); url = jdbc:cassandra:username/password@localhost:9160/keyspace; Connection conn

Re: suggestion: sstable2json to ignore TTL

2011-04-27 Thread Edward Capriolo
On Wed, Apr 27, 2011 at 10:16 AM, Timo Nentwig timo.nent...@toptarif.de wrote: On Apr 27, 2011, at 15:58, Edward Capriolo wrote: Hacking a separate copy of SSTable2json is trivial. Just look for the section of the code that writes the data and change what it writes. If I did. The method's

Re: suggestion: sstable2json to ignore TTL

2011-04-27 Thread Timo Nentwig
On Apr 27, 2011, at 16:52, Edward Capriolo wrote: The method being private is not a deal-breaker.While not good software engineering practice you can copy and paste the code and renamed the class SSTable2MyJson or whatever. Sure I can do this but I'd like to have it just available in the

Re: suggestion: sstable2json to ignore TTL

2011-04-27 Thread Edward Capriolo
On Wed, Apr 27, 2011 at 10:59 AM, Timo Nentwig timo.nent...@toptarif.de wrote: On Apr 27, 2011, at 16:52, Edward Capriolo wrote: The method being private is not a deal-breaker.While not good software engineering practice you can copy and paste the code and renamed the class SSTable2MyJson or

[RELEASE] Apache Cassandra 0.7.5 released

2011-04-27 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 0.7.5. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here:

Re: nodes reference by hostname and not IP

2011-04-27 Thread Milind Parikh
Most likely because in the wild, you can't assume a reliable DNS. Just as an aside...This question comes up often in context of managing Cassandra clusters;especially in elastic situations. Most CMDBs assume a static name (host names/static IPs) for nodes. However this often proves to be

Re: [RELEASE] Apache Cassandra 0.8.0 beta1

2011-04-27 Thread Nate McCall
Indeed. This has been fixed and redeployed. Thanks Stephen. On Wed, Apr 27, 2011 at 8:38 AM, Stephen Connolly stephen.alan.conno...@gmail.com wrote: Similar issue with the RPMs from riptano On 27 April 2011 11:01, Pierre-Yves Ritschard p...@milestonelab.com wrote: Thanks Jonathan, Should I

Re: memtablePostFlusher blocking writes?

2011-04-27 Thread Jonathan Ellis
MPF is indeed pretty lightweight, but since its job is to mark the commitlog replay position after a flush -- which has to be done in flush order to preserve correctness in failure scenarios -- you'll see the pending op count go up when you have multiple flushes happening. This is expected. Your

Re: encryption_options 0.8

2011-04-27 Thread David Strauss
On Wed, 2011-04-27 at 12:56 +0200, Sasha Dolgy wrote: IBM WebSphere applies a hardcoded XOR. Each caracter is XOR'd with the caracter ‘_’, and the resulting string is encoded in base64. This is not cryptography, it is just enough encoding so that a casual glance at the file will not reveal the

Re: Apt repositories

2011-04-27 Thread David Strauss
On Tue, 2011-04-26 at 19:03 -0500, Eric Evans wrote: There is one for each version now (06x, 07x, and 08x). The unstable suite continues to point to latest-and-greatest. The wiki has been updated. Where, exactly, is this on the wiki? I had been using the CloudConfig page [1], which still only

Re: Apt repositories

2011-04-27 Thread Jonathan Ellis
On Wed, Apr 27, 2011 at 1:46 PM, David Strauss da...@davidstrauss.net wrote: On Tue, 2011-04-26 at 19:03 -0500, Eric Evans wrote: There is one for each version now (06x, 07x, and 08x). The unstable suite continues to point to latest-and-greatest.  The wiki has been updated. Where, exactly,

Pygmalion - a github project for pig + cassandra

2011-04-27 Thread Jeremy Hanna
Hi all, A little while back, I started a project called pygmalion for example scripts and UDFs for people using Pig with Cassandra. Currently there are a few handy UDFs in there like: FromCassandraBag: a way to convert from what Cassandra returns (key:chararray, columns:bag {column:tuple

Re: Apt repositories

2011-04-27 Thread Jeremy Hanna
Thanks Eric! On Apr 26, 2011, at 7:03 PM, Eric Evans wrote: On Sat, 2011-04-23 at 16:49 -0700, David Strauss wrote: I just noticed that, following the Cassandra 0.8 beta release, the Apt repository is encouraging servers in my clusters to upgrade. Beta releases should probably be on

Re: Pygmalion - a github project for pig + cassandra

2011-04-27 Thread Jonathan Ellis
Nice! On Wed, Apr 27, 2011 at 1:57 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: Hi all, A little while back, I started a project called pygmalion for example scripts and UDFs for people using Pig with Cassandra.  Currently there are a few handy UDFs in there like: FromCassandraBag:

0.7.4: Replication assertion error after removetoken, removetoken force and a restart

2011-04-27 Thread Alexis Lê-Quôc
Hi, I've been getting the following lately, every few seconds. 2011-04-27T20:21:18.299885+00:00 10.202.61.193 [MiscStage: 97] Error in ThreadPoolExecutor 2011-04-27T20:21:18.299885+00:00 10.202.61.193 java.lang.AssertionError 2011-04-27T20:21:18.300038+00:00 10.202.61.193 10.202.61.193 at

Re: Compacting single file forever

2011-04-27 Thread Jonathan Ellis
https://issues.apache.org/jira/browse/CASSANDRA-2575 On Thu, Apr 21, 2011 at 11:56 PM, Jonathan Ellis jbel...@gmail.com wrote: I suggest as a workaround making the forceUserDefinedCompaction method ignore disk space estimates and attempt the requested compaction even if it guesses it will not

Re: JDBC Driver issue in 0.8beta1

2011-04-27 Thread Jonathan Ellis
What's the stacktrace? On Wed, Apr 27, 2011 at 9:45 AM, David McNelis dmcne...@agentisenergy.com wrote: I have a feeling that I'm likely doing something dumb.  I have  the following code compiling without any issues: String url = null; try {     

Re: Cassandra node throws NPE on startup

2011-04-27 Thread Aaron Morton
What approach did you take to restarting the cluster? It looks like the keyspace name was changed and the log replay tried to write to the old one. Aaron On 28/04/2011, at 12:03 AM, Subscriber subscri...@zfabrik.de wrote: Hi again, some more remarks. I renamed the commitlog directory

Re: JDBC Driver issue in 0.8beta1

2011-04-27 Thread David McNelis
Attached: 21 [main] INFO org.apache.cassandra.cql.jdbc.Connection - Connected to localhost:9160 Exception in thread main org.apache.cassandra.cql.jdbc.DriverResolverException: Required field 'replication_factor' was not found in serialized data! Struct: KsDef(name:system,

Re: JDBC Driver issue in 0.8beta1

2011-04-27 Thread Jonathan Ellis
That looks to me like it's using the thrift definitions from the 0.7 jar, rather than the 0.8. Are you sure the old Cassandra jar is no longer on your classpath? On Wed, Apr 27, 2011 at 4:29 PM, David McNelis dmcne...@agentisenergy.com wrote: Attached: 21 [main] INFO

Re: OOM on heavy write load

2011-04-27 Thread Aaron Morton
I'm a bit confused by the two different cases you described, so cannot comment specially on your case. In general if Cassandra is slowing down take a look at the thread pool stats, using nodetool tpstats to see where it is backing up and take at look at the logs to check for excessive GC. If

Re: JDBC Driver issue in 0.8beta1

2011-04-27 Thread David McNelis
That was my issue. As suspected, falls into the I must be doing something dumb category. Thank you, Jonathon. On Wed, Apr 27, 2011 at 4:32 PM, Jonathan Ellis jbel...@gmail.com wrote: That looks to me like it's using the thrift definitions from the 0.7 jar, rather than the 0.8. Are you sure

Re: Expanding single node to 2 node cluster

2011-04-27 Thread Aaron Morton
You could try... - delete / move the system data directory - set the initial_token for each node to what they were before - restart and recreate the schema - run repair and then clean It would have been a good idea to drain the nodes, this would checkpoint the logs and clear them. If you do

Re: nodes reference by hostname and not IP

2011-04-27 Thread Aaron Morton
It stores them, but they are not as important as the token. I.e. You can shutdown the node and bring it back on another ip and gossip with sort it out. Aaron On 28/04/2011, at 4:52 AM, Milind Parikh milindpar...@gmail.com wrote: Most likely because in the wild, you can't assume a reliable

Re: memtablePostFlusher blocking writes?

2011-04-27 Thread Terje Marthinussen
It is a good question what is the problem here. I dont think it is the pending mutations and flushes, the real problem is what causes them, and it is not me! There was maybe a misleading comment in my original mail. It is not the hinted handoffs sent from this node that is the problem, but the

Re: memtablePostFlusher blocking writes?

2011-04-27 Thread Jonathan Ellis
On Wed, Apr 27, 2011 at 5:23 PM, Terje Marthinussen tmarthinus...@gmail.com wrote: I have two issues here. - The massive amount of mutation caused by the hints playback I'm not sure how one node playing back hints could cause this. The intent of the code in HintedHandoffManager is to send a

Re: Apt repositories

2011-04-27 Thread David Strauss
On Tue, 2011-04-26 at 19:03 -0500, Eric Evans wrote: There was already a repo for cassandra-0.6 (called 06x), it just fell through the cracks with the last release. There is one for each version now (06x, 07x, and 08x). The unstable suite continues to point to latest-and-greatest. The wiki

Dropping a built in secondary index on a CF

2011-04-27 Thread Roshan Dawrani
Hi, Can someone please tell me how I can drop a built in secondary index on a column family attribute? I don't see any direct command to do that in the CLI help. -- Roshan Blog: http://roshandawrani.wordpress.com/ Twitter: @roshandawrani http://twitter.com/roshandawrani Skype: roshandawrani

Re: Dropping a built in secondary index on a CF

2011-04-27 Thread Xaero S
Hi, You just need to use the update column family command on the cassandra-cli and specify the columns and their metadata. To get the metadata of the columns in the CF, you can do describe keyspace keyspacename. Keep in mind that, in your update CF command, the other columns that must continue to