Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Jonathan Haddad
Having just enough RAM to hold the JVM's heap generally isn't a good idea unless you're not planning on doing much with the machine. Any memory not allocated to a process will generally be put to good use serving as page cache. See here: http://en.wikipedia.org/wiki/Page_cache Jon On Tue, Jul

Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Jan Algermissen
Jon, On 31.07.2013, at 08:15, Jonathan Haddad j...@jonhaddad.com wrote: Having just enough RAM to hold the JVM's heap generally isn't a good idea unless you're not planning on doing much with the machine. Yes, I agree. Two questions though: - Do you think that using a JVM heap of, for

Re: Heap stuck at 98% while restarting node

2013-07-31 Thread aaron morton
Config : 3 nodes running 1.1.3 (x2) and one node running 1.1.12. No caches, lots of column families. memtable_total_space_in_mb set to 4096MB Reduce this to 2048 so it will flush to disk more frequently and avoid growing the heap so much. From what you've said I would be looking at the size

Re: sstable size change

2013-07-31 Thread aaron morton
Can you put that in a ticket ? https://issues.apache.org/jira/browse/CASSANDRA Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 30/07/2013, at 12:52 AM, Keith Wright kwri...@nanigans.com wrote: Version 1.2.4. Original

Re: key cache hit rate and BF false positive

2013-07-31 Thread aaron morton
This looks suspicious SSTables in each level: [1, 3, 101/100, 1022/1000, 10587/1, 1750] It says there are 6 levels in the levelled DB, which may explain why the number of SSTables per read is so high. It also says some of the levels have more files than they should, check nodetool

Re: nodetool cfstats write count ?

2013-07-31 Thread aaron morton
I don't think these are exposed by any nodetool commands though, but you can use any JMX client to read them. They are sort of shown by nodetool proxyhistograms though not in aggregate. Cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton

Re: TimeoutException and keyspace exceptions

2013-07-31 Thread aaron morton
[30/07/2013 00:04:50] Read Exception (Thread: 5007; Host: 10.191.54.26): TimedOutException() it means the cluster is over loaded or suffered a failure while processing a request. A well behaved client should retry the request so long as counters are not being used. Your request went to a

Re: two problems about opscenter 3.2

2013-07-31 Thread aaron morton
You'll get better Ops Centre support on the DS site http://www.datastax.com/support-forums/ cheers - Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 30/07/2013, at 9:23 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: I also see

Re: CUSTOM index

2013-07-31 Thread Alain RODRIGUEZ
Hi, I am also interested on the use cases where custom indexes are useful. Alain 2013/7/31 baskar.duraikannu...@gmail.com Hello, Both Cassandra CLI and CQLSH have option to specify an custom index. Can you point to an example custom index implementation, if there is one? Thanks Baskar

Re: AssertionError during ALTER TYPE in 1.2.5

2013-07-31 Thread Sergey Leschenko
Hi, On Mon, Jul 29, 2013 at 11:23 AM, aaron morton aa...@thelastpickle.com wrote: The error is because the underlying CF is not defined using a composite type for the comparator. CREATE TABLE RRD ( key text, column1 blob, value blob, PRIMARY KEY (key, column1) ) WITH COMPACT STORAGE

Re: two problems about opscenter 3.2

2013-07-31 Thread Alain RODRIGUEZ
Here is the pointer to the topic on the DS support forum. http://www.datastax.com/support-forums/topic/some-32-bugs-reported-in-the-c-user-ml 2013/7/31 aaron morton aa...@thelastpickle.com You'll get better Ops Centre support on the DS site http://www.datastax.com/support-forums/ cheers

CQL and undefined columns

2013-07-31 Thread Jon Ribbens
I thought that part of the point of Cassandra was that, unlike a standard relational database, each row does not have to have the same set of columns. I don't understand how this squares with CQL. If I want to have a table (column family?) with a few fixed columns that are relevant to every row, I

Re: CQL and undefined columns

2013-07-31 Thread Alain RODRIGUEZ
I like to point to this article from Sylvain, which is really well written. http://www.datastax.com/dev/blog/thrift-to-cql3 It explains a lot of things and is really interesting for Cassandra users pre-CQL3. Actually, old dynamic columns were defined this way : CREATE TABLE test ( key

Re: CQL and undefined columns

2013-07-31 Thread Alain RODRIGUEZ
Oops, sorry about double post. Alain 2013/7/31 Alain RODRIGUEZ arodr...@gmail.com I like to point to this article from Sylvain, which is really well written. http://www.datastax.com/dev/blog/thrift-to-cql3 It explains a lot of things and is really interesting for Cassandra users

Re: sstable size change

2013-07-31 Thread Keith Wright
Created https://issues.apache.org/jira/browse/CASSANDRA-5834 From: aaron morton aa...@thelastpickle.commailto:aa...@thelastpickle.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, July 31, 2013 5:11

Re: key cache hit rate and BF false positive

2013-07-31 Thread Keith Wright
Thank you for the response. Compactionstats does not indicate that we are running behind (see below). FYI - since making the change from default 5 to 256 MB I have been seeing increased GC pauses (one node got locked in GC spirals last night and had to be restarted) so I actually decreased the

dependencies for cassandra's pig integration?

2013-07-31 Thread William Oberman
I'm using AWS's EMR (hadoop as a service), and one step copies some data from EMR - my cassandra cluster. I used to patch EMR with pig 0.11, but now AWS officially supports 0.11, so I thought I'd give it a try. I was having issues. The AWS forum on it is here:

Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Shahab Yunus
Hi Jan, One question...you say - I must make sure the disks are directly attached, to prevent problems when multiple nodes flush the commit log at the same time What do you mean by that? Thanks, Shahab On Wed, Jul 31, 2013 at 3:10 AM, Jan Algermissen jan.algermis...@nordsc.com wrote:

Re: VM dimensions for running Cassandra and Hadoop

2013-07-31 Thread Jan Algermissen
Hi Shahab, On 31.07.2013, at 15:59, Shahab Yunus shahab.yu...@gmail.com wrote: Hi Jan, One question...you say - I must make sure the disks are directly attached, to prevent problems when multiple nodes flush the commit log at the same time I read that using Cassandra with SANs can

Hadoop - using SlicePredicate with wide rows

2013-07-31 Thread Adam Masters
Hi all, I need to limit a MapReduce job to only scan a specific range of columns. The CF being processed is a wide row, so I've set the 'widerow' property in ConfigHelper.setInputColumnFamily() to true. However, in the word_count example on github, the following comment exists: // this will

Re: CQL and undefined columns

2013-07-31 Thread Jon Ribbens
On Wed, Jul 31, 2013 at 02:21:52PM +0200, Alain RODRIGUEZ wrote: I like to point to this article from Sylvain, which is really well written. http://www.datastax.com/dev/blog/thrift-to-cql3 Ah, thankyou, it looks like a combination of multi-column PRIMARY KEY and use of collections may

Re: deleting columns with CAS (2.0.0-beta2)

2013-07-31 Thread Kalpana Suraesh
Actually, it was pointed out to me that there's a comment in org.apache.cassandra.service.StorageProxy#cas() that says: // finish the paxos round w/ the desired updates // TODO turn null updates into delete? Commit proposal = Commit.newProposal(key, ballot,

Re: CQL and undefined columns

2013-07-31 Thread Edward Capriolo
You should also profile what your data looks like on disk before picking a format. It may not be as efficient to use one form or the other due to extra disk overhead. On Wed, Jul 31, 2013 at 1:32 PM, Jon Ribbens jon-cassan...@unequivocal.co.uk wrote: On Wed, Jul 31, 2013 at 02:21:52PM +0200,

Re: CQL and undefined columns

2013-07-31 Thread Jonathan Haddad
It's advised you do not use compact storage, as it's primarily for backwards compatibility. The first of these option is COMPACT STORAGE. This option is meanly targeted towards backward compatibility with some table definition created before CQL3. But it also provides a slightly more compact

Paxos in 1.2

2013-07-31 Thread Bill Hastings
What is Paxos used for? Only CAS related operations?

Re: Paxos in 1.2

2013-07-31 Thread Robert Coli
On Wed, Jul 31, 2013 at 3:50 PM, Bill Hastings bllhasti...@gmail.comwrote: What is Paxos used for? Only CAS related operations? I believe so, yes. https://issues.apache.org/jira/browse/CASSANDRA-5062 =Rob

Bulk Loader - OutOfMemoryError: Java heap space

2013-07-31 Thread Ben Gambley
Hi All We are using Cassandra 1.2.4 and are seeing the following error loading a pretty small amount of data. Exception in thread main java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.utils.obs.OpenBitSet.init(OpenBitSet.java:76) at

答复: two problems about opscenter 3.2

2013-07-31 Thread yue . zhang
thanks Alain I don’t know why not permited to create topic on datastax forum. 发件人: Alain RODRIGUEZ [mailto:arodr...@gmail.com] 发送时间: 2013年7月31日 18:11 收件人: user@cassandra.apache.org 主题: Re: two problems about opscenter 3.2 Here is the pointer to the topic on the DS support forum.