Re: Recommended hardware

2013-09-24 Thread Michał Michalski

Hi Tim,

Not sure if you've seen this, but I'd start from DataStax's documentation:

http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/architecture/architecturePlanningAbout_c.html?pagename=docsversion=1.2file=cluster_architecture/cluster_planning

Taking a look at the mailinglist's archive might be useful too.

M.

W dniu 23.09.2013 18:17, Tim Dunphy pisze:

Hello,

I am running Cassandra 2.0 on a 2gb memory 10 gb HD in a virtual cloud 
environment. It's supporting a php application running on the same node.

Mostly this instance runs smoothly but runs low on memory. Depending on how 
much the site is used, the VM will swap out sometimes excessively.

I realize this setup may not be enough to support a cassandra instance.

I was wondering if there were any recommended hardware specs someone could 
point me to for both physical and virtual (cloud) type environments.

Thank you,
Tim
Sent from my iPhone





Re: Recommended hardware

2013-09-24 Thread Jan Algermissen
Tim,

On 23.09.2013, at 18:17, Tim Dunphy bluethu...@gmail.com wrote:

 Hello,
 
 I am running Cassandra 2.0 on a 2gb memory 10 gb HD in a virtual cloud 
 environment. It's supporting a php application running on the same node.

I have played with C* (1.2 and 2.0) in a low-RAM environment the last month. 
The major insight I gained is that it is not really possible to get C* to 
protect itself against incoming writes. Apparently it will just keep sucking in 
writes until death - if it cannot flush the memtables fast enough.
 
You could share with us the write/read behavior of your use case (how much, how 
often) and whether you use SSDs or spinning disks. But aside from the 
specifics, I'd say that

- You need at least 4GB  RAM
- If you do not have SSDs, you need two disks, one for OS and commitlog, one 
for the data
- Give C* it's own 3+ nodes if you want to really investigate C* behavior (e.g. 
nodes talking to each other, replication, CAS)
- Check whether your VMs have the storage directly attached (unlikely) or 
whether they share with other VMs (which isn't optimal)

HTH,
Jan

 
 Mostly this instance runs smoothly but runs low on memory. Depending on how 
 much the site is used, the VM will swap out sometimes excessively.
 
 I realize this setup may not be enough to support a cassandra instance.
 
 I was wondering if there were any recommended hardware specs someone could 
 point me to for both physical and virtual (cloud) type environments.
 
 Thank you,
 Tim
 Sent from my iPhone



Re: Recommended hardware

2013-09-24 Thread Franc Carter
Far from expert opinion, however one configuration I have seen talked about
is 3 x m1.xlarge in AWS.

I have tested 4 x m1.xlarge and 4  x m1.large. The m1.xlarge was fine for
out tests (we were hitting it pretty hard), the m1.large
was eratic - from that I took way that you either need to give Cassandra
sufficient resources or know how to tune properly (I don't)

cheers


On Tue, Sep 24, 2013 at 2:17 AM, Tim Dunphy bluethu...@gmail.com wrote:

 Hello,

 I am running Cassandra 2.0 on a 2gb memory 10 gb HD in a virtual cloud
 environment. It's supporting a php application running on the same node.

 Mostly this instance runs smoothly but runs low on memory. Depending on
 how much the site is used, the VM will swap out sometimes excessively.

 I realize this setup may not be enough to support a cassandra instance.

 I was wondering if there were any recommended hardware specs someone
 could point me to for both physical and virtual (cloud) type environments.

 Thank you,
 Tim
 Sent from my iPhone




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 8355 2514

Level 4, 55 Harrington St, The Rocks NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: Recommended hardware

2013-09-24 Thread Tim Dunphy
Very useful.. thank you!


Hi Tim,

 Not sure if you've seen this, but I'd start from DataStax's documentation:


 http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/architecture/architecturePlanningAbout_c.html?pagename=docsversion=1.2file=cluster_architecture/cluster_planning

 Taking a look at the mailinglist's archive might be useful too.

 M.




On Tue, Sep 24, 2013 at 2:08 AM, Michał Michalski mich...@opera.com wrote:

 Hi Tim,

 Not sure if you've seen this, but I'd start from DataStax's documentation:

 http://www.datastax.com/**documentation/cassandra/2.0/**
 webhelp/index.html#cassandra/**architecture/**architecturePlanningAbout_c.
 **html?pagename=docsversion=1.**2file=cluster_architecture/**
 cluster_planninghttp://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/architecture/architecturePlanningAbout_c.html?pagename=docsversion=1.2file=cluster_architecture/cluster_planning

 Taking a look at the mailinglist's archive might be useful too.

 M.

 W dniu 23.09.2013 18:17, Tim Dunphy pisze:

  Hello,

 I am running Cassandra 2.0 on a 2gb memory 10 gb HD in a virtual cloud
 environment. It's supporting a php application running on the same node.

 Mostly this instance runs smoothly but runs low on memory. Depending on
 how much the site is used, the VM will swap out sometimes excessively.

 I realize this setup may not be enough to support a cassandra instance.

 I was wondering if there were any recommended hardware specs someone
 could point me to for both physical and virtual (cloud) type environments.

 Thank you,
 Tim
 Sent from my iPhone





-- 
GPG me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B


RE: cass 1.2.8 - 1.2.9

2013-09-24 Thread Christopher Wirt
Yes. Sorry. It was me being a fool. I didn't update the rackdc.properties
file on the new version

 

From: Robert Coli [mailto:rc...@eventbrite.com] 
Sent: 24 September 2013 01:52
To: user@cassandra.apache.org
Subject: Re: cass 1.2.8 - 1.2.9

 

On Wed, Sep 11, 2013 at 7:42 AM, Christopher Wirt chris.w...@struq.com
wrote:

I tried upgrading one server in a three node DC.

[...] 

Had to rollback sharpish as this was a live system.

 

Did you ever determine what happened here? What is your RF and CL?

 

=Rob



Re: Counters way off.

2013-09-24 Thread LeoNerd
On Mon, 23 Sep 2013 21:39:50 +
Stephanie Jackson sjack...@keek.com wrote:

 How can I figure out why there's such a huge difference in results on
 one node and not on the other?

Tiny question - are you running two (or more) nodes on the same
physical machine, by using different bind IP addresses? I'm running a
cluster like that and that appears to be the cause of some
counter-related upsets.

-- 
Paul LeoNerd Evans

leon...@leonerd.org.uk
ICQ# 4135350   |  Registered Linux# 179460
http://www.leonerd.org.uk/


signature.asc
Description: PGP signature


Re: Bad Request: Invalid null value for clustering key part

2013-09-24 Thread Sylvain Lebresne
Oh. That would be a COPY thing then. I'm not extremely familiar with cqlsh
code (which COPY is part of) but looking at the online help for it, it
seems to have a 'NULL' option that allows to define which character is used
to represent nulls. And by default, it does is an empty string. So you
could try something like:
  COPY ... TO ... WITH NULL='null'
(assuming that if you do have nulls you use the string 'null' to encode it
in your csv and that your are sure nothing that's not supposed to be null
will be represented by the string 'null').

--
Sylvain


On Tue, Sep 24, 2013 at 9:41 AM, Petter von Dolwitz (Hem) 
petter.von.dolw...@gmail.com wrote:

 Hi Sylvain,

 I was not describing the problem correctly. I'm sorry for this. This is
 the situation:

 1. I'm populating the DB with the java-driver (INSERT INTO...). Some
 fields that are part of the primary key is *empty strings*. This works fine.
 2. I'm trying to populate the DB from a CSV (COPY ... FROM) using cqlsh.
 Some fields that are part of the primary key is *empty strings*. This
 scenario gives me the Bad Request: Invalid null value for clustering key
 part {field_name} message. Seems like empty strings are treated as NULL
 when using the COPY .. FROM command?

 This can obviously be me not knowing how to encode an empty string in a
 CSV file. A simplified row from the CSV file can look like below:

 field1_value,,,field4_value,field5_value

 whereas field1 through field4 is part of the primary key.

 Thanks for your time,
 Petter




 2013/9/23 Sylvain Lebresne sylv...@datastax.com


 Is it not permitted to have null values in a field that is part a primary
 key?


 It's not.


 This seems to be ok when using the java-driver


 Are you sure? Because that would qualify as a bug (in the sense that it's
 not supported by C* so there is not reason why this would work with any
 driver). If you have some java driver code that show it possible, I'd be
 happy to have a look.

 --
 Sylvain





Unable to export data from Cassandra to MySql

2013-09-24 Thread Santosh Shet
Hi,
We are facing problem while exporting data from Cassandra to MySql database. 
For example,

1. I have created keyspace called test_ks and column family called test_cf in 
cassandra using cql.
I have inserted some dummy data into test_cf and corresponding files are 
created in the folder /var/lib/cassandra/data/test_ks/test_cf as shown below.
[root@balnsand01 test_cf]# pwd

/var/lib/cassandra/data/test_ks/test_cf

[root@balnsand01 test_cf]# ls -lrt

total 24

-rw-r--r-- 1 root root 4340 Sep 19 03:39 test_ks-test_cf-hf-1-Statistics.db

-rw-r--r-- 1 root root 22 Sep 19 03:39 test_ks-test_cf-hf-1-Index.db

-rw-r--r-- 1 root root 16 Sep 19 03:39 test_ks-test_cf-hf-1-Filter.db

-rw-r--r-- 1 root root 89 Sep 19 03:39 test_ks-test_cf-hf-1-Data.db

-rw-r--r-- 1 root root 46 Sep 19 03:39 test_ks-test_cf-hf-1-CompressionInfo.db
2.When i am trying to export this data into MySql using below command, I see 
data is getting exported in non-readable format which is not correct. Below is 
the command I am using to export
./dse sqoop export --connect jdbc:mysql://127.0.0.1:3306/testdb --username 
testuser --password mysql123 --export-dir 
/var/lib/cassandra/data/test_ks/test_cf --table table_name --columns 
'col1,col2' --input-fields-terminated-by '\t';
where, --export-dir /var/lib/cassandra/data/test_ks/test_cf is the path where 
data files gets created.
Could you please guide me where exactly I am doing wrong here.
Thanks in advance,
Santosh


Santosh Shet
Software Engineer | VistaOne Solutions
Direct India : +91 80 30273829 | Mobile India : +91 8105720582
Skype : santushet



Re: Frequent Full GC that take 30s

2013-09-24 Thread André Cruz
On Sep 24, 2013, at 5:05 AM, 谢良 xieli...@xiaomi.com wrote:

 it looks to me that MaxTenuringThreshold is too small, do you have any 
 chance to try with a bigger one, like 4 or 8 or sth else?

MaxTenuringThreshold=1 seems a bit odd, yes. But it is the Cassandra default, 
maybe there is a reason for this? Perhaps under normal usage all objects that 
survive 1 tenure should be moved to the old generation?

Anyway, I will also try higher values.

Thanks,
André



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Frequent Full GC that take 30s

2013-09-24 Thread André Cruz
On Sep 24, 2013, at 5:18 AM, Mohit Anchlia mohitanch...@gmail.com wrote:

 Your ParNew size is way too small. Generally 4GB ParNew (-Xmn) works out best 
 for 16GB heap

I was afraid that a 4GB ParNew would cause Young GCs to take too long. I'm 
going to test higher ParNew values.

Thanks,
André


signature.asc
Description: Message signed with OpenPGP using GPGMail


is this correct, thrift unportable to CQL3Š.

2013-09-24 Thread Hiller, Dean
Many applications in thrift use the wide row with composite column name and as 
an example, let's say golf score for instance and we end up with golf score : 
pk like so

null : pk56
null : pk45
89 : pk90
89: pk87
90: pk101
95: pk17

Notice that there are some who do not have a golf score(zero would not quite 
make sense and would be interpreted as a golf score).  I am hearing from this 
post if they are correct that this is not portable to CQL3???  Is this true?  
http://stackoverflow.com/questions/18963248/how-can-i-have-null-column-value-for-a-composite-key-column-in-cql3

(This sounds like a major deficit to me as the wide row now can only be used 
where actual values exist?).  Is it possible to port this pattern to CQL3?

Thanks,
Dean




Migration LCS from 1.2.X to 2.0.x exception

2013-09-24 Thread Desimpel, Ignace
Tested on WINDOWS : On startup of the 2.0.0 version from 1.2.x files I get an 
error as listed below.

This is due to the code in LeveledManifest:: mutateLevel. The method already 
has a comment saying that it is scary ...
On windows, one cannot use the File::rename if the target file name already 
exists.
Also, even on Linux, I'm not sure if a rename would actually 
'overwrite/implicit-delete' the content of the target file.

Anyway, adding code (below) before the FileUtils.renameWithConfirm should work 
in both cases (maybe even rename the fromFile to be able to recover...)
File oTo = new File(filename);
if ( oTo.exists() ) oTo.delete();


java.lang.RuntimeException: Failed to rename 
.xxx\Function-ic-10-Statistics.db-tmp to 
.xxx\Function-ic-10-Statistics.db
at 
org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:136) 
~[main/:na]
at 
org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:125) 
~[main/:na]
at 
org.apache.cassandra.db.compaction.LeveledManifest.mutateLevel(LeveledManifest.java:601)
 ~[main/:na]
at 
org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:103)
 ~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247) 
~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:443) 
~[main/:na]

Regards,

Ignace Desimpel


Re: Migration LCS from 1.2.X to 2.0.x exception

2013-09-24 Thread Nate McCall
What version of 1.2.x?

Unfortunately, you must go through 1.2.9 first. See
https://github.com/apache/cassandra/blob/cassandra-2.0.0/NEWS.txt#L19-L24


On Tue, Sep 24, 2013 at 8:57 AM, Desimpel, Ignace 
ignace.desim...@nuance.com wrote:

  Tested on WINDOWS : On startup of the 2.0.0 version from 1.2.x files I
 get an error as listed below.

 ** **

 This is due to the code in LeveledManifest:: mutateLevel. The method
 already has a comment saying that it is scary …

 On windows, one cannot use the File::rename if the target file name
 already exists. 

 Also, even on Linux, I’m not sure if a rename would actually
 ‘overwrite/implicit-delete’ the content of the target file.

 ** **

 Anyway, adding code (below) before the FileUtils.renameWithConfirm should
 work in both cases (maybe even rename the fromFile to be able to recover…)
 

 File oTo = new File(filename);

 if ( oTo.exists() ) oTo.delete();

 ** **

 ** **

 java.lang.RuntimeException: Failed to rename
 …..xxx\Function-ic-10-Statistics.db-tmp to
 …..xxx\Function-ic-10-Statistics.db

 at
 org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:136)
 ~[main/:na]

 at
 org.apache.cassandra.io.util.FileUtils.renameWithConfirm(FileUtils.java:125)
 ~[main/:na]

 at
 org.apache.cassandra.db.compaction.LeveledManifest.mutateLevel(LeveledManifest.java:601)
 ~[main/:na]

 at
 org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(LegacyLeveledManifest.java:103)
 ~[main/:na]

 at
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:247)
 ~[main/:na]

 at
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:443)
 ~[main/:na]

 ** **

 Regards,

 ** **

 Ignace Desimpel



Re: is this correct, thrift unportable to CQL3Š.

2013-09-24 Thread Vikas Goyal
Thanks Sylvain,

However,we tried missing the value but it didn't work :(

So our code is like below where we are using 3 values if colname is not
null..else 2 values..

if (key != null) {
PreparedStatement statement = session.prepare(INSERT INTO
keys.StringIndice (id, colname, colvalue) VALUES (?, ?, ?));
BoundStatement boundStatement = new
BoundStatement(statement);

session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
rowKey), key, ByteBuffer.wrap(value)));
} else {
PreparedStatement statement = session.prepare(INSERT INTO
 + keys + . + table + (id, colvalue) VALUES (?, ?));
BoundStatement boundStatement = new
BoundStatement(statement);

session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
rowKey), ByteBuffer.wrap(value)));
  }

And, I did that and getting this exception:

Exception:Missing PRIMARY KEY part colname since colvalue is set

And just FYI. Our Column Family definition is below:

CREATE TABLE keys.StringIndice (id text,
colname text,
colvalue blob,
PRIMARY KEY (id,colname, colvalue)) WITH COMPACT STORAGE)

Thanks again,
Vikas Goyal



On Tue, Sep 24, 2013 at 7:02 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 Short answer: not, this is not correct.

 Longer answer: what you call null is actually an empty value (which is
 *not* the same thing, unless you consider an empty string is the same thing
 than a null string). As it happens, C* always an empty value as a valid
 value for any type and that's true of both thrift and CQL3. What is true is
 that CQL3 discourage the use of empty values for type for which it doesn't
 particularly make sense (integers typically) by not having a particular
 easy to use syntax to input them. But that's supported nonetheless. If you
 use a prepared statement for instance (where you send values already
 serialized), nothing will prevent you from sending an empty value. Even if
 you don't want to use a prepared statement, CQL3 has conversion functions (
 http://cassandra.apache.org/doc/cql3/CQL.html#blobFun) that allows to do
 it (for instance, blobAsInt(0x) will be an empty int value).

 --
 Sylvain



 On Tue, Sep 24, 2013 at 2:36 PM, Hiller, Dean dean.hil...@nrel.govwrote:

 Many applications in thrift use the wide row with composite column name
 and as an example, let's say golf score for instance and we end up with
 golf score : pk like so

 null : pk56
 null : pk45
 89 : pk90
 89: pk87
 90: pk101
 95: pk17

 Notice that there are some who do not have a golf score(zero would not
 quite make sense and would be interpreted as a golf score).  I am hearing
 from this post if they are correct that this is not portable to CQL3???  Is
 this true?
 http://stackoverflow.com/questions/18963248/how-can-i-have-null-column-value-for-a-composite-key-column-in-cql3

 (This sounds like a major deficit to me as the wide row now can only be
 used where actual values exist?).  Is it possible to port this pattern
 to CQL3?

 Thanks,
 Dean






Re: is this correct, thrift unportable to CQL3Š.

2013-09-24 Thread Sylvain Lebresne
 However,we tried missing the value but it didn't work :(


Right, because not providing a value is akin to having a null value (in the
CQL3 sense of the term, which is different from what Dean asked about) and
null values are not allowed for primary key columns.
You could however insert an *empty* value if you wanted, which in you case
is just inserting an empty string since colname is a string. Thrift doesn't
allow more or less in that case.

--
Sylvain




 So our code is like below where we are using 3 values if colname is not
 null..else 2 values..

 if (key != null) {
 PreparedStatement statement = session.prepare(INSERT INTO
 keys.StringIndice (id, colname, colvalue) VALUES (?, ?, ?));
 BoundStatement boundStatement = new
 BoundStatement(statement);

 session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
 rowKey), key, ByteBuffer.wrap(value)));
 } else {
 PreparedStatement statement = session.prepare(INSERT INTO
  + keys + . + table + (id, colvalue) VALUES (?, ?));
 BoundStatement boundStatement = new
 BoundStatement(statement);

 session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
 rowKey), ByteBuffer.wrap(value)));
   }

 And, I did that and getting this exception:

 Exception:Missing PRIMARY KEY part colname since colvalue is set

 And just FYI. Our Column Family definition is below:

 CREATE TABLE keys.StringIndice (id text,
 colname text,
  colvalue blob,
 PRIMARY KEY (id,colname, colvalue)) WITH COMPACT STORAGE)

 Thanks again,
 Vikas Goyal



 On Tue, Sep 24, 2013 at 7:02 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 Short answer: not, this is not correct.

 Longer answer: what you call null is actually an empty value (which is
 *not* the same thing, unless you consider an empty string is the same thing
 than a null string). As it happens, C* always an empty value as a valid
 value for any type and that's true of both thrift and CQL3. What is true is
 that CQL3 discourage the use of empty values for type for which it doesn't
 particularly make sense (integers typically) by not having a particular
 easy to use syntax to input them. But that's supported nonetheless. If you
 use a prepared statement for instance (where you send values already
 serialized), nothing will prevent you from sending an empty value. Even if
 you don't want to use a prepared statement, CQL3 has conversion functions (
 http://cassandra.apache.org/doc/cql3/CQL.html#blobFun) that allows to do
 it (for instance, blobAsInt(0x) will be an empty int value).

 --
 Sylvain



 On Tue, Sep 24, 2013 at 2:36 PM, Hiller, Dean dean.hil...@nrel.govwrote:

 Many applications in thrift use the wide row with composite column name
 and as an example, let's say golf score for instance and we end up with
 golf score : pk like so

 null : pk56
 null : pk45
 89 : pk90
 89: pk87
 90: pk101
 95: pk17

 Notice that there are some who do not have a golf score(zero would not
 quite make sense and would be interpreted as a golf score).  I am hearing
 from this post if they are correct that this is not portable to CQL3???  Is
 this true?
 http://stackoverflow.com/questions/18963248/how-can-i-have-null-column-value-for-a-composite-key-column-in-cql3

 (This sounds like a major deficit to me as the wide row now can only be
 used where actual values exist?).  Is it possible to port this pattern
 to CQL3?

 Thanks,
 Dean







Re: Memtable flush blocking writes

2013-09-24 Thread Ken Hancock
This is on Cassandra 1.2.9 though packaged into DSE which I suspect may
come into play here.  I didn't really get to the bottom of it other than to
up the queue to 32 which is about the number of CFs I have.  After that,
mutation drops disappeared and the FlushWriter blocks went away.


On Mon, Sep 23, 2013 at 6:03 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Aug 23, 2013 at 10:35 AM, Ken Hancock ken.hanc...@schange.comwrote:

 I appear to have a problem illustrated by
 https://issues.apache.org/jira/browse/CASSANDRA-1955. At low data
 rates, I'm seeing mutation messages dropped because writers are
 blocked as I get a storm of memtables being flushed. OpsCenter
 memtables seem to also contribute to this:

 ...

 Now I can increase memtable_flush_queue_size, but it seems based on
 the above that in order to solve the problem, I need to set this to
 count(CF). What's the downside of this approach? It seems a backwards
 solution to the real problem...


 What version of Cassandra? Did you ever get to the bottom of this?

 =Rob




-- 
*Ken Hancock *| System Architect, Advanced Advertising
SeaChange International
50 Nagog Park
Acton, Massachusetts 01720
ken.hanc...@schange.com | www.schange.com |
NASDAQ:SEAChttp://www.schange.com/en-US/Company/InvestorRelations.aspx

Office: +1 (978) 889-3329 | [image: Google Talk:]
ken.hanc...@schange.com | [image:
Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
LinkedIn]http://www.linkedin.com/in/kenhancock

[image: SeaChange International]
 http://www.schange.com/This e-mail and any attachments may contain
information which is SeaChange International confidential. The information
enclosed is intended only for the addressees herein and may not be copied
or forwarded without permission from SeaChange International.


Re: is this correct, thrift unportable to CQL3Š.

2013-09-24 Thread Vikas Goyal
Ok. Great. It works for String and Decimal/Float but not for integer data
type..
i.e,, if I am passing  to the composite key column which is either text
or float, it works..

session.execute(boundStatement.bind(rowkey, , ByteBuffer.wrap(value)));

But not working with bigint, int or varint..and getting following exception;

Exception:Invalid type for value 1 of CQL type varint, expecting class
java.math.BigInteger but class java.lang.String provided

..I am exploring more though..

Thanks a ton,
Vikas Goyal


On Tue, Sep 24, 2013 at 9:05 PM, Sylvain Lebresne sylv...@datastax.comwrote:


 However,we tried missing the value but it didn't work :(


 Right, because not providing a value is akin to having a null value (in
 the CQL3 sense of the term, which is different from what Dean asked about)
 and null values are not allowed for primary key columns.
 You could however insert an *empty* value if you wanted, which in you case
 is just inserting an empty string since colname is a string. Thrift doesn't
 allow more or less in that case.

 --
 Sylvain




 So our code is like below where we are using 3 values if colname is not
 null..else 2 values..

 if (key != null) {
 PreparedStatement statement = session.prepare(INSERT
 INTO keys.StringIndice (id, colname, colvalue) VALUES (?, ?, ?));
 BoundStatement boundStatement = new
 BoundStatement(statement);

 session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
 rowKey), key, ByteBuffer.wrap(value)));
 } else {
 PreparedStatement statement = session.prepare(INSERT
 INTO  + keys + . + table + (id, colvalue) VALUES (?, ?));
 BoundStatement boundStatement = new
 BoundStatement(statement);

 session.execute(boundStatement.bind(StandardConverters.convertFromBytes(String.class,
 rowKey), ByteBuffer.wrap(value)));
   }

 And, I did that and getting this exception:

 Exception:Missing PRIMARY KEY part colname since colvalue is set

 And just FYI. Our Column Family definition is below:

 CREATE TABLE keys.StringIndice (id text,
 colname text,
  colvalue blob,
 PRIMARY KEY (id,colname, colvalue)) WITH COMPACT STORAGE)

 Thanks again,
 Vikas Goyal



 On Tue, Sep 24, 2013 at 7:02 PM, Sylvain Lebresne 
 sylv...@datastax.comwrote:

 Short answer: not, this is not correct.

 Longer answer: what you call null is actually an empty value (which is
 *not* the same thing, unless you consider an empty string is the same thing
 than a null string). As it happens, C* always an empty value as a valid
 value for any type and that's true of both thrift and CQL3. What is true is
 that CQL3 discourage the use of empty values for type for which it doesn't
 particularly make sense (integers typically) by not having a particular
 easy to use syntax to input them. But that's supported nonetheless. If you
 use a prepared statement for instance (where you send values already
 serialized), nothing will prevent you from sending an empty value. Even if
 you don't want to use a prepared statement, CQL3 has conversion functions (
 http://cassandra.apache.org/doc/cql3/CQL.html#blobFun) that allows to
 do it (for instance, blobAsInt(0x) will be an empty int value).

 --
 Sylvain



 On Tue, Sep 24, 2013 at 2:36 PM, Hiller, Dean dean.hil...@nrel.govwrote:

 Many applications in thrift use the wide row with composite column name
 and as an example, let's say golf score for instance and we end up with
 golf score : pk like so

 null : pk56
 null : pk45
 89 : pk90
 89: pk87
 90: pk101
 95: pk17

 Notice that there are some who do not have a golf score(zero would not
 quite make sense and would be interpreted as a golf score).  I am hearing
 from this post if they are correct that this is not portable to CQL3???  Is
 this true?
 http://stackoverflow.com/questions/18963248/how-can-i-have-null-column-value-for-a-composite-key-column-in-cql3

 (This sounds like a major deficit to me as the wide row now can only be
 used where actual values exist?).  Is it possible to port this pattern
 to CQL3?

 Thanks,
 Dean








Re: C* 2.0 reduce_cache_sizes_at ?

2013-09-24 Thread Robert Coli
On Sun, Sep 8, 2013 at 4:00 AM, Andrew Cobley a.e.cob...@dundee.ac.ukwrote:

 reduce_cache_sizes_at: 0
 reduce_cache_capacity_to: 0

 ...

 I'm assuming the blog must be talking about C* prior to version 2.0
 because these settings do not appear in 2.0's  .yaml file.

 ...

 Why where they removed and what's the alternative ?


https://issues.apache.org/jira/browse/CASSANDRA-3534

Seems like when these thresholds are reached we are trying to reduce the
keycache and Memtable sizes, but in the trunk memtable is moved off-heap
hence reducing that will not actually help.


https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commit;h=e79d9fbf84a35021cafa21d428e08fdd9bee584e

They were removed because changes in the nature of the cache made them
mostly-useless. There is therefore no alternative.

=Rob


Query about class org.apache.cassandra.io.sstable.SSTableSimpleWriter

2013-09-24 Thread Jayadev Jayaraman
Let's say I've initialized a *SSTableSimpleWriter* instance and a new
column with TTL set :

*SSTableSimpleWriter writer = new SSTableSimpleWriter( ... /* params here
*/);*
*Column column;*

What is the difference between calling *writer.addColumn()* on the column's
name and value, and *writer.addExpiringColumn()* on the column and its TTL
? Does the former result in the column expiring still , in cassandra 1.2.x
? Or does it not ?


Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Paulo Motta
Hello folks,

What is the best version to upgrade from C* 1.1.10 to 1.2.X? Any
suggestions?

Thanks,

Paulo


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Robert Coli
On Tue, Sep 24, 2013 at 1:17 PM, Paulo Motta pauloricard...@gmail.comwrote:

 What is the best version to upgrade from C* 1.1.10 to 1.2.X? Any
 suggestions?


Not sure what you're asking, but go to at-least-1.2.9. Current is 1.2.10,
so use that.

=Rob


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Robert Coli
On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.comwrote:

 Doesn't the probability of something going wrong increases as the gap
 between the versions increase? So, using this reasoning, upgrading from
 1.1.10 to 1.2.6 would have less chance of something going wrong then from
 1.1.10 to 1.2.9 or 1.2.10.


Sorta, but sorta not.

https://github.com/apache/cassandra/blob/trunk/NEWS.txt

Is the canonical source of concerns on upgrade. There are a few cases where
upgrading to the root of X.Y.Z creates issues that do not exist if you
upgrade to the head of that line. AFAIK there have been no cases where
upgrading to the head of a line (where that line is mature, like 1.2.10)
has created problems which would have been avoided by upgrading to the
root first.


 I'm hoping this reasoning is wrong and I can update directly from 1.1.10
 to 1.2.10. :-)


That's what I plan to do when we move to 1.2.X, FWIW.

=Rob


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Paulo Motta
Cool, sounds fair enough. Thanks for the help, Rob!

If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share
any tips on issues you're encountered that are not yet documented.

Cheers,

Paulo


2013/9/24 Robert Coli rc...@eventbrite.com

 On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.comwrote:

 Doesn't the probability of something going wrong increases as the gap
 between the versions increase? So, using this reasoning, upgrading from
 1.1.10 to 1.2.6 would have less chance of something going wrong then from
 1.1.10 to 1.2.9 or 1.2.10.


 Sorta, but sorta not.

 https://github.com/apache/cassandra/blob/trunk/NEWS.txt

 Is the canonical source of concerns on upgrade. There are a few cases
 where upgrading to the root of X.Y.Z creates issues that do not exist if
 you upgrade to the head of that line. AFAIK there have been no cases
 where upgrading to the head of a line (where that line is mature, like
 1.2.10) has created problems which would have been avoided by upgrading to
 the root first.


 I'm hoping this reasoning is wrong and I can update directly from 1.1.10
 to 1.2.10. :-)


 That's what I plan to do when we move to 1.2.X, FWIW.

 =Rob




-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*


Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Robert Coli
On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta pauloricard...@gmail.comwrote:

 If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share
 any tips on issues you're encountered that are not yet documented.


Exceptions like the below relate to the change in hinted handoff format and
can be safely ignored as long as you follow the instruction from NEWS.txt
to repair your cluster after upgrade :


ERROR [MutationStage:54] 2013-08-15 14:45:02,248 CassandraDaemon.java (line
192) Exception in thread Thread[MutationStage:54,5,main]
java.lang.AssertionError: Missing host ID for 10.93.12.12
at
org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:583)
at
org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:552)
at
org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1658)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)


=Rob


Re: Cassandra 1.2.9 cluster with vnodes is heavily unbalanced.

2013-09-24 Thread Suruchi Deodhar
As an update to this thread, we conducted several tests with
Cassandra-1.2.9, varying parameters such as partitioner
(Murmur3Partitioner/RandomParttioner), using NetworkToplogyStrategy (with
Ec2Snitch) / SimpleStrategy (with SimpleSnitch) across 2 Availability zones
and 1 AZ. We also tested the configurations separately with vnodes and
without vnodes.

Every time before each test, we wiped the cassandra cluster data and
commitlog folders and restarted with an empty cassandra db. However, in all
the cases using 1.2.9 we continued to see very heavy imbalance across the
nodes as reported in this thread.

We then tested the same exports with cassandra 1.2.5 version that we had
been testing previously (without vnodes across 2 AZs) and the data was
balanced across the nodes of the cluster. The output from bin/nodetool
status is attached.

Was there some change from 1.2.5 to 1.2.9 that could be responsible for the
imbalance or is there some parameter setting that we may have completely
missed in our configuration wrt 1.2.9? Has anyone else experienced such an
imbalance issue?

Also,  we were contemplating on using vnodes with NetworkTopologyStrategy
(We want to replicate data across 2 AZs)
We came across the following links that mention that vnodes with
NetworkToplogyStrategy may create hotspots and the issue is marked as Open.
Does that mean using vnodes with NetworkToplogyStrategy is a bad idea?

[ https://issues.apache.org/jira/browse/CASSANDRA-4658 ,
https://issues.apache.org/jira/browse/CASSANDRA-3810 ,
https://issues.apache.org/jira/browse/CASSANDRA-4123 ] .

Thanks again for all your replies.

Suruchi





On Fri, Sep 20, 2013 at 7:04 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Sep 20, 2013 at 3:42 PM, Suruchi Deodhar 
 suruchi.deod...@generalsentiment.com wrote:

 Using the nodes in the same availability zone(us-east-1b), we still get a
 highly imbalanced cluster. The nodetool status and ring output is attached.
 Even after running repairs, the cluster does not seem to balance.


 If your cluster doesn't experience exceptions when loading and/or store a
 lot of hints, repair is almost certainly just wasting your and your CPU's
 time.

 =Rob

Datacenter: us-east
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Owns   Host ID   
TokenRack
UN  10.151.86.146   362.52 MB  4.2%   bcc08da4-58d8-4777-a838-253a997874db  
-9223372036854775808 1a
UN  10.87.83.107359.71 MB  4.2%   7af64cbc-c225-49a4-8750-8c4229042b91  
-8454757700450211158 1b
UN  10.238.137.250  360.64 MB  4.2%   dad2d2fb-784f-48c8-bca8-890bfa5181a2  
-7686143364045646508 1a
UN  10.120.249.140  360.16 MB  4.2%   28826628-110b-4b46-8dec-8ad9b99a7459  
-6917529027641081858 1b
UN  10.137.7.90 360.23 MB  4.2%   c3379759-7c27-4ce7-87d1-eb63d69f6a30  
-6148914691236517208 1a
UN  10.87.90.42 365.93 MB  4.2%   cc23d1d5-e172-4d76-983b-0f4a7c410873  
-5380300354831952558 1b
UN  10.238.133.174  364.67 MB  4.2%   7950da0a-84f3-49c0-ad4f-ad4860192320  
-4611686018427387908 1a
UN  10.93.31.44 363.22 MB  4.2%   17016473-18c4-47be-942e-250216a6c0c4  
-3843071682022823258 1b
UN  10.238.170.159  361.34 MB  4.2%   9c959ba6-888d-434f-8d60-3f8b5e87f933  
-3074457345618258608 1a
UN  10.93.91.139361.39 MB  4.2%   d872c494-a9e5-452b-8789-8c30a27854f6  
-2305843009213693958 1b
UN  10.137.20.183   363.02 MB  4.2%   ce03bda4-587e-455c-8195-40115876793d  
-1537228672809129308 1a
UN  10.87.75.147362 MB 4.2%   ef65f745-81cb-444a-8af9-53c860cfcca3  
-768614336404564658  1b
UN  10.136.11.40365.43 MB  4.2%   4d2cb28f-43d9-4e74-90d1-274344405ee8  -8  
 1a
UN  10.123.95.248   365.24 MB  4.2%   f2d611d9-b275-4b79-9567-dbe44b0dc158  
768614336404564642   1b
UN  10.151.49.88363.49 MB  4.2%   f1a02203-3064-4a63-9ace-dcc3dd9a3928  
1537228672809129292  1a
UN  10.93.77.166364.3 MB   4.2%   acd22aa4-c0d2-42b5-85bd-023877f2787a  
2305843009213693942  1b
UN  10.138.2.20 362.56 MB  4.2%   58abf545-bb4e-4a5c-81d1-7af0b2726704  
3074457345618258592  1a
UN  10.90.246.128   362.63 MB  4.2%   f654845c-0524-4279-abd9-e83b3458a2c8  
3843071682022823242  1b
UN  10.138.10.9 362.88 MB  4.2%   c6b086ef-4b75-4192-881a-a2dc85e8daa0  
4611686018427387892  1a
UN  10.93.5.157 360.4 MB   4.2%   46a8cd4b-83dc-49d4-8e42-7cd55bebd69e  
5380300354831952542  1b
UN  10.236.138.169  360.05 MB  4.2%   0c909de3-1c26-413b-baeb-a1d46d56e1bd  
6148914691236517192  1a
UN  10.92.231.170   362.99 MB  4.2%   

Re: Best version to upgrade from 1.1.10 to 1.2.X

2013-09-24 Thread Charles Brophy
Hi Paulo,

I just completed a migration from 1.1.10 to 1.2.10 and it was surprisingly
painless.

The course of action that I took:
1) describe cluster - make sure all nodes are on the same schema
2) shutoff all maintenance tasks; i.e. make sure no scheduled repair is
going to kick off in the middle of what you're doing
3) snapshot - maybe not necessary but it's so quick it makes no sense to
skip this step
4) drain the nodes - I shut down the entire cluster rather than chance any
incompatible gossip concerns that might come from a rolling upgrade. I have
the luxury of controlling both the providers and consumers of our data, so
this wasn't so disruptive for us.
5) Upgrade the nodes, turn them on one-by-one, monitor the logs for funny
business.
6) nodetool upgradesstables
7) Turn various maintenance tasks back on, etc.

The worst part was managing the yaml/config changes between the versions.
It wasn't horrible, but the diff was noisier than a more incremental
upgrade typically is. A few things I recall that were special:
1) Since you have an existing cluster, you'll probably need to set the
default partitioner back to RandomPartitioner in cassandra.yaml. I believe
that is outlined in NEWS.
2) I set the initial tokens to be the same as what the nodes held
previously.
3) The timeout is now divided into more atomic settings and you get to
decided how (or if) to configure it from the default appropriately.

tldr; I did a standard upgrade and payed careful attention to the NEWS.txt
upgrade notices. I did a full cluster restart and NOT a rolling upgrade. It
went without a hitch.

Charles






On Tue, Sep 24, 2013 at 2:33 PM, Paulo Motta pauloricard...@gmail.comwrote:

 Cool, sounds fair enough. Thanks for the help, Rob!

 If anyone has upgraded from 1.1.X to 1.2.X, please feel invited to share
 any tips on issues you're encountered that are not yet documented.

 Cheers,

 Paulo


 2013/9/24 Robert Coli rc...@eventbrite.com

 On Tue, Sep 24, 2013 at 1:41 PM, Paulo Motta pauloricard...@gmail.comwrote:

 Doesn't the probability of something going wrong increases as the gap
 between the versions increase? So, using this reasoning, upgrading from
 1.1.10 to 1.2.6 would have less chance of something going wrong then from
 1.1.10 to 1.2.9 or 1.2.10.


 Sorta, but sorta not.

 https://github.com/apache/cassandra/blob/trunk/NEWS.txt

 Is the canonical source of concerns on upgrade. There are a few cases
 where upgrading to the root of X.Y.Z creates issues that do not exist if
 you upgrade to the head of that line. AFAIK there have been no cases
 where upgrading to the head of a line (where that line is mature, like
 1.2.10) has created problems which would have been avoided by upgrading to
 the root first.


 I'm hoping this reasoning is wrong and I can update directly from 1.1.10
 to 1.2.10. :-)


 That's what I plan to do when we move to 1.2.X, FWIW.

 =Rob




 --
 Paulo Ricardo

 --
 European Master in Distributed Computing***
 Royal Institute of Technology - KTH
 *
 *Instituto Superior Técnico - IST*
 *http://paulormg.com*



1.2.10 - 2.0.1 migration issue

2013-09-24 Thread Christopher Wirt
Hi,

 

Just had a go at upgrading a node to the latest stable c* 2 release and
think I ran into some issues with manifest migration.

 

On initial start up I hit this error as it starts to load the first of my
CF. 

 

INFO [main] 2013-09-24 22:56:01,018 LegacyLeveledManifest.java (line 89)
Migrating manifest for struqrealtime/impressionstorev2

INFO [main] 2013-09-24 22:56:01,019 LegacyLeveledManifest.java (line 119)
Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration

ERROR [main] 2013-09-24 22:56:01,030 CassandraDaemon.java (line 459)
Exception encountered during startup

FSWriteError in
/disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem
etamigration/impressionstorev2.json

at
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:83)

at
org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(
LegacyLeveledManifest.java:138)

at
org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le
gacyLeveledManifest.java:91)

at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)

at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4
42)

at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)

Caused by: java.nio.file.NoSuchFileException:
/disk1/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem
etamigration/impressionstorev2.json -
/disk1/cassandra/data/struqrealtime/impressionstorev2/impressionstorev2.json

at
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)

at
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)

at
sun.nio.fs.UnixFileSystemProvider.createLink(UnixFileSystemProvider.java:474
)

at java.nio.file.Files.createLink(Files.java:1037)

at
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:79)

... 5 more

 

I had already successful run a test migration on our dev server. Only real
difference I can see if the number of data directories defined and the
amount of data being held. 

 

I've run upgradesstables under 1.2.10. I have always been using vnodes and
CQL3. I recently moved to using LZ4 instead of Snappy..

 

I tried to startup again and it gave me a slightly different error

 

INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 89)
Migrating manifest for struqrealtime/impressionstorev2

INFO [main] 2013-09-24 22:58:28,218 LegacyLeveledManifest.java (line 119)
Snapshotting struqrealtime, impressionstorev2 to pre-sstablemetamigration

ERROR [main] 2013-09-24 22:58:28,222 CassandraDaemon.java (line 459)
Exception encountered during startup

java.lang.RuntimeException: Tried to create duplicate hard link to
/disk3/cassandra/data/struqrealtime/impressionstorev2/snapshots/pre-sstablem
etamigration/struqrealtime-impressionstorev2-ic-1030-TOC.txt

at
org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.java:71)

at
org.apache.cassandra.db.compaction.LegacyLeveledManifest.snapshotWithoutCFS(
LegacyLeveledManifest.java:129)

at
org.apache.cassandra.db.compaction.LegacyLeveledManifest.migrateManifests(Le
gacyLeveledManifest.java:91)

at
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:246)

at
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:4
42)

at
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485)

 

Will have a go recreating this tomorrow.

 

Any insight or guesses at what the issue might be are always welcome.

 

Thanks,

Chris