Re: Does or will Cassandra support OpenJDK ?

2012-05-15 Thread Jeff Williams
I've used java-package under Debian (http://wiki.debian.org/JavaPackage) which 
turns your download from Oracle into a .deb. This may work on Ubuntu as well.

On May 14, 2012, at 11:19 PM, aaron morton wrote:

 To get the latest sun java 6 JRE on a ubuntu machine using apt-get I've used 
 the instructions here https://help.ubuntu.com/community/Java#JRE_only
 
 I've also use open JDK for java 6 on ubuntu without issue. You will want to 
 edit cassandra-env.sh to enable the jamm memory meter though, just comment 
 out the if statement and leave the JVM_OPS… line un commented.
 
 Cheers
 
 
 -
 Aaron Morton
 Freelance Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 15/05/2012, at 2:33 AM, Jeremiah Jordan wrote:
 
 Open JDK is java 1.7.  Once Cassandra supports Java 1.7 it would most likely 
 work on Open JDK, as the 1.7 Open JDK really is the same thing as Oracle JDK 
 1.7 without some licensed stuff.
 
 -Jeremiah
 
 On May 11, 2012, at 10:02 PM, ramesh wrote:
 
 I've had problem downloading the Sun (Oracle) JDK and found this thread 
 where the Oracle official is insisting or rather forcing Linux users to 
 move to OpenJDK. Here is the thread
 
 https://forums.oracle.com/forums/thread.jspa?threadID=2365607
 
 I need this because I run Cassandra.
 Just curious to know if I would be able to avoid the pain of using Sun JDK 
 in future for production Cassandra ?
 
 regards
 Ramesh
 
 



Re: How do I add a custom comparator class to a cassandra cluster ?

2012-05-15 Thread Brandon Williams
On Tue, May 15, 2012 at 12:53 AM, Ertio Lew ertio...@gmail.com wrote:
 @Brandon : I just created a jira issue to request this type of comparator
 along with Cassandra.

 It is about a UTF8 comparator that provides case insensitive ordering of
 columns.
 See issue here : https://issues.apache.org/jira/browse/CASSANDRA-4245

Nothing I said before does not stand, as far as I can tell.

-Brandon


Re: live ratio counting

2012-05-15 Thread Radim Kolar


Try reducing memtable_total_space_in_mb config setting. If the problem 
is incorrect memory metering that should help.
it does not helps much because difference in correct and cassandra 
assumed calculation is way too high. It would require me to shrink 
memtables to about 10% of their correct size leading to too much 
compactions.




i have 3 workload types running in batch. Delete only workload, 
insert only and heavy update (lot of overwrites)
Are you saying you do a lot of deletes, followed by a lot of inserts 
and then updates all for the same CF ?
no. most common workload type is insert only. from time to time there 
are batch job doing lot of overwrites in memtables, and ocassionaly 
cleanup jobs doing only deletes. This breaks liveratio calculation too 
because cassandra assumes not only that average column size stored in 
memtable is constant but also that overwrite ratio in memtable is 
constant. If you overwrite too much cassandra starts to make very tiny 
sstables, if you delete too much there is risk of OOM.




yes. Record is about 120, but it is rare. 80 should be good enough. 
Default 10 (if not jusing jamm) is way too low.
Can you provide some information on what is stored in the CF and what 
sort of workload. It would be interesting to understand why the real 
memory usage is 120 times the serialised size.

super column family:

  and column_metadata = [
{column_name : 'crc32',
validation_class : LongType},
{column_name : 'id',
validation_class : LongType},
{column_name : 'name',
validation_class : AsciiType},
{column_name : 'size',
validation_class : LongType}];
]

but this is not important, problem is that you do not calculate live 
ration frequently enough, if workload changes ratio looks like:


 INFO [MemoryMeter:1] 2012-05-12 21:11:51,649 Memtable.java (line 186) 
CFS(Keyspace='dedup', ColumnFamily='resultcache') liveRatio is 64.0 
(just-counted was 4.633391051722882).  calculation took 111ms for 4465 
columns


why not recalculate it every 5 or 10 minutes. calculation takes just few 
seconds.


Re: live ratio counting

2012-05-15 Thread Radim Kolar

here is part of log. actually record is 419.
ponto:(admin)log/cassandragrep to maximum of 64 system.log.1
 WARN [MemoryMeter:1] 2012-02-03 00:00:19,444 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 64.9096047648211
 WARN [MemoryMeter:1] 2012-02-08 00:00:17,379 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 68.81016452376322
 WARN [MemoryMeter:1] 2012-02-08 00:00:32,358 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 88.49747308025415
 WARN [MemoryMeter:1] 2012-02-09 00:00:08,448 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 76.2444888765154
 WARN [MemoryMeter:1] 2012-02-10 18:18:52,677 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 142.22477982642255
 WARN [MemoryMeter:1] 2012-02-20 00:00:53,753 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 88.19832386767173
 WARN [MemoryMeter:1] 2012-03-02 10:41:00,232 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 419.9607495592804
 WARN [MemoryMeter:1] 2012-03-07 14:13:15,141 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of Infinity
 WARN [MemoryMeter:1] 2012-03-08 00:01:12,766 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 94.20215772717702
 WARN [MemoryMeter:1] 2012-03-09 00:00:38,633 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 98.54003447121715
 WARN [MemoryMeter:1] 2012-03-11 00:00:13,243 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 193.14262214179965
 WARN [MemoryMeter:1] 2012-03-14 00:00:26,709 Memtable.java (line 181) 
setting live ratio to maximum of 64 instead of 103.88360138951437




Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread Abhijit Chanda
Tamar,

Can you please illustrate little bit with some sample code. It highly
appreciable.

Thanks,

On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if your
 order is alphabetical. For example I have a CF with comparator UTF8Type,
 and then I can do slice query and bring all columns that start with the
 prefix, and end with the prefix where you replace the last char with the
 next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda abhijit.chan...@gmail.com
  wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





-- 
Abhijit Chanda
Software Developer
VeHere Interactive Pvt. Ltd.
+91-974395
tokLogo.png

Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread selam
Mapreduce jobs may solve your problem  for batch processing

On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda
abhijit.chan...@gmail.comwrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if your
 order is alphabetical. For example I have a CF with comparator UTF8Type,
 and then I can do slice query and bring all columns that start with the
 prefix, and end with the prefix where you replace the last char with the
 next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




-- 
Saygılar  İyi Çalışmalar
Timu EREN ( a.k.a selam )
tokLogo.png

Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread samal
You cannot extract via relative column value.
It can only extract via value if it has secondary index but exact column
value need to match.

as tamar suggested you can put value as column name , UTF8 comparator.

{
'name_abhijit'='abhijit'
'name_abhishek'='abhiskek'
'name_atul'='atul'
}

here you can do slice query on column name and get desired result.

/samal
On Tue, May 15, 2012 at 3:29 PM, selam selam...@gmail.com wrote:

 Mapreduce jobs may solve your problem  for batch processing


 On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if your
 order is alphabetical. For example I have a CF with comparator UTF8Type,
 and then I can do slice query and bring all columns that start with the
 prefix, and end with the prefix where you replace the last char with
 the next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




 --
 Saygılar  İyi Çalışmalar
 Timu EREN ( a.k.a selam )

tokLogo.png

Re: Composite Column

2012-05-15 Thread samal
I have not used CC but yes you can.
Below is not composite column. It is not not column with JSON hash value.
Column value can be anything you like.
date inside value are not indexed.

On Tue, May 15, 2012 at 9:27 AM, Abhijit Chanda
abhijit.chan...@gmail.comwrote:

 Is it possible to create this data model with the help of composite column.

 User_Keys_By_Last_Name = {
   Engineering : {anderson, 1 : ac1263, anderson, 2 : 724f02, ... },
   Sales : { adams, 1 : b32704, alden, 1 : 1553bd, ... },
 }

 I am using Astyanax. Please suggest...
 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




Re: Composite Column

2012-05-15 Thread samal
It is just column with JSON value

On Tue, May 15, 2012 at 4:00 PM, samal samalgo...@gmail.com wrote:

 I have not used CC but yes you can.
 Below is not composite column. It is not not column with JSON hash value.
 Column value can be anything you like.
 date inside value are not indexed.


 On Tue, May 15, 2012 at 9:27 AM, Abhijit Chanda abhijit.chan...@gmail.com
  wrote:

 Is it possible to create this data model with the help of composite
 column.

 User_Keys_By_Last_Name = {
   Engineering : {anderson, 1 : ac1263, anderson, 2 : 724f02, ... },
   Sales : { adams, 1 : b32704, alden, 1 : 1553bd, ... },
 }

 I am using Astyanax. Please suggest...
 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread Tamar Fraenkel
Do you still need the sample code? I use Hector, well here is an example:
*This is the Column Family definition:*
(I have a composite, but if you like you can have only the UTF8Type).

CREATE COLUMN FAMILY title_indx
with comparator = 'CompositeType(UTF8Type,UUIDType)'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'LongType';

*The Query:*
SliceQueryLong, Composite, String query =
HFactory.createSliceQuery(CassandraHectorConn.getKeyspace(),
LongSerializer.get(),
CompositeSerializer.get(),
StringSerializer.get());
query.setColumnFamily(title_indx);
query.setKey(...)

Composite start = new Composite();
start.add(prefix);
char c = lowerCasePrefix.charAt(lastCharIndex);
String prefixEnd =  prefix.substring(0, lastCharIndex) + ++c;
Composite end = new Composite();
end.add(prefixEnd);

ColumnSliceIteratorLong, Composite, String iterator =
  new ColumnSliceIteratorLong, Composite, String(
   query, start, end, false)
while (iterator.hasNext()) {
...
   }

Cheers,

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Tue, May 15, 2012 at 1:19 PM, samal samalgo...@gmail.com wrote:

 You cannot extract via relative column value.
 It can only extract via value if it has secondary index but exact column
 value need to match.

 as tamar suggested you can put value as column name , UTF8 comparator.

 {
 'name_abhijit'='abhijit'
 'name_abhishek'='abhiskek'
 'name_atul'='atul'
 }

 here you can do slice query on column name and get desired result.

 /samal

 On Tue, May 15, 2012 at 3:29 PM, selam selam...@gmail.com wrote:

 Mapreduce jobs may solve your problem  for batch processing


 On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if your
 order is alphabetical. For example I have a CF with comparator UTF8Type,
 and then I can do slice query and bring all columns that start with the
 prefix, and end with the prefix where you replace the last char with
 the next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




 --
 Saygılar  İyi Çalışmalar
 Timu EREN ( a.k.a selam )



tokLogo.pngtokLogo.png

Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread Abhijit Chanda
Thanks so much Guys, specially Tamar, thank you so much man.

Regards,
Abhijit

On Tue, May 15, 2012 at 4:26 PM, Tamar Fraenkel ta...@tok-media.com wrote:

 Do you still need the sample code? I use Hector, well here is an example:
 *This is the Column Family definition:*
 (I have a composite, but if you like you can have only the UTF8Type).

 CREATE COLUMN FAMILY title_indx
 with comparator = 'CompositeType(UTF8Type,UUIDType)'
 and default_validation_class = 'UTF8Type'
 and key_validation_class = 'LongType';

 *The Query:*
 SliceQueryLong, Composite, String query =
 HFactory.createSliceQuery(CassandraHectorConn.getKeyspace(),
 LongSerializer.get(),
 CompositeSerializer.get(),
 StringSerializer.get());
 query.setColumnFamily(title_indx);
 query.setKey(...)

 Composite start = new Composite();
 start.add(prefix);
 char c = lowerCasePrefix.charAt(lastCharIndex);
 String prefixEnd =  prefix.substring(0, lastCharIndex) + ++c;
 Composite end = new Composite();
 end.add(prefixEnd);

 ColumnSliceIteratorLong, Composite, String iterator =
   new ColumnSliceIteratorLong, Composite, String(
query, start, end, false)
 while (iterator.hasNext()) {
 ...
}

 Cheers,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 1:19 PM, samal samalgo...@gmail.com wrote:

 You cannot extract via relative column value.
 It can only extract via value if it has secondary index but exact column
 value need to match.

 as tamar suggested you can put value as column name , UTF8 comparator.

 {
 'name_abhijit'='abhijit'
 'name_abhishek'='abhiskek'
 'name_atul'='atul'
 }

 here you can do slice query on column name and get desired result.

 /samal

 On Tue, May 15, 2012 at 3:29 PM, selam selam...@gmail.com wrote:

 Mapreduce jobs may solve your problem  for batch processing


 On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel 
 ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if your
 order is alphabetical. For example I have a CF with comparator UTF8Type,
 and then I can do slice query and bring all columns that start with the
 prefix, and end with the prefix where you replace the last char with
 the next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




 --
 Saygılar  İyi Çalışmalar
 Timu EREN ( a.k.a selam )






-- 
Abhijit Chanda
Software Developer
VeHere Interactive Pvt. Ltd.
+91-974395
tokLogo.png

Couldn't find cfId

2012-05-15 Thread Daning Wang
We got exception UnserializableColumnFamilyException: Couldn't find
cfId=1075 in the log of one node, describe cluster showed all the nodes in
same schema version. how to fix this problem? did repair but looks does not
work, haven't try scrub yet.

We are on v1.0.3

ERROR [HintedHandoff:1631] 2012-05-15 07:13:07,877
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[HintedHandoff:1631,1,main]
java.lang.RuntimeException:
org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find
cfId=1075
at
org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:689)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.cassandra.db.UnserializableColumnFamilyException:
Couldn't find cfId=1075
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:129)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:401)
at
org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:409)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpointInternal(HintedHandOffManager.java:344)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:248)
at
org.apache.cassandra.db.HintedHandOffManager.access$200(HintedHandOffManager.java:84)
at
org.apache.cassandra.db.HintedHandOffManager$3.runMayThrow(HintedHandOffManager.java:418)

Thanks,

Daning


Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread Tamar Fraenkel
Actually woman ;-)

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





On Tue, May 15, 2012 at 3:45 PM, Abhijit Chanda
abhijit.chan...@gmail.comwrote:

 Thanks so much Guys, specially Tamar, thank you so much man.

 Regards,
 Abhijit


 On Tue, May 15, 2012 at 4:26 PM, Tamar Fraenkel ta...@tok-media.comwrote:

 Do you still need the sample code? I use Hector, well here is an example:
 *This is the Column Family definition:*
 (I have a composite, but if you like you can have only the UTF8Type).

 CREATE COLUMN FAMILY title_indx
 with comparator = 'CompositeType(UTF8Type,UUIDType)'
 and default_validation_class = 'UTF8Type'
 and key_validation_class = 'LongType';

 *The Query:*
 SliceQueryLong, Composite, String query =
 HFactory.createSliceQuery(CassandraHectorConn.getKeyspace(),
 LongSerializer.get(),
 CompositeSerializer.get(),
 StringSerializer.get());
 query.setColumnFamily(title_indx);
 query.setKey(...)

 Composite start = new Composite();
 start.add(prefix);
 char c = lowerCasePrefix.charAt(lastCharIndex);
 String prefixEnd =  prefix.substring(0, lastCharIndex) + ++c;
 Composite end = new Composite();
 end.add(prefixEnd);

 ColumnSliceIteratorLong, Composite, String iterator =
   new ColumnSliceIteratorLong, Composite, String(
query, start, end, false)
 while (iterator.hasNext()) {
 ...
}

 Cheers,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 1:19 PM, samal samalgo...@gmail.com wrote:

 You cannot extract via relative column value.
 It can only extract via value if it has secondary index but exact column
 value need to match.

 as tamar suggested you can put value as column name , UTF8 comparator.

 {
 'name_abhijit'='abhijit'
 'name_abhishek'='abhiskek'
 'name_atul'='atul'
 }

 here you can do slice query on column name and get desired result.

 /samal

 On Tue, May 15, 2012 at 3:29 PM, selam selam...@gmail.com wrote:

 Mapreduce jobs may solve your problem  for batch processing


 On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel 
 ta...@tok-media.comwrote:

 I don't think this is possible, the best you can do is prefix, if
 your order is alphabetical. For example I have a CF with
 comparator UTF8Type, and then I can do slice query and bring all columns
 that start with the prefix, and end with the prefix where you replace the
 last char with the next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a partial
 matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




 --
 Saygılar  İyi Çalışmalar
 Timu EREN ( a.k.a selam )






 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395


tokLogo.pngtokLogo.png

Tuning cassandra (compactions overall)

2012-05-15 Thread Alain RODRIGUEZ
Hi,

I'm using a 2 node cluster in production ( 2 EC2 c1.medium, CL.ONE, RF
= 2, using RP)

1 - I got this kind of message quite often (let's say every 30 seconds) :

WARN [ScheduledTasks:1] 2012-05-15 15:44:53,083 GCInspector.java (line
145) Heap is 0.8081418550931491 full.  You may need to reduce memtable
and/or cache sizes.  Cassandra will now flush up to the two largest
memtables to free up memory.  Adjust flush_largest_memtables_at
threshold in cassandra.yaml if you don't want Cassandra to do this
automatically
 WARN [ScheduledTasks:1] 2012-05-15 15:44:53,084 StorageService.java
(line 2645) Flushing CFS(Keyspace='xxx', ColumnFamily='yyy') to
relieve memory pressure

Is that a problem ?

2 - I shared 2 screenshot the cluster performance (via OpsCenter) and
the hardware metrics (via AWS).

http://img337.imageshack.us/img337/6812/performance.png
http://img256.imageshack.us/img256/9644/aws.png

What do you think of these metrics ? Are frequents compaction normal ?
What about having a 60-70% cpu load for 600 ReadsWrites/sec with this
hardware ? Is there a way to optimize my cluster ?

Here you got the main points of my cassandra.yaml :

flush_largest_memtables_at: 0.75
reduce_cache_sizes_at: 0.85
reduce_cache_capacity_to: 0.6
concurrent_reads: 32
concurrent_writes: 32
commitlog_total_space_in_mb: 4096
rpc_server_type: sync (I am going to switch to hsha, because we are
using ubuntu)
#concurrent_compactors: 1 (commented, so I use default)
multithreaded_compaction: false
compaction_throughput_mb_per_sec: 16
rpc_timeout_in_ms: 1

others tuning options (as many of the ones above) are default.

Any advice or comment would be appreciated :).

Alain


cassandra upgrade to 1.1 - migration problem

2012-05-15 Thread Casey Deccio
I recently upgraded from cassandra 1.0.10 to 1.1.  Everything worked fine
in one environment, but after I upgraded in another, I can't find my
keyspace.  When I run, e.g., cassandra-cli with 'use KeySpace;' It tells me
that the keyspace doesn't exist.  In the log I see this:

ERROR [MigrationStage:1] 2012-05-15 11:39:48,216
AbstractCassandraDaemon.java (line 134) Exception in thread
Thread[MigrationStage:1,5,main]java.lang.AssertionError
at
org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441)
at
org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269)
at
org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
at
org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)

I can see that the data I would expect still seems to be in the new place
(/var/lib/cassandra/data/App/ColFamily/App-DomainName-*) on all nodes.

What am I missing?

Thanks,
Casey


Re: cassandra upgrade to 1.1 - migration problem

2012-05-15 Thread Casey Deccio
Here's something new in the logs:

ERROR 12:21:09,418 Exception in thread Thread[SSTableBatchOpen:2,5,main]
java.lang.RuntimeException: Cannot open
/var/lib/cassandra/data/system/Versions/system-Versions-hc-35 because
partitioner does not match org.apache.cassandra.dht.ByteOrderedPartitioner
at
org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:164)
at
org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:224)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Casey

On Tue, May 15, 2012 at 12:08 PM, Casey Deccio ca...@deccio.net wrote:

 I recently upgraded from cassandra 1.0.10 to 1.1.  Everything worked fine
 in one environment, but after I upgraded in another, I can't find my
 keyspace.  When I run, e.g., cassandra-cli with 'use KeySpace;' It tells me
 that the keyspace doesn't exist.  In the log I see this:

 ERROR [MigrationStage:1] 2012-05-15 11:39:48,216
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[MigrationStage:1,5,main]java.lang.AssertionError
 at
 org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441)
 at
 org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339)
 at
 org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269)
 at
 org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
 at
 org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at java.util.concurrent.FutureTask.run(FutureTask.java:166)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)

 I can see that the data I would expect still seems to be in the new place
 (/var/lib/cassandra/data/App/ColFamily/App-DomainName-*) on all nodes.

 What am I missing?

 Thanks,
 Casey



no snappyjava in java.library.path (JDK 1.7 issue?)

2012-05-15 Thread Stephen McKamey
I'm getting a org.xerial.snappy.SnappyError when creating my first column
family after blowing away my Cassandra installation and trying to run the
latest release. I'm undoubtably making some silly mistake but cannot seem
to find it. I even commented out my sstable_compression=SnappyCompressor
settings.

InvalidRequestException(why:SnappyCompressor.create() threw an error:
org.xerial.snappy.SnappyError [FAILED_TO_LOAD_NATIVE_LIBRARY] null)
at
org.apache.cassandra.thrift.Cassandra$system_add_column_family_result.read(Cassandra.java:27683)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at
org.apache.cassandra.thrift.Cassandra$Client.recv_system_add_column_family(Cassandra.java:1193)
at
org.apache.cassandra.thrift.Cassandra$Client.system_add_column_family(Cassandra.java:1180)
at
org.scale7.cassandra.pelops.ColumnFamilyManager$2.execute(ColumnFamilyManager.java:64)
at
org.scale7.cassandra.pelops.ColumnFamilyManager$2.execute(ColumnFamilyManager.java:61)
at
org.scale7.cassandra.pelops.ManagerOperand.tryOperation(ManagerOperand.java:131)
at
org.scale7.cassandra.pelops.ColumnFamilyManager.addColumnFamily(ColumnFamilyManager.java:67)


Worth noting is I'm on Mac OS X 10.7.4 and I recently upgraded to the
latest JDK (really hoping this isn't the issue):

java version 1.7.0_04
Java(TM) SE Runtime Environment (build 1.7.0_04-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)


When Cassandra starts up with blown away lib/log dirs, I can see the Snappy
in the classpath, but it says Native methods will be disabled despite
having JNA. I've got the latest JNA installed at /usr/share/java/jna.jar
and symbolic linked at /opt/apache-cassandra-1.1.0/lib/jna.jar.

 INFO 12:18:59,430 Logging initialized
 INFO 12:18:59,434 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
VM/1.7.0_04
 
 INFO 12:18:59,435 Classpath:
bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.1.0.jar:bin/../lib/apache-cassandra-clientutil-1.1.0.jar:bin/../lib/apache-cassandra-thrift-1.1.0.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.9.2.jar:bin/../lib/jackson-mapper-asl-1.9.2.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.7.0.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/metrics-core-2.0.3.jar:bin/../lib/platform.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/snaptree-0.1.jar:bin/../lib/jamm-0.2.5.jar
 INFO 12:18:59,710 Unable to link C library. Native methods will be
disabled.
 
 INFO 12:19:00,529 Cassandra version: 1.1.0


Re: cassandra upgrade to 1.1 - migration problem

2012-05-15 Thread Casey Deccio
cassandra.yaml on all nodes had ByteOrderedPartitioner with both the
previous version and upgraded version.

That being said, when I first started up cassandra after upgrading  (with
the updated .yaml, including ByteOrderedPartitioner) all nodes in the ring
appeared to be up.  But the load they carried was minimal (KB, as opposed
to GB in the previous version), and the keyspace didn't exist.  Then when I
attempted to restart the daemon on each to see if it would help, but
starting up failed on each with the partition error.

Casey

On Tue, May 15, 2012 at 12:59 PM, Oleg Dulin oleg.du...@liquidanalytics.com
 wrote:

 Did you check cassandra.yaml to make sure partitioner there matches what
 was in your old cluster ?

 Regards,
 Oleg Dulin
 Please note my new office #: 732-917-0159

 On May 15, 2012, at 3:22 PM, Casey Deccio wrote:

 Here's something new in the logs:

 ERROR 12:21:09,418 Exception in thread Thread[SSTableBatchOpen:2,5,main]
 java.lang.RuntimeException: Cannot open
 /var/lib/cassandra/data/system/Versions/system-Versions-hc-35 because
 partitioner does not match org.apache.cassandra.dht.ByteOrderedPartitioner
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:164)
 at
 org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:224)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)

 Casey

 On Tue, May 15, 2012 at 12:08 PM, Casey Deccio ca...@deccio.net wrote:

 I recently upgraded from cassandra 1.0.10 to 1.1.  Everything worked fine
 in one environment, but after I upgraded in another, I can't find my
 keyspace.  When I run, e.g., cassandra-cli with 'use KeySpace;' It tells me
 that the keyspace doesn't exist.  In the log I see this:

 ERROR [MigrationStage:1] 2012-05-15 11:39:48,216
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[MigrationStage:1,5,main]java.lang.AssertionError
 at
 org.apache.cassandra.db.DefsTable.updateKeyspace(DefsTable.java:441)
 at
 org.apache.cassandra.db.DefsTable.mergeKeyspaces(DefsTable.java:339)
 at
 org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:269)
 at
 org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248)
 at
 org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at
 java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
 at
 java.util.concurrent.FutureTask.run(FutureTask.java:166)at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)

 I can see that the data I would expect still seems to be in the new place
 (/var/lib/cassandra/data/App/ColFamily/App-DomainName-*) on all nodes.

 What am I missing?

 Thanks,
 Casey






Re: no snappyjava in java.library.path (JDK 1.7 issue?)

2012-05-15 Thread Stephen McKamey
Reverting to JDK 1.6 appears to fix the issue. Is JDK 1.7 not yet supported
by Cassandra?

java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode)


On Tue, May 15, 2012 at 12:55 PM, Stephen McKamey step...@mckamey.comwrote:

 Worth noting is I'm on Mac OS X 10.7.4 and I recently upgraded to the
 latest JDK (really hoping this isn't the issue):

 java version 1.7.0_04
 Java(TM) SE Runtime Environment (build 1.7.0_04-b21)
 Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)




Snapshot failing on JSON files in 1.1.0

2012-05-15 Thread Bryan Fernandez
Greetings,

We recently upgraded from 1.0.8 to 1.1.0. Everything has been running fine
with the exception of snapshots. When attempting to snapshot any of the
nodes in our six node cluster we are seeing the following error.

[root@cassandra-n6 blotter]# /opt/apache-cassandra-1.1.0/bin/nodetool -h
10.20.50.58 snapshot
Requested snapshot for: all keyspaces
Exception in thread main java.io.IOError: java.io.IOException: Unable to
create hard link from
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
/var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
(errno 17)
at
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1454)
at
org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1483)
at org.apache.cassandra.db.Table.snapshot(Table.java:205)
at
org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1793)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
at
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
at
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
at
javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Unable to create hard link from
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
/var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
(errno 17)
at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:163)
at
org.apache.cassandra.db.Directories.snapshotLeveledManifest(Directories.java:343)
at
org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1450)
... 33 more


However, an LS shows that both of these JSON files exist on the filesystem
(although slightly different sizes).

[root@cassandra-n6 blotter]# ls -al
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
-rw-r--r-- 1 root root 38786 May 15 20:51
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.json

[root@cassandra-n6 blotter]# ls -al
/var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
-rw-r--r-- 1 root root 38778 May 15 20:50
/var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json


We are using Leveled Compaction on the twitter_users CF with I assume is
creating the JSON files.

[root@cassandra-n6 blotter]# ls -al
/var/lib/cassandra/data/blotter/twitter_users/*.json
-rw-r--r-- 1 root root 38779 May 15 20:51
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
-rw-r--r-- 1 root root 38779 May 15 20:51
/var/lib/cassandra/data/blotter/twitter_users/twitter_users-old.json
-rw-r--r-- 1 root root  1040 May 15 20:51
/var/lib/cassandra/data/blotter/twitter_users/twitter_users.twitter_user_attributes_screenname_idx.json
-rw-r--r-- 1 root root  1046 May 15 20:50

Re: need some clarification on recommended memory size

2012-05-15 Thread Tyler Hobbs
On Tue, May 15, 2012 at 3:19 PM, Yiming Sun yiming@gmail.com wrote:

 Hello,

 I was reading the Apache Cassandra 1.0 Documentation PDF dated May 10,
 2012, and had some questions on what the recommended memory size is.

 Below is the snippet from the PDF.  Bullet 1 suggests to have 16-32GB of
 RAM, yet Bullet 2 suggests to limit Java heap size to no more than 8GB.  My
 understanding is that Cassandra is implemented purely in Java, so all
 memory it sees and uses is the JVM Heap.


The main way that additional RAM helps is through the OS page cache, which
will store hot portions of SSTables in memory. Additionally, Cassandra can
now do off-heap caching.



  So can someone help me understand the discrepancy between 16-32GB of RAM
 and 8GB of heap?  Thanks.

 == snippet ==
 Memory
 The more memory a Cassandra node has, the better read performance. More
 RAM allows for larger cache sizes and
 reduces disk I/O for reads. More RAM also allows memory tables (memtables)
 to hold more recently written data. Larger
 memtables lead to a fewer number of SSTables being flushed to disk and
 fewer files to scan during a read. The ideal
 amount of RAM depends on the anticipated size of your hot data.

 • For dedicated hardware, a minimum of than 8GB of RAM is needed. DataStax
 recommends 16GB - 32GB.

 • Java heap space should be set to a maximum of 8GB or half of your total
 RAM, whichever is lower. (A greater
 heap size has more intense garbage collection periods.)

 • For a virtual environment use a minimum of 4GB, such as Amazon EC2 Large
 instances. For production clusters
 with a healthy amount of traffic, 8GB is more common.




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: Snapshot failing on JSON files in 1.1.0

2012-05-15 Thread Brandon Williams
Probably https://issues.apache.org/jira/browse/CASSANDRA-4230

On Tue, May 15, 2012 at 4:08 PM, Bryan Fernandez bfernande...@gmail.com wrote:
 Greetings,

 We recently upgraded from 1.0.8 to 1.1.0. Everything has been running fine
 with the exception of snapshots. When attempting to snapshot any of the
 nodes in our six node cluster we are seeing the following error.

 [root@cassandra-n6 blotter]# /opt/apache-cassandra-1.1.0/bin/nodetool -h
 10.20.50.58 snapshot
 Requested snapshot for: all keyspaces
 Exception in thread main java.io.IOError: java.io.IOException: Unable to
 create hard link from
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 (errno 17)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1454)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1483)
 at org.apache.cassandra.db.Table.snapshot(Table.java:205)
 at
 org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1793)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
 at
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
 at
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
 at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120)
 at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262)
 at
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836)
 at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427)
 at
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
 at
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
 at
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
 at
 javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303)
 at sun.rmi.transport.Transport$1.run(Transport.java:159)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
 at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
 at
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.IOException: Unable to create hard link from
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json to
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 (errno 17)
 at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:163)
 at
 org.apache.cassandra.db.Directories.snapshotLeveledManifest(Directories.java:343)
 at
 org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1450)
 ... 33 more


 However, an LS shows that both of these JSON files exist on the filesystem
 (although slightly different sizes).

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
 -rw-r--r-- 1 root root 38786 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json
 -rw-r--r-- 1 root root 38778 May 15 20:50
 /var/lib/cassandra/data/blotter/twitter_users/snapshots/1337115022389/twitter_users.json


 We are using Leveled Compaction on the twitter_users CF with I assume is
 creating the JSON files.

 [root@cassandra-n6 blotter]# ls -al
 /var/lib/cassandra/data/blotter/twitter_users/*.json
 -rw-r--r-- 1 root root 38779 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users.json
 -rw-r--r-- 1 root root 38779 May 15 20:51
 /var/lib/cassandra/data/blotter/twitter_users/twitter_users-old.json
 

Re: need some clarification on recommended memory size

2012-05-15 Thread Yiming Sun
Thanks Tyler... so my understanding is, even if Cassandra doesn't do
off-heap caching, by having a large-enough memory, it minimize the chance
of swapping the java heap to a disk.  Is that correct?

-- Y.

On Tue, May 15, 2012 at 6:26 PM, Tyler Hobbs ty...@datastax.com wrote:

 On Tue, May 15, 2012 at 3:19 PM, Yiming Sun yiming@gmail.com wrote:

 Hello,

 I was reading the Apache Cassandra 1.0 Documentation PDF dated May 10,
 2012, and had some questions on what the recommended memory size is.

 Below is the snippet from the PDF.  Bullet 1 suggests to have 16-32GB of
 RAM, yet Bullet 2 suggests to limit Java heap size to no more than 8GB.  My
 understanding is that Cassandra is implemented purely in Java, so all
 memory it sees and uses is the JVM Heap.


 The main way that additional RAM helps is through the OS page cache, which
 will store hot portions of SSTables in memory. Additionally, Cassandra can
 now do off-heap caching.



   So can someone help me understand the discrepancy between 16-32GB of
 RAM and 8GB of heap?  Thanks.

 == snippet ==
 Memory
 The more memory a Cassandra node has, the better read performance. More
 RAM allows for larger cache sizes and
 reduces disk I/O for reads. More RAM also allows memory tables
 (memtables) to hold more recently written data. Larger
 memtables lead to a fewer number of SSTables being flushed to disk and
 fewer files to scan during a read. The ideal
 amount of RAM depends on the anticipated size of your hot data.

 • For dedicated hardware, a minimum of than 8GB of RAM is needed.
 DataStax recommends 16GB - 32GB.

 • Java heap space should be set to a maximum of 8GB or half of your total
 RAM, whichever is lower. (A greater
 heap size has more intense garbage collection periods.)

 • For a virtual environment use a minimum of 4GB, such as Amazon EC2
 Large instances. For production clusters
 with a healthy amount of traffic, 8GB is more common.




 --
 Tyler Hobbs
 DataStax http://datastax.com/




Re: no snappyjava in java.library.path (JDK 1.7 issue?)

2012-05-15 Thread Roshni Rajagopal
Hi Stephen,

Cassandra's wiki says Cassandra requires the most stable version of Java 1.6 
you can deploy.

http://wiki.apache.org/cassandra/GettingStarted

Regards,

Roshni

From: Stephen McKamey step...@mckamey.commailto:step...@mckamey.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Tue, 15 May 2012 13:40:43 -0700
To: Cassandra user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: no snappyjava in java.library.path (JDK 1.7 issue?)

Reverting to JDK 1.6 appears to fix the issue. Is JDK 1.7 not yet supported by 
Cassandra?

java version 1.6.0_31
Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635)
Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode)

On Tue, May 15, 2012 at 12:55 PM, Stephen McKamey 
step...@mckamey.commailto:step...@mckamey.com wrote:
Worth noting is I'm on Mac OS X 10.7.4 and I recently upgraded to the latest 
JDK (really hoping this isn't the issue):

java version 1.7.0_04
Java(TM) SE Runtime Environment (build 1.7.0_04-b21)
Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)

This email and any files transmitted with it are confidential and intended 
solely for the individual or entity to whom they are addressed. If you have 
received this email in error destroy it immediately. *** Walmart Confidential 
***


Re: cassandra upgrade to 1.1 - migration problem

2012-05-15 Thread Dave Brosius
The replication factor for a keyspace is stored in the 
system.schema_keyspaces column family.


Since you can't view this with cli as the server won't start, the only 
way to look at it, that i know of is to use the


sstable2json tool on the *.db file for that column family...

So for instance on my machine i do

./sstable2json 
/var/lib/cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ia-1-Data.db


and get


{
7374726573735f6b73: [[durable_writes,true,1968197311980145], 
[name,stress_ks,1968197311980145], 
[strategy_class,org.apache.cassandra.locator.SimpleStrategy,1968197311980145], 
[strategy_options,{\replication_factor\:\3\},1968197311980145]]


It's likely you don't have a entry from replication_factor.

Theoretically i suppose you could embellish the output, and use 
json2sstable to fix it, but I have no experience here, and would get the 
blessings of datastax fellas, before proceeding.






On 05/15/2012 07:02 PM, Casey Deccio wrote:
Sorry to reply to my own message (again).  I took a closer look at the 
logs and realized that the partitioner errors aren't what kept the 
daemon to stop; those errors are in the logs even before I upgraded.  
This one seems to be the culprit.


java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:160)
Caused by: java.lang.RuntimeException: 
org.apache.cassandra.config.ConfigurationException: SimpleStrategy 
requires a replication_factor strategy option.

at org.apache.cassandra.db.Table.init(Table.java:275)
at org.apache.cassandra.db.Table.open(Table.java:114)
at org.apache.cassandra.db.Table.open(Table.java:97)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:204)
at 
org.apache.cassandra.service.AbstractCassandraDaemon.init(AbstractCassandraDaemon.java:254)

... 5 more
Caused by: org.apache.cassandra.config.ConfigurationException: 
SimpleStrategy requires a replication_factor strategy option.
at 
org.apache.cassandra.locator.SimpleStrategy.validateOptions(SimpleStrategy.java:71)
at 
org.apache.cassandra.locator.AbstractReplicationStrategy.createReplicationStrategy(AbstractReplicationStrategy.java:218)
at 
org.apache.cassandra.db.Table.createReplicationStrategy(Table.java:295)

at org.apache.cassandra.db.Table.init(Table.java:271)
... 9 more
Cannot load daemon

I'm not sure how to check the replication_factor and/or update it 
without using cassandra-cli, which requires the daemon to be running.


Casey




how do remove default_validation_class using cassandra-cli?

2012-05-15 Thread Yuhan Zhang
Hi all,

Is there a way to remove default_validation_class after assigned it to a
column family?

I'd like to have a column family storing both string and long. looks like
it throws error at
me for String didn't validate.

Thank you.

Yuhan


Using EC2 ephemeral 4disk raid0 cause high iowait trouble

2012-05-15 Thread koji Lin
Hi

Our service already run cassandra 1.0 on 1x ec2 instances(with ebs), and we
saw lots of discussion talk about using  ephemeral raid for better
performance and consistent performance.

So we want to create new instance using 4 ephemeral raid0, and copy the
data from ebs to finally replace the old instance and reduce some .

we create the xlarge instance with -b '/dev/sdb=ephemeral0' -b
'/dev/sdc=ephemeral1' -b '/dev/sdd=ephemeral2' -b '/dev/sde=ephemeral3',

and use mdadm command like this  mdadm --create /dev/md0 --level=0 -c256
--raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde

after copying file and start the cassandra(same token as old instance it
replaced).

we saw the read is really fast always keep 2xxm/sec, but system load exceed
40, with high iowait, and lots of client get timeout result. We guess maybe
it's the problem of ec2 instance, so we create another one with same
setting to replace other machine ,but the result is same . Then we rollback
to ebs with single disk ,read speed keeps at 1xmb/sec but system becomes
well .(using ebs with 2 disks raid0 will keep at 2xmb/sec and higher iowait
then single disk ,but still works)

Is there anyone meet the same problem too ? or do we forget something to
configure?

thank you

koji


Re: how do remove default_validation_class using cassandra-cli?

2012-05-15 Thread Yuhan Zhang
to answer my own question: set default_validation_class = BytesType;

On Tue, May 15, 2012 at 7:09 PM, Yuhan Zhang yzh...@onescreen.com wrote:

 Hi all,

 Is there a way to remove default_validation_class after assigned it to a
 column family?

 I'd like to have a column family storing both string and long. looks like
 it throws error at
 me for String didn't validate.

 Thank you.

 Yuhan




-- 
Yuhan Zhang
Application Developer
OneScreen Inc.
yzh...@onescreen.com eho...@onescreen.com
www.onescreen.com

The information contained in this e-mail is for the exclusive use of the
intended recipient(s) and may be confidential, proprietary, and/or legally
privileged. Inadvertent disclosure of this message does not constitute a
waiver of any privilege.  If you receive this message in error, please do
not directly or indirectly print, copy, retransmit, disseminate, or
otherwise use the information. In addition, please delete this e-mail and
all copies and notify the sender.


Re: How can I implement 'LIKE operation in SQL' on values while querying a column family in Cassandra

2012-05-15 Thread Abhijit Chanda
Sorry for the confusion Tamar. Any ways thanks dear.

Regards,
Abhijit

On Tue, May 15, 2012 at 9:36 PM, Tamar Fraenkel ta...@tok-media.com wrote:

 Actually woman ;-)

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 3:45 PM, Abhijit Chanda abhijit.chan...@gmail.com
  wrote:

 Thanks so much Guys, specially Tamar, thank you so much man.

 Regards,
 Abhijit


 On Tue, May 15, 2012 at 4:26 PM, Tamar Fraenkel ta...@tok-media.comwrote:

 Do you still need the sample code? I use Hector, well here is an example:
 *This is the Column Family definition:*
 (I have a composite, but if you like you can have only the UTF8Type).

 CREATE COLUMN FAMILY title_indx
 with comparator = 'CompositeType(UTF8Type,UUIDType)'
 and default_validation_class = 'UTF8Type'
 and key_validation_class = 'LongType';

 *The Query:*
 SliceQueryLong, Composite, String query =
 HFactory.createSliceQuery(CassandraHectorConn.getKeyspace(),
 LongSerializer.get(),
 CompositeSerializer.get(),
 StringSerializer.get());
 query.setColumnFamily(title_indx);
 query.setKey(...)

 Composite start = new Composite();
 start.add(prefix);
 char c = lowerCasePrefix.charAt(lastCharIndex);
 String prefixEnd =  prefix.substring(0, lastCharIndex) + ++c;
 Composite end = new Composite();
 end.add(prefixEnd);

 ColumnSliceIteratorLong, Composite, String iterator =
   new ColumnSliceIteratorLong, Composite, String(
query, start, end, false)
 while (iterator.hasNext()) {
 ...
}

 Cheers,

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 1:19 PM, samal samalgo...@gmail.com wrote:

 You cannot extract via relative column value.
 It can only extract via value if it has secondary index but exact
 column value need to match.

 as tamar suggested you can put value as column name , UTF8 comparator.

 {
 'name_abhijit'='abhijit'
 'name_abhishek'='abhiskek'
 'name_atul'='atul'
 }

 here you can do slice query on column name and get desired result.

 /samal

 On Tue, May 15, 2012 at 3:29 PM, selam selam...@gmail.com wrote:

 Mapreduce jobs may solve your problem  for batch processing


 On Tue, May 15, 2012 at 12:49 PM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 Tamar,

 Can you please illustrate little bit with some sample code. It highly
 appreciable.

 Thanks,


 On Tue, May 15, 2012 at 10:48 AM, Tamar Fraenkel ta...@tok-media.com
  wrote:

 I don't think this is possible, the best you can do is prefix, if
 your order is alphabetical. For example I have a CF with
 comparator UTF8Type, and then I can do slice query and bring all columns
 that start with the prefix, and end with the prefix where you replace 
 the
 last char with the next one in order (i.e. aaa-aab).

 Hope that helps.

 *Tamar Fraenkel *
 Senior Software Engineer, TOK Media

 [image: Inline image 1]

 ta...@tok-media.com
 Tel:   +972 2 6409736
 Mob:  +972 54 8356490
 Fax:   +972 2 5612956





 On Tue, May 15, 2012 at 7:56 AM, Abhijit Chanda 
 abhijit.chan...@gmail.com wrote:

 I don't know the exact value on a column, but I want to do a
 partial matching to know all available values that matches.
 I want to do similar kind of operation that LIKE operator in SQL
 do.
 Any help is highly appreciated.

 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395




 --
 Saygılar  İyi Çalışmalar
 Timu EREN ( a.k.a selam )






 --
 Abhijit Chanda
 Software Developer
 VeHere Interactive Pvt. Ltd.
 +91-974395





-- 
Abhijit Chanda
Software Developer
VeHere Interactive Pvt. Ltd.
+91-974395
tokLogo.png

Re: cassandra upgrade to 1.1 - migration problem

2012-05-15 Thread Casey Deccio
On Tue, May 15, 2012 at 5:41 PM, Dave Brosius dbros...@mebigfatguy.comwrote:

 The replication factor for a keyspace is stored in the
 system.schema_keyspaces column family.

 Since you can't view this with cli as the server won't start, the only way
 to look at it, that i know of is to use the

 sstable2json tool on the *.db file for that column family...

 So for instance on my machine i do

 ./sstable2json /var/lib/cassandra/data/**system/schema_keyspaces/**
 system-schema_keyspaces-ia-1-**Data.db

 and get


 {
 7374726573735f6b73: [[durable_writes,true,**1968197311980145],
 [name,stress_ks,**1968197311980145], [strategy_class,org.apache.**
 cassandra.locator.**SimpleStrategy,**1968197311980145],
 [strategy_options,{\**replication_factor\:\3\},**
 1968197311980145]]

 It's likely you don't have a entry from replication_factor.


Yep, I checked the system.schema_keyspaces ColumnFamily, and there was no
replication_factor value, as you suspected.  But the dev cluster that
worked after upgrade did have that value, so it started up okay.
Apparently pre-1.1 was less picky about its presence.


 Theoretically i suppose you could embellish the output, and use
 json2sstable to fix it, but I have no experience here, and would get the
 blessings of datastax fellas, before proceeding.


Actually, I went ahead and took a chance because I had already completely
offline for several hours and wanted to get things back up.  I did what you
suggested and added the replication_factor value to the json returned from
sstable2json and imported it using json2sstable.  Fortunately I had the dev
cluster values to use as a basis.  I started things up, and it worked like
a champ.  Thanks!

Casey