Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Vivek Mishra
Error: Main method not found in class
org.apache.cassandra.service.CassandraDaemon, please define the main method
as:
   public static void main(String[] args)


Hi,
I am getting this error. Earlier it was working fine for me, when i simply
downloaded the tarball installation and ran cassandra server. Recently i
did rpm package installation of Cassandra and which is working fine. But
somehow when i try to run it via originally extracted tar package. i am
getting:

*
xss =  -ea
-javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
-XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
-Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
Error: Main method not found in class
org.apache.cassandra.service.CassandraDaemon, please define the main method
as:
   public static void main(String[] args)
*

I tried setting CASSANDRA_HOME directory, but no luck.

Error is bit confusing, Any suggestions???

-Vivek


Re: Huge query Cassandra limits

2013-07-17 Thread cesare cugnasco
Hi Rob,
of course, we could issue multiple requests, but then we should  consider
which is the optimal way to split the query in smaller ones. Moreover, we
should choose how many of sub-query run in parallel.

 In ours tests,  we found there's a significant performance difference
between various  configurations and we are studying a policy to optimize
it. The doubt is that, if the needing of issuing multiple requests is
caused only by a fixable implementation detail, would make pointless do
this study.

Does anyone made similar analysis?


2013/7/16 Robert Coli rc...@eventbrite.com


 On Tue, Jul 16, 2013 at 4:46 AM, cesare cugnasco 
 cesare.cugna...@gmail.com wrote:

 We  are working on porting some life science applications to Cassandra,
 but we have to deal with its limits managing huge queries. Our queries are
 usually multiget_slice ones: many rows with many columns each.


 You are not getting much win by increasing request size in Cassandra,
 and you expose yourself to lose such as you have experienced.

 Is there some reason you cannot just issue multiple requests?

 =Rob



Re: manually removing sstable

2013-07-17 Thread Michał Michalski

Hi Aaron,

 * Tombstones will only be purged if all fragments of a row are in the 
SStable(s) being compacted.


According to my knowledge it's not necessarily true. In a specific case 
this patch comes into play:


https://issues.apache.org/jira/browse/CASSANDRA-4671

We could however purge tombstone if we know that the non-compacted 
sstables doesn't have any info that is older than the tombstones we're 
about to purge (since then we know that the tombstones we'll consider 
can't delete data in non compacted sstables).


M.

W dniu 12.07.2013 10:25, aaron morton pisze:

That sounds sane to me. Couple of caveats:

* Remember that Expiring Columns turn into Tombstones and can only be purged 
after TTL and gc_grace.
* Tombstones will only be purged if all fragments of a row are in the 
SStable(s) being compacted.

Cheers

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 11/07/2013, at 10:17 PM, Theo Hultberg t...@iconara.net wrote:


a colleague of mine came up with an alternative solution that also seems to 
work, and I'd just like your opinion on if it's sound.

we run find to list all old sstables, and then use cmdline-jmxclient to run the 
forceUserDefinedCompaction function on each of them, this is roughly what we do 
(but with find and xargs to orchestrate it)

   java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 
org.apache.cassandra.db:type=CompactionManager 
forceUserDefinedCompaction=the_keyspace,db_file_name

the downside is that c* needs to read the file and do disk io, but the upside 
is that it doesn't require a restart. c* does a little more work, but we can 
schedule that during off-peak hours. another upside is that it feels like we're 
pretty safe from screwups, we won't accidentally remove an sstable with live 
data, the worst case is that we ask c* to compact an sstable with live data and 
end up with an identical sstable.

if anyone else wants to do the same thing, this is the full cron command:

0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type f -name 
'*-Data.db' -mtime +8 -printf 
forceUserDefinedCompaction=the_keyspace_name,\%P\n | xargs -t 
--no-run-if-empty java -jar /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - 
localhost:7199 org.apache.cassandra.db:type=CompactionManager

just change the keyspace name and the path to the data directory.

T#


On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg t...@iconara.net wrote:
thanks a lot. I can confirm that it solved our problem too.

looks like the C* 2.0 feature is perfect for us.

T#


On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.com wrote:
yep that works, you need to remove all components of the sstable though, not 
just -Data.db

and, in 2.0 there is this:
https://issues.apache.org/jira/browse/CASSANDRA-5228

/Marcus


On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote:
Hi,

I think I remember reading that if you have sstables that you know contain only 
data that whose ttl has expired, it's safe to remove them manually by stopping 
c*, removing the *-Data.db files and then starting up c* again. is this correct?

we have a cluster where everything is written with a ttl, and sometimes c* 
needs to compact over a 100 gb of sstables where we know ever has expired, and 
we'd rather just manually get rid of those.

T#










Re: IllegalArgumentException on query with AbstractCompositeType

2013-07-17 Thread aaron morton
 This would seem to conflict with the advice to only use secondary indexes on 
 fields with low cardinality, not high cardinality. I guess low cardinality is 
 good, as long as it isn't /too/ low? 
My concern is seeing people in the wild create secondary indexes with low 
cardinality that generate huge rows. 

Also with how selective indexes are, for background see Create 
Highly-Selective Indexes 
http://msdn.microsoft.com/en-nz/library/ms172984(v=sql.100).aspx

if you index 100 rows with a low cardinality, say there are only 10 unique 
values, then you have 10 index rows with 10 entries each. Using Selectivity is 
the ratio of qualifying rows to total rows. from the article it's at 1:10 
ratio. If you have 50 unique values, you have 50 rows with 2 values each so the 
ratio is 1:50. The second is more selective and more useful. 
 
Indexing 20 million rows that all have foo == bar is not very useful. 

Cheers

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 15/07/2013, at 10:52 AM, Tristan Seligmann mithra...@mithrandi.net wrote:

 On Mon, Jul 15, 2013 at 12:26 AM, aaron morton aa...@thelastpickle.com 
 wrote:
 Aaron Morton can confirm but I think one problem could be that to create an 
 index on a field with small number of possible values is not good.
 Yes.
 In cassandra each value in the index becomes a single row in the internal 
 secondary index CF. You will end up with a huge row for all the values with 
 false. 
 
 And in general, if you want a queue you should use a queue. 
 
 This would seem to conflict with the advice to only use secondary indexes on 
 fields with low cardinality, not high cardinality. I guess low cardinality is 
 good, as long as it isn't /too/ low? 
 -- 
 mithrandi, i Ainil en-Balandor, a faer Ambar



Re: Pig load data with cassandrastorage and slice filter param

2013-07-17 Thread aaron morton
Not sure I understand the question. What was the command that failed?

Cheers


-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 16/07/2013, at 11:03 PM, Miguel Angel Martin junquera 
mianmarjun.mailingl...@gmail.com wrote:

 hi all
 
 I trying to load data from cassandra with slice params option but ther are  
 no much info about how to use i. I found  only a quick reference in 
 readme.txt in cassandra project  .../examples/pig 
 
 ...
 Slices on columns can also be specified:
 grunt rows = LOAD 
 'cassandra://MyKeyspace/MyColumnFamily?slice_start=C2slice_end=C4limit=1reversed=true'
  USING CassandraStorage();
 Binary values for slice_start and slice_end can be escaped such as '\u0255'
 ...
 
 
 I want to filter the initial load data by day o range dates and I only found 
 this info about cassandra and pig
 
   • http://rubyscale.com/blog/2011/03/06/basic-time-series-with-cassandra/
   • http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
 
 
 I,m going  to try to do a test with  dummy data with Composite column Family 
 like  anuniqueIDGenerate:timestamp for example or  
 anuniqueIDGenerate:stringdate where date is a string with dornat YYY-MM-dd 
 for example
 
 Another option is use Supercolumn family by day for example ad try to use 
 slice with this feature 
 
 
 Or another option is create a custom load cassandra but perhaps It´s more 
 complex and I could this features.
 
 I will appreciate any help or example how I must define cassandra data and  
 Pig example load with slice.
 
 Thanks in advance and king regards
 
 
 



Re: AbstractCassandraDaemon.java (line 134) Exception in thread

2013-07-17 Thread aaron morton
Double check the stack size is to set 100K see 
https://github.com/apache/cassandra/blob/cassandra-1.1/conf/cassandra-env.sh#L187

Cheers

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 6:26 AM, Julio Quierati julio.quier...@gmail.com wrote:

 Hello,
 
 At least 2 times a day I'm having hundreds of log entry related to exception 
 below, the network bottleneck seems, anyone know how to solve, or encountered 
 this problem
 
 Using
   C* version 1.1.4
and
 Java version 1.7.0_21
 Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
 Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
 AWS m1.xlarge
 
 ERROR [Thrift:967696] 2013-07-16 12:36:12,514 AbstractCassandraDaemon.java 
 (line 134) Exception in thread Thread[Thrift:967696,5,main]
 java.lang.StackOverflowError
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:150)
   at java.net.SocketInputStream.read(SocketInputStream.java:121)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
   at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
   at 
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
 ERROR [Thrift:966717] 2013-07-16 12:36:17,152 AbstractCassandraDaemon.java 
 (line 134) Exception in thread Thread[Thrift:966717,5,main]
 java.lang.StackOverflowError
   at java.net.SocketInputStream.socketRead0(Native Method)
   at java.net.SocketInputStream.read(SocketInputStream.java:150)
   at java.net.SocketInputStream.read(SocketInputStream.java:121)
   at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
   at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
   at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
   at 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
   at 
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
   at 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:722)
  INFO [OptionalTasks:1] 2013-07-16 12:38:07,847 MeteredFlusher.java (line 62) 
 flushing high-traffic column family CFS(Keyspace='zm', 
 ColumnFamily='deviceConnectionLog') (estimated 200441950 bytes)
  INFO [OptionalTasks:1] 2013-07-16 12:38:07,848 ColumnFamilyStore.java (line 
 659) Enqueuing flush of 
 Memtable-deviceConnectionLog@64568169(20293642/200441950 serialized/live 
 bytes, 114490 ops)
  INFO [FlushWriter:4306] 2013-07-16 12:38:07,848 Memtable.java (line 264) 
 Writing Memtable-deviceConnectionLog@64568169(20293642/200441950 
 serialized/live bytes, 114490 ops)
  INFO [FlushWriter:4306] 2013-07-16 12:38:09,154 Memtable.java (line 305) 
 Completed flushing 
 /mnt/cassandra/data/zm/deviceConnectionLog/zm-deviceConnectionLog-he-5083-Data.db
  (5747695 bytes) for commitlog position 
 ReplayPosition(segmentId=1282719983369784, 

Re: Deletion use more space.

2013-07-17 Thread aaron morton
you are seeing this http://wiki.apache.org/cassandra/FAQ#range_ghosts

Lots of client API's and CQL 3 hide this from you now. 

Cheers
 
-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 1:52 PM, 杨辉强 huiqiangy...@yunrang.com wrote:

 Thanks, But Michael's answer confuse me more. 
 
 I use list cf; in cassandra-cli. It seems lots of rows have been deleted, but 
 keys exist.
 
 After the deletion, why the key still exists? It seems useless.
 
 RowKey: 3030303031306365633862356437636365303861303433343137656531306435
 ---
 RowKey: 3030303031316333616336366531613636373735396363323037396331613230
 ---
 RowKey: 3030303031316333616336366531613637303964616364363630663865313433
 ---
 RowKey: 303030303132393461363730323932356363313330323862633064626335
 ---
 RowKey: 3030303031323934613637303239323566303733303638373138366334323436
 ---
 RowKey: 3030303031333838333139303930633664643364613331316664363134656639
 ---
 RowKey: 303030303133626534363930363061393837666230363439316336333230
 ---
 RowKey: 303030303133636565373537646561633463393263363832653130363733
 ---
 RowKey: 303030303134363234326136396637646465623537323761633233353065
 
 
 - 原始邮件 -
 发件人: Michael Theroux mthero...@yahoo.com
 收件人: user@cassandra.apache.org
 发送时间: 星期二, 2013年 7 月 16日 下午 10:23:32
 主题: Re: Deletion use more space.
 
 The only time information is removed from the filesystem is during 
 compaction.  Compaction can remove tombstones after gc_grace_seconds, which, 
 could result in reanimation of deleted data if the tombstone was never 
 properly replicated to other replicas.  Repair will make sure tombstones are 
 consistent amongst replicas.  However, tombstones can not be removed if the 
 data the tombstone is deleting is in another SSTable and has not yet been 
 removed. 
 
 Hope this helps,
 -Mike
 
 
 On Jul 16, 2013, at 10:04 AM, Andrew Bialecki wrote:
 
 I don't think setting gc_grace_seconds to an hour is going to do what you'd 
 expect. After gc_grace_seconds, if you haven't run a repair within that 
 hour, the data you deleted will seem to have been undeleted.
 
 Someone correct me if I'm wrong, but in order to order to completely delete 
 data and regain the space it takes up, you need to delete it, which 
 creates tombstones, and then run a repair on that column family within 
 gc_grace_seconds. After that the data is actually gone and the space 
 reclaimed.
 
 
 On Tue, Jul 16, 2013 at 6:20 AM, 杨辉强 huiqiangy...@yunrang.com wrote:
 Thank you!
 It should be update column family ScheduleInfoCF with gc_grace = 3600;
 Faint.
 
 - 原始邮件 -
 发件人: 杨辉强 huiqiangy...@yunrang.com
 收件人: user@cassandra.apache.org
 发送时间: 星期二, 2013年 7 月 16日 下午 6:15:12
 主题: Re: Deletion use more space.
 
 Hi,
  I use the follow cmd to update gc_grace_seconds. It reports error! Why?
 
 [default@WebSearch] update column family ScheduleInfoCF with 
 gc_grace_seconds = 3600;
 java.lang.IllegalArgumentException: No enum const class 
 org.apache.cassandra.cli.CliClient$ColumnFamilyArgument.GC_GRACE_SECONDS
 
 
 - 原始邮件 -
 发件人: Michał Michalski mich...@opera.com
 收件人: user@cassandra.apache.org
 发送时间: 星期二, 2013年 7 月 16日 下午 5:51:49
 主题: Re: Deletion use more space.
 
 Deletion is not really removing data, but it's adding tombstones
 (markers) of deletion. They'll be later merged with existing data during
 compaction and - in the end (see: gc_grace_seconds) - removed, but by
 this time they'll take some space.
 
 http://wiki.apache.org/cassandra/DistributedDeletes
 
 M.
 
 W dniu 16.07.2013 11:46, 杨辉强 pisze:
 Hi, all:
   I use cassandra 1.2.4 and I have 4 nodes ring and use byte order 
 partitioner.
   I had inserted about 200G data in the ring previous days.
 
   Today I write a program to scan the ring and then at the same time delete 
 the items that are scanned.
   To my surprise, the cassandra cost more disk usage.
 
Anybody can tell me why? Thanks.
 
 



Re: Intresting issue with getting Order By to work...

2013-07-17 Thread aaron morton
 The use of Order By requires Primary Key which appears to be only supported 
 by by using CQL and not Cassandra-cli. 
Order By in CQL is the also supported on the thrift interface. 

When using thrift the order you get the columns back is the order the 
Comparator puts them in. If you want them reversed the thrift API supports 
that. 

 I read that thrift clients will not work with CQL created tables due to extra 
 things created by the CQL. If so how can I create Primary Keys and be 
 supported by thrift based clients??
No.
Do not access CQL tables with the thrift API. 

 Seems like Cassandra-cli should support creation of compound primary keys or
It does. 
See help on the CompositeType

 Also CQL tables are not visible via cli.so I can not see details on what was 
 created by CQL and the cqlsh script has errors according to the latest Python 
 windows program I tried.
They are visible for read access. 

 I will post to Datastax the same question 
Please ask questions to one group at a time so people do not waste their time 
providing answers you already have. 

Cheers


-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 3:44 PM, Tony Anecito adanec...@yahoo.com wrote:

 Hi All,
 
 Well I got most everything working I wanted using Cassandra then discovered I 
 needed to use an Order By. I am using Cassandra 1.2.5.
 The use of Order By requires Primary Key which appears to be only supported 
 by by using CQL and not Cassandra-cli. So I dropped my table created uisng 
 CLI and used CQL and was able to create a Table. But when I went to insert 
 data that worked fine on the cli created table I now get an exception:
 Error while inserting 
 com.datastax.driver.core.exceptions.InvalidQueryException: Unknown identifier 
 type.
 
 I read that thrift clients will not work with CQL created tables due to extra 
 things created by the CQL. If so how can I create Primary Keys and be 
 supported by thrift based clients??
 
 I will post to Datastax the same question but trying to understand how to 
 resolve cli vs CQL issue like this. Seems like Cassandra-cli should support 
 creation of compound primary keys or CQL should create tables readable by 
 thrift based clients. Is there some meta column info people should add?
 Also CQL tables are not visible via cli.so I can not see details on what was 
 created by CQL and the cqlsh script has errors according to the latest Python 
 windows program I tried.
 
 Thanks,
 -Tony
 
 



Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Vivek Mishra
Any suggestions?
I am kind of stuck with this, else i need to delete rpm installation to get
it working.

-Vivek


On Wed, Jul 17, 2013 at 12:17 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 

 Hi,
 I am getting this error. Earlier it was working fine for me, when i simply
 downloaded the tarball installation and ran cassandra server. Recently i
 did rpm package installation of Cassandra and which is working fine. But
 somehow when i try to run it via originally extracted tar package. i am
 getting:

 *
 xss =  -ea
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 *

 I tried setting CASSANDRA_HOME directory, but no luck.

 Error is bit confusing, Any suggestions???

 -Vivek



Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread aaron morton
Something is messed up in your install.  Can you try scrubbing the install and 
restarting ? 

Cheers

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Error: Main method not found in class 
 org.apache.cassandra.service.CassandraDaemon, please define the main method 
 as:
public static void main(String[] args)
 
 
 Hi,
 I am getting this error. Earlier it was working fine for me, when i simply 
 downloaded the tarball installation and ran cassandra server. Recently i did 
 rpm package installation of Cassandra and which is working fine. But somehow 
 when i try to run it via originally extracted tar package. i am getting:
 
 *
 xss =  -ea 
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar 
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M 
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class 
 org.apache.cassandra.service.CassandraDaemon, please define the main method 
 as:
public static void main(String[] args)
 *
 
 I tried setting CASSANDRA_HOME directory, but no luck.
 
 Error is bit confusing, Any suggestions???
 
 -Vivek



Re: Huge query Cassandra limits

2013-07-17 Thread aaron morton
  In ours tests,  we found there's a significant performance difference 
 between various  configurations and we are studying a policy to optimize it. 
 The doubt is that, if the needing of issuing multiple requests is caused only 
 by a fixable implementation detail, would make pointless do this study.
if you provide your numbers we can see if you are getting expected results. 

There are some limiting factors. Using the thrift API the max message size is 
15 MB. And each row you ask for becomes (roughly) RF number of tasks in the 
thread pools on replicas. When you ask for 1000 rows it creates (roughly) 3,000 
tasks in the replicas. If you have other clients trying to do reads at the same 
time this can cause delays to their reads. 

Like everything in computing, more is not always better. Run some tests to try 
multi gets with different sizes and see where improvements in the overall 
throughput begin to decline. 

Also consider using a newer client with token aware balancing and async 
networking. Again though, if you try to read everything at once you are going 
to have a bad day.

Cheers
  
-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 8:24 PM, cesare cugnasco cesare.cugna...@gmail.com wrote:

 Hi Rob,
 of course, we could issue multiple requests, but then we should  consider 
 which is the optimal way to split the query in smaller ones. Moreover, we 
 should choose how many of sub-query run in parallel.
  In ours tests,  we found there's a significant performance difference 
 between various  configurations and we are studying a policy to optimize it. 
 The doubt is that, if the needing of issuing multiple requests is caused only 
 by a fixable implementation detail, would make pointless do this study.
 
 Does anyone made similar analysis?
 
 
 2013/7/16 Robert Coli rc...@eventbrite.com
 
 On Tue, Jul 16, 2013 at 4:46 AM, cesare cugnasco cesare.cugna...@gmail.com 
 wrote:
 We  are working on porting some life science applications to Cassandra, but 
 we have to deal with its limits managing huge queries. Our queries are 
 usually multiget_slice ones: many rows with many columns each.
 
 You are not getting much win by increasing request size in Cassandra, and 
 you expose yourself to lose such as you have experienced.
 
 Is there some reason you cannot just issue multiple requests?
 
 =Rob 
 



Re: manually removing sstable

2013-07-17 Thread aaron morton
 https://issues.apache.org/jira/browse/CASSANDRA-4671
Thanks for the tip. 

A

-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 17/07/2013, at 8:47 PM, Michał Michalski mich...@opera.com wrote:

 Hi Aaron,
 
  * Tombstones will only be purged if all fragments of a row are in the 
  SStable(s) being compacted.
 
 According to my knowledge it's not necessarily true. In a specific case this 
 patch comes into play:
 
 https://issues.apache.org/jira/browse/CASSANDRA-4671
 
 We could however purge tombstone if we know that the non-compacted sstables 
 doesn't have any info that is older than the tombstones we're about to purge 
 (since then we know that the tombstones we'll consider can't delete data in 
 non compacted sstables).
 
 M.
 
 W dniu 12.07.2013 10:25, aaron morton pisze:
 That sounds sane to me. Couple of caveats:
 
 * Remember that Expiring Columns turn into Tombstones and can only be purged 
 after TTL and gc_grace.
 * Tombstones will only be purged if all fragments of a row are in the 
 SStable(s) being compacted.
 
 Cheers
 
 -
 Aaron Morton
 Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 11/07/2013, at 10:17 PM, Theo Hultberg t...@iconara.net wrote:
 
 a colleague of mine came up with an alternative solution that also seems to 
 work, and I'd just like your opinion on if it's sound.
 
 we run find to list all old sstables, and then use cmdline-jmxclient to run 
 the forceUserDefinedCompaction function on each of them, this is roughly 
 what we do (but with find and xargs to orchestrate it)
 
   java -jar cmdline-jmxclient-0.10.3.jar - localhost:7199 
 org.apache.cassandra.db:type=CompactionManager 
 forceUserDefinedCompaction=the_keyspace,db_file_name
 
 the downside is that c* needs to read the file and do disk io, but the 
 upside is that it doesn't require a restart. c* does a little more work, 
 but we can schedule that during off-peak hours. another upside is that it 
 feels like we're pretty safe from screwups, we won't accidentally remove an 
 sstable with live data, the worst case is that we ask c* to compact an 
 sstable with live data and end up with an identical sstable.
 
 if anyone else wants to do the same thing, this is the full cron command:
 
 0 4 * * * find /path/to/cassandra/data/the_keyspace_name -maxdepth 1 -type 
 f -name '*-Data.db' -mtime +8 -printf 
 forceUserDefinedCompaction=the_keyspace_name,\%P\n | xargs -t 
 --no-run-if-empty java -jar 
 /usr/local/share/java/cmdline-jmxclient-0.10.3.jar - localhost:7199 
 org.apache.cassandra.db:type=CompactionManager
 
 just change the keyspace name and the path to the data directory.
 
 T#
 
 
 On Thu, Jul 11, 2013 at 7:09 AM, Theo Hultberg t...@iconara.net wrote:
 thanks a lot. I can confirm that it solved our problem too.
 
 looks like the C* 2.0 feature is perfect for us.
 
 T#
 
 
 On Wed, Jul 10, 2013 at 7:28 PM, Marcus Eriksson krum...@gmail.com wrote:
 yep that works, you need to remove all components of the sstable though, 
 not just -Data.db
 
 and, in 2.0 there is this:
 https://issues.apache.org/jira/browse/CASSANDRA-5228
 
 /Marcus
 
 
 On Wed, Jul 10, 2013 at 2:09 PM, Theo Hultberg t...@iconara.net wrote:
 Hi,
 
 I think I remember reading that if you have sstables that you know contain 
 only data that whose ttl has expired, it's safe to remove them manually by 
 stopping c*, removing the *-Data.db files and then starting up c* again. is 
 this correct?
 
 we have a cluster where everything is written with a ttl, and sometimes c* 
 needs to compact over a 100 gb of sstables where we know ever has expired, 
 and we'd rather just manually get rid of those.
 
 T#
 
 
 
 
 
 



Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Vivek Mishra
@aaron
Thanks for your reply. I did have a look rpm installed files
1.  /etc/alternatives/cassandra, it contains configuration files only.
and .sh files are installed within /usr/bin folder.

Even if i try to run from extracted tar ball folder as

/home/impadmin/apache-cassandra-1.2.4/bin/cassandra -f

same error.

/home/impadmin/apache-cassandra-1.2.4/bin/cassandra -v

gives me 1.1.12 though it should give me 1.2.4


-Vivek
it gives me same error.


On Wed, Jul 17, 2013 at 3:37 PM, aaron morton aa...@thelastpickle.comwrote:

 Something is messed up in your install.  Can you try scrubbing the install
 and restarting ?

 Cheers

 -
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 

 Hi,
 I am getting this error. Earlier it was working fine for me, when i simply
 downloaded the tarball installation and ran cassandra server. Recently i
 did rpm package installation of Cassandra and which is working fine. But
 somehow when i try to run it via originally extracted tar package. i am
 getting:

 *
 xss =  -ea
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 *

 I tried setting CASSANDRA_HOME directory, but no luck.

 Error is bit confusing, Any suggestions???

 -Vivek





Re: Intresting issue with getting Order By to work...

2013-07-17 Thread Tony Anecito
Thanks for the answers.
 
The reason why I ask is it is stated the composite keys are not the same as 
Primary Key. I found no examples for thrift where it specifcally said the 
composite key is a primary key required by order by. All the examples where the 
words primary key were used were with CQL examples and I am seeing postings 
where people had issues with Order By but no answers like what you said.
 
If there was better documentation for Cassandra with working examples and 
explnations about the differences between CQL and CLI I would not need to ask 
questions on the users groups. I have also spotted major issues and tried to 
help understand them for all users.
 
-Tony

From: aaron morton aa...@thelastpickle.com
To: Cassandra User user@cassandra.apache.org 
Sent: Wednesday, July 17, 2013 4:06 AM
Subject: Re: Intresting issue with getting Order By to work...


 The use of Order By requires Primary Key which appears to be only supported 
 by by using CQL and not Cassandra-cli. 
Order By in CQL is the also supported on the thrift interface. 

When using thrift the order you get the columns back is the order the 
Comparator puts them in. If you want them reversed the thrift API supports 
that. 

 I read that thrift clients will not work with CQL created tables due to extra 
 things created by the CQL. If so how can I create Primary Keys and be 
 supported by thrift based clients??
No.
Do not access CQL tables with the thrift API. 

 Seems like Cassandra-cli should support creation of compound primary keys or
It does. 
See help on the CompositeType

 Also CQL tables are not visible via cli.so I can not see details on what was 
 created by CQL and the cqlsh script has errors according to the latest Python 
 windows program I tried.
They are visible for read access. 

 I will post to Datastax the same question 
Please ask questions to one group at a time so people do not waste their time 
providing answers you already have. 

Cheers


-
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com/

On 17/07/2013, at 3:44 PM, Tony Anecito adanec...@yahoo.com wrote:

 Hi All,
 
 Well I got most everything working I wanted using Cassandra then discovered I 
 needed to use an Order By. I am using Cassandra 1.2.5.
 The use of Order By requires Primary Key which appears to be only supported 
 by by using CQL and not Cassandra-cli. So I dropped my table created uisng 
 CLI and used CQL and was able to create a Table. But when I went to insert 
 data that worked fine on the cli created table I now get an exception:
 Error while inserting 
 com.datastax.driver.core.exceptions.InvalidQueryException: Unknown identifier 
 type.
 
 I read that thrift clients will not work with CQL created tables due to extra 
 things created by the CQL. If so how can I create Primary Keys and be 
 supported by thrift based clients??
 
 I will post to Datastax the same question but trying to understand how to 
 resolve cli vs CQL issue like this. Seems like Cassandra-cli should support 
 creation of compound primary keys or CQL should create tables readable by 
 thrift based clients. Is there some meta column info people should add?
 Also CQL tables are not visible via cli.so I can not see details on what was 
 created by CQL and the cqlsh script has errors according to the latest Python 
 windows program I tried.
 
 Thanks,
 -Tony
 
 

Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Vivek Mishra
Finally,
i have to delete all rpm installed files to get this working, folders are:
/usr/share/cassandra
/etc/alternatives/cassandra
/usr/bin/cassandra
/usr/bin/cassandra.in.sh
/usr/bin/cassandra-cli

Still don't understand why it's giving me such weird error:

Error: Main method not found in class
org.apache.cassandra.service.CassandraDaemon, please define the main method
as:
   public static void main(String[] args)
***

This is not informative at all and does not even Help!

-Vivek


On Wed, Jul 17, 2013 at 3:49 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 @aaron
 Thanks for your reply. I did have a look rpm installed files
 1.  /etc/alternatives/cassandra, it contains configuration files only.
 and .sh files are installed within /usr/bin folder.

 Even if i try to run from extracted tar ball folder as

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -f

 same error.

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -v

 gives me 1.1.12 though it should give me 1.2.4


 -Vivek
 it gives me same error.


 On Wed, Jul 17, 2013 at 3:37 PM, aaron morton aa...@thelastpickle.comwrote:

 Something is messed up in your install.  Can you try scrubbing the
 install and restarting ?

 Cheers

-
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 

 Hi,
 I am getting this error. Earlier it was working fine for me, when i
 simply downloaded the tarball installation and ran cassandra server.
 Recently i did rpm package installation of Cassandra and which is working
 fine. But somehow when i try to run it via originally extracted tar
 package. i am getting:

 *
 xss =  -ea
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 *

 I tried setting CASSANDRA_HOME directory, but no luck.

 Error is bit confusing, Any suggestions???

 -Vivek






Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Brian O'Neill
Vivek,

The location of CassandraDaemon changed between versions.  (from
org.apache.cassandra.thrift to org.apache.cassandra.service)

It is likely that the start scripts are picking up the old version on the
classpath, which results in the main method not being found.

Do you have CASSANDRA_HOME set?  I believe the start scripts will use that.
 Perhaps you have that set and pointed to the older 1.1.X version?

-brian


On Wed, Jul 17, 2013 at 8:31 AM, Vivek Mishra mishra.v...@gmail.com wrote:

 Finally,
 i have to delete all rpm installed files to get this working, folders are:
 /usr/share/cassandra
 /etc/alternatives/cassandra
 /usr/bin/cassandra
 /usr/bin/cassandra.in.sh
 /usr/bin/cassandra-cli

 Still don't understand why it's giving me such weird error:
 
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 ***

 This is not informative at all and does not even Help!

 -Vivek


 On Wed, Jul 17, 2013 at 3:49 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 @aaron
 Thanks for your reply. I did have a look rpm installed files
 1.  /etc/alternatives/cassandra, it contains configuration files only.
 and .sh files are installed within /usr/bin folder.

 Even if i try to run from extracted tar ball folder as

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -f

 same error.

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -v

 gives me 1.1.12 though it should give me 1.2.4


 -Vivek
 it gives me same error.


 On Wed, Jul 17, 2013 at 3:37 PM, aaron morton aa...@thelastpickle.comwrote:

 Something is messed up in your install.  Can you try scrubbing the
 install and restarting ?

 Cheers

-
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 

 Hi,
 I am getting this error. Earlier it was working fine for me, when i
 simply downloaded the tarball installation and ran cassandra server.
 Recently i did rpm package installation of Cassandra and which is working
 fine. But somehow when i try to run it via originally extracted tar
 package. i am getting:

 *
 xss =  -ea
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 *

 I tried setting CASSANDRA_HOME directory, but no luck.

 Error is bit confusing, Any suggestions???

 -Vivek







-- 
Brian ONeill
Chief Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Brian O'Neill
Vivek,

You could try echoing the CLASSPATH to double check.  Drop an echo into the
launch_service function in the cassandra shell script.  (~line 121)

Let us know the output.

-brian

---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Vivek Mishra mishra.v...@gmail.com
Reply-To:  user@cassandra.apache.org
Date:  Wednesday, July 17, 2013 10:24 AM
To:  user@cassandra.apache.org
Subject:  Re: Main method not found in class
org.apache.cassandra.service.CassandraDaemon

Hi Brian,
Thanks for your response.
I think i did change CASSANDRA_HOME to point to new directory.

-Vivek


On Wed, Jul 17, 2013 at 7:03 PM, Brian O'Neill b...@alumni.brown.edu
wrote:
 Vivek,
 
 The location of CassandraDaemon changed between versions.  (from
 org.apache.cassandra.thrift to org.apache.cassandra.service)
 
 It is likely that the start scripts are picking up the old version on the
 classpath, which results in the main method not being found.
 
 Do you have CASSANDRA_HOME set?  I believe the start scripts will use that.
 Perhaps you have that set and pointed to the older 1.1.X version?
 
 -brian
 
 
 On Wed, Jul 17, 2013 at 8:31 AM, Vivek Mishra mishra.v...@gmail.com wrote:
 Finally,
 i have to delete all rpm installed files to get this working, folders are:
 /usr/share/cassandra
 /etc/alternatives/cassandra
 /usr/bin/cassandra
 /usr/bin/cassandra.in.sh http://cassandra.in.sh
 /usr/bin/cassandra-cli
 
 Still don't understand why it's giving me such weird error:
 
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 ***
 
 This is not informative at all and does not even Help!
 
 -Vivek
 
 
 On Wed, Jul 17, 2013 at 3:49 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 @aaron
 Thanks for your reply. I did have a look rpm installed files
 1.  /etc/alternatives/cassandra, it contains configuration files only.
 and .sh files are installed within /usr/bin folder.
 
 Even if i try to run from extracted tar ball folder as
 
 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -f
 
 same error.  
 
 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -v
 
 gives me 1.1.12 though it should give me 1.2.4
 
 
 -Vivek
 it gives me same error.
 
 
 On Wed, Jul 17, 2013 at 3:37 PM, aaron morton aa...@thelastpickle.com
 wrote:
 Something is messed up in your install.  Can you try scrubbing the install
 and restarting ?
 
 Cheers
 
 -
 Aaron Morton
 Cassandra Consultant
 New Zealand
 
 @aaronmorton
 http://www.thelastpickle.com
 
 On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:
 
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main
 method as:
public static void main(String[] args)
 
 
 Hi,
 I am getting this error. Earlier it was working fine for me, when i simply
 downloaded the tarball installation and ran cassandra server. Recently i
 did rpm package installation of Cassandra and which is working fine. But
 somehow when i try to run it via originally extracted tar package. i am
 getting:
 
 *
 xss =  -ea 
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.
 jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M
 -Xmx1024M -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main
 method as:
public static void main(String[] args)
 *
 
 I tried setting CASSANDRA_HOME directory, but no luck.
 
 Error is bit confusing, Any suggestions???
 
 -Vivek
 
 
 
 
 
 
 -- 
 Brian ONeill
 Chief Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42





Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Vivek Mishra
Hi Brian,
Thanks for your response.
I think i did change CASSANDRA_HOME to point to new directory.

-Vivek


On Wed, Jul 17, 2013 at 7:03 PM, Brian O'Neill b...@alumni.brown.eduwrote:

 Vivek,

 The location of CassandraDaemon changed between versions.  (from
 org.apache.cassandra.thrift to org.apache.cassandra.service)

 It is likely that the start scripts are picking up the old version on the
 classpath, which results in the main method not being found.

 Do you have CASSANDRA_HOME set?  I believe the start scripts will use
 that.  Perhaps you have that set and pointed to the older 1.1.X version?

 -brian


 On Wed, Jul 17, 2013 at 8:31 AM, Vivek Mishra mishra.v...@gmail.comwrote:

 Finally,
 i have to delete all rpm installed files to get this working, folders are:
 /usr/share/cassandra
 /etc/alternatives/cassandra
 /usr/bin/cassandra
 /usr/bin/cassandra.in.sh
 /usr/bin/cassandra-cli

 Still don't understand why it's giving me such weird error:
 
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 ***

 This is not informative at all and does not even Help!

 -Vivek


 On Wed, Jul 17, 2013 at 3:49 PM, Vivek Mishra mishra.v...@gmail.comwrote:

 @aaron
 Thanks for your reply. I did have a look rpm installed files
 1.  /etc/alternatives/cassandra, it contains configuration files only.
 and .sh files are installed within /usr/bin folder.

 Even if i try to run from extracted tar ball folder as

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -f

 same error.

 /home/impadmin/apache-cassandra-1.2.4/bin/cassandra -v

 gives me 1.1.12 though it should give me 1.2.4


 -Vivek
 it gives me same error.


 On Wed, Jul 17, 2013 at 3:37 PM, aaron morton 
 aa...@thelastpickle.comwrote:

 Something is messed up in your install.  Can you try scrubbing the
 install and restarting ?

 Cheers

-
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:47 PM, Vivek Mishra mishra.v...@gmail.com wrote:

 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 

 Hi,
 I am getting this error. Earlier it was working fine for me, when i
 simply downloaded the tarball installation and ran cassandra server.
 Recently i did rpm package installation of Cassandra and which is working
 fine. But somehow when i try to run it via originally extracted tar
 package. i am getting:

 *
 xss =  -ea
 -javaagent:/home/impadmin/software/apache-cassandra-1.2.4//lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms1024M -Xmx1024M
 -Xmn256M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
 Error: Main method not found in class
 org.apache.cassandra.service.CassandraDaemon, please define the main method
 as:
public static void main(String[] args)
 *

 I tried setting CASSANDRA_HOME directory, but no luck.

 Error is bit confusing, Any suggestions???

 -Vivek







 --
 Brian ONeill
 Chief Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



unsubscribe

2013-07-17 Thread crigano
unsubscribe


Re: AbstractCassandraDaemon.java (line 134) Exception in thread

2013-07-17 Thread Julio Quierati
Hi Aaron,

My cassandra-env is by setting -Xss160k changed to -Xss100k.

if [ `uname` = Linux ] ; then
# reduce the per-thread stack size to minimize the impact of Thrift
# thread-per-client.  (Best practice is for client connections to
# be pooled anyway.) Only do so on Linux where it is known to be
# supported.
if startswith $JVM_VERSION '1.7.'
then
JVM_OPTS=$JVM_OPTS -Xss160k
else
JVM_OPTS=$JVM_OPTS -Xss128k
fi
fi

cassandra error after restart.

The stack size specified is too small, Specify at least 160k
Cannot create Java VM
Service exit with a return value of 1


https://issues.apache.org/jira/browse/CASSANDRA-4275
I set 256k.

A new problem, after restart, the nodetool ring, this viewing old nodes
removed a 1 months ago, these nodesnot exist anymore :


nodetool ring
Note: Ownership information does not include topology, please specify a
keyspace.
Address DC  RackStatus State   LoadOwns
   Token

   134691819273918816419698471612174905938
172.16.1.48 datacenter1 rack1   Down   Normal  ?
45.82%  42514072861213193928274919594412671125
172.16.4.212datacenter1 rack1   Up Normal  21.59 GB
 4.18%   49621227543684200553854819754232853074
172.16.3.221datacenter1 rack1   Down   Normal  ?
29.17%  99246027852756676471313599864858497886
172.16.5.57 datacenter1 rack1   Down   Normal  ?
17.50%  129020446491903175361975561488312102412
172.16.3.93 datacenter1 rack1   Up Normal  21.96 GB
 3.33%   134691819273918816419698471612174905938


I have 2 nodes, now errors, system is down now :
172.16.3.93  datacenter1 rack1   Up Normal  21.96 GB
 3.33%   134691819273918816419698471612174905938
172.16.4.212datacenter1 rack1   Up Normal  21.59 GB
 4.18%   49621227543684200553854819754232853074



--
Julio Quierati
User Linux #492973

OpenPGP Key: 0xC9A064FA578E0D60
8D70 B111 ECE9 D3E9 E661 305B C9A0 64FA 578E 0D60


2013/7/17 aaron morton aa...@thelastpickle.com

 Double check the stack size is to set 100K see
 https://github.com/apache/cassandra/blob/cassandra-1.1/conf/cassandra-env.sh#L187

 Cheers

 -
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:26 AM, Julio Quierati julio.quier...@gmail.com
 wrote:

  Hello,
 
  At least 2 times a day I'm having hundreds of log entry related to
 exception below, the network bottleneck seems, anyone know how to solve, or
 encountered this problem
 
  Using
C* version 1.1.4
 and
  Java version 1.7.0_21
  Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
  Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
  AWS m1.xlarge
 
  ERROR [Thrift:967696] 2013-07-16 12:36:12,514
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[Thrift:967696,5,main]
  java.lang.StackOverflowError
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at
 org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at
 org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
  ERROR [Thrift:966717] 2013-07-16 12:36:17,152
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[Thrift:966717,5,main]
  java.lang.StackOverflowError
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at 

Re: Minimum CPU and RAM for Cassandra and Hadoop Cluster

2013-07-17 Thread Martin Arrowsmith
Even if we have one machine, will splitting it up into 2 nodes via a VM
make a difference ?
Can it simulate 2 nodes of half the computing power ? Also, yes, this will
just be a test playground
and not in production.

Thank you,

Martin


On Mon, Jul 15, 2013 at 1:56 PM, Nate McCall zznat...@gmail.com wrote:

 Good point. Just to be clear - my suggestions all assume this is a
 testing/playground/get a feel setup. This is a bad idea for
 performance testing (not to mention anywhere near production).

 On Mon, Jul 15, 2013 at 3:02 PM, Tim Wintle timwin...@gmail.com wrote:
  I might be missing something, but if it is all on one machine then why
 use
  Cassandra or hadoop?
 
  Sent from my phone
 
  On 13 Jul 2013 01:16, Martin Arrowsmith arrowsmith.mar...@gmail.com
  wrote:
 
  Dear Cassandra experts,
 
  I have an HP Proliant ML350 G8 server, and I want to put virtual
  servers on it. I would like to put the maximum number of nodes
  for a Cassandra + Hadoop cluster. I was wondering - what is the
  minimum RAM and memory per node I that I need to have Cassandra + Hadoop
  before the performance decreases are not worth the extra nodes?
 
  Also, what is the suggested typical number of CPU cores / Node ? Would
  it make sense to have 1 core / node ? Less than that ?
 
  Any insight is appreciated! Thanks very much for your time!
 
  Martin



Re: AbstractCassandraDaemon.java (line 134) Exception in thread

2013-07-17 Thread Julio Quierati
System up again xD

nodetool removetoken 129020446491903175361975561488312102412
nodetool removetoken force
nodetool removetoken 42514072861213193928274919594412671125
nodetool removetoken force
nodetool removetoken 99246027852756676471313599864858497886
nodetool removetoken force




--
Julio Quierati
User Linux #492973

OpenPGP Key: 0xC9A064FA578E0D60
8D70 B111 ECE9 D3E9 E661 305B C9A0 64FA 578E 0D60


2013/7/17 Julio Quierati julio.quier...@gmail.com

 Hi Aaron,

 My cassandra-env is by setting -Xss160k changed to -Xss100k.

 if [ `uname` = Linux ] ; then
 # reduce the per-thread stack size to minimize the impact of Thrift
 # thread-per-client.  (Best practice is for client connections to
 # be pooled anyway.) Only do so on Linux where it is known to be
 # supported.
 if startswith $JVM_VERSION '1.7.'
 then
 JVM_OPTS=$JVM_OPTS -Xss160k
 else
 JVM_OPTS=$JVM_OPTS -Xss128k
 fi
 fi

 cassandra error after restart.

 The stack size specified is too small, Specify at least 160k
 Cannot create Java VM
 Service exit with a return value of 1


 https://issues.apache.org/jira/browse/CASSANDRA-4275
 I set 256k.

 A new problem, after restart, the nodetool ring, this viewing old nodes
 removed a 1 months ago, these nodesnot exist anymore :


 nodetool ring
 Note: Ownership information does not include topology, please specify a
 keyspace.
 Address DC  RackStatus State   Load
  OwnsToken

  134691819273918816419698471612174905938
 172.16.1.48 datacenter1 rack1   Down   Normal  ?
 45.82%  42514072861213193928274919594412671125
 172.16.4.212datacenter1 rack1   Up Normal  21.59 GB
  4.18%   49621227543684200553854819754232853074
 172.16.3.221datacenter1 rack1   Down   Normal  ?
 29.17%  99246027852756676471313599864858497886
 172.16.5.57 datacenter1 rack1   Down   Normal  ?
 17.50%  129020446491903175361975561488312102412
 172.16.3.93 datacenter1 rack1   Up Normal  21.96 GB
  3.33%   134691819273918816419698471612174905938


 I have 2 nodes, now errors, system is down now :
 172.16.3.93  datacenter1 rack1   Up Normal  21.96 GB
  3.33%   134691819273918816419698471612174905938
 172.16.4.212datacenter1 rack1   Up Normal  21.59 GB
  4.18%   49621227543684200553854819754232853074



 --
 Julio Quierati
 User Linux #492973

 OpenPGP Key: 0xC9A064FA578E0D60
 8D70 B111 ECE9 D3E9 E661 305B C9A0 64FA 578E 0D60


 2013/7/17 aaron morton aa...@thelastpickle.com

 Double check the stack size is to set 100K see
 https://github.com/apache/cassandra/blob/cassandra-1.1/conf/cassandra-env.sh#L187

 Cheers

 -
 Aaron Morton
 Cassandra Consultant
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 17/07/2013, at 6:26 AM, Julio Quierati julio.quier...@gmail.com
 wrote:

  Hello,
 
  At least 2 times a day I'm having hundreds of log entry related to
 exception below, the network bottleneck seems, anyone know how to solve, or
 encountered this problem
 
  Using
C* version 1.1.4
 and
  Java version 1.7.0_21
  Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
  Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
  AWS m1.xlarge
 
  ERROR [Thrift:967696] 2013-07-16 12:36:12,514
 AbstractCassandraDaemon.java (line 134) Exception in thread
 Thread[Thrift:967696,5,main]
  java.lang.StackOverflowError
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
at
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at
 org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
 org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at
 org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at
 org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at
 org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:22)
at
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
 

Re: unsubscribe

2013-07-17 Thread Robert Coli
http://wiki.apache.org/cassandra/FAQ#unsubscribe


On Wed, Jul 17, 2013 at 8:35 AM, crig...@cox.net wrote:

 unsubscribe



Re: manually removing sstable

2013-07-17 Thread Robert Coli
On Wed, Jul 17, 2013 at 1:47 AM, Michał Michalski mich...@opera.com wrote:

 According to my knowledge it's not necessarily true. In a specific case
 this patch comes into play:

 https://issues.apache.org/**jira/browse/CASSANDRA-4671https://issues.apache.org/jira/browse/CASSANDRA-4671

 We could however purge tombstone if we know that the non-compacted
 sstables doesn't have any info that is older than the tombstones we're
 about to purge (since then we know that the tombstones we'll consider can't
 delete data in non compacted sstables).


There's also https://issues.apache.org/jira/browse/CASSANDRA-1074

However, as we have bloom filters into each one of these SSTables, minor
compaction can relatively inexpensively check for existence of this key in
SSTable files not involved in the current minor compaction, and thereby
delete the key, assuming all bloom filters return negative. If the filter
returns positive, a major compaction would of course still be required.

IIRC there may have been some minor caveats to this, but fwiw!

=Rob


Re: Compaction: pending tasks are stuck

2013-07-17 Thread Steffen Rusitschka

Hi Aaron, list,

Thanks for the input. But... We figured out what the reason is:

https://issues.apache.org/jira/browse/CASSANDRA-5677

we've eliminated the two major spots where we did deletes and things 
have calmed down. Timeouts still happen and we're eagerly awaiting 1.2.7.


Cheers
Steffen


Am 09.07.2013 00:58, schrieb aaron morton:

It started with Cassandra 1.1.10 and on Friday we've migrated to 1.2.6
and now we have the issue that flushes are no longer executed
automatically - we have to do them every hour by hand (nodetool flush)
in order to keep the nodes on an acceptable utilizatzion
level. Without manual flush the CPU goes mad after a couple of hours
on each instance.

It may be that because 1.2 manages it's memory better it has to flush to
disk less frequently in your use case. We flush to disk based on memory
usage and log file growth.

Can you explain what happens when the CPU goes mad ?


Is there a way to find out which tasks are pending? Is there a way to
cancel them?

This looks like an error in the way the count is being handled, if there
are pending tasks they would be running.
There was a recent issue like this that was fixed in 1.2.5
https://issues.apache.org/jira/browse/CASSANDRA-5554?attachmentOrder=asc

If this is happening reliability can you create a new ticket and include
the version and the compaction strategy.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 8/07/2013, at 9:21 PM, Steffen Rusitschka r...@megazebra.com
mailto:r...@megazebra.com wrote:


The heap size is 3GB and the GC has to run only every 30 minutes...
I'm pretty sure increasing heap won't help.

Am 08.07.2013 11:15, schrieb Radim Kolar:



Without manual flush the CPU goes mad after a couple of hours on each
instance.

increase heap size



--
Steffen Rusitschka
CTO
MegaZebra GmbH
Steinsdorfstraße 2
81538 München
Phone +49 89 80929577


r...@megazebra.com mailto:r...@megazebra.com
Challenge me at www.megazebra.com

MegaZebra GmbH
Geschäftsführer: Henning Kosmack, Christian Meister, Steffen Rusitschka
Sitz der Gesellschaft: München, HRB 177947





--
Steffen Rusitschka
CTO
MegaZebra GmbH
Steinsdorfstraße 2
81538 München
Phone +49 89 80929577


r...@megazebra.com
Challenge me at www.megazebra.com

MegaZebra GmbH
Geschäftsführer: Henning Kosmack, Christian Meister, Steffen Rusitschka
Sitz der Gesellschaft: München, HRB 177947


is there a key to sstable index file?

2013-07-17 Thread S Ahmed
Since SSTables are mutable, and they are ordered, does this mean that there
is a index of key ranges that each SS table holds, and the value could be 1
more sstables that have to be scanned and then the latest one is chosen?

e.g. Say I write a value abc to CF1.  This gets stored in a sstable.

Then I write def to CF1, this gets stored in another sstable eventually.

How when I go to fetch the value, it has to scan 2 sstables and then figure
out which is the latest entry correct?

So is there an index of key's to sstables, and there can be 1 or more
sstables per key?

(This is assuming compaction hasn't occurred yet).


is there a key to sstable index file?

2013-07-17 Thread S Ahmed
Since SSTables are mutable, and they are ordered, does this mean that there
is a index of key ranges that each SS table holds, and the value could be 1
more sstables that have to be scanned and then the latest one is chosen?

e.g. Say I write a value abc to CF1.  This gets stored in a sstable.

Then I write def to CF1, this gets stored in another sstable eventually.

How when I go to fetch the value, it has to scan 2 sstables and then figure
out which is the latest entry correct?

So is there an index of key's to sstables, and there can be 1 or more
sstables per key?

(This is assuming compaction hasn't occurred yet).


Re: Alternate major compaction

2013-07-17 Thread Radim Kolar

Dne 16.7.2013 20:45, Robert Coli napsal(a):
On Fri, Jul 12, 2013 at 2:28 AM, Radim Kolar h...@filez.com 
mailto:h...@filez.com wrote:


with some very little work (less then 10 KB of code) is possible
to have online sstable splitter and exported this functionality
over JMX.


Are you volunteering? If so, I'd totally dig using the patch! :D

I do not work for free.


RE: is there a key to sstable index file?

2013-07-17 Thread Kanwar Sangha
Yes..Multiple SSTables can have same key and only after compaction the keys are 
merged reflect the latest value..

From: S Ahmed [mailto:sahmed1...@gmail.com]
Sent: 17 July 2013 15:54
To: cassandra-u...@incubator.apache.org
Subject: is there a key to sstable index file?

Since SSTables are mutable, and they are ordered, does this mean that there is 
a index of key ranges that each SS table holds, and the value could be 1 more 
sstables that have to be scanned and then the latest one is chosen?

e.g. Say I write a value abc to CF1.  This gets stored in a sstable.

Then I write def to CF1, this gets stored in another sstable eventually.

How when I go to fetch the value, it has to scan 2 sstables and then figure out 
which is the latest entry correct?

So is there an index of key's to sstables, and there can be 1 or more sstables 
per key?

(This is assuming compaction hasn't occurred yet).


InvalidRequestException(why:Not enough bytes to read value of component 0)

2013-07-17 Thread Rahul Gupta
Getting error while trying to persist data in a Column Family having a 
CompositeType comparator

Using Cassandra ver 1.1.9.7
Hector Core ver 1.1.5 API ( which uses Thrift 1.1.10)

Created Column Family using cassandra-cli:

create column family event_counts
with comparator = 'CompositeType(DateType,UTF8Type)'
and key_validation_class = 'UUIDType'
and default_validation_class = 'CounterColumnType';

Persistence Code(sumLoad.java):

import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.cassandra.serializers.DateSerializer;
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.Composite;
import me.prettyprint.hector.api.beans.HCounterColumn;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;
import java.sql.Date;
import java.util.logging.Level;

public class sumLoad {

final static Cluster cluster = 
HFactory.getOrCreateCluster(Dev, new 
CassandraHostConfigurator(100.10.0.6:9160));
final static Keyspace keyspace = 
HFactory.createKeyspace(Events, cluster);
final static StringSerializer ss = StringSerializer.get();

private boolean storeCounts(String vKey, String counterCF, Date 
dateStr, String vStr, long value)
{
try
{
   MutatorString m1 = HFactory.createMutator(keyspace, 
StringSerializer.get());

   Composite ts = new Composite();
   ts.addComponent(dateStr, DateSerializer.get());
   ts.addComponent(vStr, StringSerializer.get());
   HCounterColumnString hColumn_ts = 
HFactory.createCounterColumn(ts.toString(), value, StringSerializer.get());

   m1.insertCounter(vKey, counterCF, hColumn_ts);
   m1.execute();
   return true;
}
catch(Exception ex)
{
LOGGER.log(Level.WARNING, Unable to 
store record, ex);
}
return false;
}

public static void main(String[] args) {

Date vDate = new Date(0);
sumLoad SumLoad = new sumLoad();
SumLoad.storeCounts(b9874e3e-4a0e-4e60-ae23-c3f1e575af93, event_counts, 
vDate, StoreThisString, 673);
}

}

Error:

[main] INFO me.prettyprint.cassandra.service.JmxMonitor - Registering JMX 
me.prettyprint.cassandra.service_Dev:ServiceType=hector,MonitorType=hector
Unable to store record
me.prettyprint.hector.api.exceptions.HInvalidRequestException: 
InvalidRequestException(why:Not enough bytes to read value of component 0)


Rahul Gupta
This e-mail and the information, including any attachments, it contains are 
intended to be a confidential communication only to the person or entity to 
whom it is addressed and may contain information that is privileged. If the 
reader of this message is not the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
immediately notify the sender and destroy the original message.



This e-mail and the information, including any attachments, it contains are 
intended to be a confidential communication only to the person or entity to 
whom it is addressed and may contain information that is privileged. If the 
reader of this message is not the intended recipient, you are hereby notified 
that any dissemination, distribution or copying of this communication is 
strictly prohibited. If you have received this communication in error, please 
immediately notify the sender and destroy the original message.

Thank you.

Please consider the environment before printing this email.


Re: is there a key to sstable index file?

2013-07-17 Thread Robert Coli
On Wed, Jul 17, 2013 at 1:53 PM, S Ahmed sahmed1...@gmail.com wrote:

 So is there an index of key's to sstables, and there can be 1 or more
 sstables per key?


There are bloom filters, which answer the question is my row key
definitely not in this SSTable?

There is also the Key Cache, which is a list of SSTables a given row key is
known to be in, and at what offset.

=Rob


Re: AbstractCassandraDaemon.java (line 134) Exception in thread

2013-07-17 Thread Dave Brosius
 What is your -Xss set to. If it's below 256m, set it there, and see if you 
still have the issues.  - Original Message -From: quot;Julio 
Quieratiquot; ;julio.quier...@gmail.com 

sstable size ?

2013-07-17 Thread Langston, Jim
Hi all,

Is there a way to get an SSTable to a smaller size ? By this I mean that I
currently have an SSTable that is nearly 1.2G, so that subsequent SSTables
when they compact are trying to grow to that size. The result is that when
the min_compaction_threshold reaches it value and a compaction is needed,
the compaction is taking a long time as the file grows (it is currently at 52MB 
and
takes ~22s to compact).

I'm not sure how the SSTable initially grew to its current size of 1.2G, since 
the
servers have been up for a couple of years. I hadn't noticed until I just 
upgraded to 1.2.6,
but now I see it affects everything.


Jim


Re: sstable size ?

2013-07-17 Thread Colin Blower
Take a look at the very recent thread called 'Alternate major
compaction'. There are some ideas in there about splitting up a large
SSTable.

http://www.mail-archive.com/user@cassandra.apache.org/msg30956.html


On 07/17/2013 04:17 PM, Langston, Jim wrote:
 Hi all,

 Is there a way to get an SSTable to a smaller size ? By this I mean
 that I 
 currently have an SSTable that is nearly 1.2G, so that subsequent SSTables
 when they compact are trying to grow to that size. The result is that
 when 
 the min_compaction_threshold reaches it value and a compaction is needed, 
 the compaction is taking a long time as the file grows (it is
 currently at 52MB and
 takes ~22s to compact).

 I'm not sure how the SSTable initially grew to its current size of
 1.2G, since the
 servers have been up for a couple of years. I hadn't noticed until I
 just upgraded to 1.2.6,
 but now I see it affects everything. 


 Jim


-- 
*Colin Blower*
/Software Engineer/
Barracuda Networks Inc.
+1 408-342-5576 (o)


Fp chance for column level bloom filter

2013-07-17 Thread Takenori Sato
Hi,

I thought memory consumption of column level bloom filter will become a big
concern when a row becomes very wide like more than tens of millions of
columns.

But I read from source(1.0.7) that fp chance for column level bloom filter
is hard-coded as 0.160, which is very high. So seems not.

Is this correct?

Thanks,
Takenori


Re: sstable size ?

2013-07-17 Thread Langston, Jim
Thanks, this does look like what I'm experiencing. Can someone
post a walkthrough ? The README and the sstablesplit script
don't seem to cover it use in any detail.

Jim

From: Colin Blower cblo...@barracuda.commailto:cblo...@barracuda.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wed, 17 Jul 2013 16:49:59 -0700
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: sstable size ?

Take a look at the very recent thread called 'Alternate major compaction'. 
There are some ideas in there about splitting up a large SSTable.

http://www.mail-archive.com/user@cassandra.apache.org/msg30956.html


On 07/17/2013 04:17 PM, Langston, Jim wrote:
Hi all,

Is there a way to get an SSTable to a smaller size ? By this I mean that I
currently have an SSTable that is nearly 1.2G, so that subsequent SSTables
when they compact are trying to grow to that size. The result is that when
the min_compaction_threshold reaches it value and a compaction is needed,
the compaction is taking a long time as the file grows (it is currently at 52MB 
and
takes ~22s to compact).

I'm not sure how the SSTable initially grew to its current size of 1.2G, since 
the
servers have been up for a couple of years. I hadn't noticed until I just 
upgraded to 1.2.6,
but now I see it affects everything.


Jim


--
Colin Blower
Software Engineer
Barracuda Networks Inc.
+1 408-342-5576 (o)


Re: alter column family ?

2013-07-17 Thread Langston, Jim
As a follow up – I did upgrade the cluster to 1.2.6 and that
did take care of the issue. The upgrade went very smoothly,
the longest part was being thorough on the configuration
files, but I was able to able to quickly update the schema's
after restarting the cluster.


Jim

From: Langston, Jim 
jim.langs...@compuware.commailto:jim.langs...@compuware.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Thu, 11 Jul 2013 18:25:42 +
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: alter column family ?

Was just looking at a bug with uppercase , could that be the error ?

And, yes, definitely saved off the original system keyspaces.

I'm tailing the logs when running the cassandra-cli, but I do not
see anything in the logs ..

Jim

From: Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Thu, 11 Jul 2013 11:07:55 -0700
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: alter column family ?

On Thu, Jul 11, 2013 at 11:00 AM, Langston, Jim 
jim.langs...@compuware.commailto:jim.langs...@compuware.com wrote:
I went through the whole sequence again and now have gotten to the point of
being able to try and pull in the schema, but now getting this error from the 
one
node I'm executing on.
[default@unknown] create keyspace OTracker
9209ec36-3b3f-3e24-9dfb-8a45a5b29a2a
Waiting for schema agreement...
... schemas agree across the cluster
NotFoundException()

This is pretty unusual.

All the nodes see each other and are available, all only contain a system
schema, none have a OTracker schema

If you look in the logs for schema related stuff when you try to create 
OTracker, what do you see?

Do you see the above UUID schema version in the logs?

At this point I am unable to suggest anything other than upgrading to the head 
of 1.1 line and try to create your keyspace there. There should be no chance of 
old state being implicated in your now stuck schema, so it seems likely that 
the problem has re-occured due to the version of Cassandra you are running.

Sorry I am unable to be of more assistance and that my advice appears to have 
resulted in your cluster being in worse condition than when you started. I 
probably mentioned but will do so again that if you have the old system 
keyspace directories, you can stop cassandra on all nodes and then revert to 
them.

=Rob