Re: C* 2.1-rc2 gets unstable after a 'DROP KEYSPACE' command ?

2014-07-17 Thread Benedict Elliott Smith
Also https://issues.apache.org/jira/browse/CASSANDRA-7437 and
https://issues.apache.org/jira/browse/CASSANDRA-7465 for rc3, although the
CounterCacheKey assertion looks like an independent (though comparatively
benign) bug I will file a ticket for.

Can you try this against rc3 to see if the problem persists? You may see
the last exception, but it shouldn't affect the stability of the cluster.
If either of the other exceptions persist, please file a ticket.


On Thu, Jul 17, 2014 at 1:41 AM, Tyler Hobbs ty...@datastax.com wrote:

 This looks like https://issues.apache.org/jira/browse/CASSANDRA-6959, but
 that was fixed for 2.1.0-rc1.

 Is there any chance you can put together a script to reproduce the issue?


 On Thu, Jul 10, 2014 at 8:51 AM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 It seems that memtable tries to flush itself to SSTable of not existing
 keyspace. I don't know why it is happens, but probably running nodetool
 flush before drop should prevent this issue.

 Pavel


 On Thu, Jul 10, 2014 at 4:09 AM, Fabrice Larcher 
 fabrice.larc...@level5.fr wrote:

 ​Hello,

 I am using the 'development' version 2.1-rc2.

 With one node (=localhost), I get timeouts trying to connect to C* after
 running a 'DROP KEYSPACE' command. I have following error messages in
 system.log :

 INFO  [SharedPool-Worker-3] 2014-07-09 16:29:36,578
 MigrationManager.java:319 - Drop Keyspace 'test_main'
 (...)
 ERROR [MemtableFlushWriter:6] 2014-07-09 16:29:37,178
 CassandraDaemon.java:166 - Exception in thread
 Thread[MemtableFlushWriter:6,5,main]
 java.lang.RuntimeException: Last written key
 DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
 52446dde90244ca49789b41671e4ca7c) = current key
 DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
 52446dde90244ca49789b41671e4ca7c) writing into
 ./../data/data/test_main/user-911d5360076f11e4812d3d4ba97474ac/test_main-user.user_account-tmp-ka-1-Data.db
 at
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:215)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:351)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:314)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
 ~[guava-16.0.jar:na]
 at
 org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1054)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 ~[na:1.7.0_55]
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_55]

 Then, I can not connect to the Cluster anymore from my app (Java Driver
 2.1-SNAPSHOT) and got in application logs :

 com.datastax.driver.core.exceptions.NoHostAvailableException: All
 host(s) tried for query failed (tried: /127.0.0.1:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
 at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:258)
 at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:174)
 at
 com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
 at
 com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:36)
 (...)
 Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException:
 All host(s) tried for query failed (tried: /127.0.0.1:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
 at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

 I can still connect through CQLSH but if I run (again) a DROP KEYSPACE
 command from CQLSH, I get the following error :
 errors={}, last_host=127.0.0.1

 Now, on a 2 nodes cluster I also have a similar issue but the error's
 stacktrace is different 

Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Duncan Sands

Hi Diane,

On 17/07/14 06:19, Diane Griffith wrote:

We have been struggling proving out linear read performance with our cassandra
configuration, that it is horizontally scaling.  Wondering if anyone has any
suggestions for what minimal configuration and approach to use to demonstrate 
this.

We were trying to go for a simple set up, so on the keyspace and/or column
families we went with the following settings thinking it was the minimal to
prove scaling:

replication_factor set to 1,


a RF of 1 means that any particular bit of data exists on exactly one node.  So 
if you are testing read speed by reading the same data item again and again as 
fast as you can, then all the reads will be coming from the same one node, the 
one that has that data item on it.  In this situation adding more nodes won't 
help.  Maybe this isn't exactly how you are testing read speed, but perhaps you 
are doing something analogous?  I suggest you explain how you are measuring read 
speed exactly.


Ciao, Duncan.


SimpleStrategy,
default consistency level,
default compaction strategy (size tiered),
but compacted down to 1 sstable per cf on each node (versus using leveled
compaction for read performance)

*Read Performance Results:*
1 client thread - 2 nodes  1 node was seen but we couldn't show increased
performance adding more nodes i.e 4 nodes !  2 nodes
2 client threads - 2 nodes  1 node still was true but again we couldn't show
increased performance adding more nodes i.e. 4 nodes !  2 nodes
10 client threads - this time 2 nodes  1 node on performance numbers.  2 nodes
suffered from larger reduce throughput than 1 node was showing.

Where are we going wrong?

How have others shown horizontal scaling for reads?

Thanks,
Diane




Re: TTransportException (java.net.SocketException: Broken pipe)

2014-07-17 Thread Benedict Elliott Smith
Are you still seeing the same exceptions about too many open files?




On Thu, Jul 17, 2014 at 6:28 AM, Bhaskar Singhal bhaskarsing...@yahoo.com
wrote:

 Even after changing ulimits and moving to the recommended production
 settings, we are still seeing the same issue.

 root@lnx148-76:~# cat /proc/17663/limits
 Limit Soft Limit   Hard Limit   Units
 Max cpu time  unlimitedunlimitedseconds
 Max file size unlimitedunlimitedbytes
 Max data size unlimitedunlimitedbytes
 Max stack size8388608  unlimitedbytes
 Max core file size0unlimitedbytes
 Max resident set  unlimitedunlimitedbytes
 Max processes 256502   256502
 processes
 Max open files4096 4096 files
 Max locked memory 6553665536bytes
 Max address space unlimitedunlimitedbytes
 Max file locksunlimitedunlimitedlocks
 Max pending signals   256502   256502   signals
 Max msgqueue size 819200   819200   bytes
 Max nice priority 00
 Max realtime priority 00
 Max realtime timeout  unlimitedunlimitedus


 Regards,
 Bhaskar


   On Thursday, 10 July 2014 12:09 AM, Robert Coli rc...@eventbrite.com
 wrote:


 On Tue, Jul 8, 2014 at 10:17 AM, Bhaskar Singhal bhaskarsing...@yahoo.com
  wrote:

 But I am wondering why does Cassandra need to keep 3000+ commit log
 segment files open?


 Because you are writing faster than you can flush to disk.

 =Rob






Re: TTransportException (java.net.SocketException: Broken pipe)

2014-07-17 Thread Bhaskar Singhal
Yes, I am.
lsof lists around 9000 open file handles.. and there were around 3000 commitlog 
segments.



On Thursday, 17 July 2014 1:24 PM, Benedict Elliott Smith 
belliottsm...@datastax.com wrote:
 


Are you still seeing the same exceptions about too many open files?





On Thu, Jul 17, 2014 at 6:28 AM, Bhaskar Singhal bhaskarsing...@yahoo.com 
wrote:

Even after changing ulimits and moving to the recommended production settings, 
we are still seeing the same issue.


root@lnx148-76:~# cat /proc/17663/limits
Limit Soft Limit   Hard Limit   Units
Max cpu time  unlimited    unlimited    seconds
Max file size unlimited    unlimited    bytes
Max data size
 unlimited    unlimited    bytes
Max stack size    8388608  unlimited    bytes
Max core file size    0    unlimited    bytes
Max resident set  unlimited    unlimited    bytes
Max processes 256502   256502   processes
Max open files    4096 4096 files
Max locked memory 65536    65536    bytes
Max address
 space unlimited    unlimited    bytes
Max file locks    unlimited    unlimited    locks
Max pending signals   256502   256502   signals
Max msgqueue size 819200  
 819200   bytes
Max nice priority 0    0
Max realtime priority 0    0
Max realtime timeout  unlimited    unlimited    us




Regards,
Bhaskar




On Thursday, 10 July 2014 12:09 AM, Robert Coli rc...@eventbrite.com wrote:
 


On Tue, Jul 8, 2014 at 10:17 AM, Bhaskar Singhal bhaskarsing...@yahoo.com 
wrote:

But I am wondering why does Cassandra need to keep 3000+ commit log segment 
files open?


Because you are writing faster than you can flush to disk.


=Rob
 



Re: MemtablePostFlusher and FlushWriter

2014-07-17 Thread Kais Ahmed
Thanks christian,

I'll check on my side.

Have you an idea about FlushWriter 'All time blocked'

Thanks,


2014-07-16 16:23 GMT+02:00 horschi hors...@gmail.com:

 Hi Ahmed,

 this exception is caused by you creating rows with a key-length of more
 than 64kb. Your key is 394920 bytes long it seems.

 Keys and column-names are limited to 64kb. Only values may be larger.

 I cannot say for sure if this is the cause of your high
 MemtablePostFlusher pending count, but I would say it is possible.

 kind regards,
 Christian

 PS: I still use good old thrift lingo.






 On Wed, Jul 16, 2014 at 3:14 PM, Kais Ahmed k...@neteck-fr.com wrote:

 Hi chris, christan,

 Thanks for reply, i'm not using DSE.

 I have in the log files, this error that appear two times.

 ERROR [FlushWriter:3456] 2014-07-01 18:25:33,607 CassandraDaemon.java
 (line 196) Exception in thread Thread[FlushWriter:3456,5,main]
 java.lang.AssertionError: 394920
 at
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187)
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:365)
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:318)
 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)


 It's the same error than this link
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3cbay169-w52699dd7a1c0007783f8d8a8...@phx.gbl%3E
 ,
 with the same configuration 2 nodes RF 2 with SimpleStrategy.

 Hope this help.

 Thanks,



 2014-07-16 1:49 GMT+02:00 Chris Lohfink clohf...@blackbirdit.com:

 The MemtablePostFlusher is also used for flushing non-cf backed (solr)
 indexes.  Are you using DSE and solr by chance?

 Chris

 On Jul 15, 2014, at 5:01 PM, horschi hors...@gmail.com wrote:

 I have seen this behavour when Commitlog files got deleted (or
 permissions were set to read only).

 MemtablePostFlusher is the stage that marks the Commitlog as flushed.
 When they fail it usually means there is something wrong with the commitlog
 files.

 Check your logfiles for any commitlog related errors.

 regards,
 Christian


 On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote:

 Hi all,

 I have a small cluster (2 nodes RF 2)  running with C* 2.0.6 on I2
 Extra Large (AWS) with SSD disk,
 the nodetool tpstats shows many MemtablePostFlusher pending and
 FlushWriter All time blocked.

 The two nodes have the default configuration. All CF use size-tiered
 compaction strategy.

 There are 10 times more reads than writes (1300 reads/s and 150
 writes/s).


 ubuntu@node1 :~$ nodetool tpstats
 Pool NameActive   Pending  Completed   Blocked
 All time blocked
 MemtablePostFlusher   1  1158 159590
 0 0
 FlushWriter   0 0  11568
 0  1031

 ubuntu@node1:~$ nodetool compactionstats
 pending tasks: 90
 Active compaction remaining time :n/a


 ubuntu@node2:~$ nodetool tpstats
 Pool NameActive   Pending  Completed   Blocked
 All time blocked
 MemtablePostFlusher   1  1020  50987
 0 0
 FlushWriter   0 0   6672
 0   948


 ubuntu@node2:~$ nodetool compactionstats
 pending tasks: 89
 Active compaction remaining time :n/a

 I think there is something wrong, thank you for your help.








How to prevent writing to a Keyspace?

2014-07-17 Thread Lu, Boying
Hi, All,

I need to make a Cassandra keyspace to be read-only.
Does anyone know how to do that?

Thanks

Boying



Re: How to prevent writing to a Keyspace?

2014-07-17 Thread Vivek Mishra
Think about managing it via authorization and authentication support


On Thu, Jul 17, 2014 at 4:00 PM, Lu, Boying boying...@emc.com wrote:

 Hi, All,



 I need to make a Cassandra keyspace to be read-only.

 Does anyone know how to do that?



 Thanks



 Boying





Issue after loading data using ssttable loader

2014-07-17 Thread mahesh rajamani
Hi,

I have an issue in my environment running with cassandra 2.0.5, It is build
with 9 nodes, with 3 nodes in each datacenter. After loading the data, I am
able to do token range lookup or list in cassandra-cli, but when I do get
x[rowkey], the system hangs. Similar query in CQL also has same
behavior.

I have 3 nodes in the source environment, which is configured as 3
datacenter, having 1 node. I did an export from source environment and
imported into new environment with 9 nodes. The other difference is source
is configured as 256 vnodes and destination environment is with 32 vnodes.

Below is the exception i see in cassandra.
ERROR [ReadStage:103] 2014-07-16 21:23:55,648 CassandraDaemon.java (line
192) Exception in thread Thread[ReadStage:103,5,main]

java.lang.AssertionError: Added column does not sort as the first column


at
org.apache.cassandra.db.ArrayBackedSortedColumns.addColumn(ArrayBackedSortedColumns.java:115)

at
org.apache.cassandra.db.ColumnFamily.addColumn(ColumnFamily.java:116)


at
org.apache.cassandra.db.ColumnFamily.addIfRelevant(ColumnFamily.java:110)
at
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:205)

at
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)


at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)


at
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72)


at
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297)

at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53)

at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1560)

at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1379)

at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:327)


at
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65)


at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1396)

at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1931)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)


at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)


at java.lang.Thread.run(Thread.java:744)



-- 
Regards,
Mahesh Rajamani


Re: MemtablePostFlusher and FlushWriter

2014-07-17 Thread horschi
Hi Ahmed,

for that you should increase the flush queue size setting in your
cassandra.yaml

kind regards,
Christian



On Thu, Jul 17, 2014 at 10:54 AM, Kais Ahmed k...@neteck-fr.com wrote:

 Thanks christian,

 I'll check on my side.

 Have you an idea about FlushWriter 'All time blocked'

 Thanks,


 2014-07-16 16:23 GMT+02:00 horschi hors...@gmail.com:

 Hi Ahmed,

 this exception is caused by you creating rows with a key-length of more
 than 64kb. Your key is 394920 bytes long it seems.

 Keys and column-names are limited to 64kb. Only values may be larger.

 I cannot say for sure if this is the cause of your high
 MemtablePostFlusher pending count, but I would say it is possible.

 kind regards,
 Christian

 PS: I still use good old thrift lingo.






 On Wed, Jul 16, 2014 at 3:14 PM, Kais Ahmed k...@neteck-fr.com wrote:

 Hi chris, christan,

 Thanks for reply, i'm not using DSE.

 I have in the log files, this error that appear two times.

 ERROR [FlushWriter:3456] 2014-07-01 18:25:33,607 CassandraDaemon.java
 (line 196) Exception in thread Thread[FlushWriter:3456,5,main]
 java.lang.AssertionError: 394920
 at
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:342)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:201)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:188)
 at
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:133)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:202)
 at
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:187)
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:365)
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:318)
 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)


 It's the same error than this link
 http://mail-archives.apache.org/mod_mbox/cassandra-user/201305.mbox/%3cbay169-w52699dd7a1c0007783f8d8a8...@phx.gbl%3E
 ,
 with the same configuration 2 nodes RF 2 with SimpleStrategy.

 Hope this help.

 Thanks,



 2014-07-16 1:49 GMT+02:00 Chris Lohfink clohf...@blackbirdit.com:

 The MemtablePostFlusher is also used for flushing non-cf backed (solr)
 indexes.  Are you using DSE and solr by chance?

 Chris

 On Jul 15, 2014, at 5:01 PM, horschi hors...@gmail.com wrote:

 I have seen this behavour when Commitlog files got deleted (or
 permissions were set to read only).

 MemtablePostFlusher is the stage that marks the Commitlog as flushed.
 When they fail it usually means there is something wrong with the commitlog
 files.

 Check your logfiles for any commitlog related errors.

 regards,
 Christian


 On Tue, Jul 15, 2014 at 7:03 PM, Kais Ahmed k...@neteck-fr.com wrote:

 Hi all,

 I have a small cluster (2 nodes RF 2)  running with C* 2.0.6 on I2
 Extra Large (AWS) with SSD disk,
 the nodetool tpstats shows many MemtablePostFlusher pending and
 FlushWriter All time blocked.

 The two nodes have the default configuration. All CF use size-tiered
 compaction strategy.

 There are 10 times more reads than writes (1300 reads/s and 150
 writes/s).


 ubuntu@node1 :~$ nodetool tpstats
 Pool NameActive   Pending  Completed
 Blocked  All time blocked
 MemtablePostFlusher   1  1158 159590
 0 0
 FlushWriter   0 0  11568
 0  1031

 ubuntu@node1:~$ nodetool compactionstats
 pending tasks: 90
 Active compaction remaining time :n/a


 ubuntu@node2:~$ nodetool tpstats
 Pool NameActive   Pending  Completed
 Blocked  All time blocked
 MemtablePostFlusher   1  1020  50987
 0 0
 FlushWriter   0 0   6672
 0   948


 ubuntu@node2:~$ nodetool compactionstats
 pending tasks: 89
 Active compaction remaining time :n/a

 I think there is something wrong, thank you for your help.









Re: possible to have TTL on individual collection values?

2014-07-17 Thread Ben Bromhead
Create a table with a set as one of the columns using cqlsh, populate with a 
few records.

Connect using the cassandra-cli, run list on your table/cf and you'll see how 
the sets work. 


Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359


On 13/07/2014, at 11:19 AM, Kevin Burton bur...@spinn3r.com wrote:

 
 
 
 On Sat, Jul 12, 2014 at 6:05 PM, Keith Wright kwri...@nanigans.com wrote:
 Yes each item in the set can have a different TTL so long as they are 
 upserted with commands having differing TTLs. 
 
 
 Ah… ok. So you can just insert them with unique UPDATE/INSERT commands with 
 different USING TTLs and it will work.  That makes sense.
 You should read about how collections/maps work in CQL3 in terms of their 
 CQL2 structure.
 
 Definitely.  I tried but the documentation is all over the map.  This is one 
 of the problems with Cassandra IMO.  It's evolving so fast that it's 
 difficult to find the correct documentation.
  
 -- 
 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 



Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Diane Griffith
Duncan,

Thanks for that feedback.  I'll give a bit more info and then ask some more
questions.

*Our Goal*:  Not to produce the fastest read but show horizontal scaling.

*Test procedure*:
* Inserted 54M rows where one third of that represents a unique key, 18M
keys.  End result given our schema is the 54M rows becomes 72M rows in the
column family as the control query load to use.
* have a client that queries 100k records in configurable batches, set to
1k.  And then it does 100 reps of queries.  It doesn't do the same keys for
each rep, it uses an offset and then it increases the keys to query.
* We can adjust the hit rate, i.e. how many of the keys will be found but
have been focused on 100% hit rate
* we run the query where multiple clients can be spawned to do the same
query cycle 100k keys but the offset is not different so each client will
query the same keys.
* We thought we should manually compact the tables down to 1 sstable on a
given node for consistent results across different cluster sizes
* We had set replication factor to 1 originally to not complicate things or
impact initial write times even.  We would assess rf later was our thought.
 Since we changed the keys getting queried it would have to hit additional
nodes to get row data but for just 1 client thread (to get simplest path to
show horizontal scaling, had a slight decrease of performance when going to
4 nodes from 2 nodes)

Things seen off of given procedure and set up:


   1. 1 client thread:  2 nodes do better than 1 node on the query test.
But 4 nodes did not do better than 2.
   2. 2 client threads: 2 nodes were still doing better than 1 node
   3. 10 client threads: the times drastically suffered and 2 nodes were
   doing 1/2 the speed of 1 node but before 1 to 2 threads performed better on
   2 nodes vs 1 node.  There was a huge decrease in performance on 2 nodes and
   just a mild decrease on 1 node.

Note: 50+ threads was also drastically falling apart.

*Observations*:

   - compacting each node to 1 table did not seem to help as running 10
   client threads on exploded sstables and 2 nodes was 2x better than the last
   2 node 10 client test but still decreased performance from 1 to 2 threads
   query against compacted tables
   - I would see upwards to 10 read requests pending at times while 8 to 10
   were processing when I did nodetool tpstats.
   - having key cache on or disabled did not seem to impact things
   noticeably with our current configuration

.

*Questions:*

   1. can multiple threads read the same sstable at the same time?  Does
   compacting down to 1 sstable (to get a given row into one sstable) add any
   benefit or actually hurt like limited testing has indicated currently?
   2. given the above testing process, does it still make sense to adjust
   replication factor appropriately for cluster size (i.e. 1 for 1 node
   cluster, 2 for 2 node cluster, 3 for n size cluster).  We assumed it was
   just the ability for threads to connect into a coordinator that would help
   but sounds like it can still block


I'm going to try a limited test with changing replication factor.  But if
anyone has any input on compacting to 1 sstable benefit or detriment on
just simple scalability test, how if at all does cassandra block on reading
sstables, and if higher replication factors do indeed help produce reliable
results it would be appreciated.  I know part of our charter was keep it
simple to produce the scalability proof but it does sound like replication
factor is hurting us if the delay between clients for the same keys is not
long enough given the fact we are not doing different offsets for each
client thread.

Thanks,
Diane

On Thu, Jul 17, 2014 at 3:53 AM, Duncan Sands duncan.sa...@gmail.com
wrote:

 Hi Diane,


 On 17/07/14 06:19, Diane Griffith wrote:

 We have been struggling proving out linear read performance with our
 cassandra
 configuration, that it is horizontally scaling.  Wondering if anyone has
 any
 suggestions for what minimal configuration and approach to use to
 demonstrate this.

 We were trying to go for a simple set up, so on the keyspace and/or column
 families we went with the following settings thinking it was the minimal
 to
 prove scaling:

 replication_factor set to 1,


 a RF of 1 means that any particular bit of data exists on exactly one
 node.  So if you are testing read speed by reading the same data item again
 and again as fast as you can, then all the reads will be coming from the
 same one node, the one that has that data item on it.  In this situation
 adding more nodes won't help.  Maybe this isn't exactly how you are testing
 read speed, but perhaps you are doing something analogous?  I suggest you
 explain how you are measuring read speed exactly.

 Ciao, Duncan.

  SimpleStrategy,
 default consistency level,
 default compaction strategy (size tiered),
 but compacted down to 1 sstable per cf on each node (versus using leveled
 compaction for read performance)

 

Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Jack Krupansky
It sounds as if you are actually testing “vertical scalability” (load on a 
single node) rather than Cassandra’s sweet spot of “horizontal scalability” 
(add more nodes to handle higher load.) Maybe you could clarify your intentions 
and specific use case.

Also, it sounds like you are trying to focus on large queries, but Cassandra’s 
sweet spot is lots of smaller queries. With larger queries you can end up 
measuring things like the capabilities of your hardware, cpu cores, memory, I/O 
bandwidth, network latency, JVM configuration, etc. rather than measuring 
Cassandra per se. So, again, maybe you could clarify your intended use case.

It might be that you need to add more “vertical scale” (bigger box, more cores, 
more memory, beefier I/O and networking) to handle large queries, or maybe 
simple, Cassandra-style “horizontal scaling” (adding nodes) will be sufficient. 
Sure, you can tune Cassandra for single-node performance, but that seems lot a 
lot of extra work, to me, compared to adding more cheap nodes.

-- Jack Krupansky

From: Diane Griffith 
Sent: Thursday, July 17, 2014 9:31 AM
To: user 
Subject: Re: trouble showing cluster scalability for read performance

Duncan,  

Thanks for that feedback.  I'll give a bit more info and then ask some more 
questions. 

Our Goal:  Not to produce the fastest read but show horizontal scaling.

Test procedure:  
* Inserted 54M rows where one third of that represents a unique key, 18M keys.  
End result given our schema is the 54M rows becomes 72M rows in the column 
family as the control query load to use.
* have a client that queries 100k records in configurable batches, set to 1k.  
And then it does 100 reps of queries.  It doesn't do the same keys for each 
rep, it uses an offset and then it increases the keys to query.  
* We can adjust the hit rate, i.e. how many of the keys will be found but have 
been focused on 100% hit rate
* we run the query where multiple clients can be spawned to do the same query 
cycle 100k keys but the offset is not different so each client will query the 
same keys.
* We thought we should manually compact the tables down to 1 sstable on a given 
node for consistent results across different cluster sizes
* We had set replication factor to 1 originally to not complicate things or 
impact initial write times even.  We would assess rf later was our thought.  
Since we changed the keys getting queried it would have to hit additional nodes 
to get row data but for just 1 client thread (to get simplest path to show 
horizontal scaling, had a slight decrease of performance when going to 4 nodes 
from 2 nodes)

Things seen off of given procedure and set up:

  1.. 1 client thread:  2 nodes do better than 1 node on the query test.  But 4 
nodes did not do better than 2.

  2.. 2 client threads: 2 nodes were still doing better than 1 node 
  3.. 10 client threads: the times drastically suffered and 2 nodes were doing 
1/2 the speed of 1 node but before 1 to 2 threads performed better on 2 nodes 
vs 1 node.  There was a huge decrease in performance on 2 nodes and just a mild 
decrease on 1 node. 
Note: 50+ threads was also drastically falling apart.


Observations:
  a.. compacting each node to 1 table did not seem to help as running 10 client 
threads on exploded sstables and 2 nodes was 2x better than the last 2 node 10 
client test but still decreased performance from 1 to 2 threads query against 
compacted tables

  b.. I would see upwards to 10 read requests pending at times while 8 to 10 
were processing when I did nodetool tpstats.

  c.. having key cache on or disabled did not seem to impact things noticeably 
with our current configuration

.

Questions:
  1.. can multiple threads read the same sstable at the same time?  Does 
compacting down to 1 sstable (to get a given row into one sstable) add any 
benefit or actually hurt like limited testing has indicated currently?

  2.. given the above testing process, does it still make sense to adjust 
replication factor appropriately for cluster size (i.e. 1 for 1 node cluster, 2 
for 2 node cluster, 3 for n size cluster).  We assumed it was just the ability 
for threads to connect into a coordinator that would help but sounds like it 
can still block


I'm going to try a limited test with changing replication factor.  But if 
anyone has any input on compacting to 1 sstable benefit or detriment on just 
simple scalability test, how if at all does cassandra block on reading 
sstables, and if higher replication factors do indeed help produce reliable 
results it would be appreciated.  I know part of our charter was keep it simple 
to produce the scalability proof but it does sound like replication factor is 
hurting us if the delay between clients for the same keys is not long enough 
given the fact we are not doing different offsets for each client thread.  

Thanks,
Diane


On Thu, Jul 17, 2014 at 3:53 AM, Duncan Sands duncan.sa...@gmail.com wrote:

  Hi Diane, 


  On 17/07/14 06:19, 

horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
This is a follow on re-post to clarify what we are trying to do, providing
information that was missing or not clear.



Goal:  Verify horizontal scaling for random non duplicating key reads using
the simplest configuration (or minimal configuration) possible.



Background:

A couple years ago we did similar performance testing with Cassandra for
both read and write performance and found excellent (essentially linear)
horizontal scalability.  That project got put on hold.  We are now moving
forward with an operational system and are having scaling problems.



During the prior testing (3 years ago) we were using a much older version
of Cassandra (0.8 or older), the THRIFT API, and Amazon AWS rather than
OpenStack VMs.  We are now using the latest Cassandra and the CQL
interface.  We did try moving from OpenStack to AWS/EC2 but that did not
materially change our (poor) results.



Test Procedure:

   - Inserted 54 million cells in 18 million rows (so 3 cells per row),
   using randomly generated row keys. That was to be our data control for the
   test.
   - Spawn a client on a different VM to query 100k rows and do that for
   100 reps.  Each row key queried is drawn randomly from the set of existing
   row keys, and then not re-used, so all 10 million row queries use a
   different (valid) row key.  This test is a specific use case of our system
   we are trying to show will scale

Result:

   - 2 nodes performed better than 1 node test but 4 nodes showed decreased
   performance over 2 nodes.  So that did not show horizontal scaling



Notes:

   - We have replication factor set to 1 as we were trying to keep the
   control test simple to prove out horizontal scaling.
   - When we tried to add threading to see if it would help it had
   interesting side behavior which did not prove out horizontal scaling.
   - We are using CQL versus THRIFT API for Cassandra 2.0.6





Does anyone have any feedback that either threading or replication factor
is necessary to show horizontal scaling of Cassandra versus the minimal way
of just continue to add nodes to help throughput?



Any suggestions of minimal configuration necessary to show scaling of our
query use case 100k requests for random non repeating keys constantly
coming in over a period of time?


Thanks,

Diane


Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Diane Griffith
Definitely not trying to show vertical scaling.  We have a query use case
we are trying to show will scale as we add more nodes should performance
fall below adequate.   But to show the scaling we do the test on a 1 node
cluster, then 2 node cluster, then 4 node cluster with a goal that query
throughput increases when adding more nodes.

Basically we do not want to tune for single node performance and did want
to prove out adding nodes works but for our query use case it hasn't yet.
 Our query size is a valid use case though for our need.

Earlier it may not have been clear but we are not querying the same key
over and over in one thread but continuously querying random non
duplicating keys.  Bringing up the threading was not our main path or
desired goal so I re-posted with clearer intent hopefully of our goal, what
we experienced in the past against THRIFT and an older version of Cassandra
which we have not been able to duplicate via CQL and Cassandra 2.0.6.

So just hoping someone has suggestions of what one must do at a minimum to
prove horizontal scaling or have suggestions of what to look at in our
current datasize/query use case that may be causing us to not achieve
horizontal scaling.

Thanks,
Diane




On Thu, Jul 17, 2014 at 10:03 AM, Jack Krupansky j...@basetechnology.com
wrote:

   It sounds as if you are actually testing “vertical scalability” (load
 on a single node) rather than Cassandra’s sweet spot of “horizontal
 scalability” (add more nodes to handle higher load.) Maybe you could
 clarify your intentions and specific use case.

 Also, it sounds like you are trying to focus on large queries, but
 Cassandra’s sweet spot is lots of smaller queries. With larger queries you
 can end up measuring things like the capabilities of your hardware, cpu
 cores, memory, I/O bandwidth, network latency, JVM configuration, etc.
 rather than measuring Cassandra per se. So, again, maybe you could clarify
 your intended use case.

 It might be that you need to add more “vertical scale” (bigger box, more
 cores, more memory, beefier I/O and networking) to handle large queries, or
 maybe simple, Cassandra-style “horizontal scaling” (adding nodes) will be
 sufficient. Sure, you can tune Cassandra for single-node performance, but
 that seems lot a lot of extra work, to me, compared to adding more cheap
 nodes.

 -- Jack Krupansky

  *From:* Diane Griffith dfgriff...@gmail.com
 *Sent:* Thursday, July 17, 2014 9:31 AM
 *To:* user user@cassandra.apache.org
 *Subject:* Re: trouble showing cluster scalability for read performance

  Duncan,

 Thanks for that feedback.  I'll give a bit more info and then ask some
 more questions.

 *Our Goal*:  Not to produce the fastest read but show horizontal scaling.

  *Test procedure*:
 * Inserted 54M rows where one third of that represents a unique key, 18M
 keys.  End result given our schema is the 54M rows becomes 72M rows in the
 column family as the control query load to use.
 * have a client that queries 100k records in configurable batches, set to
 1k.  And then it does 100 reps of queries.  It doesn't do the same keys for
 each rep, it uses an offset and then it increases the keys to query.
 * We can adjust the hit rate, i.e. how many of the keys will be found but
 have been focused on 100% hit rate
 * we run the query where multiple clients can be spawned to do the same
 query cycle 100k keys but the offset is not different so each client will
 query the same keys.
 * We thought we should manually compact the tables down to 1 sstable on a
 given node for consistent results across different cluster sizes
 * We had set replication factor to 1 originally to not complicate things
 or impact initial write times even.  We would assess rf later was our
 thought.  Since we changed the keys getting queried it would have to hit
 additional nodes to get row data but for just 1 client thread (to get
 simplest path to show horizontal scaling, had a slight decrease of
 performance when going to 4 nodes from 2 nodes)

 Things seen off of given procedure and set up:


1. 1 client thread:  2 nodes do better than 1 node on the query test.
But 4 nodes did not do better than 2.
2. 2 client threads: 2 nodes were still doing better than 1 node
3. 10 client threads: the times drastically suffered and 2 nodes were
doing 1/2 the speed of 1 node but before 1 to 2 threads performed better on
2 nodes vs 1 node.  There was a huge decrease in performance on 2 nodes and
just a mild decrease on 1 node.

 Note: 50+ threads was also drastically falling apart.

 *Observations*:

- compacting each node to 1 table did not seem to help as running 10
client threads on exploded sstables and 2 nodes was 2x better than the last
2 node 10 client test but still decreased performance from 1 to 2 threads
query against compacted tables
- I would see upwards to 10 read requests pending at times while 8 to
10 were processing when I did nodetool tpstats.
- 

Re: trouble showing cluster scalability for read performance

2014-07-17 Thread Timo Ahokas
Hi Diane,

Sounds a bit like the client might be the limiting factor in your test -
not the server. Especially if you're using one single threaded client, you
might not be loading the backend in any significant way. Have you done any
vertical scaling tests (identical client, bigger server)? if the client is
indeed the limiting factor, then adding server capacity probably doesn't
gain you much. What sort of CPU/IO load do you have on the client/server
during your tests?

I might be barking up the wrong tree (we haven't done any load tests yet on
Cassandra), but when we load tested our clustered app, we used 3-10 client
machines (with multithreaded clients) against 3 app server nodes.

I would definitely first try to add more client load (multiple
clients/multithreading and/or client machines) and once you're actually
hitting the server properly, then add more server nodes.

Best regards,
Timo


On 17 July 2014 20:39, Diane Griffith dfgriff...@gmail.com wrote:

 Definitely not trying to show vertical scaling.  We have a query use case
 we are trying to show will scale as we add more nodes should performance
 fall below adequate.   But to show the scaling we do the test on a 1 node
 cluster, then 2 node cluster, then 4 node cluster with a goal that query
 throughput increases when adding more nodes.

 Basically we do not want to tune for single node performance and did want
 to prove out adding nodes works but for our query use case it hasn't yet.
  Our query size is a valid use case though for our need.

 Earlier it may not have been clear but we are not querying the same key
 over and over in one thread but continuously querying random non
 duplicating keys.  Bringing up the threading was not our main path or
 desired goal so I re-posted with clearer intent hopefully of our goal, what
 we experienced in the past against THRIFT and an older version of Cassandra
 which we have not been able to duplicate via CQL and Cassandra 2.0.6.

 So just hoping someone has suggestions of what one must do at a minimum to
 prove horizontal scaling or have suggestions of what to look at in our
 current datasize/query use case that may be causing us to not achieve
 horizontal scaling.

 Thanks,
 Diane




 On Thu, Jul 17, 2014 at 10:03 AM, Jack Krupansky j...@basetechnology.com
 wrote:

   It sounds as if you are actually testing “vertical scalability” (load
 on a single node) rather than Cassandra’s sweet spot of “horizontal
 scalability” (add more nodes to handle higher load.) Maybe you could
 clarify your intentions and specific use case.

 Also, it sounds like you are trying to focus on large queries, but
 Cassandra’s sweet spot is lots of smaller queries. With larger queries you
 can end up measuring things like the capabilities of your hardware, cpu
 cores, memory, I/O bandwidth, network latency, JVM configuration, etc.
 rather than measuring Cassandra per se. So, again, maybe you could clarify
 your intended use case.

 It might be that you need to add more “vertical scale” (bigger box, more
 cores, more memory, beefier I/O and networking) to handle large queries, or
 maybe simple, Cassandra-style “horizontal scaling” (adding nodes) will be
 sufficient. Sure, you can tune Cassandra for single-node performance, but
 that seems lot a lot of extra work, to me, compared to adding more cheap
 nodes.

 -- Jack Krupansky

  *From:* Diane Griffith dfgriff...@gmail.com
 *Sent:* Thursday, July 17, 2014 9:31 AM
 *To:* user user@cassandra.apache.org
 *Subject:* Re: trouble showing cluster scalability for read performance

  Duncan,

 Thanks for that feedback.  I'll give a bit more info and then ask some
 more questions.

 *Our Goal*:  Not to produce the fastest read but show horizontal scaling.

  *Test procedure*:
 * Inserted 54M rows where one third of that represents a unique key, 18M
 keys.  End result given our schema is the 54M rows becomes 72M rows in the
 column family as the control query load to use.
 * have a client that queries 100k records in configurable batches, set to
 1k.  And then it does 100 reps of queries.  It doesn't do the same keys for
 each rep, it uses an offset and then it increases the keys to query.
 * We can adjust the hit rate, i.e. how many of the keys will be found but
 have been focused on 100% hit rate
 * we run the query where multiple clients can be spawned to do the same
 query cycle 100k keys but the offset is not different so each client will
 query the same keys.
 * We thought we should manually compact the tables down to 1 sstable on a
 given node for consistent results across different cluster sizes
 * We had set replication factor to 1 originally to not complicate things
 or impact initial write times even.  We would assess rf later was our
 thought.  Since we changed the keys getting queried it would have to hit
 additional nodes to get row data but for just 1 client thread (to get
 simplest path to show horizontal scaling, had a slight decrease of
 performance when going to 4 nodes 

Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
How many partitions are you spreading those 18 million rows over? That many 
rows in a single partition will not be a sweet spot for Cassandra. It’s not 
exceeding any hard limit (2 billion), but some internal operations may cache 
the partition rather than the logical row.

And all those rows in a single partition would certainly not be a test of 
“horizontal scaling” (adding nodes to handle more data – more token values or 
partitions.)

-- Jack Krupansky

From: Diane Griffith 
Sent: Thursday, July 17, 2014 1:33 PM
To: user 
Subject: horizontal query scaling issues follow on

This is a follow on re-post to clarify what we are trying to do, providing 
information that was missing or not clear.



Goal:  Verify horizontal scaling for random non duplicating key reads using the 
simplest configuration (or minimal configuration) possible.



Background:

A couple years ago we did similar performance testing with Cassandra for both 
read and write performance and found excellent (essentially linear) horizontal 
scalability.  That project got put on hold.  We are now moving forward with an 
operational system and are having scaling problems.



During the prior testing (3 years ago) we were using a much older version of 
Cassandra (0.8 or older), the THRIFT API, and Amazon AWS rather than OpenStack 
VMs.  We are now using the latest Cassandra and the CQL interface.  We did try 
moving from OpenStack to AWS/EC2 but that did not materially change our (poor) 
results.



Test Procedure:

  a.. Inserted 54 million cells in 18 million rows (so 3 cells per row), using 
randomly generated row keys. That was to be our data control for the test. 
  b.. Spawn a client on a different VM to query 100k rows and do that for 100 
reps.  Each row key queried is drawn randomly from the set of existing row 
keys, and then not re-used, so all 10 million row queries use a different 
(valid) row key.  This test is a specific use case of our system we are trying 
to show will scale 
Result:

  a.. 2 nodes performed better than 1 node test but 4 nodes showed decreased 
performance over 2 nodes.  So that did not show horizontal scaling 


Notes:

  a.. We have replication factor set to 1 as we were trying to keep the control 
test simple to prove out horizontal scaling.  
  b.. When we tried to add threading to see if it would help it had interesting 
side behavior which did not prove out horizontal scaling. 
  c.. We are using CQL versus THRIFT API for Cassandra 2.0.6 




Does anyone have any feedback that either threading or replication factor is 
necessary to show horizontal scaling of Cassandra versus the minimal way of 
just continue to add nodes to help throughput?



Any suggestions of minimal configuration necessary to show scaling of our query 
use case 100k requests for random non repeating keys constantly coming in over 
a period of time?




Thanks,

Diane


Re: Index creation sometimes fails

2014-07-17 Thread Clint Kelly
Hi Tyler,

Thanks for replying.  This is good to know that I am not going crazy!  :)

I will post a JIRA, along with directions on how to get this to
happen.  The tricky thing, though, is that this doesn't always happen,
and I cannot reproduce it on my laptop or in a VM.

BTW you mean the datastax JIRA, correct?

Best regards,
Clint

On Wed, Jul 16, 2014 at 4:32 PM, Tyler Hobbs ty...@datastax.com wrote:

 On Tue, Jul 15, 2014 at 1:40 PM, Clint Kelly clint.ke...@gmail.com wrote:


 Is there some way to get the driver to block until the schema code has
 propagated everywhere?  My currently solution feels rather janky!


 The driver *should* be blocking until the schema has propagated already.  If
 it's not, that's a bug.  I would check the changelog and JIRA for related
 tickets, and if you don't find anything, open a new ticket with details and
 steps to repro: http://cassandra.apache.org/doc/cql3/CQL.html#batchStmt


 --
 Tyler Hobbs
 DataStax


Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
So do partitions equate to tokens/vnodes?

If so we had configured all cluster nodes/vms with num_tokens: 256 instead
of setting init_token and assigning ranges.  I am still not getting why in
Cassandra 2.0, I would assign my own ranges via init_token and this was
based on the documentation and even this blog item
http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 that made
it seem right for us to always configure our cluster vms with num_tokens:
256 in the cassandra.yaml file.

Also in all testing, all vms were of equal sizing so one was not more
powerful than another.

I didn't think I was hitting an i/o wall on the client vm (separate vm)
where we command line scripted our query call to the cassandra cluster.
 I can break the client call load across vms which I tried early on.  Happy
to verify that again though.

So given that I was assuming the partitions were such that it wasn't a
problem.  Is that an incorrect assumption and something to dig into more?

Thanks,
Diane


On Thu, Jul 17, 2014 at 3:01 PM, Jack Krupansky j...@basetechnology.com
wrote:

   How many partitions are you spreading those 18 million rows over? That
 many rows in a single partition will not be a sweet spot for Cassandra.
 It’s not exceeding any hard limit (2 billion), but some internal operations
 may cache the partition rather than the logical row.

 And all those rows in a single partition would certainly not be a test of
 “horizontal scaling” (adding nodes to handle more data – more token values
 or partitions.)

 -- Jack Krupansky

  *From:* Diane Griffith dfgriff...@gmail.com
 *Sent:* Thursday, July 17, 2014 1:33 PM
 *To:* user user@cassandra.apache.org
 *Subject:* horizontal query scaling issues follow on


 This is a follow on re-post to clarify what we are trying to do, providing
 information that was missing or not clear.



 Goal:  Verify horizontal scaling for random non duplicating key reads
 using the simplest configuration (or minimal configuration) possible.



 Background:

 A couple years ago we did similar performance testing with Cassandra for
 both read and write performance and found excellent (essentially linear)
 horizontal scalability.  That project got put on hold.  We are now moving
 forward with an operational system and are having scaling problems.



 During the prior testing (3 years ago) we were using a much older version
 of Cassandra (0.8 or older), the THRIFT API, and Amazon AWS rather than
 OpenStack VMs.  We are now using the latest Cassandra and the CQL
 interface.  We did try moving from OpenStack to AWS/EC2 but that did not
 materially change our (poor) results.



 Test Procedure:

- Inserted 54 million cells in 18 million rows (so 3 cells per row),
using randomly generated row keys. That was to be our data control for the
test.
- Spawn a client on a different VM to query 100k rows and do that for
100 reps.  Each row key queried is drawn randomly from the set of existing
row keys, and then not re-used, so all 10 million row queries use a
different (valid) row key.  This test is a specific use case of our system
we are trying to show will scale

 Result:

- 2 nodes performed better than 1 node test but 4 nodes showed
decreased performance over 2 nodes.  So that did not show horizontal 
 scaling



 Notes:

- We have replication factor set to 1 as we were trying to keep the
control test simple to prove out horizontal scaling.
- When we tried to add threading to see if it would help it had
interesting side behavior which did not prove out horizontal scaling.
- We are using CQL versus THRIFT API for Cassandra 2.0.6





 Does anyone have any feedback that either threading or replication factor
 is necessary to show horizontal scaling of Cassandra versus the minimal way
 of just continue to add nodes to help throughput?



 Any suggestions of minimal configuration necessary to show scaling of our
 query use case 100k requests for random non repeating keys constantly
 coming in over a period of time?


 Thanks,

 Diane



Re: How to column slice with CQL + 1.2

2014-07-17 Thread Michael Dykman
The last term in this query is redundant.  Any time column1 = 1, we
may reasonably expect that it is also = 2 as that's where 1 is found.
If you remove the last term, you elimiate the error and non of the
selection logic.

SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND
column34 AND column1=2;

On Thu, Jul 17, 2014 at 6:23 PM, Mike Heffner m...@librato.com wrote:
 What is the proper way to perform a column slice using CQL with 1.2?

 I have a CF with a primary key X and 3 composite columns (A, B, C). I'd like
 to find records at:

 key=X
 columns  (A=1, B=3, C=4) AND
columns = (A=2)

 The Query:

 SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34 AND
 column1=2;

 fails with:

 DoGetMeasures: column1 cannot be restricted by both an equal and an inequal
 relation

 This is against Cassandra 1.2.16.

 What is the proper way to perform this query?


 Cheers,

 Mike

 --

   Mike Heffner m...@librato.com
   Librato, Inc.




-- 
 - michael dykman
 - mdyk...@gmail.com

 May the Source be with you.


Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith dfgriff...@gmail.com
wrote:

 So do partitions equate to tokens/vnodes?


A partition is what used to be called a row.

Each individual token in the token ring can contain a partition, which you
request using the token as the key.

A token range is the space between two tokens.


 If so we had configured all cluster nodes/vms with num_tokens: 256 instead
 of setting init_token and assigning ranges.  I am still not getting why in
 Cassandra 2.0, I would assign my own ranges via init_token and this was
 based on the documentation and even this blog item
 http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 that
 made it seem right for us to always configure our cluster vms with
 num_tokens: 256 in the cassandra.yaml file.


If you are using vnodes and don't want to try to figure out what ideally
random token ranges for them are, you should, generally :

1) start the node with num_tokens set to a value greater than 1
2) once succesffully bootstrapped, dump all node tokens with :

nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,

3) put list from 2) in initial_token list in cassandra.yaml
4) (optional) restart and verify that your node has the tokens you expect

So given that I was assuming the partitions were such that it wasn't a
 problem.  Is that an incorrect assumption and something to dig into more?


How many client threads do you have? Your OP suggested a low number, which
will not have good results in terms of throughput?

=Rob


Re: How to column slice with CQL + 1.2

2014-07-17 Thread Mike Heffner
Michael,

So if I switch to:

SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34

That doesn't include rows where column1=2, which breaks the original slice
query.

Maybe a better way to put it, I would like:

SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34
AND column1=2;

but that is rejected with:

Bad Request: PRIMARY KEY part column2 cannot be restricted (preceding part
column1 is either not restricted or by a non-EQ relation)


Mike



On Thu, Jul 17, 2014 at 6:37 PM, Michael Dykman mdyk...@gmail.com wrote:

 The last term in this query is redundant.  Any time column1 = 1, we
 may reasonably expect that it is also = 2 as that's where 1 is found.
 If you remove the last term, you elimiate the error and non of the
 selection logic.

 SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND
 column34 AND column1=2;

 On Thu, Jul 17, 2014 at 6:23 PM, Mike Heffner m...@librato.com wrote:
  What is the proper way to perform a column slice using CQL with 1.2?
 
  I have a CF with a primary key X and 3 composite columns (A, B, C). I'd
 like
  to find records at:
 
  key=X
  columns  (A=1, B=3, C=4) AND
 columns = (A=2)
 
  The Query:
 
  SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34
 AND
  column1=2;
 
  fails with:
 
  DoGetMeasures: column1 cannot be restricted by both an equal and an
 inequal
  relation
 
  This is against Cassandra 1.2.16.
 
  What is the proper way to perform this query?
 
 
  Cheers,
 
  Mike
 
  --
 
Mike Heffner m...@librato.com
Librato, Inc.
 



 --
  - michael dykman
  - mdyk...@gmail.com

  May the Source be with you.




-- 

  Mike Heffner m...@librato.com
  Librato, Inc.


How to maintain the N-most-recent versions of a value?

2014-07-17 Thread Clint Kelly
Hi everyone,

I am trying to design a schema that will keep the N-most-recent
versions of a value.  Currently my table looks like the following:

CREATE TABLE foo (
rowkey text,
family text,
qualifier text,
version long,
value blob,
PRIMARY KEY (rowkey, family, qualifier, version))
WITH CLUSTER ORDER BY (rowkey ASC, family ASC, qualifier ASC, version DESC));

Is there any standard design pattern for updating such a layout such
that I keep the N-most-recent (version, value) pairs for every unique
(rowkey, family, qualifier)?  I can't think of any way to do this
without doing a read-modify-write.  The best thing I can think of is
to use TTL to approximate the desired behavior (which will work if I
know how often we are writing new data to the table).  I could also
use LIMIT N in my queries to limit myself to only N items, but that
does not address any of the storage-size issues.

In case anyone is curious, this question is related to some work that
I am doing translating a system built on HBase (which provides this
keep the N-most-recent-version-of-a-cell behavior) to Cassandra
while providing the user with as-similar-as-possible an interface.

Best regards,
Clint


Re: horizontal query scaling issues follow on

2014-07-17 Thread Diane Griffith
So I stripped out the number of clients experiment path information.  It is
unclear if I can only show horizontal scaling by also spawning many client
requests all working at once.  So that is why I stripped that information
out to distill what our original attempt was at how to show horizontal
scaling.

I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
 Performance on 2 nodes starts to degrade from 10 clients on.  I saw
similar behavior on 4 nodes but haven't done the official runs on that yet.


When I tried to grab the list of tokens assigned and populate it in the
cassandra.yaml I never got it right.

I basically did the command and it was outputting 256 tokens on each node
and comma separated.  So I tried taking that string and setting that as the
value to initial_token but the node wouldn't start up.

Not sure if I maybe had a carriage return in there and that was the problem.

And if I do that do I need to do more than comment out num_tokens?

Thanks,
Diane




On Thu, Jul 17, 2014 at 6:58 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Jul 17, 2014 at 3:21 PM, Diane Griffith dfgriff...@gmail.com
 wrote:

 So do partitions equate to tokens/vnodes?


 A partition is what used to be called a row.

 Each individual token in the token ring can contain a partition, which you
 request using the token as the key.

 A token range is the space between two tokens.


 If so we had configured all cluster nodes/vms with num_tokens: 256
 instead of setting init_token and assigning ranges.  I am still not getting
 why in Cassandra 2.0, I would assign my own ranges via init_token and this
 was based on the documentation and even this blog item
 http://www.datastax.com/dev/blog/virtual-nodes-in-cassandra-1-2 that
 made it seem right for us to always configure our cluster vms with
 num_tokens: 256 in the cassandra.yaml file.


 If you are using vnodes and don't want to try to figure out what ideally
 random token ranges for them are, you should, generally :

 1) start the node with num_tokens set to a value greater than 1
 2) once succesffully bootstrapped, dump all node tokens with :

 nodetool info -T | grep Token | awk '{print $3}' | paste -s -d,

 3) put list from 2) in initial_token list in cassandra.yaml
 4) (optional) restart and verify that your node has the tokens you expect

 So given that I was assuming the partitions were such that it wasn't a
 problem.  Is that an incorrect assumption and something to dig into more?


 How many client threads do you have? Your OP suggested a low number, which
 will not have good results in terms of throughput?

 =Rob




Re: How to column slice with CQL + 1.2

2014-07-17 Thread Tyler Hobbs
For this type of query, you really want the tuple notation introduced in
2.0.6 (https://issues.apache.org/jira/browse/CASSANDRA-4851):

SELECT * FROM CF WHERE key='X' AND (column1, column2, column3)  (1, 3, 4)
AND (column1)  (2)


On Thu, Jul 17, 2014 at 6:01 PM, Mike Heffner m...@librato.com wrote:

 Michael,

 So if I switch to:

 SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34

 That doesn't include rows where column1=2, which breaks the original slice
 query.

 Maybe a better way to put it, I would like:

 SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column34
 AND column1=2;

 but that is rejected with:

 Bad Request: PRIMARY KEY part column2 cannot be restricted (preceding part
 column1 is either not restricted or by a non-EQ relation)


 Mike



 On Thu, Jul 17, 2014 at 6:37 PM, Michael Dykman mdyk...@gmail.com wrote:

 The last term in this query is redundant.  Any time column1 = 1, we
 may reasonably expect that it is also = 2 as that's where 1 is found.
 If you remove the last term, you elimiate the error and non of the
 selection logic.

 SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND
 column34 AND column1=2;

 On Thu, Jul 17, 2014 at 6:23 PM, Mike Heffner m...@librato.com wrote:
  What is the proper way to perform a column slice using CQL with 1.2?
 
  I have a CF with a primary key X and 3 composite columns (A, B, C). I'd
 like
  to find records at:
 
  key=X
  columns  (A=1, B=3, C=4) AND
 columns = (A=2)
 
  The Query:
 
  SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND
 column34 AND
  column1=2;
 
  fails with:
 
  DoGetMeasures: column1 cannot be restricted by both an equal and an
 inequal
  relation
 
  This is against Cassandra 1.2.16.
 
  What is the proper way to perform this query?
 
 
  Cheers,
 
  Mike
 
  --
 
Mike Heffner m...@librato.com
Librato, Inc.
 



 --
  - michael dykman
  - mdyk...@gmail.com

  May the Source be with you.




 --

   Mike Heffner m...@librato.com
   Librato, Inc.




-- 
Tyler Hobbs
DataStax http://datastax.com/


Re: horizontal query scaling issues follow on

2014-07-17 Thread Robert Coli
On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith dfgriff...@gmail.com
wrote:

 I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
  Performance on 2 nodes starts to degrade from 10 clients on.  I saw
 similar behavior on 4 nodes but haven't done the official runs on that yet.



Ok, if you've multi-threaded your client, then you aren't starving for
client thread paralellism, and that rules out another scalability
bottleneck.

As a brief aside, you only lose from vnodes until your cluster is larger
than a certain sizes, and then only when adding or removing nodes from a
cluster. Perhaps if you are ramping up and scientifically testing smaller
cluster sizes, you should start at first with a token per range, ie
pre-vnodes operation?

I basically did the command and it was outputting 256 tokens on each node
 and comma separated.  So I tried taking that string and setting that as the
 value to initial_token but the node wouldn't start up.

 Not sure if I maybe had a carriage return in there and that was the
 problem.


It should take a comma delimited list of tokens, did the failed node
startup log any error?


 And if I do that do I need to do more than comment out num_tokens?


No, though you probably should anyway in order to be unambiguous.

=Rob


Re: How to maintain the N-most-recent versions of a value?

2014-07-17 Thread Chris Lohfink
I would say that would work, but since already familiar with storage model from 
hbase and trying to emulate it may want to  look into thrift interfaces.  They 
little more similar to hbase interface (not as friendly to use and you cant use 
the very useful new client libraries from datastax) and accesses storage more 
directly, which is similar to hbases. You have your column family foo, then 
just use a composite column to store family, qualifier, and version in column 
name with value of column being value.  row key is your row key.

---
Chris Lohfink


On Jul 17, 2014, at 6:32 PM, Clint Kelly clint.ke...@gmail.com wrote:

 Hi everyone,
 
 I am trying to design a schema that will keep the N-most-recent
 versions of a value.  Currently my table looks like the following:
 
 CREATE TABLE foo (
rowkey text,
family text,
qualifier text,
version long,
value blob,
PRIMARY KEY (rowkey, family, qualifier, version))
 WITH CLUSTER ORDER BY (rowkey ASC, family ASC, qualifier ASC, version DESC));
 
 Is there any standard design pattern for updating such a layout such
 that I keep the N-most-recent (version, value) pairs for every unique
 (rowkey, family, qualifier)?  I can't think of any way to do this
 without doing a read-modify-write.  The best thing I can think of is
 to use TTL to approximate the desired behavior (which will work if I
 know how often we are writing new data to the table).  I could also
 use LIMIT N in my queries to limit myself to only N items, but that
 does not address any of the storage-size issues.
 
 In case anyone is curious, this question is related to some work that
 I am doing translating a system built on HBase (which provides this
 keep the N-most-recent-version-of-a-cell behavior) to Cassandra
 while providing the user with as-similar-as-possible an interface.
 
 Best regards,
 Clint



DataType protocol ID error for TIMESTAMPs when upgrading from 1.2.11 to 2.0.9

2014-07-17 Thread Karl Rieb
Hi,

I've been testing an in-place upgrade of a 1.2.11 cluster to 2.0.9.  The
1.2.11 nodes all have a schema defined through CQL with existing data
before I perform the rolling upgrade.  While the upgrade is in progress,
services are continuing to read and write data to the cluster (strictly
using protocol version 1).  I drain each node one at a time, upgrade the
configuration files, upgrade cassandra, then start the node back up.  The
cassandra logs show no errors or exceptions during startup and appear to
join properly with the other nodes in the cluster.

On our service side, everything goes smoothly except for queries against a
few of our tables.  On some of the tables with timestamp columns (not all),
we will get an error from the Datastax java-driver when binding
PreparedStatements or trying to process ResultSets:

com.datastax.driver.core.exceptions.InvalidTypeException: Invalid type for
value 2 of CQL type 'org.apache.cassandra.db.marshal.DateType', expecting
class java.nio.ByteBuffer but class java.util.Date provided
at com.datastax.driver.core.BoundStatement.bind(BoundStatement.java:190)
at
com.datastax.driver.core.DefaultPreparedStatement.bind(DefaultPreparedStatement.java:103)


I traced the code on the driver side, and I see it has to do with bad
DataType information coming back from a table metadata query.  The 2.0.9
nodes will return protocol ID 0 instead of 11 for some timestamp column
definitions.  The protocol ID 0 maps to a custom type, and the 2.0.9
nodes specify org.apache.cassandra.db.marshal.DateType as the custom type
name.  The 1.2.11 nodes, however, continue to send 11 for their protocol
ID, which gets properly mapped to the timestamp data type.

Strangely not all our tables with timestamp columns have this issue.

If I bring up an entirely new 2.0.9 cluster (no existing data), and
provision our schema, then there are no issues.  The proper protocol ID,
11, gets sent for all our tables with timestamp columns.

I have tried doing nodetool upgradesstables and nodetool scrub on the
nodes, but neither fixes the issue.

Any suggestions on what is going on or how to fix it?


Re: Connection reset by peer error

2014-07-17 Thread Jacob Rhoden
The information about how the servers are connected is important, because we 
have exactly these types of situations in some of our applications (not using 
Cassandra) when firewall administrators/configurators get “creative” about 
“enhancing” security. Other things can cause this type of situation, but in my 
limited experience, I’ve only ever seen it caused by the firewall.

Best regards,
Jacob

On 1 Jul 2014, at 12:55 pm, cass savy casss...@gmail.com wrote:
 The app and Cassandra are connected via firewall. For some reason, 
 connections are still remaining on Cassandra side even after stopping 
 services on app server.
 
 On Mon, Jun 30, 2014 at 3:29 PM, Jacob Rhoden jacob.rho...@me.com wrote:
 How are the two machines connected? Direct cable? Via a hub, router, 
 firewall, wan?
 
 On 1 Jul 2014, at 6:01 am, cass savy casss...@gmail.com wrote:
 We use Datastax Java driver version 1.0.6. Application is running into 
 issues connecting to the 3 node cluster. What is cause for it? Application 
 is not able to establish a connection at all. I see this error 
 intermittently few time every other day.
 
 
 Is the issue related to read/write timeout?Do I need to increase *timeout* 
 values in yaml ? 
 APP logs
 2014-06-27 17:33:47
 
 Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.5-b02 mixed mode):
 
  
 RMI TCP Connection(105)-10.198.49.16 - Thread t@247
 
java.lang.Thread.State: RUNNABLE
 
 at java.net.SocketInputStream.socketRead0(Native Method)
 
 at 
 java.net.SocketInputStream.read(SocketInputStream.java:150)
 
 at 
 java.net.SocketInputStream.read(SocketInputStream.java:121)
 
 at 
 java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
 
 at 
 java.io.BufferedInputStream.read(BufferedInputStream.java:254)
 
 - locked 68d37818 (a java.io.BufferedInputStream)
 
 at java.io.FilterInputStream.read(FilterInputStream.java:83)
 
 at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
 
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
 
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 
 at java.lang.Thread.run(Thread.java:722)
 
 
 
 
 
 All I see in the Cassandra logs:
 ERROR [Native-Transport-Requests:2704] 2014-06-27 16:33:23,339 
 ErrorMessage.java (line 210) Unexpected exception during request
 java.io.IOException: Connection reset by peer
 at sun.nio.ch.FileDispatcher.read0(Native Method)
 at sun.nio.ch.SocketDispatcher.read(Unknown Source)
 at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)
 at sun.nio.ch.IOUtil.read(Unknown Source)
 at sun.nio.ch.SocketChannelImpl.read(Unknown Source)
 at 
 org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:59)
 at 
 org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
 at 
 org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
 at 
 org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
 Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 
 
 



smime.p7s
Description: S/MIME cryptographic signature


Re: horizontal query scaling issues follow on

2014-07-17 Thread Jack Krupansky
Sorry I may have confused the discussion by mentioning tokens – I wasn’t 
intending to refer to vnodes or the num_tokens property, but merely referring 
to the token range of a node and that the partition key hashes to a token value.

The main question is what you use for your primary key and whether you are 
using a small number of partition keys and a large number of clustering 
columns, or does each row have a unique partition key and no clustering columns.

-- Jack Krupansky

From: Diane Griffith 
Sent: Thursday, July 17, 2014 6:21 PM
To: user 
Subject: Re: horizontal query scaling issues follow on

So do partitions equate to tokens/vnodes? 

If so we had configured all cluster nodes/vms with num_tokens: 256 instead of 
setting init_token and assigning ranges.  I am still not getting why in 
Cassandra 2.0, I would assign my own ranges via init_token and this was based 
on the documentation and even this blog item that made it seem right for us to 
always configure our cluster vms with num_tokens: 256 in the cassandra.yaml 
file.  

Also in all testing, all vms were of equal sizing so one was not more powerful 
than another.  

I didn't think I was hitting an i/o wall on the client vm (separate vm) where 
we command line scripted our query call to the cassandra cluster.I can 
break the client call load across vms which I tried early on.  Happy to verify 
that again though.

So given that I was assuming the partitions were such that it wasn't a problem. 
 Is that an incorrect assumption and something to dig into more?

Thanks,
Diane



On Thu, Jul 17, 2014 at 3:01 PM, Jack Krupansky j...@basetechnology.com wrote:

  How many partitions are you spreading those 18 million rows over? That many 
rows in a single partition will not be a sweet spot for Cassandra. It’s not 
exceeding any hard limit (2 billion), but some internal operations may cache 
the partition rather than the logical row.

  And all those rows in a single partition would certainly not be a test of 
“horizontal scaling” (adding nodes to handle more data – more token values or 
partitions.)

  -- Jack Krupansky

  From: Diane Griffith 
  Sent: Thursday, July 17, 2014 1:33 PM
  To: user 
  Subject: horizontal query scaling issues follow on

  This is a follow on re-post to clarify what we are trying to do, providing 
information that was missing or not clear.



  Goal:  Verify horizontal scaling for random non duplicating key reads using 
the simplest configuration (or minimal configuration) possible.



  Background:

  A couple years ago we did similar performance testing with Cassandra for both 
read and write performance and found excellent (essentially linear) horizontal 
scalability.  That project got put on hold.  We are now moving forward with an 
operational system and are having scaling problems.



  During the prior testing (3 years ago) we were using a much older version of 
Cassandra (0.8 or older), the THRIFT API, and Amazon AWS rather than OpenStack 
VMs.  We are now using the latest Cassandra and the CQL interface.  We did try 
moving from OpenStack to AWS/EC2 but that did not materially change our (poor) 
results.



  Test Procedure:

a.. Inserted 54 million cells in 18 million rows (so 3 cells per row), 
using randomly generated row keys. That was to be our data control for the 
test. 
b.. Spawn a client on a different VM to query 100k rows and do that for 100 
reps.  Each row key queried is drawn randomly from the set of existing row 
keys, and then not re-used, so all 10 million row queries use a different 
(valid) row key.  This test is a specific use case of our system we are trying 
to show will scale 
  Result:

a.. 2 nodes performed better than 1 node test but 4 nodes showed decreased 
performance over 2 nodes.  So that did not show horizontal scaling 


  Notes:

a.. We have replication factor set to 1 as we were trying to keep the 
control test simple to prove out horizontal scaling.  
b.. When we tried to add threading to see if it would help it had 
interesting side behavior which did not prove out horizontal scaling. 
c.. We are using CQL versus THRIFT API for Cassandra 2.0.6 




  Does anyone have any feedback that either threading or replication factor is 
necessary to show horizontal scaling of Cassandra versus the minimal way of 
just continue to add nodes to help throughput?



  Any suggestions of minimal configuration necessary to show scaling of our 
query use case 100k requests for random non repeating keys constantly coming in 
over a period of time?




  Thanks,

  Diane



Re: horizontal query scaling issues follow on

2014-07-17 Thread Jonathan Haddad
The problem with starting without vnodes is moving to them is a bit
hairy.  In particular, nodetool shuffle has been reported to take an
extremely long time (days, weeks).  I would start with vnodes if you
have any intent on using them.

On Thu, Jul 17, 2014 at 6:03 PM, Robert Coli rc...@eventbrite.com wrote:
 On Thu, Jul 17, 2014 at 5:16 PM, Diane Griffith dfgriff...@gmail.com
 wrote:

 I did tests comparing 1, 2, 10, 20, 50, 100 clients spawned all querying.
 Performance on 2 nodes starts to degrade from 10 clients on.  I saw similar
 behavior on 4 nodes but haven't done the official runs on that yet.


 Ok, if you've multi-threaded your client, then you aren't starving for
 client thread paralellism, and that rules out another scalability
 bottleneck.

 As a brief aside, you only lose from vnodes until your cluster is larger
 than a certain sizes, and then only when adding or removing nodes from a
 cluster. Perhaps if you are ramping up and scientifically testing smaller
 cluster sizes, you should start at first with a token per range, ie
 pre-vnodes operation?

 I basically did the command and it was outputting 256 tokens on each node
 and comma separated.  So I tried taking that string and setting that as the
 value to initial_token but the node wouldn't start up.

 Not sure if I maybe had a carriage return in there and that was the
 problem.


 It should take a comma delimited list of tokens, did the failed node startup
 log any error?


 And if I do that do I need to do more than comment out num_tokens?


 No, though you probably should anyway in order to be unambiguous.

 =Rob




-- 
Jon Haddad
http://www.rustyrazorblade.com
skype: rustyrazorblade


Re: C* 2.1-rc2 gets unstable after a 'DROP KEYSPACE' command ?

2014-07-17 Thread Fabrice Larcher
Hello,

I still experience a similar issue after a 'DROP KEYSPACE' command with C*
2.1-rc3. Connection to the node may fail after a 'DROP'.

But I did not see this issue with 2.1-rc1 (- it seems like to be a
regression brought with 2.1-rc2).

Fabrice LARCHER


2014-07-17 9:19 GMT+02:00 Benedict Elliott Smith belliottsm...@datastax.com
:

 Also https://issues.apache.org/jira/browse/CASSANDRA-7437 and
 https://issues.apache.org/jira/browse/CASSANDRA-7465 for rc3, although
 the CounterCacheKey assertion looks like an independent (though
 comparatively benign) bug I will file a ticket for.

 Can you try this against rc3 to see if the problem persists? You may see
 the last exception, but it shouldn't affect the stability of the cluster.
 If either of the other exceptions persist, please file a ticket.


 On Thu, Jul 17, 2014 at 1:41 AM, Tyler Hobbs ty...@datastax.com wrote:

 This looks like https://issues.apache.org/jira/browse/CASSANDRA-6959,
 but that was fixed for 2.1.0-rc1.

 Is there any chance you can put together a script to reproduce the issue?


 On Thu, Jul 10, 2014 at 8:51 AM, Pavel Kogan pavel.ko...@cortica.com
 wrote:

 It seems that memtable tries to flush itself to SSTable of not existing
 keyspace. I don't know why it is happens, but probably running nodetool
 flush before drop should prevent this issue.

 Pavel


 On Thu, Jul 10, 2014 at 4:09 AM, Fabrice Larcher 
 fabrice.larc...@level5.fr wrote:

 ​Hello,

 I am using the 'development' version 2.1-rc2.

 With one node (=localhost), I get timeouts trying to connect to C*
 after running a 'DROP KEYSPACE' command. I have following error messages in
 system.log :

 INFO  [SharedPool-Worker-3] 2014-07-09 16:29:36,578
 MigrationManager.java:319 - Drop Keyspace 'test_main'
 (...)
 ERROR [MemtableFlushWriter:6] 2014-07-09 16:29:37,178
 CassandraDaemon.java:166 - Exception in thread
 Thread[MemtableFlushWriter:6,5,main]
 java.lang.RuntimeException: Last written key
 DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
 52446dde90244ca49789b41671e4ca7c) = current key
 DecoratedKey(91e7f660-076f-11e4-a36d-28d2444c0b1b,
 52446dde90244ca49789b41671e4ca7c) writing into
 ./../data/data/test_main/user-911d5360076f11e4812d3d4ba97474ac/test_main-user.user_account-tmp-ka-1-Data.db
 at
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:172)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:215)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:351)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:314)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
 ~[guava-16.0.jar:na]
 at
 org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1054)
 ~[apache-cassandra-2.1.0-rc2.jar:2.1.0-rc2]
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 ~[na:1.7.0_55]
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 ~[na:1.7.0_55]
 at java.lang.Thread.run(Thread.java:744) ~[na:1.7.0_55]

 Then, I can not connect to the Cluster anymore from my app (Java Driver
 2.1-SNAPSHOT) and got in application logs :

 com.datastax.driver.core.exceptions.NoHostAvailableException: All
 host(s) tried for query failed (tried: /127.0.0.1:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
 at
 com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
 at
 com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:258)
 at
 com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:174)
 at
 com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
 at
 com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:36)
 (...)
 Caused by:
 com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
 tried for query failed (tried: /127.0.0.1:9042
 (com.datastax.driver.core.exceptions.DriverException: Timeout during read))
 at
 com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
 at
 com.datastax.driver.core.RequestHandler$1.run(RequestHandler.java:175)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 

Re: How to maintain the N-most-recent versions of a value?

2014-07-17 Thread DuyHai Doan
In C* 2.1, the new row cache implementation keeps the most recent N
partitions in memory, it might be of interest for you:
http://www.datastax.com/dev/blog/row-caching-in-cassandra-2-1


On Fri, Jul 18, 2014 at 3:39 AM, Chris Lohfink clohf...@blackbirdit.com
wrote:

 I would say that would work, but since already familiar with storage model
 from hbase and trying to emulate it may want to  look into thrift
 interfaces.  They little more similar to hbase interface (not as friendly
 to use and you cant use the very useful new client libraries from datastax)
 and accesses storage more directly, which is similar to hbases. You have
 your column family foo, then just use a composite column to store family,
 qualifier, and version in column name with value of column being value.
  row key is your row key.

 ---
 Chris Lohfink


 On Jul 17, 2014, at 6:32 PM, Clint Kelly clint.ke...@gmail.com wrote:

  Hi everyone,
 
  I am trying to design a schema that will keep the N-most-recent
  versions of a value.  Currently my table looks like the following:
 
  CREATE TABLE foo (
 rowkey text,
 family text,
 qualifier text,
 version long,
 value blob,
 PRIMARY KEY (rowkey, family, qualifier, version))
  WITH CLUSTER ORDER BY (rowkey ASC, family ASC, qualifier ASC, version
 DESC));
 
  Is there any standard design pattern for updating such a layout such
  that I keep the N-most-recent (version, value) pairs for every unique
  (rowkey, family, qualifier)?  I can't think of any way to do this
  without doing a read-modify-write.  The best thing I can think of is
  to use TTL to approximate the desired behavior (which will work if I
  know how often we are writing new data to the table).  I could also
  use LIMIT N in my queries to limit myself to only N items, but that
  does not address any of the storage-size issues.
 
  In case anyone is curious, this question is related to some work that
  I am doing translating a system built on HBase (which provides this
  keep the N-most-recent-version-of-a-cell behavior) to Cassandra
  while providing the user with as-similar-as-possible an interface.
 
  Best regards,
  Clint