Re: counters + replication = awful performance?

2012-11-28 Thread Sylvain Lebresne
Counters replication works in different ways than the one of normal
writes. Namely, a counter update is written to a first replica, then a read
is perform and the result of that is replicated to the other nodes. With
RF=1, since there is only one replica no read is involved but in a way it's
a degenerate case. So there is two reason why RF2 is much slower than RF=1:
1) it involves a read to replicate and that read takes times. Especially if
that read hits the disk, it may even dominate the insertion time.
2) the replication to the first replica and the one to the res of the
replica are not done in parallel but sequentially. Note that this is only
true for the first replica versus the othere. In other words, from RF=2 to
RF=3 you should see a significant performance degradation.

Note that while there is nothing you can do for 2), you can try to speed up
1) by using row cache for instance (in case you weren't).

In other words, with counters, it is expected that RF=1 be potentially much
faster than RF1. That is the way counters works.

And don't get me wrong, I'm not suggesting you should use RF=1 at all. What
I am saying is that the performance you see with RF=2 is the performance of
counters in Cassandra.

--
Sylvain


On Wed, Nov 28, 2012 at 7:34 AM, Sergey Olefir solf.li...@gmail.com wrote:

 I think there might be a misunderstanding as to the nature of the problem.

 Say, I have test set T. And I have two identical servers A and B.
 - I tested that server A (singly) is able to handle load of T.
 - I tested that server B (singly) is able to handle load of T.
 - I then join A and B in the cluster and set replication=2 -- this means
 that each server in effect has to handle full test load individually
 (because there are two servers and replication=2 it means that each server
 effectively has to handle all the data written to the cluster). Under these
 circumstances it is reasonable to assume that cluster A+B shall be able to
 handle load T because each server is able to do so individually.

 HOWEVER, this is not the case. In fact, A+B together are only able to
 handle
 less than 1/3 of T DESPITE the fact that A and B individually are able to
 handle T just fine.

 I think there's something wrong with Cassandra replication (possibly as
 simple as me misconfiguring something) -- it shouldn't be three times
 faster
 to write to two separate nodes in parallel as compared to writing to 2-node
 Cassandra cluster with replication=2.


 Edward Capriolo wrote
  Say you are doing 100 inserts rf1 on two nodes. That is 50 inserts a
 node.
  If you go to rf2 that is 100 inserts a node.  If you were at 75 %
 capacity
  on each mode your now at 150% which is not possible so things bog down.
 
  To figure out what is going on we would need to see tpstat, iostat , and
  top information.
 
  I think your looking at the performance the wrong way. Starting off at rf
  1
  is not the way to understand cassandra performance.
 
  You do not get the benefits of scala out don't happen until you fix
 your
  rf and increment your nodecount. Ie 5 nodes at rf 3 is fast 10 nodes at
 rf
  3 even better.
  On Tuesday, November 27, 2012, Sergey Olefir lt;

  solf.lists@

  gt; wrote:
  I already do a lot of in-memory aggregation before writing to Cassandra.
 
  The question here is what is wrong with Cassandra (or its configuration)
  that causes huge performance drop when moving from 1-replication to
  2-replication for counters -- and more importantly how to resolve the
  problem. 2x-3x drop when moving from 1-replication to 2-replication on
  two
  nodes is reasonable. 6x is not. Like I said, with this kind of
  performance
  degradation it makes more sense to run two clusters with replication=1
 in
  parallel rather than rely on Cassandra replication.
 
  And yes, Rainbird was the inspiration for what we are trying to do here
  :)
 
 
 
  Edward Capriolo wrote
  Cassandra's counters read on increment. Additionally they are
  distributed
  so that can be multiple reads on increment. If they are not fast enough
  and
  you have avoided all tuning options add more servers to handle the
 load.
 
  In many cases incrementing the same counter n times can be avoided.
 
  Twitter's rainbird did just that. It avoided multiple counter
 increments
  by
  batching them.
 
  I have done a similar think using cassandra and Kafka.
 
 
 
 https://github.com/edwardcapriolo/IronCount/blob/master/src/test/java/com/jointhegrid/ironcount/mockingbird/MockingBirdMessageHandler.java
 
 
  On Tuesday, November 27, 2012, Sergey Olefir lt;
 
  solf.lists@
 
  gt; wrote:
  Hi, thanks for your suggestions.
 
  Regarding replicate=2 vs replicate=1 performance: I expected that
 below
  configurations will have similar performance:
  - single node, replicate = 1
  - two nodes, replicate = 2 (okay, this probably should be a bit slower
  due
  to additional overhead).
 
  However what I'm seeing is that second option (replicate=2) is about
  THREE
  times slower 

Re: counters + replication = awful performance?

2012-11-28 Thread Rob Coli
On Tue, Nov 27, 2012 at 3:21 PM, Edward Capriolo edlinuxg...@gmail.com wrote:
 I mispoke really. It is not dangerous you just have to understand what it
 means. this jira discusses it.

 https://issues.apache.org/jira/browse/CASSANDRA-3868

Per Sylvain on the referenced ticket :


I don't disagree about the efficiency of the valve, but at what price?
'Bootstrapping a node will make you lose increments (you don't know
which ones, you don't know how many and this even if nothing goes
wrong)' is a pretty bad drawback. That is pretty much why that option
makes me uncomfortable: it does give you better performance, so people
may be tempted to use it. Now if it was only a matter of replicating
writes only through read-repair/repair, then ok, it's pretty dangerous
but it's rather easy to explain/understand the drawback (if you don't
lose a disk, you don't lose increments, and you'd better use CL.ALL or
have read_repair_chance to 1). But the fact that it doesn't work with
bootstrap/move makes me wonder if having the option at all is not
making a disservice to users.


To me anything that can be described as will make you lose increments
(you don't know which ones, you don't know how many and this even if
nothing goes wrong) and which therefore doesn't work with
bootstrap/move is correctly described as dangerous. :D

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb


Re: need some help with row cache

2012-11-28 Thread Bryan Talbot
The row cache itself is global and the size is set with
row_cache_size_in_mb.  It must be enabled per CF using the proper
settings.  CQL3 isn't complete yet in C* 1.1 so if the cache settings
aren't shown there, then you'll probably need to use cassandra-cli.

-Bryan


On Tue, Nov 27, 2012 at 10:41 PM, Wz1975 wz1...@yahoo.com wrote:
 Use cassandracli.


 Thanks.
 -Wei

 Sent from my Samsung smartphone on ATT


  Original message 
 Subject: Re: need some help with row cache
 From: Yiming Sun yiming@gmail.com
 To: user@cassandra.apache.org
 CC:


 Also, what command can I used to see the caching setting?  DESC TABLE
 cf doesn't list caching at all.  Thanks.

 -- Y.


 On Wed, Nov 28, 2012 at 12:15 AM, Yiming Sun yiming@gmail.com wrote:

 Hi Bryan,

 Thank you very much for this information.  So in other words, the settings
 such as row_cache_size_in_mb in YAML alone are not enough, and I must also
 specify the caching attribute on a per column family basis?

 -- Y.


 On Tue, Nov 27, 2012 at 11:57 PM, Bryan Talbot btal...@aeriagames.com
 wrote:

 On Tue, Nov 27, 2012 at 8:16 PM, Yiming Sun yiming@gmail.com wrote:
  Hello,
 
  but it is not clear to me where this setting belongs to, because even
  in the
  v1.1.6 conf/cassandra.yaml,  there is no such property, and apparently
  adding this property to the yaml causes a fatal configuration error
  upon
  server startup,
 

 It's a per column family setting that can be applied using the CLI or
 CQL.

 With CQL3 it would be

 ALTER TABLE cf WITH caching = 'rows_only';

 to enable the row cache but no key cache for that CF.

 -Bryan





Re: Other problem in update

2012-11-28 Thread Everton Lima
The problens was that my unit tests are not cleaning up their data
directory and there is some corrupt data in there.
The problem was fixed by del the directory manualy.

Thanks

2012/11/27 Tupshin Harper tups...@tupshin.com

 Unless I'm misreading the git history, the stack trace you referenced
 isn't from 1.1.2. In particular, the writeHintForMutation method in
 StorageProxy.java wasn't added to the codebase until September 9th (
 https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=commitdiff;h=b38ca2879cf1cbf5de17e1912772b6588eaa7de6),
 and wasn't part of any release until 1.2.0-beta1.

 -Tupshin

 On Tue, Nov 27, 2012 at 7:40 AM, Everton Lima peitin.inu...@gmail.comwrote:

 writeHintForMutation





-- 

Everton Lima Aleixo
Bacharel em Ciencia da Computação
Universidade Federal de Goiás


Data backup and restore

2012-11-28 Thread Adeel Akbar

Dear All,

I have Cassandra 1.1.4 cluster with 2 nodes. I need to take backup and 
restore on staging for testing purpose. I have taken snapshot with below 
mentioned command but It created snapshot on every Keyspace's column 
family. Is there any other way to take backup and restore quick.


/opt/apache-cassandra-1.1.4/bin/nodetool -h localhost snapshot -t 
cassandra_bkup


_*Snapshot directory:*_
/var/log/cassandra/data/KeySpace/subfolder/snapshot/cassandra_bkup

--


Thanks  Regards

*Adeel**Akbar*



Re: Upgrade

2012-11-28 Thread Everton Lima
Yes.

java.lang.NullPointerException
at
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:796)
at
org.apache.cassandra.thrift.ThriftSessionManager.currentSession(ThriftSessionManager.java:53)
at
org.apache.cassandra.thrift.CassandraServer.state(CassandraServer.java:88)
at
org.apache.cassandra.thrift.CassandraServer.system_add_keyspace(CassandraServer.java:1345)
at
harpia.ns.storage.cassandra.CassandraHelper.setupKeyspace(CassandraHelper.java:179)
at
harpia.ns.storage.cassandra.CassandraHelper.startInstance(CassandraHelper.java:154)
at
harpia.ns.storage.cassandra.CassandraStorageService.init(CassandraStorageService.java:129)
at
harpia.ns.storage.StorageServiceFactory.createInstance(StorageServiceFactory.java:39)
at
harpia.ns.storage.StorageServiceFactory.createInstanceFor(StorageServiceFactory.java:29)
at harpia.ns.NodeServer.init(NodeServer.java:82)
at
harpia.ns.NodeServerFactory.createNodeServer(NodeServerFactory.java:8)
at harpia.ns.StartNodeServer.run(StartNodeServer.java:56)

-

Someone knows why the set variable disappear?
  initialized = true
in the class StorageService - method initServer(int dalay)
in version 1.1.6, in this method it is set but in the version 1.2.0-beta2
it does not occour. So in my code I can not verify if the node is
initialized.

2012/11/28 aaron morton aa...@thelastpickle.com

 Do you have the error stack ?

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 28/11/2012, at 12:28 AM, Everton Lima peitin.inu...@gmail.com wrote:

 Hello people.
 I was using cassandra 1.1.6 and use the Object CassandraServer() to create
 keyspaces by my code.

 But when I update to version 1.2.0-beta2, my code starts to throw
 Exception (NullPointerException) in the method:

 *in version 1.1.6*
 CassandraServer - state() -
  {
SocketAddress remoteSocket =
 SocketSessionManagementService.remoteSocket.get();
if (remoteSocket == null)
return clientState.get();

ClientState cState =
 SocketSessionManagementService.instance.get(remoteSocket);
if (cState == null)
{
cState = new ClientState();
SocketSessionManagementService.instance.put(remoteSocket,
 cState);
}
return cState;
}

 *in version 1.2.0*
 CassandraServer - state() -
 {
   return ThriftSessionManager.instance.currentSession();
 }
 currtentSession(){
  SocketAddress socket = remoteSocket.get();
 assert socket != null;

 ThriftClientState cState = activeSocketSessions.get(socket);
 if (cState == null)
 {
 cState = new ThriftClientState();
 activeSocketSessions.put(socket, cState);
 }
 return cState;
 }


 So, in version 1.1.6, it verify if has  a remote connection, it not it try
 to get o local. In the version 1.2.0 it try to get a remote connection and
 apply it
 to a ThriftClientState, but if does not have a remote connection (like in
 1.1.6) it will throw a NullPointerException in line:
 ThriftClientState cState = activeSocketSessions.get(socket);

 Is any way to use CassandraServer in the new version??

 Thanks!

 --

 Everton Lima Aleixo
 Bacharel em Ciencia da Computação
 Universidade Federal de Goiás





-- 

Everton Lima Aleixo
Bacharel em Ciencia da Computação
Universidade Federal de Goiás


Re: need some help with row cache

2012-11-28 Thread Yiming Sun
Thanks guys.  However, after I ran the client code several times (same set
of 5000 entries),  still 2 of the 6 nodes show 0 hits on row cache, despite
each node has 1GB capacity for row cache and the caches are full.   Since I
always request the same entries over and over again, shouldn't there be
some hits?


[user@node]$ ./checkinfo.sh
Token: 85070591730234615865843651857942052863
Gossip active: true
Thrift active: true
Load : 587.15 GB
Generation No: 1354074048
Uptime (seconds) : 36957
Heap Memory (MB) : 2027.29 / 3948.00
Data Center  : DC1
Rack : r2
Exceptions   : 0
Key Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests,
NaN recent hit rate, 14400 save period in seconds
Row Cache: size 1072651974 (bytes), capacity 1073741824 (bytes), 0
hits, 2576 requests, NaN recent hit rate, 0 save period in seconds

Token: 141784319550391026443072753096570088105
Gossip active: true
Thrift active: true
Load : 583.21 GB
Generation No: 1354074461
Uptime (seconds) : 36535
Heap Memory (MB) : 828.71 / 3948.00
Data Center  : DC1
Rack : r2
Exceptions   : 0
Key Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests,
NaN recent hit rate, 14400 save period in seconds
Row Cache: size 1072602906 (bytes), capacity 1073741824 (bytes), 0
hits, 3194 requests, NaN recent hit rate, 0 save period in seconds


On Wed, Nov 28, 2012 at 4:26 AM, Bryan Talbot btal...@aeriagames.comwrote:

 The row cache itself is global and the size is set with
 row_cache_size_in_mb.  It must be enabled per CF using the proper
 settings.  CQL3 isn't complete yet in C* 1.1 so if the cache settings
 aren't shown there, then you'll probably need to use cassandra-cli.

 -Bryan


 On Tue, Nov 27, 2012 at 10:41 PM, Wz1975 wz1...@yahoo.com wrote:
  Use cassandracli.
 
 
  Thanks.
  -Wei
 
  Sent from my Samsung smartphone on ATT
 
 
   Original message 
  Subject: Re: need some help with row cache
  From: Yiming Sun yiming@gmail.com
  To: user@cassandra.apache.org
  CC:
 
 
  Also, what command can I used to see the caching setting?  DESC TABLE
  cf doesn't list caching at all.  Thanks.
 
  -- Y.
 
 
  On Wed, Nov 28, 2012 at 12:15 AM, Yiming Sun yiming@gmail.com
 wrote:
 
  Hi Bryan,
 
  Thank you very much for this information.  So in other words, the
 settings
  such as row_cache_size_in_mb in YAML alone are not enough, and I must
 also
  specify the caching attribute on a per column family basis?
 
  -- Y.
 
 
  On Tue, Nov 27, 2012 at 11:57 PM, Bryan Talbot btal...@aeriagames.com
  wrote:
 
  On Tue, Nov 27, 2012 at 8:16 PM, Yiming Sun yiming@gmail.com
 wrote:
   Hello,
  
   but it is not clear to me where this setting belongs to, because even
   in the
   v1.1.6 conf/cassandra.yaml,  there is no such property, and
 apparently
   adding this property to the yaml causes a fatal configuration error
   upon
   server startup,
  
 
  It's a per column family setting that can be applied using the CLI or
  CQL.
 
  With CQL3 it would be
 
  ALTER TABLE cf WITH caching = 'rows_only';
 
  to enable the row cache but no key cache for that CF.
 
  -Bryan
 
 
 



Re: need some help with row cache

2012-11-28 Thread Yiming Sun
Does replica placement play a role in row cache hits?

I happen to notice that the 3 nodes on rack 2 are the ones with no recent
hit rates, even when I specify only one node from rack2 as the host to
Hector.

The cluster uses PropertyFileSnitch, and the nodes are alternating between
rac1 and rac2 in a single Data Center clockwise on the ring.  This
particular column family uses NetworkTopologyStrategy, with replication
factor of 2.   So the idea is it can place the replica on the next node in
the ring without having to walk all the around.   But it seems cache hits
tend to only happen on rack 1?


Address DC  RackStatus State   Load
Effective-Ownership Token

141784319550391026443072753096570088105
x.x.x.1DC1 r1  Up Normal  587.46 GB
33.33%  0
x.x.x.2DC1 r2  Up Normal  591.21 GB
33.33%  28356863910078205288614550619314017621
x.x.x.3DC1 r1  Up Normal  594.97 GB
33.33%  56713727820156410577229101238628035242
x.x.x.4DC1 r2  Up Normal  587.15 GB
33.33%  85070591730234615865843651857942052863
x.x.x.5DC1 r1  Up Normal  590.26 GB
33.33%  113427455640312821154458202477256070484
x.x.x.6DC1 r2  Up Normal  583.21 GB
33.33%  141784319550391026443072753096570088105


On Wed, Nov 28, 2012 at 9:09 AM, Yiming Sun yiming@gmail.com wrote:

 Thanks guys.  However, after I ran the client code several times (same set
 of 5000 entries),  still 2 of the 6 nodes show 0 hits on row cache, despite
 each node has 1GB capacity for row cache and the caches are full.   Since I
 always request the same entries over and over again, shouldn't there be
 some hits?


 [user@node]$ ./checkinfo.sh
 Token: 85070591730234615865843651857942052863
 Gossip active: true
 Thrift active: true
 Load : 587.15 GB
 Generation No: 1354074048
 Uptime (seconds) : 36957
 Heap Memory (MB) : 2027.29 / 3948.00
 Data Center  : DC1
 Rack : r2
 Exceptions   : 0

 Key Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests,
 NaN recent hit rate, 14400 save period in seconds
 Row Cache: size 1072651974 (bytes), capacity 1073741824 (bytes), 0
 hits, 2576 requests, NaN recent hit rate, 0 save period in seconds

 Token: 141784319550391026443072753096570088105
 Gossip active: true
 Thrift active: true
 Load : 583.21 GB
 Generation No: 1354074461
 Uptime (seconds) : 36535
 Heap Memory (MB) : 828.71 / 3948.00
 Data Center  : DC1
 Rack : r2
 Exceptions   : 0

 Key Cache: size 0 (bytes), capacity 0 (bytes), 0 hits, 0 requests,
 NaN recent hit rate, 14400 save period in seconds
 Row Cache: size 1072602906 (bytes), capacity 1073741824 (bytes), 0
 hits, 3194 requests, NaN recent hit rate, 0 save period in seconds


 On Wed, Nov 28, 2012 at 4:26 AM, Bryan Talbot btal...@aeriagames.comwrote:

 The row cache itself is global and the size is set with
 row_cache_size_in_mb.  It must be enabled per CF using the proper
 settings.  CQL3 isn't complete yet in C* 1.1 so if the cache settings
 aren't shown there, then you'll probably need to use cassandra-cli.

 -Bryan


 On Tue, Nov 27, 2012 at 10:41 PM, Wz1975 wz1...@yahoo.com wrote:
  Use cassandracli.
 
 
  Thanks.
  -Wei
 
  Sent from my Samsung smartphone on ATT
 
 
   Original message 
  Subject: Re: need some help with row cache
  From: Yiming Sun yiming@gmail.com
  To: user@cassandra.apache.org
  CC:
 
 
  Also, what command can I used to see the caching setting?  DESC TABLE
  cf doesn't list caching at all.  Thanks.
 
  -- Y.
 
 
  On Wed, Nov 28, 2012 at 12:15 AM, Yiming Sun yiming@gmail.com
 wrote:
 
  Hi Bryan,
 
  Thank you very much for this information.  So in other words, the
 settings
  such as row_cache_size_in_mb in YAML alone are not enough, and I must
 also
  specify the caching attribute on a per column family basis?
 
  -- Y.
 
 
  On Tue, Nov 27, 2012 at 11:57 PM, Bryan Talbot btal...@aeriagames.com
 
  wrote:
 
  On Tue, Nov 27, 2012 at 8:16 PM, Yiming Sun yiming@gmail.com
 wrote:
   Hello,
  
   but it is not clear to me where this setting belongs to, because
 even
   in the
   v1.1.6 conf/cassandra.yaml,  there is no such property, and
 apparently
   adding this property to the yaml causes a fatal configuration error
   upon
   server startup,
  
 
  It's a per column family setting that can be applied using the CLI or
  CQL.
 
  With CQL3 it would be
 
  ALTER TABLE cf WITH caching = 'rows_only';
 
  to enable the row cache but no key cache for that CF.
 
  -Bryan
 
 
 





Re: counters + replication = awful performance?

2012-11-28 Thread Edward Capriolo
I may be wrong but during a bootstrap hints can be silently discarded, if
the node they are destined for leaves the ring.

There are a large number of people using counters for 5 minute real-time
statistics. On the back end they use ETL based reporting to compute the
true stats at a hourly or daily interval.

A user like this might benefit from DANGER counters. They are not looking
for perfection, only better performance, and the counter row keys
themselves role over in 5 minutes anyway.

Options like this are also great for winning benchmarks. When someone other
NoSQL (that is not has fast as c*) wants to win a benchmark they turn
off/on WAL, or write acks, or something that compromises their ACID/CAP
story for the purpose of winning. We need our own secret awesome-sauce
dangerous options too! jk


On Wed, Nov 28, 2012 at 4:21 AM, Rob Coli rc...@palominodb.com wrote:

 On Tue, Nov 27, 2012 at 3:21 PM, Edward Capriolo edlinuxg...@gmail.com
 wrote:
  I mispoke really. It is not dangerous you just have to understand what it
  means. this jira discusses it.
 
  https://issues.apache.org/jira/browse/CASSANDRA-3868

 Per Sylvain on the referenced ticket :

 
 I don't disagree about the efficiency of the valve, but at what price?
 'Bootstrapping a node will make you lose increments (you don't know
 which ones, you don't know how many and this even if nothing goes
 wrong)' is a pretty bad drawback. That is pretty much why that option
 makes me uncomfortable: it does give you better performance, so people
 may be tempted to use it. Now if it was only a matter of replicating
 writes only through read-repair/repair, then ok, it's pretty dangerous
 but it's rather easy to explain/understand the drawback (if you don't
 lose a disk, you don't lose increments, and you'd better use CL.ALL or
 have read_repair_chance to 1). But the fact that it doesn't work with
 bootstrap/move makes me wonder if having the option at all is not
 making a disservice to users.
 

 To me anything that can be described as will make you lose increments
 (you don't know which ones, you don't know how many and this even if
 nothing goes wrong) and which therefore doesn't work with
 bootstrap/move is correctly described as dangerous. :D

 =Rob

 --
 =Robert Coli
 AIMGTALK - rc...@palominodb.com
 YAHOO - rcoli.palominob
 SKYPE - rcoli_palominodb



Re: outOfMemory error

2012-11-28 Thread Bryan Talbot
Well, asking for 500MB of data at once for a server with such modest
specs is asking for troubles.  Here are my suggestions.

Disable the 1 GB row cache
Consider allocating that memory for the java heap Xms2500m Xmx2500m
Don't fetch all the columns at once -- page through them a slice at a time
Increase the memtable to more than 64 MB if you want to write data to
this cluster

-Bryan



On Wed, Nov 28, 2012 at 5:06 AM, Damien Lejeune d.leje...@pepite.be wrote:
 Hi all,

 I'm currently experiencing a outOfMemory problem with Cassandra-1.1.6 on
 Windows XP-Pro (32-bit). The server crashes when I try to query it with a
 relatively small amount of data (around 100 rows with 5 columns each: to
 be precise, on my configuration, querying 75 or more rows makes the server
 to crash).
 I tried with different library (Hector, JDBC, Thrift) and with the Cassandra
 stress tool. All lead to the same outOfMemory problem.

 My dataset is composed, for each row, of: 1 column in DateType, 4
 columns in DoubleType. I ran a query to fetch the entire dataset (around
 330MB for the raw data + around 200MB for the metadata) and got the log at
 the end of this message.

 I also checked the heap-dump with Mat which displays these top values:
 Class Name
  Objects  Shallow Heap
 java.nio.HeapByteBuffer   16,253,559
 780,170,832
 bytes[]   16,254,013
 330,207,640 -- Data ?
 java.util.TreeMap$Entry8,126,711
 260,054,752
 org.apache.cassandra.db.Column 8,116,589
 194,798,136 -- Metadata ?

 I tried to change the configuration in Cassandra for the values:
 - row_cache_size_in_mb: tried different value between [0,1000] MB
 - flush_largest_memtables_at: set to 0.1, but tried with 0.75
 - reduce_cache_sizes_at: tried 0.85, 0.6, 0.2 and 0.1
 - reduce_cache_capacity_to: tried 0.6 and 0.15
 - memtable_total_space_in_mb: 64 MB, but also tried to disable it (- 1/3 of
 the heap)
 - Xms1G
 - Xmx1500M
 with no real observable improvements regarding my problem.

 My Cassandra server and client both run on the same machine.

 Here are the characteristics of my system configuration:
 - Cassandra-1.1.6
 - java version 1.6.0_20
  Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
  Java HotSpot(TM) Client VM (build 16.3-b01, mixed mode, sharing)
 - Windows XP-Pro 32 bits with service pack 3
 - CPU double-core, 32 bits @2.26GHz
 - 3.48 of RAM

 I'm aware that my system configuration is not an optimized environment to
 make Cassandra to run efficiently, but I wonder if you guys know a
 workaround (or any idea on how) to fix this problem. Part of the answer is
 probably that I do not have enough RAM to run the process, but I also wonder
 if it is a 'normal' behaviour for Cassandra to handle this particular test
 case that way.

 Cheers,

 Damien

  Cassandra's LOG ---

 Starting Cassandra Server
  INFO 09:10:27,171 Logging initialized
  INFO 09:10:27,171 JVM vendor/version: Java HotSpot(TM) Client VM/1.6.0_18
  INFO 09:10:27,171 Heap size: 1072103424/1569521664
  INFO 09:10:27,171 Classpath:
 

Re: Java high-level client

2012-11-28 Thread Andrey Ilinykh
+1


On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman
mkjell...@barracuda.comwrote:

 Netflix has a great client

 https://github.com/Netflix/astyanax




Re: Java high-level client

2012-11-28 Thread Wei Zhu
We are using Hector now. What is the major advantage of astyanax over Hector?

Thanks.
-Wei



 From: Andrey Ilinykh ailin...@gmail.com
To: user@cassandra.apache.org 
Sent: Wednesday, November 28, 2012 9:37 AM
Subject: Re: Java high-level client
 

+1



On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman mkjell...@barracuda.com 
wrote:

Netflix has a great client

https://github.com/Netflix/astyanax




How to query secondary indexes

2012-11-28 Thread Oren Karmi
Hi,

According to the documentation on Indexes (
http://www.datastax.com/docs/1.1/ddl/indexes ),
in order to use WHERE on a column which is not part of my key, I must
define a secondary index on it. However, I can only use equality comparison
on it but I wish to use other comparisons methods like greater than.

Let's say I have a room with people and every timestamp, I measure
the temperature of the room and number of people. I use the timestamp as my
key and I want to select all timestamps where temperature was over 50
degrees but I can't seem to be able to do it with a regular query even if I
define that column as a secondary index.
SELECT * FROM MyTable WHERE temp  50.4571;

My lame workaround is to define a secondary index on NumOfPeopleInRoom and
than for a specific value
SELECT * FROM MyTable WHERE NumOfPeopleInRoom = 7 AND temp  50.4571;

I'm pretty sure this is not the proper way for me to do this.

How should I attack this? It feels like I'm missing a very basic concept.
I'd appreciate it if your answers include also the option of not changing
my schema.

Thanks!!!


Re: How to query secondary indexes

2012-11-28 Thread Blake Eggleston
You're going to have a problem doing this in a single query because you're
asking cassandra to select a non-contiguous set of rows. Also, to my
knowledge, you can only use non equal operators on clustering keys. The
best solution I could come up with would be to define you table like so:

CREATE TABLE room_data (
room_id uuid,
in_room int,
temp float,
time timestamp,
PRIMARY KEY (room_id, in_room, temp));

Then run 2 queries:
SELECT * FROM room_data WHERE in_room  7;
SELECT * FROM room_data WHERE temp  50.0;

And do an intersection on the results.

I should add the disclaimer that I am relatively new to CQL, so there may
be a better way to do this.

Blake


On Wed, Nov 28, 2012 at 10:02 AM, Oren Karmi oka...@gmail.com wrote:

 Hi,

 According to the documentation on Indexes (
 http://www.datastax.com/docs/1.1/ddl/indexes ),
 in order to use WHERE on a column which is not part of my key, I must
 define a secondary index on it. However, I can only use equality comparison
 on it but I wish to use other comparisons methods like greater than.

 Let's say I have a room with people and every timestamp, I measure
 the temperature of the room and number of people. I use the timestamp as my
 key and I want to select all timestamps where temperature was over 50
 degrees but I can't seem to be able to do it with a regular query even if I
 define that column as a secondary index.
 SELECT * FROM MyTable WHERE temp  50.4571;

 My lame workaround is to define a secondary index on NumOfPeopleInRoom and
 than for a specific value
 SELECT * FROM MyTable WHERE NumOfPeopleInRoom = 7 AND temp  50.4571;

 I'm pretty sure this is not the proper way for me to do this.

 How should I attack this? It feels like I'm missing a very basic concept.
 I'd appreciate it if your answers include also the option of not changing
 my schema.

 Thanks!!!



Re: counters + replication = awful performance?

2012-11-28 Thread Sergey Olefir
Well, those are sad news then. I don't think I can consider 20k increments
per second for a two node cluster (with RF=2) a reasonable performance (cost
vs. benefit).

I might have to look into other storage solutions or perhaps experiment with
duplicate clusters with RF=1 or replicate_on_write=false.

Although yes, I probably should try that row cache you mentioned -- I saw
that key cache was going unused (so saw no reason to try to enable row
cache), but I think it was on RF=1, it might be different on RF=2.


Sylvain Lebresne-3 wrote
 Counters replication works in different ways than the one of normal
 writes. Namely, a counter update is written to a first replica, then a
 read
 is perform and the result of that is replicated to the other nodes. With
 RF=1, since there is only one replica no read is involved but in a way
 it's
 a degenerate case. So there is two reason why RF2 is much slower than
 RF=1:
 1) it involves a read to replicate and that read takes times. Especially
 if
 that read hits the disk, it may even dominate the insertion time.
 2) the replication to the first replica and the one to the res of the
 replica are not done in parallel but sequentially. Note that this is only
 true for the first replica versus the othere. In other words, from RF=2 to
 RF=3 you should see a significant performance degradation.
 
 Note that while there is nothing you can do for 2), you can try to speed
 up
 1) by using row cache for instance (in case you weren't).
 
 In other words, with counters, it is expected that RF=1 be potentially
 much
 faster than RF1. That is the way counters works.
 
 And don't get me wrong, I'm not suggesting you should use RF=1 at all.
 What
 I am saying is that the performance you see with RF=2 is the performance
 of
 counters in Cassandra.
 
 --
 Sylvain
 
 
 On Wed, Nov 28, 2012 at 7:34 AM, Sergey Olefir lt;

 solf.lists@

 gt; wrote:
 
 I think there might be a misunderstanding as to the nature of the
 problem.

 Say, I have test set T. And I have two identical servers A and B.
 - I tested that server A (singly) is able to handle load of T.
 - I tested that server B (singly) is able to handle load of T.
 - I then join A and B in the cluster and set replication=2 -- this means
 that each server in effect has to handle full test load individually
 (because there are two servers and replication=2 it means that each
 server
 effectively has to handle all the data written to the cluster). Under
 these
 circumstances it is reasonable to assume that cluster A+B shall be able
 to
 handle load T because each server is able to do so individually.

 HOWEVER, this is not the case. In fact, A+B together are only able to
 handle
 less than 1/3 of T DESPITE the fact that A and B individually are able to
 handle T just fine.

 I think there's something wrong with Cassandra replication (possibly as
 simple as me misconfiguring something) -- it shouldn't be three times
 faster
 to write to two separate nodes in parallel as compared to writing to
 2-node
 Cassandra cluster with replication=2.


 Edward Capriolo wrote
  Say you are doing 100 inserts rf1 on two nodes. That is 50 inserts a
 node.
  If you go to rf2 that is 100 inserts a node.  If you were at 75 %
 capacity
  on each mode your now at 150% which is not possible so things bog down.
 
  To figure out what is going on we would need to see tpstat, iostat ,
 and
  top information.
 
  I think your looking at the performance the wrong way. Starting off at
 rf
  1
  is not the way to understand cassandra performance.
 
  You do not get the benefits of scala out don't happen until you fix
 your
  rf and increment your nodecount. Ie 5 nodes at rf 3 is fast 10 nodes at
 rf
  3 even better.
  On Tuesday, November 27, 2012, Sergey Olefir lt;

  solf.lists@

  gt; wrote:
  I already do a lot of in-memory aggregation before writing to
 Cassandra.
 
  The question here is what is wrong with Cassandra (or its
 configuration)
  that causes huge performance drop when moving from 1-replication to
  2-replication for counters -- and more importantly how to resolve the
  problem. 2x-3x drop when moving from 1-replication to 2-replication on
  two
  nodes is reasonable. 6x is not. Like I said, with this kind of
  performance
  degradation it makes more sense to run two clusters with replication=1
 in
  parallel rather than rely on Cassandra replication.
 
  And yes, Rainbird was the inspiration for what we are trying to do
 here
  :)
 
 
 
  Edward Capriolo wrote
  Cassandra's counters read on increment. Additionally they are
  distributed
  so that can be multiple reads on increment. If they are not fast
 enough
  and
  you have avoided all tuning options add more servers to handle the
 load.
 
  In many cases incrementing the same counter n times can be avoided.
 
  Twitter's rainbird did just that. It avoided multiple counter
 increments
  by
  batching them.
 
  I have done a similar think using cassandra and Kafka.
 
 
 
 

Re: Java high-level client

2012-11-28 Thread Andrey Ilinykh
First at all, it is backed by Netflix. They used it production for long
time, so it is pretty solid. Also they have nice tool (Priam) which makes
cassandra cloud (AWS) friendly. This is important for us.

Andrey


On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu wz1...@yahoo.com wrote:

 We are using Hector now. What is the major advantage of astyanax over
 Hector?

 Thanks.
 -Wei

   --
 *From:* Andrey Ilinykh ailin...@gmail.com
 *To:* user@cassandra.apache.org
 *Sent:* Wednesday, November 28, 2012 9:37 AM

 *Subject:* Re: Java high-level client

 +1


 On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

 Netflix has a great client

 https://github.com/Netflix/astyanax






Re: counters + replication = awful performance?

2012-11-28 Thread Edward Capriolo
Just for reference HBase's counters also do a local read. I am not saying
they work better/worse/faster/slower but I would not suspect any system
that reads on increment to me significantly faster then what Cassandra
does.

Just saying your counter throughput is read bound, this is not unique to
C*'s implementation.



On Wed, Nov 28, 2012 at 2:41 PM, Sergey Olefir solf.li...@gmail.com wrote:

 Well, those are sad news then. I don't think I can consider 20k increments
 per second for a two node cluster (with RF=2) a reasonable performance
 (cost
 vs. benefit).

 I might have to look into other storage solutions or perhaps experiment
 with
 duplicate clusters with RF=1 or replicate_on_write=false.

 Although yes, I probably should try that row cache you mentioned -- I saw
 that key cache was going unused (so saw no reason to try to enable row
 cache), but I think it was on RF=1, it might be different on RF=2.


 Sylvain Lebresne-3 wrote
  Counters replication works in different ways than the one of normal
  writes. Namely, a counter update is written to a first replica, then a
  read
  is perform and the result of that is replicated to the other nodes. With
  RF=1, since there is only one replica no read is involved but in a way
  it's
  a degenerate case. So there is two reason why RF2 is much slower than
  RF=1:
  1) it involves a read to replicate and that read takes times. Especially
  if
  that read hits the disk, it may even dominate the insertion time.
  2) the replication to the first replica and the one to the res of the
  replica are not done in parallel but sequentially. Note that this is only
  true for the first replica versus the othere. In other words, from RF=2
 to
  RF=3 you should see a significant performance degradation.
 
  Note that while there is nothing you can do for 2), you can try to speed
  up
  1) by using row cache for instance (in case you weren't).
 
  In other words, with counters, it is expected that RF=1 be potentially
  much
  faster than RF1. That is the way counters works.
 
  And don't get me wrong, I'm not suggesting you should use RF=1 at all.
  What
  I am saying is that the performance you see with RF=2 is the performance
  of
  counters in Cassandra.
 
  --
  Sylvain
 
 
  On Wed, Nov 28, 2012 at 7:34 AM, Sergey Olefir lt;

  solf.lists@

  gt; wrote:
 
  I think there might be a misunderstanding as to the nature of the
  problem.
 
  Say, I have test set T. And I have two identical servers A and B.
  - I tested that server A (singly) is able to handle load of T.
  - I tested that server B (singly) is able to handle load of T.
  - I then join A and B in the cluster and set replication=2 -- this means
  that each server in effect has to handle full test load individually
  (because there are two servers and replication=2 it means that each
  server
  effectively has to handle all the data written to the cluster). Under
  these
  circumstances it is reasonable to assume that cluster A+B shall be able
  to
  handle load T because each server is able to do so individually.
 
  HOWEVER, this is not the case. In fact, A+B together are only able to
  handle
  less than 1/3 of T DESPITE the fact that A and B individually are able
 to
  handle T just fine.
 
  I think there's something wrong with Cassandra replication (possibly as
  simple as me misconfiguring something) -- it shouldn't be three times
  faster
  to write to two separate nodes in parallel as compared to writing to
  2-node
  Cassandra cluster with replication=2.
 
 
  Edward Capriolo wrote
   Say you are doing 100 inserts rf1 on two nodes. That is 50 inserts a
  node.
   If you go to rf2 that is 100 inserts a node.  If you were at 75 %
  capacity
   on each mode your now at 150% which is not possible so things bog
 down.
  
   To figure out what is going on we would need to see tpstat, iostat ,
  and
   top information.
  
   I think your looking at the performance the wrong way. Starting off at
  rf
   1
   is not the way to understand cassandra performance.
  
   You do not get the benefits of scala out don't happen until you fix
  your
   rf and increment your nodecount. Ie 5 nodes at rf 3 is fast 10 nodes
 at
  rf
   3 even better.
   On Tuesday, November 27, 2012, Sergey Olefir lt;
 
   solf.lists@
 
   gt; wrote:
   I already do a lot of in-memory aggregation before writing to
  Cassandra.
  
   The question here is what is wrong with Cassandra (or its
  configuration)
   that causes huge performance drop when moving from 1-replication to
   2-replication for counters -- and more importantly how to resolve the
   problem. 2x-3x drop when moving from 1-replication to 2-replication
 on
   two
   nodes is reasonable. 6x is not. Like I said, with this kind of
   performance
   degradation it makes more sense to run two clusters with
 replication=1
  in
   parallel rather than rely on Cassandra replication.
  
   And yes, Rainbird was the inspiration for what we are trying to do
  here
   :)
  
  
  
   

Re: Java high-level client

2012-11-28 Thread Michael Kjellman
Lots of example code, nice api, good performance as the first things that come 
to mind why I like Astyanax better than Hector

From: Andrey Ilinykh ailin...@gmail.commailto:ailin...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wednesday, November 28, 2012 11:49 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org, Wei Zhu 
wz1...@yahoo.commailto:wz1...@yahoo.com
Subject: Re: Java high-level client

First at all, it is backed by Netflix. They used it production for long time, 
so it is pretty solid. Also they have nice tool (Priam) which makes cassandra 
cloud (AWS) friendly. This is important for us.

Andrey


On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu 
wz1...@yahoo.commailto:wz1...@yahoo.com wrote:
We are using Hector now. What is the major advantage of astyanax over Hector?

Thanks.
-Wei


From: Andrey Ilinykh ailin...@gmail.commailto:ailin...@gmail.com
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Sent: Wednesday, November 28, 2012 9:37 AM

Subject: Re: Java high-level client

+1


On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
Netflix has a great client

https://github.com/Netflix/astyanax





'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook




Re: Java high-level client

2012-11-28 Thread Wei Zhu
Astyanax was the son of Hector who was Cassandra's brother in greek mythology.

So son is doing better than the father:)

-Wei




 From: Michael Kjellman mkjell...@barracuda.com
To: user@cassandra.apache.org user@cassandra.apache.org 
Sent: Wednesday, November 28, 2012 11:51 AM
Subject: Re: Java high-level client
 

Lots of example code, nice api, good performance as the first things that come 
to mind why I like Astyanax better than Hector
From:  Andrey Ilinykh ailin...@gmail.com
Reply-To:  user@cassandra.apache.org user@cassandra.apache.org
Date:  Wednesday, November 28, 2012 11:49 AM
To:  user@cassandra.apache.org user@cassandra.apache.org, Wei Zhu 
wz1...@yahoo.com
Subject:  Re: Java high-level client


First at all, it is backed by Netflix. They used it production for long time, 
so it is pretty solid. Also they have nice tool (Priam) which makes cassandra 
cloud (AWS) friendly. This is important for us. 

Andrey 



On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu wz1...@yahoo.com wrote:

We are using Hector now. What is the major advantage of astyanax over Hector?


Thanks.
-Wei




From: Andrey Ilinykh ailin...@gmail.com
To: user@cassandra.apache.org 
Sent: Wednesday, November 28, 2012 9:37 AM 

Subject: Re: Java high-level client



+1 



On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman mkjell...@barracuda.com 
wrote:

Netflix has a great client

https://github.com/Netflix/astyanax







-- 
'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook 
  ­­  

Re: Java high-level client

2012-11-28 Thread Edward Capriolo
Astyanax is a hector fork. You can see many of the hector' authors comments
still in the astyanax code. There is some nice stuff in there but (IMHO) I
do not see the fork as necessary. It has split up the community a bit, as
there are now 3 high level Java clients.

I would advice follow Josh's advice
http://www.youtube.com/watch?v=nPG4sK_glls . Go to reddit and select
whatever sexy technology is new and trending :)


On Wed, Nov 28, 2012 at 2:51 PM, Michael Kjellman
mkjell...@barracuda.comwrote:

 Lots of example code, nice api, good performance as the first things that
 come to mind why I like Astyanax better than Hector

 From: Andrey Ilinykh ailin...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Wednesday, November 28, 2012 11:49 AM
 To: user@cassandra.apache.org user@cassandra.apache.org, Wei Zhu 
 wz1...@yahoo.com
 Subject: Re: Java high-level client

 First at all, it is backed by Netflix. They used it production for long
 time, so it is pretty solid. Also they have nice tool (Priam) which makes
 cassandra cloud (AWS) friendly. This is important for us.

 Andrey


 On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu wz1...@yahoo.com wrote:

 We are using Hector now. What is the major advantage of astyanax over
 Hector?

 Thanks.
 -Wei

 --
 *From:* Andrey Ilinykh ailin...@gmail.com
 *To:* user@cassandra.apache.org
 *Sent:* Wednesday, November 28, 2012 9:37 AM

 *Subject:* Re: Java high-level client

 +1


 On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

 Netflix has a great client

 https://github.com/Netflix/astyanax





 --
 'Like' us on Facebook for exclusive content and other resources on all
 Barracuda Networks solutions.
 Visit http://barracudanetworks.com/facebook
   ­­



Re: Java high-level client

2012-11-28 Thread David Schairer
Well, not really.  Astyanax ('astu-wanax' in mycenaean greek, 'lord of the 
city') has his brains dashed out against the walls of troy by Neoptolemus, son 
of Achilles.  So the suck was universal.  

--DRS, possibly the only trained classicist using big cassandra databases :)

On Nov 28, 2012, at 12:19 PM, Wei Zhu wz1...@yahoo.com wrote:

 Astyanax was the son of Hector who was Cassandra's brother in greek mythology.
 So son is doing better than the father:)
 
 -Wei
 
 From: Michael Kjellman mkjell...@barracuda.com
 To: user@cassandra.apache.org user@cassandra.apache.org 
 Sent: Wednesday, November 28, 2012 11:51 AM
 Subject: Re: Java high-level client
 
 Lots of example code, nice api, good performance as the first things that 
 come to mind why I like Astyanax better than Hector
 
 From: Andrey Ilinykh ailin...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Wednesday, November 28, 2012 11:49 AM
 To: user@cassandra.apache.org user@cassandra.apache.org, Wei Zhu 
 wz1...@yahoo.com
 Subject: Re: Java high-level client
 
 First at all, it is backed by Netflix. They used it production for long time, 
 so it is pretty solid. Also they have nice tool (Priam) which makes cassandra 
 cloud (AWS) friendly. This is important for us.
 
 Andrey 
 
 
 On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu wz1...@yahoo.com wrote:
 We are using Hector now. What is the major advantage of astyanax over Hector?
 
 Thanks.
 -Wei
 
 From: Andrey Ilinykh ailin...@gmail.com
 To: user@cassandra.apache.org 
 Sent: Wednesday, November 28, 2012 9:37 AM
 
 Subject: Re: Java high-level client
 
 +1
 
 
 On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman mkjell...@barracuda.com 
 wrote:
 Netflix has a great client
 
 https://github.com/Netflix/astyanax
 
 
 
 
 
 -- 
 'Like' us on Facebook for exclusive content and other resources on all 
 Barracuda Networks solutions.
 Visit http://barracudanetworks.com/facebook
   ­­  
 
 



Re: Java high-level client

2012-11-28 Thread Michael Kjellman
CQL Datastax Java Driver for the win then...

On Nov 28, 2012, at 12:25 PM, Edward Capriolo 
edlinuxg...@gmail.commailto:edlinuxg...@gmail.com wrote:

Astyanax is a hector fork. You can see many of the hector' authors comments 
still in the astyanax code. There is some nice stuff in there but (IMHO) I do 
not see the fork as necessary. It has split up the community a bit, as there 
are now 3 high level Java clients.

I would advice follow Josh's advice  http://www.youtube.com/watch?v=nPG4sK_glls 
. Go to reddit and select whatever sexy technology is new and trending :)


On Wed, Nov 28, 2012 at 2:51 PM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
Lots of example code, nice api, good performance as the first things that come 
to mind why I like Astyanax better than Hector

From: Andrey Ilinykh ailin...@gmail.commailto:ailin...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Wednesday, November 28, 2012 11:49 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org, Wei Zhu 
wz1...@yahoo.commailto:wz1...@yahoo.com
Subject: Re: Java high-level client

First at all, it is backed by Netflix. They used it production for long time, 
so it is pretty solid. Also they have nice tool (Priam) which makes cassandra 
cloud (AWS) friendly. This is important for us.

Andrey


On Wed, Nov 28, 2012 at 11:53 AM, Wei Zhu 
wz1...@yahoo.commailto:wz1...@yahoo.com wrote:
We are using Hector now. What is the major advantage of astyanax over Hector?

Thanks.
-Wei


From: Andrey Ilinykh ailin...@gmail.commailto:ailin...@gmail.com
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Sent: Wednesday, November 28, 2012 9:37 AM

Subject: Re: Java high-level client

+1


On Tue, Nov 27, 2012 at 10:10 AM, Michael Kjellman 
mkjell...@barracuda.commailto:mkjell...@barracuda.com wrote:
Netflix has a great client

https://github.com/Netflix/astyanax





--
'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook
  ­­


'Like' us on Facebook for exclusive content and other resources on all 
Barracuda Networks solutions.
Visit http://barracudanetworks.com/facebook




Re: Generic questions over Cassandra 1.1/1.2

2012-11-28 Thread Bill de hÓra

 Compact storage is the schemaless of old.

Right. That comes with the downside of picking one :)  It does not seem 
the compact storage is the default choice for the the future. As well as 
interop with the thrift/cli world, I also find it hard to reason about 
row caching with CQL defined tables. I still work through thrift/cli as 
a result, which is a pity because CQL has a nice surface.


Bill

On 28/11/12 01:32, Edward Capriolo wrote:

@Bill

Are you saying that now cassandra is less schema less ? :)

Compact storage is the schemaless of old.

On Tuesday, November 27, 2012, Bill de hÓra b...@dehora.net
mailto:b...@dehora.net wrote:
  I'm not sure I always
  understand what people mean by schema less
  exactly and I'm curious.
 
  For 'schema less', given this -
 
  {{{
  cqlsh use example;
  cqlsh:example CREATE TABLE users (
  ...  user_name varchar,
  ...  password varchar,
  ...  gender varchar,
  ...  session_token varchar,
  ...  state varchar,
  ...  birth_year bigint,
  ...  PRIMARY KEY (user_name)
  ... );
  }}}
 
  I expect this would not cause an unknown identifier error -
 
  {{{
  INSERT INTO users
  (user_name, password, extra, moar)
  VALUES
  ('bob', 'secret', 'a', 'b');
  }}}
 
  but definitions vary.
 
  Bill
 
  On 26/11/12 09:18, Sylvain Lebresne wrote:
 
  On Mon, Nov 26, 2012 at 8:41 AM, aaron morton
aa...@thelastpickle.com mailto:aa...@thelastpickle.com
  mailto:aa...@thelastpickle.com mailto:aa...@thelastpickle.com
wrote:
 Is there any noticeable performance difference between thrift
or CQL3?
Off the top of my head it's within 5% (maybe 10%) under stress tests.
  See Eric's talk at the Cassandra SF conference for the exact numbers.
 
  Eric's benchmark results was that normal queries were slightly slower
  but prepared one (and in real life, I see no good reason not to prepare
  statements) were actually slightly faster.
 
CQL 3 requires a schema, however altering the schema is easier. And
  in 1.2 will support concurrent schema modifications.
Thrift API is still schema less.
 
  Sorry to hijack this thread, but I'd be curious (like seriously, I'm not
  trolling) to understand what you mean by CQL 3 requires a schema but
  Thrift API is still schema less. Basically I'm not sure I always
  understand what people mean by schema less exactly and I'm curious.
 
  --
  Sylvain
 
 




Re: How to determine compaction bottlenecks

2012-11-28 Thread Derek Bromenshenkel
aaron morton aaron at thelastpickle.com writes:

 
 
 
 I've been playing around with trying to figure out what is making 
compactions run so slow.Is this regular compaction or table upgrades ? 
 I *think* upgrade tables is single threaded. 
 Do you have some compaction logs lines that say Compacted to…? It's handy 
 to 
see the throughput and the number of keys compacted.
 
 snapshot_before_compaction: falsein_memory_compaction_limit_in_mb: 
256multithreaded_compaction: truecompaction_throughput_mb_per_sec: 
128compaction_preheat_key_cache: true
 What setting for concurrent_compactors ? 
 I would also check the logs for GC issues. 
 
 Cheers
 
 
 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand
 
  at aaronmorton
 http://www.thelastpickle.com
 
 
 
 
 On 28/11/2012, at 4:23 AM, Derek Bromenshenkel derek.bromenshenkel at 
gmail.com wrote:
 Setup: C* 1.1.6, 6 node (Linux, 64GB RAM, 16 Core CPU, 2x512 SSD), RF=3, 
1.65TB total usedBackground: Client app is off - no reads/writes happening. 
Doing some cluster maintenance requiring node repairs and upgradesstables.I've 
been playing around with trying to figure out what is making compactions run so 
slow.  Watching syslogs, it seems to average 3-4MB/s.  That just seems so slow 
for this set up and the fact there is zero external load on the cluster.  As 
far 
as I can tell:1. Not I/O bound according to iostat data 2. CPU seems to be 
idiling also3. From my understanding, I am using all the correct compaction 
settings for this setup: Here are those below:snapshot_before_compaction: 
falsein_memory_compaction_limit_in_mb: 256multithreaded_compaction: 
truecompaction_throughput_mb_per_sec: 128compaction_preheat_key_cache: trueSome 
other thoughts:- I have turned on DEBUG logging for the Throttle class and 
played with the live compaction_throughput_mb_per_sec setting.  I can see it 
performing the throttling if I set the value low (say 4), but anything over 8 
it 
is apparently running wide open. [Side note: Although the math for the Throttle 
class adds up, over all the throttling seems to be very very conservative.]- I 
accidently turned on DEBUG for the entire ...compaction.* package and that 
unintentionally created A LOT of I/O from the ParallelCompactionIterable class, 
and the disk/OS handled that just fine.Perhaps I just don't fully grasp what is 
going on or have the correct expectations.  I am OK with things being slow if 
the hardware is working hard, but that does not seem to be the case.Anyone have 
some insight?Thanks
 
 
 
 
 
 

Hi Aaron,

Thank you for taking the time and responding.  I'll try to answer your 
questions.

- reg vs upgrade: Seeing the same speed on regular compaction and upgrades.  
True that most of the frustration comes from the upgrades since there is so 
much 
work to be done.
- GC: looked fine. I've seen pressure before, but only when under very heavy 
client app load.
- concurrent_compactors: not set, so should default to #cores [32; 16 phys * 2 
hyperthread], and I did see 32 CompactionExecuter (I think) threads via JMX.
- examples: yes I have a lot of examples. here are some

Leveled
 INFO [CompactionExecutor:1033] 2012-11-26 01:38:56,800 CompactionTask.java 
(line 221) Compacted to [lcs1].  35,058,450 to 33,408,896 (~95% of original) 
bytes for 127,771 keys at 4.371735MB/s.  Time: 7,288ms.
 INFO [CompactionExecutor:2015] 2012-11-26 03:12:43,800 CompactionTask.java 
(line 221) Compacted to [lcs2].  37,029,581 to 36,747,459 (~99% of original) 
bytes for 135,471 keys at 3.748541MB/s.  Time: 9,349ms.

Size Tiered
 INFO [CompactionExecutor:6242] 2012-11-26 10:46:24,030 CompactionTask.java 
(line 221) Compacted to [abc].  12,804,781,130 to 5,575,340,207 (~43% of 
original) bytes for 84,723,404 keys at 1.382544MB/s.  Time: 3,845,851ms.
 INFO [CompactionExecutor:288] 2012-11-26 00:42:58,629 CompactionTask.java 
(line 
221) Compacted to [def].  116,347,764 to 58,354,237 (~50% of original) bytes 
for 
2,511,375 keys at 0.655612MB/s.  Time: 84,884ms.
 INFO [CompactionExecutor:5113] 2012-11-26 08:33:12,885 CompactionTask.java 
(line 221) Compacted to [ghi].  560,682,371 to 294,965,985 (~52% of original) 
bytes for 220 keys at 3.172669MB/s.  Time: 88,664ms.
 INFO [CompactionExecutor:6124] 2012-11-26 09:36:52,141 CompactionTask.java 
(line 221) Compacted to [jkl].  418,807,103 to 234,394,618 (~55% of original) 
bytes for 3,130,751 keys at 2.807220MB/s.  Time: 79,629ms.

Also, upon reading the messages here/JIRA/etc I decided to disable 
multithreaded_compaction late yesterday.  That helped to the tune of 3-5x 
improvement. Why multi is so much slower I'm willing to digress for now.  
However, I'm still interested in understanding why, under zero load in an 
unthrottled state, the compaction process does not consume at least one full 
CPU 
core and/or max out the disk I/O.

Thanks again,
Derek



Re: counters + replication = awful performance?

2012-11-28 Thread Rob Coli
On Wed, Nov 28, 2012 at 7:15 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 I may be wrong but during a bootstrap hints can be silently discarded, if
 the node they are destined for leaves the ring.

Yeah : https://issues.apache.org/jira/browse/CASSANDRA-2434

 A user like this might benefit from DANGER counters. They are not looking
 for perfection, only better performance, and the counter row keys themselves
 role over in 5 minutes anyway.

Yep, I agree that if you don't care about accurate counting, Cassandra
counters may be for you. Cassandra counters in mongo mode are even
more web scale! The unfortunate thing is that people seem to assume
that software does what it is supposed to do, and probably do not get
a great impression of said software when it doesn't. :D

=Rob

-- 
=Robert Coli
AIMGTALK - rc...@palominodb.com
YAHOO - rcoli.palominob
SKYPE - rcoli_palominodb