from:"Brian Tarbox"

unsubscribe

2015-10-27 Thread Brian Tarbox

unsubscribe

-- 
http://about.me/BrianTarbox

unsubscribe

2015-10-26 Thread Brian Tarbox

-- 
http://about.me/BrianTarbox

Re: write timeout

2015-03-23 Thread Brian Tarbox

My group is seeing the same thing and also can not figure out why its
happening.

On Mon, Mar 23, 2015 at 8:36 AM, Anishek Agarwal anis...@gmail.com wrote:

 Forgot to mention I am using Cassandra 2.0.13

 On Mon, Mar 23, 2015 at 5:59 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am using a single node  server class machine with 16 CPUs with 32GB RAM
 with a single drive attached to it.

 my table structure is as below

 CREATE TABLE t1(id bigint, ts timestamp, cat1 settext, cat2 settext, lat 
 float, lon float, a bigint, primary key (id, ts));

 I am trying to insert 300 entries per partition key with 4000 partition
 keys using 25 threads. Configurations

 write_request_timeout_in_ms: 5000
 concurrent_writes: 32
 heap space : 8GB

 Client side timeout is 12 sec using datastax java driver.
 Consistency level: ONE

 With the above configuration i try to run it 10 times to eventually
 generate around

 300 * 4000 * 10 = 1200 entries,

 When i run this after the first few runs i get a WriteTimeout exception
 at client with 1 replica were required but only 0 acknowledged the write
 message.

 There are no errors in server log. Why does this error come how do i know
 what is the limit I should limit concurrent writes to a single node to.


 Looking at iostat disk utilization seems to be at 1-3% when running this.

 Please let me know if anything else is required.

 Regards,
 Anishek





-- 
http://about.me/BrianTarbox

Re: High read latency after data volume increased

2015-01-09 Thread Brian Tarbox

C* seems to have more than its share of version x doesn't work, use
version y  type issues

On Thu, Jan 8, 2015 at 2:23 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Jan 8, 2015 at 11:14 AM, Roni Balthazar ronibaltha...@gmail.com
 wrote:

 We are using C* 2.1.2 with 2 DCs. 30 nodes DC1 and 10 nodes DC2.


 https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/

 2.1.2 in particular is known to have significant issues. You'd be better
 off running 2.1.1 ...

 =Rob





-- 
http://about.me/BrianTarbox

multiple threads updating result in TransportException

2014-11-26 Thread Brian Tarbox

We're running into a problem where things are fine if our client runs
single threaded but gets TransportException if we use multiple threads.
The datastax driver gets an NIO checkBounds error.

Here is a link to a stack overflow question we found that describes the
problem we're seeing.  This question was asked 7 months ago and got no
answers.

We're running C* 2.0.9 and see the problem on our single node test cluster.

Here is the stack trace we see:

at java.nio.Buffer.checkBounds(Buffer.java:559) ~[na:1.7.0_55]

at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:143)
~[na:1.7.0_55]

at
org.jboss.netty.buffer.HeapChannelBuffer.setBytes(HeapChannelBuffer.java:136)
~[netty-3.7.0.Final.jar:na]

at
org.jboss.netty.buffer.AbstractChannelBuffer.writeBytes(AbstractChannelBuffer.java:472)
~[netty-3.7.0.Final.jar:na]

at com.datastax.driver.core.CBUtil.writeValue(CBUtil.java:272)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at com.datastax.driver.core.CBUtil.writeValueList(CBUtil.java:297)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at
com.datastax.driver.core.Requests$QueryProtocolOptions.encode(Requests.java:223)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at
com.datastax.driver.core.Requests$Execute$1.encode(Requests.java:122)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at
com.datastax.driver.core.Requests$Execute$1.encode(Requests.java:119)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at
com.datastax.driver.core.Message$ProtocolEncoder.encode(Message.java:184)
~[cassandra-driver-core-2.0.0-rc2.jar:na]

at
org.jboss.netty.handler.codec.oneone.OneToOneEncoder.doEncode(OneToOneEncoder.java:66)
~[netty-3.7.0.Final.jar:na]

at
org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:59)
~[netty-3.7.0.Final.jar:na]

at org.jboss.netty.channel.Channels.write(Channels.java:704)
~[netty-3.7.0.Final.jar:na]

at org.jboss.netty.channel.Channels.write(Channels.java:671)
~[netty-3.7.0.Final.jar:na]

at org.jboss.netty.channel.Ab

-- 
http://about.me/BrianTarbox

read after write inconsistent even on a one node cluster

2014-11-06 Thread Brian Tarbox

We're doing development on a single node cluster (and yes of course we're
not really deploying that way), and we're getting inconsistent behavior on
reads after writes.

We write values to our keyspaces and then immediately read the values back
(in our Cucumber tests).  About 20% of the time we get the old value.if
we wait 1 second and redo the query (within the same java method) we get
the new value.

This is all happening on a single node...how is this possible?

We're using 2.0.9 and the java client.   Though it shouldn't matter given a
single node cluster I set the consistency level to ALL with no effect.

I've read CASSANDRA-876 which seems spot-on but it was closed as
won't-fix...and I don't see what the solution is.

Thanks in advance for any help.

Brian Tarbox

-- 
http://about.me/BrianTarbox

Re: read after write inconsistent even on a one node cluster

2014-11-06 Thread Brian Tarbox

Thanks.   Right now its just for testing but in general we can't guard
against multiple users ending up the one writes and then one reads.

It would be one thing if the read just got old data but we're seeing it
return wrong data...i.e. data that doesn't correspond to any particular
version of the object.

Brian

On Thu, Nov 6, 2014 at 10:30 AM, Eric Stevens migh...@gmail.com wrote:

 If this is just for doing tests to make sure you get back the data you
 expect, I would recommend looking some sort of eventually construct in your
 testing.  We use Specs2 as our testing framework, and our write-then-read
 tests look something like this:

 someDAO.write(someObject)

 eventually {
 someDAO.read(someObject.id) mustEqual someObject
 }

 This will retry the read repeatedly over a short duration.

 Just in case you are trying to do write-then-read outside of tests, you
 should be aware that it's a Bad Idea™, but your email reads like you
 already know that =)

 On Thu Nov 06 2014 at 7:16:25 AM Brian Tarbox briantar...@gmail.com
 wrote:

 We're doing development on a single node cluster (and yes of course we're
 not really deploying that way), and we're getting inconsistent behavior on
 reads after writes.

 We write values to our keyspaces and then immediately read the values
 back (in our Cucumber tests).  About 20% of the time we get the old
 value.if we wait 1 second and redo the query (within the same java
 method) we get the new value.

 This is all happening on a single node...how is this possible?

 We're using 2.0.9 and the java client.   Though it shouldn't matter given
 a single node cluster I set the consistency level to ALL with no effect.

 I've read CASSANDRA-876 which seems spot-on but it was closed as
 won't-fix...and I don't see what the solution is.

 Thanks in advance for any help.

 Brian Tarbox

 --
 http://about.me/BrianTarbox




-- 
http://about.me/BrianTarbox

Re: Options for expanding Cassandra cluster on AWS

2014-08-19 Thread Brian Tarbox

The last guidance I heard from DataStax was to use m2.2xlarge's on AWS and
put data on the ephemeral drivehave they changed this guidance?

Brian


On Tue, Aug 19, 2014 at 9:41 AM, Oleg Dulin oleg.du...@gmail.com wrote:

 Distinguished Colleagues:

 Our current Cassandra cluster on AWS looks like this:

 3 nodes in N. Virginia, one per zone.
 RF=3

 Each node is a c3.4xlarge with 2x160G SSDs in RAID-0 (~300 Gig SSD on each
 node). Works great, I find it the most optimal configuration for a
 Cassandra node.

 But the time is coming soon when I need to expand storage capacity.

 I have the following options in front of me:

 1) Add 3 more c3.4xlarge nodes. This keeps the amount of data on each node
 reasonable, and all repairs and other tasks can complete in a reasonable
 amount of time. The downside is that c3.4xlarge are pricey.

 2) Add provisioned EBS volumes. These days I can get SSD-backed EBS with
 up to 4000 IOPS provisioned. I can add those volumes to data_directories
 list in Yaml, and I expect Cassandra can deal with that JBOD-style The
 upside is that it is much cheaper than option #1 above; the downside is
 that it is a much slower configuration and repairs can take longer.

 I'd appreciate any input on this topic.

 Thanks in advance,
 Oleg





-- 
http://about.me/BrianTarbox

do all nodes actually send the data to the coordinator when doing a read?

2014-07-25 Thread Brian Tarbox

We're considering a C* setup with very large columns and I have a question
about the details of read.

I understand that a read request gets handled by the coordinator which
sends read requests to quorum of the nodes holding replicas of the data,
and once quorum nodes have replied with consistent data it is returned to
the client.

My understanding is that each of the nodes actually sends the full data
being requested to the coordinator (which in the case of very large columns
would involve lots of network traffic).  Is that right?

The alternative (which I don't think is the case but I've been asked to
verify) is that the replicas first send meta-data to the coordinator which
then asks one replica to send the actual data.  Again, I don't think this
is the case but was asked to confirm.

Thanks.

-- 
http://about.me/BrianTarbox

nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox

I have a six node cluster in AWS (repl:3) and recently noticed that repair
was hanging.  I've run with the -pr switch.

I see this output in the nodetool command line (and also in that node's
system.log):
 Starting repair command #9, repairing 256 ranges for keyspace dev_a

but then no other output.  And I see nothing in any of the other node's log
files.

Right now the application using C* is turned off so there is zero activity.
I've let it be in this state for up to 24 hours with nothing more logged.

Any suggestions?

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox

We're running 1.2.13.

Any chance that doing a rolling-restart would help?

Would running without the -pr improve the odds?

Thanks.


On Tue, Jul 1, 2014 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Jul 1, 2014 at 9:24 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 I have a six node cluster in AWS (repl:3) and recently noticed that
 repair was hanging.  I've run with the -pr switch.


 It'll do that.

 What version of Cassandra?

 =Rob

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox

Does this output from jstack indicate a problem?

ReadRepairStage:12170 daemon prio=10 tid=0x7f9dcc018800 nid=0x7361
waiting on condition [0x7f9db540c000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000613e049d8 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

ReadRepairStage:12169 daemon prio=10 tid=0x7f9dd4009000 nid=0x7340
waiting on condition [0x7f9db53cb000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000613e049d8 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

ReadRepairStage:12168 daemon prio=10 tid=0x7f9dd001d000 nid=0x733f
waiting on condition [0x7f9db51a6000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x000613e049d8 (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at
java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)




On Tue, Jul 1, 2014 at 2:09 PM, Brian Tarbox tar...@cabotresearch.com
wrote:

 We're running 1.2.13.

 Any chance that doing a rolling-restart would help?

 Would running without the -pr improve the odds?

 Thanks.


 On Tue, Jul 1, 2014 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Jul 1, 2014 at 9:24 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 I have a six node cluster in AWS (repl:3) and recently noticed that
 repair was hanging.  I've run with the -pr switch.


 It'll do that.

 What version of Cassandra?

 =Rob

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox

Given that an upgrade is (for various internal reasons) not an option at
this point...is there anything I can do to get repair working again?  I'll
also mention that I see this behavior from all nodes.

Thanks.


On Tue, Jul 1, 2014 at 2:51 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Jul 1, 2014 at 11:09 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 We're running 1.2.13.


 1.2.17 contains a few streaming fixes which might help.


 Any chance that doing a rolling-restart would help?


 Probably not.


 Would running without the -pr improve the odds?


 No, that'd make it less likely to succeed.

 =Rob

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox

For what purpose are you running repair?   Because I read that we should!
:-)

We do delete data from one column family quite regularly...from the other
CFs occasionally.  We almost never run with less than 100% of our nodes up.

In this configuration do we *need* to run repair?

Thanks,


On Tue, Jul 1, 2014 at 2:57 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Jul 1, 2014 at 11:54 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 Given that an upgrade is (for various internal reasons) not an option at
 this point...is there anything I can do to get repair working again?  I'll
 also mention that I see this behavior from all nodes.


 I think maybe increasing your phi tolerance for streaming timeouts might
 help.

 But basically, no. Repair has historically been quite broken in AWS. It
 was re-written in 2.0 along with the rest of streaming, and hopefully will
 soon stabilize and actually work.

 For what purpose are you running repair?

 =Rob

running out of diskspace during maintenance tasks

2014-06-18 Thread Brian Tarbox

I'm running on AWS m2.2xlarge instances using the ~800 gig
ephemeral/attached disk for my data directory.  My data size per node is
nearing 400 gig.

Sometimes during maintenance operations (repairs mostly I think) I run out
of disk space as my understanding is that some of these operations require
double the space of one's data.

Since I can't change the size of attached storage for my instance type my
question is can I somehow get these maintenance operations to use other
volumes?

Failing that, what are my options?  Thanks.

Brian Tarbox

Re: running out of diskspace during maintenance tasks

2014-06-18 Thread Brian Tarbox

We do a repair -pr on each node once a week on a rolling basis.
Should we be running cleanup as well?  My understanding that was only used
after adding/removing nodes?

We'd like to avoid adding nodes if possible (which might not be).   Still
curious if we can get C* to do the maintenance task on a separate volume.

Thanks.


On Wed, Jun 18, 2014 at 12:03 PM, Jeremy Jongsma jer...@barchart.com
wrote:

 One option is to add new nodes, and do a node repair/cleanup on
 everything. That will at least reduce your per-node data size.


 On Wed, Jun 18, 2014 at 11:01 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 I'm running on AWS m2.2xlarge instances using the ~800 gig
 ephemeral/attached disk for my data directory.  My data size per node is
 nearing 400 gig.

 Sometimes during maintenance operations (repairs mostly I think) I run
 out of disk space as my understanding is that some of these operations
 require double the space of one's data.

 Since I can't change the size of attached storage for my instance type my
 question is can I somehow get these maintenance operations to use other
 volumes?

 Failing that, what are my options?  Thanks.

 Brian Tarbox

can I kill very old data files in my data folder (I know that sounds crazy but....)

2014-06-18 Thread Brian Tarbox

I have a column family that only stores the last 5 days worth of some
data...and yet I have files in the data directory for this CF that are 3
weeks old.  They take the form:

keyspace-CFName-ic--Filter.db
keyspace-CFName-ic--Index.db
keyspace-CFName-ic--Data.db
keyspace-CFName-ic--Statistics.db
keyspace-CFName-ic--TOC.txt
keyspace-CFName-ic--Summary.db

I have six bunches of these file groups, each with a different 
value...and with timestamps of each of the last five days...plus one group
from 3 weeks ago...which makes me wonder if that group  somehow should have
been deleted but were not.

The files are tens or hundreds of gigs so deleting would be good, unless
its really bad!

Thanks,

Brian Tarbox

Re: can I kill very old data files in my data folder (I know that sounds crazy but....)

2014-06-18 Thread Brian Tarbox

Rob,
Thank you!   We are not using TTL, we're manually deleting data more than 5
days old for this CF.  We're running 1.2.13 and are using size tiered
compaction (this cf is append-only i.e.zero updates).

Sounds like we can get away with doing a (stop, delete old-data-file,
restart) process on a rolling basis if I understand you.

Thanks,

Brian


On Wed, Jun 18, 2014 at 2:37 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Jun 18, 2014 at 10:56 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 I have a column family that only stores the last 5 days worth of some
 data...and yet I have files in the data directory for this CF that are 3
 weeks old.


 Are you using TTL? If so :

 https://issues.apache.org/jira/browse/CASSANDRA-6654

 Are you using size tiered or level compaction?

 I have six bunches of these file groups, each with a different 
 value...and with timestamps of each of the last five days...plus one group
 from 3 weeks ago...which makes me wonder if that group  somehow should have
 been deleted but were not.

 The files are tens or hundreds of gigs so deleting would be good, unless
 its really bad!


 Data files can't be deleted from the data dir with Cassandra running, but
 it should be fine (if probably technically unsupported) to delete them with
 Cassandra stopped. In most cases you don't want to do so, because you might
 un-mask deleted rows or cause unexpected consistency characteristics.

 In your case, you know that no data in files created 3 weeks old can
 possibly have any value, so it is safe to delete them.

 =Rob

Re: can I kill very old data files in my data folder (I know that sounds crazy but....)

2014-06-18 Thread Brian Tarbox

I don't think I have the space to run a major compaction right now (I'm
above 50% disk space used already) and compaction can take extra space I
think?


On Wed, Jun 18, 2014 at 3:24 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Jun 18, 2014 at 12:05 PM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 Thank you!   We are not using TTL, we're manually deleting data more than
 5 days old for this CF.  We're running 1.2.13 and are using size tiered
 compaction (this cf is append-only i.e.zero updates).

 Sounds like we can get away with doing a (stop, delete old-data-file,
 restart) process on a rolling basis if I understand you.


 Sure, though in your case (because you're using STS and can) I'd probably
 just run a major compaction.

 =Rob

Specifying startBefore with iterators with compositeKeys

2014-04-01 Thread Brian Tarbox

I have a composite key consisting of: (integer, bytes) and I have rows like:
(1,abc), (1,def), (2,abc), (2,def) and I want to find all rows with the
integer part = 2.

I need to create a startBeyondName using CompositeType.Builder class and am
wondering if specifying (2, Bytes.Empty) will sort correctly?

I think another way of saying this is: does HeapByteBuffer with
pos=,lim=0,cap=0 sort prior to any other possible HeapByteBuffer?

Thanks.

getting dropped messages in log even with no one running

2014-03-24 Thread Brian Tarbox

I'm getting messages dropped messages in my cluster even when (like right
now) there are no clients running against the cluster.

1) who could be generating the traffic if there are no clients?
2) is there a way to list active clients...on the off chance that there is
a client I don't know about?
3) why is messages dropped an INFO rather than a WARNING?

I'm running 1.2.13 on a six node AWS cluster on m2-2xlarge servers.

Any help is appreciated.

Brian

Re: getting dropped messages in log even with no one running

2014-03-24 Thread Brian Tarbox

The problem was one of my nodes was in some kind of bad Hinted-Handoff
loop.  I looked at CASSANDRA-4740 which discusses this but to no solution
that I could see.

When I killed the server trying to do the hinted-handoffs the other nodes
stopped complaining...as soon as I restarted the node all the other nodes
went right back into the dropping messages state.

Help please.

Brian



On Mon, Mar 24, 2014 at 10:01 AM, Brian Tarbox tar...@cabotresearch.comwrote:

 I'm getting messages dropped messages in my cluster even when (like
 right now) there are no clients running against the cluster.

 1) who could be generating the traffic if there are no clients?
 2) is there a way to list active clients...on the off chance that there is
 a client I don't know about?
 3) why is messages dropped an INFO rather than a WARNING?

 I'm running 1.2.13 on a six node AWS cluster on m2-2xlarge servers.

 Any help is appreciated.

 Brian

getting lots of dropped messages/requests/mutations but only on 2 of 6 servers

2014-03-20 Thread Brian Tarbox

I have a six node cluster (running m2-2xlarge instances in AWS) with RF=3
and I'm seeing two of the six nodes reporting lots of dropped messages.

The six machines are identical (created from same AWS AMI) so this local
behavior has me puzzled.

BTW this is mostly happening when I'm reading via secondary indexes.

I only have 40 gig of data or so on each machine.

Any help appreciated.

Re: this seems like a flaw in Node Selection

2014-03-20 Thread Brian Tarbox

Does this still apply since we're using 1.2.13?  (should have said that in
the original message)

Thank you.


On Thu, Mar 20, 2014 at 3:57 PM, Robert Coli rc...@eventbrite.com wrote:

  On Thu, Mar 20, 2014 at 12:31 PM, Brian Tarbox 
 tar...@cabotresearch.comwrote:

 I've seen this problem with other companies and products: leastloaded as
 a means of picking servers is almost always liable to death spirals when a
 server can have a failure.

 Is there any way to configure away from this in C*?


 Disable the dynamic snitch... via the hidden but still valid [1]
 configuration directive dynamic_snitch in cassandra.yaml.

 dynamic_snitch: false

 https://issues.apache.org/jira/browse/CASSANDRA-3229

 Given the various other bad edge cases with the Dynamic Snitch (sending
 requests to the wrong DC on reset, etc.) this might be worth considering
 in general... especially with the speculative execution stuff in 2.0...

 I do note that https://issues.apache.org/jira/browse/CASSANDRA-6465 has a
 patch in 2.0.5, yay...

 =Rob

Re: this seems like a flaw in Node Selection

2014-03-20 Thread Brian Tarbox

Yes, I was going to say (sorry for the brain-freeze) that this is behavior
in Pelops not in C* itself.


On Thu, Mar 20, 2014 at 4:15 PM, Tyler Hobbs ty...@datastax.com wrote:

 Brian,

 Are you referring to Pelops?  The code you mentioned doesn't exist in
 Cassandra.


 On Thu, Mar 20, 2014 at 3:07 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Mar 20, 2014 at 1:03 PM, Brian Tarbox 
 tar...@cabotresearch.comwrote:

 Does this still apply since we're using 1.2.13?  (should have said that
 in the original message)


 I checked the cassandra-1.2 branch to verify that the dynamic_snitch
 config file option is still supported there; it is.

 =Rob





 --
 Tyler Hobbs
 DataStax http://datastax.com/

in AWS is it worth trying to talk to a server in the same zone as your client?

2014-02-12 Thread Brian Tarbox

We're running a C* cluster with 6 servers spread across the four us-east1
zones.

We also spread our clients (hundreds of them) across the four zones.

Currently we give our clients a connection string listing all six servers
and let C* do its thing.

This is all working just fine...and we're paying a fair bit in AWS transfer
costs.  There is a suspicion that this transfer cost is driven by us
passing data around between our C* servers and clients.

Would there be any value to trying to get a client to talk to one of the C*
servers in its own zone?

I understand (at least partially!) about coordinator nodes and replication
and know that no matter which server is the coordinator for an operation
replication may cause bits to get transferred to/from servers in other
zones.  Having said that...is there a chance that trying to encourage a
client to initially contact a server in its own zone would help?

Thank you,

Brian Tarbox

Re: in AWS is it worth trying to talk to a server in the same zone as your client?

2014-02-12 Thread Brian Tarbox

We're definitely using all private IPs.

I guess my question really is: with repl=3 and quorum operations I know
we're going to push/pull bits across the various AZs within us-east-1.  So,
does having the client start the conversation with a server in the same AZ
save us anything?


On Wed, Feb 12, 2014 at 4:14 PM, Ben Bromhead b...@instaclustr.com wrote:

 0.01/G between zones irrespective of IP is correct.

 As for your original question, depending on the driver you are using you
 could write a custom co-ordinator node selection policy.

 For example if you are using the Datastax driver you would extend
 http://www.datastax.com/drivers/java/2.0/apidocs/com/datastax/driver/core/policies/LoadBalancingPolicy.html

 ... and set the distance based on which zone the node is in.

 An alternate method would be to define the zones as data centres and then
 you could leverage existing DC aware policies (We've never tried this
 though).


 Ben Bromhead
 Instaclustr | www.instaclustr.com | 
 @instaclustrhttp://twitter.com/instaclustr |
 +61 415 936 359




 On 13/02/2014, at 8:00 AM, Andrey Ilinykh ailin...@gmail.com wrote:

 I think you are mistaken. It is true for the same zone. between zones
 0.01/G


 On Wed, Feb 12, 2014 at 12:17 PM, Russell Bradberry 
 rbradbe...@gmail.comwrote:

 Not when using private IP addresses.  That pricing *ONLY *applies if you
 are using the public interface or EIP/ENI.  If you use the private IP
 addresses there is no cost associated.



 On February 12, 2014 at 3:13:58 PM, William Oberman (
 ober...@civicscience.com //ober...@civicscience.com) wrote:

 Same region, cross zone transfer is $0.01 / GB (see
 http://aws.amazon.com/ec2/pricing/, Data Transfer section).


 On Wed, Feb 12, 2014 at 3:04 PM, Russell Bradberry 
 rbradbe...@gmail.comwrote:

  Cross zone data transfer does not cost any extra money.

  LOCAL_QUORUM = QUORUM if all 6 servers are located in the same logical
 datacenter.

  Ensure your clients are connecting to either the local IP or the AWS
 hostname that is a CNAME to the local ip from within AWS.  If you connect
 to the public IP you will get charged for outbound data transfer.



 On February 12, 2014 at 2:58:07 PM, Yogi Nerella 
 (ynerella...@gmail.com//ynerella...@gmail.com)
 wrote:

  Also, may be you need to check the read consistency to local_quorum,
 otherwise the servers still try to read the data from all other data
 centers.

 I can understand the latency, but I cant understand how it would save
 money?   The amount of data transferred from the AWS server to the client
 should be same no matter where the client is connected?



 On Wed, Feb 12, 2014 at 10:33 AM, Andrey Ilinykh ailin...@gmail.comwrote:

 yes, sure. Taking data from the same zone will reduce latency and save
 you some money.


 On Wed, Feb 12, 2014 at 10:13 AM, Brian Tarbox 
 tar...@cabotresearch.com wrote:

 We're running a C* cluster with 6 servers spread across the four
 us-east1 zones.

 We also spread our clients (hundreds of them) across the four zones.

 Currently we give our clients a connection string listing all six
 servers and let C* do its thing.

 This is all working just fine...and we're paying a fair bit in AWS
 transfer costs.  There is a suspicion that this transfer cost is driven by
 us passing data around between our C* servers and clients.

 Would there be any value to trying to get a client to talk to one of
 the C* servers in its own zone?

 I understand (at least partially!) about coordinator nodes and
 replication and know that no matter which server is the coordinator for an
 operation replication may cause bits to get transferred to/from servers in
 other zones.  Having said that...is there a chance that trying to 
 encourage
 a client to initially contact a server in its own zone would help?

 Thank you,

 Brian Tarbox

Re: Opscenter tabs

2014-01-23 Thread Brian Tarbox

A vaguely related question...my OpsCenter now has two separate tabs for the
same cluster...one tab shows all six nodes and has their agents...the other
tab has the same six nodes but no agents.

I see no way to get rid of the spurious tab.


On Thu, Jan 23, 2014 at 12:47 PM, Ken Hancock ken.hanc...@schange.comwrote:

 Multiple DCs are still a single cluster in OpsCenter.  If you go to
 Physical View, you should see one column for each data center.

 Also, the Community edition of OpsCenter, last I saw, only supported a
 single cluster.



 On Thu, Jan 23, 2014 at 12:06 PM, Daniel Curry 
 daniel.cu...@arrayent.comwrote:

   I am unable to find any references on if the tabs to monitor multiple
 DC can be configure to read the DC location. I do not want to change the
 cluster name itself.  Right now I see three tabs all with the same names
 cluster_name:  test.  Like to keep the current cluster name test, but
 change the opscenter tabs to DC1, DC2, and DC3.

Is this documented somewhere?

 --
 Daniel Curry
 Sr Linux Systems Administrator
 Arrayent, Inc.
 2317 Broadway Street, Suite 20
 Redwood City, CA 94063
 dan...@arrayent.com




 --
 *Ken Hancock *| System Architect, Advanced Advertising
 SeaChange International
 50 Nagog Park
 Acton, Massachusetts 01720
 ken.hanc...@schange.com | www.schange.com | 
 NASDAQ:SEAChttp://www.schange.com/en-US/Company/InvestorRelations.aspx

 Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hanc...@schange.com
  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
 LinkedIn] http://www.linkedin.com/in/kenhancock

 [image: SeaChange International]
  http://www.schange.com/This e-mail and any attachments may contain
 information which is SeaChange International confidential. The information
 enclosed is intended only for the addressees herein and may not be copied
 or forwarded without permission from SeaChange International.

Re: bad interaction between CompositeTypes and Secondary index

2014-01-21 Thread Brian Tarbox

The table was created this way, we also avoid altering exiting tables.


On Tue, Jan 21, 2014 at 4:19 PM, Jacob Rhoden jacob.rho...@me.com wrote:

 Was the original table created, or created then altered? It makes a
 difference as I have seen this type of thing occur on tables I first
 created then updated. Not sure if that issue was fixed in 2.0.4, I'm
 avoiding altering tables completely for now.

 __
 Sent from iPhone

 On 22 Jan 2014, at 7:50 am, Brian Tarbox tar...@cabotresearch.com wrote:

 We're trying to use CompositeTypes and Secondary indexes and are getting
 an assertion failure in ExtendedFilter.java line 258 (running C* 2.0.3)
 when we call getIndexedColumns.  The assertion is for not finding any
 columns.

 The strange bit is that if we re-create the column family in question and
 do *not *set ComparatorType then things work fine.  This seems odd since
 as I understand it the ComparatorType is for controlling the ordering of
 columns within a row and the Secondary Index is to find a subset of rows
 that contain a particular column valuein other words they seem like
 they shouldn't have an interaction.

 Its also puzzling to us that ExtendedFilter asserts in this case...if it
 find no columns I would have expected an empty return but not a failure
 (that our client code saw as a Timeout exception).

 Any clues would be appreciated.

 Thanks,

 Brian Tarbox

changing several things (almost) at once; is this the right order to make the changes?

2013-12-02 Thread Brian Tarbox

We're making several changes and I'd to confirm that our order of making
them is reasonable.  Right now we have 4 node system at replicationFactor=2
running 1.1.6.

We've moving to a 6 node system at rf=3 running 1.2.12 (I guess).

We think the order should be:
1) change to rf=3 and run repair on all nodes while still at 1.1.6
2) upgrade to 1.1.10 (latest on that branch?)
3) upgrade to 1.2.12 (latest on that branch?)
4) run the convert-to-v_Node command
5) add two more servers

Is that reasonable?  Thanks.  We run in ec2 and I'm planning on testing it
all on a new set of servers just in case but figured I'd ask the experts
first in case I'm doing something foolish.

Thanks,

Brian Tarbox

looking for advice before upgrading from 1.1.6 to 2.0.1

2013-10-10 Thread Brian Tarbox

We're currently running our pre-production system on a 4 node EC2 cluster
with C* 1.1.6.

We have the luxury of a fresh install..rebuilding all our data so we can
skip upgrades and just install a clean system.  We obviously won't to do
this very often so we'd like to do it right...take advantage of new
features like vnodes (so we can scale more easily...we've very likely going
to 6 nodes soon but not yet) and such.

We haven't made many changes to the yaml file...any advice on:
- configuration settings?
- any client side java level changes we have to make?
- other features we should really consider?

would be most appreciated.

Thank you,

Brian Tarbox

Re: Cassandra JVM heap sizes on EC2

2013-08-23 Thread Brian Tarbox

The advice I heard at the New York C* conference...which we follow is to
use the m2.2xlarge and give it about 8 GB.  The m2.4xlarge seems overkill
(or at least over price).

Brian


On Fri, Aug 23, 2013 at 6:12 PM, David Laube d...@stormpath.com wrote:

 Hi All,

 We are evaluating our JVM heap size configuration on Cassandra 1.2.8 and
 would like to get some feedback from the community as to what the proper
 JVM heap size should be for cassandra nodes deployed on to Amazon EC2. We
 are running m2.4xlarge EC2 instances (64GB RAM, 8 core, 2 x 840GB disks)
 --so we will have plenty of RAM. I've already consulted the docs at
 http://www.datastax.com/documentation/cassandra/1.2/mobile/cassandra/operations/ops_tune_jvm_c.html
  but
 would love to hear what is working or not working for you in the wild.
 Since Datastax cautions against using more than 8GB, I'm wondering if it is
 even advantageous to use even slightly more.

 Thanks,
 -David Laube

Re: Which of these VPS configurations would perform better for Cassandra ?

2013-08-05 Thread Brian Tarbox

We run a cluster in EC2 and it's working very well for us.  The standard
seems to be M2.2XLarge instances with data living on the ephemeral drives
(which means its local and fast) and backups either to EBS, S3 or just
relying on cluster size and replication (we avoid that last idea).

Brian


On Sun, Aug 4, 2013 at 9:02 PM, Ben Bromhead b...@instaclustr.com wrote:

 If you want to get a rough idea of how things will perform, fire up YCSB (
 https://github.com/brianfrankcooper/YCSB/wiki) and run the tests that
 closest match how you think your workload will be (run the test clients
 from a couple of beefy AWS spot-instances for less than a dollar). As you
 are a new startup without any existing load/traffic patterns, benchmarking
 will be your best bet.

 As a have a look at running Cassandra with SmartOS on Joyent. When you run
 SmartOS on Joyent virtualisation is done using solaris zones, an OS based
 virtualisation, which is at least a quadrillion times better than KVM, xen
 etc.

 Ok maybe not that much… but it is pretty cool and has the following
 benefits:

 - No hardware emulation.
 - Shared kernel with the host (you don't have to waste precious memory
 running a guest os).
 - ZFS :)

 Have a read of http://wiki.smartos.org/display/DOC/SmartOS+Virtualization for
 more info.

 There are some downsides as well:

 The version of Cassandra that comes with the SmartOS package management
 system is old and busted, so you will want to build from source.
 You will want to be technically confident in running on something a little
 outside the norm (SmartOS is based on Solaris).

 Just make sure you test and benchmark all your options, a few days of
 testing now will save you weeks of pain.

 Good luck!

 Ben Bromhead
 Instaclustr | www.instaclustr.com | 
 @instaclustrhttp://twitter.com/instaclustr



 On 05/08/2013, at 12:34 AM, David Schairer dschai...@humbaba.net wrote:

 Of course -- my point is simply that if you're looking for speed, SSD+KVM,
 especially in a shared tenant situation, is unlikely to perform the way you
 want to.  If you're building a pure proof of concept that never stresses
 the system, it doesn't matter, but if you plan an MVP with any sort of
 scale, you'll want a plan to be on something more robust.

 I'll also say that it's really important (imho) to be doing even your dev
 in a config where you have consistency conditions like eventual production
 -- so make sure you're writing to both nodes and can have cases where
 eventual consistency delays kick in, or it'll come back to bite you later
 -- I've seen this force people to redesign their whole data model when they
 don't plan for it initially.

 As I said, I haven't tested DO.  I've tested very similar configurations
 at other providers and they were all terrible under load -- and certainly
 took away most of the benefits of SSD once you stressed writes a bit.
  XEN+SSD, on modern kernels, should work better, but I didn't test it
 (linode doesn't offer this, though, and they've had lots of other
 challenges of late).

 --DRS

 On Aug 3, 2013, at 11:40 PM, Ertio Lew ertio...@gmail.com wrote:

 @David:
 Like all other start-ups, we too cannot start with all dedicated servers
 for Cassandra. So right now we have no better choice except for using a VPS
 :), but we can definitely choose one from amongst a suitable set of VPS
 configurations. As of now since we are starting out, could we initiate our
 cluster with 2 nodes(RF=2), (KVM, 2GB ram, 2 cores, 30GB SDD) . Right now
 we wont we having a very heavy load on Cassandra until a next few months
 till we grow our user base. So, this choice is mainly based on the pricing
 vs configuration as well as digital ocean's good reputation in the
 community.


 On Sun, Aug 4, 2013 at 12:53 AM, David Schairer dschai...@humbaba.net
 wrote:
 I've run several lab configurations on linodes; I wouldn't run cassandra
 on any shared virtual platform for large-scale production, just because
 your IO performance is going to be really hard to predict.  Lots of people
 do, though -- depends on your cassandra loads and how consistent you need
 to have performance be, as well as how much of your working set will fit
 into memory.  Remember that linode significantly oversells their CPU as
 well.

 The release version of KVM, at least as of a few months ago, still doesn't
 support TRIM on SSD; that, plus the fact that you don't know how others
 will use SSDs or if their file systems will keep the SSDs healthy, means
 that SSD performance on KVM is going to be highly unpredictable.  I have
 not tested digitalocean, but I did test several other KVM+SSD shared-tenant
 hosting providers aggressively for cassandra a couple months ago; they all
 failed badly.

 Your mileage will vary considerably based on what you need out of
 cassandra, what your data patterns look like, and how you configure your
 system.  That said, I would use xen before KVM for high-performance IO.

 I have not run Cassandra in any volume on

Re: too many open files

2013-07-15 Thread Brian Tarbox

Odd that this discussion happens now as I'm also getting this error.  I get
a burst of error messages and then the system continues...with no apparent
ill effect.
I can't tell what the system was doing at the timehere is the stack.
 BTW Opscenter says I only have 4 or 5 SSTables in each of my 6 CFs.

ERROR [ReadStage:62384] 2013-07-14 18:04:26,062
AbstractCassandraDaemon.java (line 135) Exception in thread
Thread[ReadStage:62384,5,main]
java.io.IOError: java.io.FileNotFoundException:
/tmp_vol/cassandra/data/dev_a/portfoliodao/dev_a-portfoliodao-hf-166-Data.db
(Too many open files)
at
org.apache.cassandra.io.util.CompressedSegmentedFile.getSegment(CompressedSegmentedFile.java:69)
at
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:898)
at
org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:63)
at
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:61)
at
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:79)
at
org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:124)
at
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:64)
at
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1345)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1207)
at
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1142)
at org.apache.cassandra.db.Table.getRow(Table.java:378)
at
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58)
at
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:51)
at
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.FileNotFoundException:
/tmp_vol/cassandra/data/dev_a/portfoliodao/dev_a-portfoliodao-hf-166-Data.db
(Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.init(RandomAccessFile.java:216)
at
org.apache.cassandra.io.util.RandomAccessReader.init(RandomAccessReader.java:67)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(CompressedRandomAccessReader.java:64)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:46)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:41)
at
org.apache.cassandra.io.util.CompressedSegmentedFile.getSegment(CompressedSegmentedFile.java:63)
... 16 more



On Mon, Jul 15, 2013 at 7:23 AM, Michał Michalski mich...@opera.com wrote:

 It doesn't tell you anything if file ends it with ic-###, except
 pointing out the SSTable version it uses (ic in this case).

 Files related to secondary index contain something like this in the
 filename: KS-CF.IDX-NAME, while in regular CFs do not contain any
 dots except the one just before file extension.

 M.

 W dniu 15.07.2013 09:38, Paul Ingalls pisze:

  Also, looking through the log, it appears a lot of the files end with
 ic- which I assume is associated with a secondary index I have on the
 table.  Are secondary indexes really expensive from a file descriptor
 standpoint?  That particular table uses the default compaction scheme...

 On Jul 15, 2013, at 12:00 AM, Paul Ingalls paulinga...@gmail.com wrote:

  I have one table that is using leveled.  It was set to 10MB, I will try
 changing it to 256MB.  Is there a good way to merge the existing sstables?

 On Jul 14, 2013, at 5:32 PM, Jonathan Haddad j...@jonhaddad.com wrote:

  Are you using leveled compaction?  If so, what do you have the file
 size set at?  If you're using the defaults, you'll have a ton of really
 small files.  I believe Albert Tobey recommended using 256MB for the table
 sstable_size_in_mb to avoid this problem.


 On Sun, Jul 14, 2013 at 5:10 PM, Paul Ingalls paulinga...@gmail.com
 wrote:
 I'm running into a problem where instances of my cluster are hitting
 over 450K open files.  Is this normal for a 4 node 1.2.6 cluster with
 replication factor of 3 and about 50GB of data on each node?  I can push
 the file descriptor limit up, but I plan on having a much larger load so
 I'm wondering if I should be looking at something else….

 Let me know if you need more info…

 Paul





 --
 Jon Haddad
 http://www.rustyrazorblade.com
 skype: rustyrazorblade

Re: Alternate major compaction

2013-07-11 Thread Brian Tarbox

Perhaps I should already know this but why is running a major compaction
considered so bad?  We're running 1.1.6.

Thanks.


On Thu, Jul 11, 2013 at 7:51 AM, Takenori Sato ts...@cloudian.com wrote:

 Hi,

 I think it is a common headache for users running a large Cassandra
 cluster in production.


 Running a major compaction is not the only cause, but more. For example, I
 see two typical scenario.

 1. backup use case
 2. active wide row

 In the case of 1, say, one data is removed a year later. This means,
 tombstone on the row is 1 year away from the original row. To remove an
 expired row entirely, a compaction set has to include all the rows. So,
 when do the original, 1 year old row, and the tombstoned row are included
 in a compaction set? It is likely to take one year.

 In the case of 2, such an active wide row exists in most of sstable files.
 And it typically contains many expired columns. But none of them wouldn't
 be removed entirely because a compaction set practically do not include all
 the row fragments.


 Btw, there is a very convenient MBean API is available. It is
 CompactionManager's forceUserDefinedCompaction. You can invoke a minor
 compaction on a file set you define. So the question is how to find an
 optimal set of sstable files.

 Then, I wrote a tool to check garbage, and print outs some useful
 information to find such an optimal set.

 Here's a simple log output.

 # /opt/cassandra/bin/checksstablegarbage -e 
 /cassandra_data/UserData/Test5_BLOB-hc-4-Data.db
 [Keyspace, ColumnFamily, gcGraceSeconds(gcBefore)] = [UserData, Test5_BLOB, 
 300(1373504071)]
 ===
 ROW_KEY, TOTAL_SIZE, COMPACTED_SIZE, TOMBSTONED, EXPIRED, 
 REMAINNING_SSTABLE_FILES
 ===
 hello5/100.txt.1373502926003, 40, 40, YES, YES, Test5_BLOB-hc-3-Data.db
 ---
 TOTAL, 40, 40
 ===

 REMAINNING_SSTABLE_FILES means any other sstable files that contain the
 respective row. So, the following is an optimal set.

 # /opt/cassandra/bin/checksstablegarbage -e 
 /cassandra_data/UserData/Test5_BLOB-hc-4-Data.db 
 /cassandra_data/UserData/Test5_BLOB-hc-3-Data.db
 [Keyspace, ColumnFamily, gcGraceSeconds(gcBefore)] = [UserData, Test5_BLOB, 
 300(1373504131)]
 ===
 ROW_KEY, TOTAL_SIZE, COMPACTED_SIZE, TOMBSTONED, EXPIRED, 
 REMAINNING_SSTABLE_FILES
 ===
 hello5/100.txt.1373502926003, 223, 0, YES, YES
 ---
 TOTAL, 223, 0
 ===

 This tool relies on SSTableReader and an aggregation iterator as Cassandra
 does in compaction. I was considering to share this with the community. So
 let me know if anyone is interested.

 Ah, note that it is based on 1.0.7. So I will need to check and update for
 newer versions.

 Thanks,
 Takenori


 On Thu, Jul 11, 2013 at 6:46 PM, Tomàs Núnez tomas.nu...@groupalia.comwrote:

 Hi

 About a year ago, we did a major compaction in our cassandra cluster (a
 n00b mistake, I know), and since then we've had huge sstables that never
 get compacted, and we were condemned to repeat the major compaction process
 every once in a while (we are using SizeTieredCompaction strategy, and
 we've not avaluated yet LeveledCompaction, because it has its downsides,
 and we've had no time to test all of them in our environment).

 I was trying to find a way to solve this situation (that is, do something
 like a major compaction that writes small sstables, not huge as major
 compaction does), and I couldn't find it in the documentation. I tried
 cleanup and scrub/upgradesstables, but they don't do that (as documentation
 states). Then I tried deleting all data in a node and then bootstrapping it
 (or nodetool rebuild-ing it), hoping that this way the sstables would get
 cleaned from deleted records and updates. But the deleted node just copied
 the sstables from another node as they were, cleaning nothing.

 So I tried a new approach: I switched the sstable compaction strategy
 (SizeTiered to Leveled), forcing the sstables to be rewritten from scratch,
 and then switching it back (Leveled to SizeTiered). It took a while (but so
 do the major compaction process) and it worked, I have smaller sstables,
 and I've regained a lot of disk space.

 I'm happy with the results, but it doesn't seem a orthodox way of
 cleaning the sstables. What do you think, is it something wrong or crazy?
 Is there a different way to achieve the same thing?

 Let's put an example:
 Suppose you have a write-only

Re: Unreachable Nodes

2013-05-22 Thread Brian Tarbox

Have to disagree with the does no harm comment just a tiny bit.  I had a
similar situation recently and coincidentally needed to do a CF truncate.
 The system rejected the request saying that not all nodes were up.
 Nodetool ring said everyone was up but nodetool gossipinfo said there were
vestiges of dead nodes still hanging around.  I ended up restarting the
entire cluster which cleared the issue.

Brian


On Wed, May 22, 2013 at 6:46 AM, Vasileios Vlachos 
vasileiosvlac...@gmail.com wrote:

 Hello,

 Thanks for your fast response. That makes sense. I'll just keep an eye on
 it then.

 Many thanks,

 Vasilis


 On Wed, May 22, 2013 at 10:54 AM, Alain RODRIGUEZ arodr...@gmail.comwrote:

 Hi.

 I think that the unsafeAssassinateEndpoint was the good solution here.
 I was going to lead you to this solution after reading the first part of
 your message.

 Does anyone know why the dead nodes still appear when we run nodetool
 gossipinfo but they don't when we run describe cluster from the CLI?

  That's a good thing. Gossiper just keep this information for a while (7
 or 10 days by default off the top off my head), but this doesn't harm your
 cluster in any ways, but having UNREACHABLE nodes could have been
 annoying. By the way gossipinfo shows you those nodes as STATUS:LEFT
 which is good. I am quite sure that this status changed when you used the
 jmx unsafeAssassinateEndpoint.

 do a full cluster restart (I presume that means a rolling restart - not
 shut-down the entire cluster right???). 

 A full restart = entire cluster down = down time. It is precisely *not*
 a rolling restart.

 To conclude I would say that your cluster seems healthy now (from what I
 can see), you have no more ghost nodes and nothing to do. Just wait a week
 or so and look for gossipinfo again.


 2013/5/22 Vasileios Vlachos vasileiosvlac...@gmail.com

 Hello All,

 A while ago we had 3 cassandra nodes on Amazon. At some point we decided
 to buy some servers and deploy cassandra there. The problem is that since
 then we have a list of dead IPs listed as UNREACHABLE nodes when we run
 describe cluster on cassandra-cli.

 I have seen other posts which describe similar issues, and the bottom
 line is it's harmless but if you want to get rid of it do a full cluster
 restart (I presume that means a rolling restart - not shut-down the entire
 cluster right???). Anyway...

 We also came across another solution: Install libmx4j-java, uncomment
 the respective line on /etc/default/cassandra, restart the node, go to 
 http://cassandra_node:8081/mbean?objectname=org.apache.cassandra.net%3Atype%3DGossiper;,
 type in the dead IP/IPs next to the unsafeAssassinateEndpoint and invoke
 it. So we did that on one of the nodes for the list of dead IPs. After
 running describe cluster on the CLI on every node, we noticed that there
 were no UNREACHABLE nodes and everything looked OK.

 However, when we run nodetool gossipinfo we get the following output:

 /10.1.32.97
  RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.76851457173E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,56713727820156410577229101238628035243
 /10.128.16.111
 REMOVAL_COORDINATOR:REMOVER,113427455640312821154458202477256070486
 STATUS:LEFT,42537039300520238181471502256297362072,1369471488145
 /10.128.16.110
 REMOVAL_COORDINATOR:REMOVER,1
 STATUS:LEFT,42537092606577173116506557155915918934,1369471275829
 /10.1.32.100
 RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.75649392881E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,85070591730234615865843651857942052863
 /10.1.32.101
 RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.71158702006E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,141784319550391026443072753096570088105
 /10.1.32.98
 RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.73163150773E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,113427455640312821154458202477256070486
 /10.128.16.112
 REMOVAL_COORDINATOR:REMOVER,1
 STATUS:LEFT,42537092606577173116506557155915918934,1369471567719
 /10.1.32.99
 RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.72271268395E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,28356863910078205288614550619314017621
 /10.1.32.96
 RELEASE_VERSION:1.0.11
 SCHEMA:b1116df0-b3dd-11e2--16fe4da5dbff
 LOAD:2.71494331357E11
 RPC_ADDRESS:0.0.0.0
 STATUS:NORMAL,0

 Does anyone know why the dead nodes still appear when we run nodetool
 gossipinfo but they don't when we run describe cluster from the CLI?

 Thank you in advance for your help,

 Vasilis

best practices on EC2 question

2013-05-16 Thread Brian Tarbox

From this list and the NYC* conference it seems that the consensus
configuration of C* on EC2 is to put the data on an ephemeral drive and
then periodically back it the drive to S3...relying on C*'s inherent fault
tolerance to deal with any data loss.

Fine, and we're doing this...but we find that transfer rates from S3 back
to a rebooted server instance are *very *slow...like 15 MB/second or
roughly a minute per gigabyte.  Calling EC2 support resulting in them
saying sorry, that's how it is.

I'm wondering if anyone a) has found a faster way to transfer to S3, or b)
do people skip backups altogether except for huge outages and just let
rebooted server instances come up empty to repopulate via C*?

An alternative that we had explored for a while was to do a two stage
backup:
1) copy a C* snapshot from the ephemeral drive to an EBS drive
2) do an EBS snapshot to S3.

The idea being that EBS is quite reliable, S3 is still the emergency backup
and copying back from EBS to ephemeral is likely much faster than the 15
MB/sec we get from S3.

Thoughts?

Brian

how to monitor nodetool cleanup?

2013-05-07 Thread Brian Tarbox

I'm recovering from a significant failure and so am doing lots of nodetool
move, removetoken, repair and cleanup.

For most of these I can do nodetool netstats to monitor progress but it
doesn't show anything for cleanup...how can I monitor the progress of
cleanup?  On a related note: I'm able to stop all client access to the
cluster until things are happy again...is there anything I can do to make
move/repair/cleanup go faster?

FWIW my problems came from trying to move nodes between EC2 availability
zones...which led to
1) killing a node and recreating it in another availability zone
2) new node had different local ip address so cluster thought old node was
just down and we had a new node...

I did the removetoken on the dead node and gave the new node
oldToken-1...but things still got weird and I ended up spending a couple of
days cleaning up (which seems odd for only about 300 gig total data).

Anyway, any suggestions for monitoring / speeding up cleanup would be
appreciated.

Brian Tarbox

Re: Cassandra Summit 2013

2013-04-12 Thread Brian Tarbox

Jonathan,
I'm a bit puzzled.  I had planned to attend Cassandra's major conference in
the summer but then the NYC* conference was announced.  I spoke with
DataStax and was told that there was no summer conference this year and
that NYC* was all there was.  So, I spent my conference time/budget on it
and so now can't attend SFSummit25.   NYC* was great but I feel that I got
misled...or did I misunderstand somehow???

Brian Tarbox


On Fri, Apr 12, 2013 at 11:50 AM, Jonathan Ellis jbel...@gmail.com wrote:

 Hi all,

 Last year's Summit saw fantastic talks [1] and over 800 attendees.
 The feedback was enthusiastic; the most commonly requested improvement
 was to extend it to two days.

 We're pleased to deliver just that for 2013!  This year's Cassandra
 Summit will be at Fort Mason in San Francisco, California from June
 11th - 12th, with 45+ sessions covering Cassandra use cases,
 development tips and tricks, war stories, how-tos, and more.

 The popular meet the experts room will also return.  Engineers and
 committers from companies such as Spotify, eBay, Netflix, Comcast,
 BlueMountain Capital, and DataStax will be there excited to share
 their Cassandra experiences.

 The schedule of talks is about 90% final.  To view it and register,
 visit
 http://www.datastax.com/company/news-and-events/events/cassandrasummit2013
 and use the code SFSummit25 for 25% off.

 See you there!

 [1]
 http://www.datastax.com/company/news-and-events/events/cassandrasummit2012/presentations

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced

any other NYC* attendees find your usb stick of the proceedings empty?

2013-03-25 Thread Brian Tarbox

Last week I attended DataStax's NYC* conference and one of the give-aways
was a wooden USB stick.  Finally getting around to loading it I find it
empty.

Anyone else have this problem?  Are the conference presentations available
somewhere else?

Brian Tarbox

Re: cfhistograms

2013-03-25 Thread Brian Tarbox

I think we all go through this learning curve.  Here is the answer I gave
last time this question was asked:

The output of this command seems to make no sense unless I think of it as 5
completely separate histograms that just happen to be displayed together.

Using this example output should I read it as: my reads all took either 1
or 2 sstable.  And separately, I had write latencies of 3,7,19.  And
separately I had read latencies of 2, 8,69, etc?

In other words...each row isn't really a row...i.e. on those 16033 reads
from a single SSTable I didn't have 0 write latency, 0 read latency, 0 row
size and 0 column count.  Is that right?

Offset  SSTables Write Latency  Read Latency  Row Size
 Column Count
1  16033 00
   0 0
2303   00
 0 1
3  0 00
   0 0
4  0 00
   0 0
5  0 00
   0 0
6  0 00
   0 0
7  0 00
   0 0
8  0 02
   0 0
10 0 00
   0  6261
12 0 02
   0   117
14 0 08
   0 0
17 0 3   69
   0   255
20 0 7  163
   0 0
24 019 1369
   0 0


On Mon, Mar 25, 2013 at 11:52 AM, Kanwar Sangha kan...@mavenir.com wrote:

  Can someone explain how to read the cfhistograms o/p ?

 ** **

 [root@db4 ~]# nodetool cfhistograms usertable data

 usertable/data histograms

 Offset  SSTables Write Latency  Read Latency  Row
 Size  Column Count

 12857444  4051 0
 0 342711

 26355104 27021 0
0  201313

 32579941 61600 0
 0  130489

 4 374067119286 0
 0 91378

 5   9175210934 0
 0 68548

 6  0321098 0
 0 54479

 7  0476677 0
 0 45427

 8  0734846 0
 0 38814

 10 0   2867967 4
 0 65512

 12 0   536684422
 0 59967

 14 0   691143136
 0 63980

 17 0  10155740   127
 0115714

 20 0   7432318   302
   0138759

 24 0   5231047   969
 0193477

 29 0   2368553  2790
 0209998

 35 0859591  4385
 0204751

 42 0456978  3790
 0214658

 50 0306084  2465
 0151838

 60 0223202  2158
 0 40277

 72 0122906  2896
 0  1735

 ** **

 ** **

 Thanks

 Kanwar

 ** **

Re: what addresses to use in EC2 cluster (whenever an instance restarts it gets a new private ip)?

2013-02-12 Thread Brian Tarbox

When my EC2 instance failed I restarted it, and added the new private IP
address to the list of seed nodes (was this my error?).
Nodetool then showed 4 live nodes and one dead one (corresponding to the
old private IP address).

I'm guessing that what I should have done on the restarted node is start it
with -Dreplace_token?
In such cases what should I do with the list of seed nodes?

I think this is a great opportunity for a technical paper or something on
how to setup Cassandra on EC2.  :-)
BTW: I'm running with encrypted disksrunning live on ephemeral drives
that get periodically copied back to EBS stores so I don't lose anything.

Brian


On Tue, Feb 12, 2013 at 12:20 PM, aaron morton aa...@thelastpickle.comwrote:

 Cassandra handles nodes changing IP. The import thing to Cassandra is the
 token, not the IP.

 In your case did the replacement node have the same token as the failed
 one?

 You can normally work around these issues using commands like nodetool
 removetoken.

 Cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 New Zealand

 @aaronmorton
 http://www.thelastpickle.com

 On 12/02/2013, at 10:04 AM, Andrey Ilinykh ailin...@gmail.com wrote:

 You have to use private IPs, but if an instance dies you have to bootstrap
 it with replace token flag. If you use EC2 I'd recommend Netflix's Priam
 tool. It manages all that stuff, plus you have S3 backup.


 Andrey


 On Mon, Feb 11, 2013 at 11:35 AM, Brian Tarbox 
 tar...@cabotresearch.comwrote:

 How do I configure my cluster to run in EC2?  In my cassandra.yaml I have
 IP addresses under seed_provider, listen_address and rpc_address.

 I tried setting up my cluster using just the EC2 private addresses but
 when one of my instances failed and I restarted it there was a new private
 address.  Suddenly my cluster thought it have five nodes rather than four.

 Then I tried using Elastic IP addresses (permanent addresses) but it
 turns out you get charged for network traffic between elastic addresses
 even if they are within the cluster.

 So...how do you configure the cluster when the IP addresses can change
 out from under you?

 Thanks.

 Brian Tarbox

what addresses to use in EC2 cluster (whenever an instance restarts it gets a new private ip)?

2013-02-11 Thread Brian Tarbox

How do I configure my cluster to run in EC2?  In my cassandra.yaml I have
IP addresses under seed_provider, listen_address and rpc_address.

I tried setting up my cluster using just the EC2 private addresses but when
one of my instances failed and I restarted it there was a new private
address.  Suddenly my cluster thought it have five nodes rather than four.

Then I tried using Elastic IP addresses (permanent addresses) but it turns
out you get charged for network traffic between elastic addresses even if
they are within the cluster.

So...how do you configure the cluster when the IP addresses can change out
from under you?

Thanks.

Brian Tarbox

Re: Upcoming conferences

2013-01-30 Thread Brian Tarbox

At what level will the NY talks be?  I had been planning on attending
Datastax's big summer conference and I might not be able to get approval
for bothso I'd like to hear more about this one.


On Wed, Jan 30, 2013 at 12:40 PM, Jonathan Ellis jbel...@gmail.com wrote:

 ApacheCon North America (Portland, Feb 26-28) has a Cassandra track on
 the 28th: http://na.apachecon.com/schedule/

 NY C* Tech Day (NY, March 20) is a 2-track, one-day conference devoted
 to Cassandra: http://datastax.com/nycassandra2013/

 See you there!

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced

trying to create encrypted ephemeral drive on Amazon

2013-01-29 Thread Brian Tarbox

I've heard that on Amazon EC2 I should be using ephemeral drives...but I
want/need to be using encrypted volumes.

On my local machine I use cryptsetup to encrypt a device and then mount it
and so on...but on Amazon I get the error:

Cannot open device /dev/xvdb for read-only access.

Reading further I wonder if this is even possible based on this statement
in the Amazon doc set *An instance store is dedicated to a particular
instance; however, the disk subsystem is shared among instances on a host
computer*

How are other folks achieving performance and encryption on EC2?

Thanks.

Re: Is this how to read the output of nodetool cfhistograms?

2013-01-23 Thread Brian Tarbox

Wei,
Thank you for the explanation (Offset is always the x-axis, the other
columns represent the y-axis (taken 5 independent times)).

Part of this still doesn't make sense.  If I look at just read latencies
for example...am I to believe that 1916 times I had a latency of exactly
3229500 usecs?  Is this just some weird 5-independent variable mushed
together data bucketing???

OffsetSSTables Write Lat Read Lat
   1109 0 349 642406  1331 0 147 1335840  1597 0 121 640374  *1916* 0 117 *
3229500*  2299 0 91 683749  2759 0 77 202722


On Tue, Jan 22, 2013 at 12:11 PM, Wei Zhu wz1...@yahoo.com wrote:

 I agree that Cassandra cfhistograms is probably the most bizarre metrics I
 have ever come across although it's extremely useful.

 I believe the offset is actually the metrics it has tracked (x-axis on the
 traditional histogram) and the number under each column is how many times
 that value has been recorded (y-axis on the traditional histogram). Your
 write latency are 17, 20, 24 (microseconds?). 3 writes took 17, 7 writes
 took 20 and 19 writes took 24

 Correct me if I am wrong.

 Thanks.
 -Wei

   --
 *From:* Brian Tarbox tar...@cabotresearch.com
 *To:* user@cassandra.apache.org
 *Sent:* Tuesday, January 22, 2013 7:27 AM
 *Subject:* Re: Is this how to read the output of nodetool cfhistograms?

 Indeed, but how many Cassandra users have the good fortune to stumble
 across that page?  Just saying that the explanation of the very powerful
 nodetool commands should be more front and center.

 Brian


 On Tue, Jan 22, 2013 at 10:03 AM, Edward Capriolo 
 edlinuxg...@gmail.comwrote:

 This was described in good detail here:

 http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/

 On Tue, Jan 22, 2013 at 9:41 AM, Brian Tarbox tar...@cabotresearch.comwrote:

 Thank you!   Since this is a very non-standard way to display data it
 might be worth a better explanation in the various online documentation
 sets.

 Thank you again.

 Brian


 On Tue, Jan 22, 2013 at 9:19 AM, Mina Naguib mina.nag...@adgear.comwrote:



 On 2013-01-22, at 8:59 AM, Brian Tarbox tar...@cabotresearch.com wrote:

  The output of this command seems to make no sense unless I think of it
 as 5 completely separate histograms that just happen to be displayed
 together.
 
  Using this example output should I read it as: my reads all took either
 1 or 2 sstable.  And separately, I had write latencies of 3,7,19.  And
 separately I had read latencies of 2, 8,69, etc?
 
  In other words...each row isn't really a row...i.e. on those 16033 reads
 from a single SSTable I didn't have 0 write latency, 0 read latency, 0 row
 size and 0 column count.  Is that right?

 Correct.  A number in any of the metric columns is a count value bucketed
 in the offset on that row.  There are no relationships between other
 columns on the same row.

 So your first row says 16033 reads were satisfied by 1 sstable.  The
 other metrics (for example, latency of these reads) is reflected in the
 histogram under Read Latency, under various other bucketed offsets.

 
  Offset  SSTables Write Latency  Read Latency  Row
 Size  Column Count
  1  16033 00
0 0
  2303   00
  0 1
  3  0 00
0 0
  4  0 00
0 0
  5  0 00
0 0
  6  0 00
0 0
  7  0 00
0 0
  8  0 02
0 0
  10 0 00
0  6261
  12 0 02
0   117
  14 0 08
0 0
  17 0 3   69
0   255
  20 0 7  163
0 0
  24 019 1369
0 0

Is this how to read the output of nodetool cfhistograms?

2013-01-22 Thread Brian Tarbox

The output of this command seems to make no sense unless I think of it as 5
completely separate histograms that just happen to be displayed together.

Using this example output should I read it as: my reads all took either 1
or 2 sstable.  And separately, I had write latencies of 3,7,19.  And
separately I had read latencies of 2, 8,69, etc?

In other words...each row isn't really a row...i.e. on those 16033 reads
from a single SSTable I didn't have 0 write latency, 0 read latency, 0 row
size and 0 column count.  Is that right?

Offset  SSTables Write Latency  Read Latency  Row Size
 Column Count
1  16033 00
   0 0
2303   00
 0 1
3  0 00
   0 0
4  0 00
   0 0
5  0 00
   0 0
6  0 00
   0 0
7  0 00
   0 0
8  0 02
   0 0
10 0 00
   0  6261
12 0 02
   0   117
14 0 08
   0 0
17 0 3   69
   0   255
20 0 7  163
   0 0
24 019 1369
   0 0

Re: Is this how to read the output of nodetool cfhistograms?

2013-01-22 Thread Brian Tarbox

Thank you!   Since this is a very non-standard way to display data it might
be worth a better explanation in the various online documentation sets.

Thank you again.

Brian


On Tue, Jan 22, 2013 at 9:19 AM, Mina Naguib mina.nag...@adgear.com wrote:



 On 2013-01-22, at 8:59 AM, Brian Tarbox tar...@cabotresearch.com wrote:

  The output of this command seems to make no sense unless I think of it
 as 5 completely separate histograms that just happen to be displayed
 together.
 
  Using this example output should I read it as: my reads all took either
 1 or 2 sstable.  And separately, I had write latencies of 3,7,19.  And
 separately I had read latencies of 2, 8,69, etc?
 
  In other words...each row isn't really a row...i.e. on those 16033 reads
 from a single SSTable I didn't have 0 write latency, 0 read latency, 0 row
 size and 0 column count.  Is that right?

 Correct.  A number in any of the metric columns is a count value bucketed
 in the offset on that row.  There are no relationships between other
 columns on the same row.

 So your first row says 16033 reads were satisfied by 1 sstable.  The
 other metrics (for example, latency of these reads) is reflected in the
 histogram under Read Latency, under various other bucketed offsets.

 
  Offset  SSTables Write Latency  Read Latency  Row
 Size  Column Count
  1  16033 00
0 0
  2303   00
  0 1
  3  0 00
0 0
  4  0 00
0 0
  5  0 00
0 0
  6  0 00
0 0
  7  0 00
0 0
  8  0 02
0 0
  10 0 00
0  6261
  12 0 02
0   117
  14 0 08
0 0
  17 0 3   69
0   255
  20 0 7  163
0 0
  24 019 1369
0 0

Re: Is this how to read the output of nodetool cfhistograms?

2013-01-22 Thread Brian Tarbox

Indeed, but how many Cassandra users have the good fortune to stumble
across that page?  Just saying that the explanation of the very powerful
nodetool commands should be more front and center.

Brian


On Tue, Jan 22, 2013 at 10:03 AM, Edward Capriolo edlinuxg...@gmail.comwrote:

 This was described in good detail here:

 http://thelastpickle.com/2011/04/28/Forces-of-Write-and-Read/

 On Tue, Jan 22, 2013 at 9:41 AM, Brian Tarbox tar...@cabotresearch.comwrote:

 Thank you!   Since this is a very non-standard way to display data it
 might be worth a better explanation in the various online documentation
 sets.

 Thank you again.

 Brian


 On Tue, Jan 22, 2013 at 9:19 AM, Mina Naguib mina.nag...@adgear.comwrote:



 On 2013-01-22, at 8:59 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

  The output of this command seems to make no sense unless I think of it
 as 5 completely separate histograms that just happen to be displayed
 together.
 
  Using this example output should I read it as: my reads all took
 either 1 or 2 sstable.  And separately, I had write latencies of 3,7,19.
  And separately I had read latencies of 2, 8,69, etc?
 
  In other words...each row isn't really a row...i.e. on those 16033
 reads from a single SSTable I didn't have 0 write latency, 0 read latency,
 0 row size and 0 column count.  Is that right?

 Correct.  A number in any of the metric columns is a count value
 bucketed in the offset on that row.  There are no relationships between
 other columns on the same row.

 So your first row says 16033 reads were satisfied by 1 sstable.  The
 other metrics (for example, latency of these reads) is reflected in the
 histogram under Read Latency, under various other bucketed offsets.

 
  Offset  SSTables Write Latency  Read Latency  Row
 Size  Column Count
  1  16033 00
  0 0
  2303   00
0 1
  3  0 00
  0 0
  4  0 00
  0 0
  5  0 00
  0 0
  6  0 00
  0 0
  7  0 00
  0 0
  8  0 02
  0 0
  10 0 00
  0  6261
  12 0 02
  0   117
  14 0 08
  0 0
  17 0 3   69
  0   255
  20 0 7  163
  0 0
  24 019 1369
  0 0

Re: How can OpsCenter show me Read Request Latency where there are no read requests??

2013-01-16 Thread Brian Tarbox

Hmm, that's sense but then why is the latency for the reads that get the
metric often so high (several thousand uSecs) and why does it so closely
track the latency of my normal reads?


On Wed, Jan 16, 2013 at 12:14 PM, Tyler Hobbs ty...@datastax.com wrote:

 When you view OpsCenter metrics, you're generating a small number of reads
 to fetch the metric data, which is why your read count is near zero instead
 of actually being zero.  Since reads are still occurring, Cassandra will
 continue to show a read latency.  Basically, you're just viewing the
 latency on the reads to fetch metric data.

 Normally the number of reads required to view metrics are small enough
 that they only make a minor difference in your overall read latency
 average, but when you have no other reads occurring, they're the only reads
 that are included in the average.


 On Tue, Jan 15, 2013 at 9:28 PM, Brian Tarbox tar...@cabotresearch.comwrote:

 I am making heavy use of DataStax OpsCenter to help tune my system and
 its great.

 And yet puzzling.  I see my clients do a burst of Reads causing the
 OpsCenter Read Requests chart to go up and stay up until the clients finish
 doing their reads.  The read request latency chart also goes upbut it
 stays up even after all the reads are done.  At last glance I've had next
 to zero reads for 10 minutes but still have a read request latency thats
 basically unchanged from when there were actual reads.

 How am I to interpret this?

 Thanks.

 Brian Tarbox




 --
 Tyler Hobbs
 DataStax http://datastax.com/

trying to use row_cache (b/c we have hot rows) but nodetool info says zero requests

2013-01-16 Thread Brian Tarbox

We have quite wide rows and do a lot of concentrated processing on each
row...so I thought I'd try the row cache on one node in my cluster to see
if I could detect an effect of using it.

The problem is that nodetool info says that even with a two gig row_cache
we're getting zero requests.  Since my client program is actively
processing, and since keycache shows lots of activity I'm puzzled.

Shouldn't any read of a column cause the entire row to be loaded?

My entire data file is only 32 gig right now so its hard to imagine the 2
gig is too small to hold even a single row?

Any suggestions how to proceed are appreciated.

Thanks.

Brian Tarbox

How can OpsCenter show me Read Request Latency where there are no read requests??

2013-01-15 Thread Brian Tarbox

I am making heavy use of DataStax OpsCenter to help tune my system and its
great.

And yet puzzling.  I see my clients do a burst of Reads causing the
OpsCenter Read Requests chart to go up and stay up until the clients finish
doing their reads.  The read request latency chart also goes upbut it
stays up even after all the reads are done.  At last glance I've had next
to zero reads for 10 minutes but still have a read request latency thats
basically unchanged from when there were actual reads.

How am I to interpret this?

Thanks.

Brian Tarbox

is there a way to list who is connected to my cluster?

2013-01-11 Thread Brian Tarbox

I'd like to be able to find out which processes are connected to my
clusteris there a way to do this?

The root problem is that someone is doing an operation or set of operations
that is causing DataStax to show high read latency.  In trying to find out
which of our various programs is doing this I've turned off all of the
programs I know of.

But, DataStax still saying there are clients running against us...how can I
find them?

Thanks.

Brian Tarbox

puzzled why my cluster is slowing down

2013-01-07 Thread Brian Tarbox

I have a 4 node cluster with lots JVM memory and lots of system memory that
slows down when I'm doing lots of writes.

Running DataStax charts I see my read and write latency rise from 50-100
u-secs to 1500-4500 u-secs.  This is across a 12 hour data load during
which time the applied load is high but fairly constant (500-700
writes/sec).

I'm trying to understand the slowdown: there is no memory pressure, I've
run every option under nodetool to look for bottlenecks (tpstats,
compactionStats, etc) and see none.  I'm running with keycache and have
about 98% hits.

What can I check next?
Thanks!

Brian Tarbox

help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox

I have a column family where I'm doing 500 inserts/sec for 12 hours or so
at time.  At some point my performance falls off a cliff due to time spent
doing compactions.

I'm seeing row after row of logs saying that after 1 or 2 hours of
compactiing it reduced to 100% of 99% of the original.

I'm trying to understand what direction this data points me to in term of
configuration change.

   a) increase my compaction_throughput_mb_per_sec because I'm falling
behind (am I falling behind?)

   b) enable multi-threaded compaction?

Any help is appreciated.

Brian

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox

I have not specified leveled compaction so I guess I'm defaulting to size
tiered?  My data (in the column family causing the trouble) insert once,
ready many, update-never.

Brian


On Mon, Jan 7, 2013 at 3:13 PM, Michael Kjellman mkjell...@barracuda.comwrote:

 Size tiered or leveled compaction?

 From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 12:03 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: help turning compaction..hours of run to get 0% compaction

 I have a column family where I'm doing 500 inserts/sec for 12 hours or so
 at time.  At some point my performance falls off a cliff due to time spent
 doing compactions.

 I'm seeing row after row of logs saying that after 1 or 2 hours of
 compactiing it reduced to 100% of 99% of the original.

 I'm trying to understand what direction this data points me to in term of
 configuration change.

a) increase my compaction_throughput_mb_per_sec because I'm falling
 behind (am I falling behind?)

b) enable multi-threaded compaction?

 Any help is appreciated.

 Brian

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f

Re: help turning compaction..hours of run to get 0% compaction....

2013-01-07 Thread Brian Tarbox

The problem I see is that it already takes me more than 24 hours just to
load my data...during which time the logs say I'm spending tons of time
doing compaction.  For example in the last 72 hours I'm consumed* 20
hours*per machine on compaction.

Can I conclude from that than I should be (perhaps drastically) increasing
my compaction_mb_per_sec on the theory that I'm getting behind?

The fact that it takes me 3 days or more to run a test means its hard to
just play with values and see what works best, so I'm trying to understand
the behavior in detail.

Thanks.

Brain


On Mon, Jan 7, 2013 at 4:13 PM, Michael Kjellman mkjell...@barracuda.comwrote:

 http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

 If you perform at least twice as many reads as you do writes, leveled
 compaction may actually save you disk I/O, despite consuming more I/O for
 compaction. This is especially true if your reads are fairly random and
 don’t focus on a single, hot dataset.

 From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 12:56 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Re: help turning compaction..hours of run to get 0%
 compaction

 I have not specified leveled compaction so I guess I'm defaulting to size
 tiered?  My data (in the column family causing the trouble) insert once,
 ready many, update-never.

 Brian


 On Mon, Jan 7, 2013 at 3:13 PM, Michael Kjellman 
 mkjell...@barracuda.comwrote:

 Size tiered or leveled compaction?

 From: Brian Tarbox tar...@cabotresearch.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Monday, January 7, 2013 12:03 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: help turning compaction..hours of run to get 0% compaction

 I have a column family where I'm doing 500 inserts/sec for 12 hours or so
 at time.  At some point my performance falls off a cliff due to time spent
 doing compactions.

 I'm seeing row after row of logs saying that after 1 or 2 hours of
 compactiing it reduced to 100% of 99% of the original.

 I'm trying to understand what direction this data points me to in term of
 configuration change.

a) increase my compaction_throughput_mb_per_sec because I'm falling
 behind (am I falling behind?)

b) enable multi-threaded compaction?

 Any help is appreciated.

 Brian

 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   



 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f

Re: Cassadra API for Java

2012-12-30 Thread Brian Tarbox

Anyone still use Pelops?


On Sun, Dec 30, 2012 at 12:19 PM, Shahryar Sedghi shsed...@gmail.comwrote:

 I use JDBC with Cassandra 1.1 with CQL 3. I  tried both Hector and Thrift
 and JDBC is  much easier to code, I never tried Astyanax. Application
 servers have built-in connection pooling support for JDBC, but do not
 provide fail over to other machines, you need to do it at the application
 level.

 Another Caveat: With Both Hector and Thrift without CQL you can retrieve
 all or portion of the row keys, CQL (at least on 1.1) does not give you
 distinct row keys. If you have a use case like this, either you need a
 Hybrid API solution or stick with another API,

 Regards

 Shahryar


 On Fri, Dec 28, 2012 at 10:34 PM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

 This was asked as recently as one month + 1 day btw:

 http://grokbase.com/t/cassandra/user/12bve4d8e8/java-high-level-client if
 you weren't subscribed to the group to see the messages to see a longer
 discussion.

 From: Baskar Sikkayan techba...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Friday, December 28, 2012 7:24 PM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: Cassadra API for Java

 Hi,
   I am new to Apache Cassandra.
 Could you please suggest me good java API( Hector, thrift or .) for
 Cassandra?

 Thanks,
 Baskar.S
 +91 97394 76008



 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f
   




 --
 Life is what happens while you are making other plans. ~ John Lennon

Re: State of Cassandra and Java 7

2012-12-22 Thread Brian Tarbox

What I saw in all cases was
a) set JAVA_HOME to java7, run program fail
b) set JAVA_HOME to java6, run program success

I should have better notes but I'm at a 6 person startup so working tools
gets used and failing tools get deleted.

Brian


On Fri, Dec 21, 2012 at 3:54 PM, Bryan Talbot btal...@aeriagames.comwrote:

 Brian, did any of your issues with java 7 result in corrupting data in
 cassandra?

 We just ran into an issue after upgrading a test cluster from Cassandra
 1.1.5 and Oracle JDK 1.6.0_29-b11 to Cassandra 1.1.7 and 7u10.

 What we saw is values in columns with validation
 Class=org.apache.cassandra.db.marshal.LongType that were proper integers
 becoming corrupted so that they become stored as strings.  I don't have
 a reproducible test case yet but will work on making one over the holiday
 if I can.

 For example, a column with a long type that was originally written and
 stored properly (say with value 1200) was somehow changed during cassandra
 operations (compaction seems the only possibility) to be the value '1200'
 with quotes.

 The data was written using the phpcassa library and that application and
 library haven't been changed.  This has only happened on our test cluster
 which was upgraded and hasn't happened on our live cluster which was not
 upgraded.  Many of our column families were affected and all affected
 columns are Long (or bigint for cql3).

 Errors when reading using CQL3 command client look like this:

 Failed to decode value '1356441225' (for column 'expires') as bigint:
 unpack requires a string argument of length 8

 and when reading with cassandra-cli the error is

 [default@cf] get
 token['fbc1e9f7cc2c0c2fa186138ed28e5f691613409c0bcff648c651ab1f79f9600b'];
 = (column=client_id, value=8ec4c29de726ad4db3f89a44cb07909c04f90932d,
 timestamp=1355836425784329, ttl=648000)
 A long is exactly 8 bytes: 10




 -Bryan





 On Mon, Dec 17, 2012 at 7:33 AM, Brian Tarbox tar...@cabotresearch.comwrote:

 I was using jre-7u9-linux-x64  which was the latest at the time.

 I'll confess that I did not file any bugs...at the time the advice from
 both the Cassandra and Zookeeper lists was to stay away from Java 7 (and my
 boss had had enough of my reporting that *the problem was Java 7* for
 me to spend a lot more time getting the details).

 Brian


 On Sun, Dec 16, 2012 at 4:54 AM, Sylvain Lebresne 
 sylv...@datastax.comwrote:

 On Sat, Dec 15, 2012 at 7:12 PM, Michael Kjellman 
 mkjell...@barracuda.com wrote:

 What issues have you ran into? Actually curious because we push
 1.1.5-7 really hard and have no issues whatsoever.


 A related question is which which version of java 7 did you try? The
 first releases of java 7 were apparently famous for having many issues but
 it seems the more recent updates are much more stable.

 --
 Sylvain


 On Dec 15, 2012, at 7:51 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 We've reverted all machines back to Java 6 after running into numerous
 Java 7 issues...some running Cassandra, some running Zookeeper, others just
 general problems.  I don't recall any other major language release being
 such a mess.


 On Fri, Dec 14, 2012 at 5:07 PM, Bill de hÓra b...@dehora.net wrote:

 At least that would be one way of defining officially supported.

 Not quite, because, Datastax is not Apache Cassandra.

 the only issue related to Java 7 that I know of is CASSANDRA-4958,
 but that's osx specific (I wouldn't advise using osx in production anyway)
 and it's not directly related to Cassandra anyway so you can easily use 
 the
 beta version of snappy-java as a workaround if you want to. So that non
 blocking issue aside, and as far as we know, Cassandra supports Java 7. Is
 it rock-solid in production? Well, only repeated use in production can
 tell, and that's not really in the hand of the project.

 Exactly right. If enough people use Cassandra on Java7 and enough
 people file bugs about Java 7 and enough people work on bugs for Java 7
 then Cassandra will eventually work well enough on Java7.

 Bill

 On 14 Dec 2012, at 19:43, Drew Kutcharian d...@venarc.com wrote:

  In addition, the DataStax official documentation states: Versions
 earlier than 1.6.0_19 should not be used. Java 7 is not recommended.
 
  http://www.datastax.com/docs/1.1/install/install_rpm
 
 
 
  On Dec 14, 2012, at 9:42 AM, Aaron Turner synfina...@gmail.com
 wrote:
 
  Does Datastax (or any other company) support Cassandra under Java 7?
  Or will they tell you to downgrade when you have some problem,
 because
  they don't support C* running on 7?
 
  At least that would be one way of defining officially supported.
 
  On Fri, Dec 14, 2012 at 2:22 AM, Sylvain Lebresne 
 sylv...@datastax.com wrote:
  What kind of official statement do you want? As far as I can be
 considered
  an official voice of the project, my statement is: various people
 run in
  production with Java 7 and it seems to work.
 
  Or to answer the initial question, the only issue related to Java
 7 that I

Re: State of Cassandra and Java 7

2012-12-17 Thread Brian Tarbox

I was using jre-7u9-linux-x64  which was the latest at the time.

I'll confess that I did not file any bugs...at the time the advice from
both the Cassandra and Zookeeper lists was to stay away from Java 7 (and my
boss had had enough of my reporting that *the problem was Java 7* for me
to spend a lot more time getting the details).

Brian


On Sun, Dec 16, 2012 at 4:54 AM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Sat, Dec 15, 2012 at 7:12 PM, Michael Kjellman mkjell...@barracuda.com
  wrote:

 What issues have you ran into? Actually curious because we push 1.1.5-7
 really hard and have no issues whatsoever.


 A related question is which which version of java 7 did you try? The
 first releases of java 7 were apparently famous for having many issues but
 it seems the more recent updates are much more stable.

 --
 Sylvain


 On Dec 15, 2012, at 7:51 AM, Brian Tarbox tar...@cabotresearch.com
 wrote:

 We've reverted all machines back to Java 6 after running into numerous
 Java 7 issues...some running Cassandra, some running Zookeeper, others just
 general problems.  I don't recall any other major language release being
 such a mess.


 On Fri, Dec 14, 2012 at 5:07 PM, Bill de hÓra b...@dehora.net wrote:

 At least that would be one way of defining officially supported.

 Not quite, because, Datastax is not Apache Cassandra.

 the only issue related to Java 7 that I know of is CASSANDRA-4958, but
 that's osx specific (I wouldn't advise using osx in production anyway) and
 it's not directly related to Cassandra anyway so you can easily use the
 beta version of snappy-java as a workaround if you want to. So that non
 blocking issue aside, and as far as we know, Cassandra supports Java 7. Is
 it rock-solid in production? Well, only repeated use in production can
 tell, and that's not really in the hand of the project.

 Exactly right. If enough people use Cassandra on Java7 and enough people
 file bugs about Java 7 and enough people work on bugs for Java 7 then
 Cassandra will eventually work well enough on Java7.

 Bill

 On 14 Dec 2012, at 19:43, Drew Kutcharian d...@venarc.com wrote:

  In addition, the DataStax official documentation states: Versions
 earlier than 1.6.0_19 should not be used. Java 7 is not recommended.
 
  http://www.datastax.com/docs/1.1/install/install_rpm
 
 
 
  On Dec 14, 2012, at 9:42 AM, Aaron Turner synfina...@gmail.com
 wrote:
 
  Does Datastax (or any other company) support Cassandra under Java 7?
  Or will they tell you to downgrade when you have some problem, because
  they don't support C* running on 7?
 
  At least that would be one way of defining officially supported.
 
  On Fri, Dec 14, 2012 at 2:22 AM, Sylvain Lebresne 
 sylv...@datastax.com wrote:
  What kind of official statement do you want? As far as I can be
 considered
  an official voice of the project, my statement is: various people
 run in
  production with Java 7 and it seems to work.
 
  Or to answer the initial question, the only issue related to Java 7
 that I
  know of is CASSANDRA-4958, but that's osx specific (I wouldn't
 advise using
  osx in production anyway) and it's not directly related to Cassandra
 anyway
  so you can easily use the beta version of snappy-java as a
 workaround if you
  want to. So that non blocking issue aside, and as far as we know,
 Cassandra
  supports Java 7. Is it rock-solid in production? Well, only repeated
 use in
  production can tell, and that's not really in the hand of the
 project. We do
  obviously encourage people to try Java 7 as much as possible and
 report any
  problem they may run into, but I would have though this goes without
 saying.
 
 
  On Fri, Dec 14, 2012 at 4:05 AM, Rob Coli rc...@palominodb.com
 wrote:
 
  On Thu, Dec 13, 2012 at 11:43 AM, Drew Kutcharian d...@venarc.com
 wrote:
  With Java 6 begin EOL-ed soon
  (https://blogs.oracle.com/java/entry/end_of_public_updates_for),
 what's the
  status of Cassandra's Java 7 support? Anyone using it in
 production? Any
  outstanding *known* issues?
 
  I'd love to see an official statement from the project, due to the
  sort of EOL issues you're referring to. Unfortunately previous
  requests on this list for such a statement have gone unanswered.
 
  The non-official response is that various people run in production
  with Java 7 and it seems to work. :)
 
  =Rob
 
  --
  =Robert Coli
  AIMGTALK - rc...@palominodb.com
  YAHOO - rcoli.palominob
  SKYPE - rcoli_palominodb
 
 
 
 
 
  --
  Aaron Turner
  http://synfin.net/ Twitter: @synfinatic
  http://tcpreplay.synfin.net/ - Pcap editing and replay tools for
 Unix  Windows
  Those who would give up essential Liberty, to purchase a little
 temporary
  Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
  carpe diem quam minimum credula postero
 



 --
 Join Barracuda Networks in the fight against hunger.
 To learn how you can help in your community, please visit:
 http://on.fb.me/UAdL4f

Re: State of Cassandra and Java 7

2012-12-15 Thread Brian Tarbox

We've reverted all machines back to Java 6 after running into numerous Java
7 issues...some running Cassandra, some running Zookeeper, others just
general problems.  I don't recall any other major language release being
such a mess.


On Fri, Dec 14, 2012 at 5:07 PM, Bill de hÓra b...@dehora.net wrote:

 At least that would be one way of defining officially supported.

 Not quite, because, Datastax is not Apache Cassandra.

 the only issue related to Java 7 that I know of is CASSANDRA-4958, but
 that's osx specific (I wouldn't advise using osx in production anyway) and
 it's not directly related to Cassandra anyway so you can easily use the
 beta version of snappy-java as a workaround if you want to. So that non
 blocking issue aside, and as far as we know, Cassandra supports Java 7. Is
 it rock-solid in production? Well, only repeated use in production can
 tell, and that's not really in the hand of the project.

 Exactly right. If enough people use Cassandra on Java7 and enough people
 file bugs about Java 7 and enough people work on bugs for Java 7 then
 Cassandra will eventually work well enough on Java7.

 Bill

 On 14 Dec 2012, at 19:43, Drew Kutcharian d...@venarc.com wrote:

  In addition, the DataStax official documentation states: Versions
 earlier than 1.6.0_19 should not be used. Java 7 is not recommended.
 
  http://www.datastax.com/docs/1.1/install/install_rpm
 
 
 
  On Dec 14, 2012, at 9:42 AM, Aaron Turner synfina...@gmail.com wrote:
 
  Does Datastax (or any other company) support Cassandra under Java 7?
  Or will they tell you to downgrade when you have some problem, because
  they don't support C* running on 7?
 
  At least that would be one way of defining officially supported.
 
  On Fri, Dec 14, 2012 at 2:22 AM, Sylvain Lebresne sylv...@datastax.com
 wrote:
  What kind of official statement do you want? As far as I can be
 considered
  an official voice of the project, my statement is: various people run
 in
  production with Java 7 and it seems to work.
 
  Or to answer the initial question, the only issue related to Java 7
 that I
  know of is CASSANDRA-4958, but that's osx specific (I wouldn't advise
 using
  osx in production anyway) and it's not directly related to Cassandra
 anyway
  so you can easily use the beta version of snappy-java as a workaround
 if you
  want to. So that non blocking issue aside, and as far as we know,
 Cassandra
  supports Java 7. Is it rock-solid in production? Well, only repeated
 use in
  production can tell, and that's not really in the hand of the project.
 We do
  obviously encourage people to try Java 7 as much as possible and
 report any
  problem they may run into, but I would have though this goes without
 saying.
 
 
  On Fri, Dec 14, 2012 at 4:05 AM, Rob Coli rc...@palominodb.com
 wrote:
 
  On Thu, Dec 13, 2012 at 11:43 AM, Drew Kutcharian d...@venarc.com
 wrote:
  With Java 6 begin EOL-ed soon
  (https://blogs.oracle.com/java/entry/end_of_public_updates_for),
 what's the
  status of Cassandra's Java 7 support? Anyone using it in production?
 Any
  outstanding *known* issues?
 
  I'd love to see an official statement from the project, due to the
  sort of EOL issues you're referring to. Unfortunately previous
  requests on this list for such a statement have gone unanswered.
 
  The non-official response is that various people run in production
  with Java 7 and it seems to work. :)
 
  =Rob
 
  --
  =Robert Coli
  AIMGTALK - rc...@palominodb.com
  YAHOO - rcoli.palominob
  SKYPE - rcoli_palominodb
 
 
 
 
 
  --
  Aaron Turner
  http://synfin.net/ Twitter: @synfinatic
  http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix
  Windows
  Those who would give up essential Liberty, to purchase a little
 temporary
  Safety, deserve neither Liberty nor Safety.
-- Benjamin Franklin
  carpe diem quam minimum credula postero

Re: Memory Manager

2012-11-12 Thread Brian Tarbox

Can you supply your java parameters?

On Mon, Nov 12, 2012 at 7:29 AM, Everton Lima peitin.inu...@gmail.comwrote:

 Hi people,

 I was using cassandra on distributed project. I am using java 6 and
 cassandra 1.1.6. My problem is in Memory manager (I think). My system was
 throwing heap limit exception.
 The problem is that after some inserts (2Gb) the Old Gen memory of heap
 full and can not be cleaned. This problem, only occurs when I use more than
 one machine, with only one machine, after the insert the GC clean the Old
 Gen.

 Some one can help me?
 Thanks!

 --

 Everton Lima Aleixo
 Bacharel em Ciencia da Computação
 Universidade Federal de Goiás

problem encrypting keys and data

2012-11-07 Thread Brian Tarbox

We have a requirement to store our data encrypted.
Our encryption system turns our various strings into byte arrays.  So far
so good.

The problem is that the bytes in our byte arrays are sometimes
negative...but when we look at them in the cassandra-cli (or try
to programatically retrieve them) the bytes are all positive so we of
course don't find the expected data.

We have tried Byte encoding and UTF8 encoding without luck.  In looking at
the Byte validator in particular I see nothing that ought to care about the
sign of the bytes, but perhaps I'm missing something.

Any suggestions would be appreciated, thanks.

Brian Tarbox

my ubunutu system has fuseblk filesystem, would it go faster if I changed it to EXT4?

2012-10-30 Thread Brian Tarbox

I got some new ubuntu servers to add to my cluster and found that the file
system is fuseblk which really means NTFS.

All else being equal would I expect to get any performance boost if I
converted the file system to EXT4?  Edward Capriolo's Cookbook book seems
to suggest so.

Thanks.

Brian Tarbox

why does my Effective-Ownership and Load from ring give such different answers?

2012-10-19 Thread Brian Tarbox

I had a two node cluster that I expanded to four nodes.  I ran the token
generation script and moved all the nodes so that when I run nodetool
ring each node reports 25% Effective-Ownership.

However, my load numbers map out to 39%, 30%, 15%, 17%.

How can that be?

Thanks.

read performance plumetted

2012-10-12 Thread Brian Tarbox

I have a two node cluster hosting a 45 gig dataset.  I periodically have to
read a high fraction (20% or so) of my 'rows', grabbing a few thousand at a
time and then processing them.

This used to result in about 300-500 reads a second which seemed quite
good.  Recently that number has plummeted to 20-50 reads a second.  The
obvious question is what did I change?  I certainly added more
databringing my total load from 38 or so gig to 45 or so gig but its
hard to imagine that causing this problem.  The shape of my data has not
changed and I haven't changed any cassandra configuration.

Running nodetool tpstats I'm for the first time ever seeing entries under
ReadStage Active and Pending  which correlates with slow reads.
Running iostat I'm seeing a significant (10-50%) of iowait where I
previously never saw higher than 1-2%

I ran a full compaction on the relevant CF (which took 3.5 hours) to no
avail.

Any suggestions on where I can look next?

Thanks.

can I have a mix of 32 and 64 bit machines in a cluster?

2012-10-09 Thread Brian Tarbox

I can't imagine why this would be a problem but I wonder if anyone has
experience with running a mix of 32 and 64 bit nodes in a cluster.

(I'm not going to do this in production, just trying to make use of the
gear I have for my local system).

Thanks.

67 matches

Mail list logo