Cassandra counter column family performance

2014-05-13 Thread Batranut Bogdan
Hello all,

I have a counter CF defined as pk text PRIMARY KEY, a counter, b counter, c 
counter, d counter
After inserting a few million keys... 55 mil, the performance goes down the 
drain, 2-3 nodes in the cluster are on medium load, and when inserting batches 
of same lengths writes take longer and longer until the whole cluster becomes 
loaded and I get a lot of TExceptions... and the cluster becomes unresponsive.

Did anyone have the same problem?
Feel free to comment and share experiences about counter CF performance.


Re: Schema disagreement errors

2014-05-13 Thread Duncan Sands

Hi Gaurav, a schema versioning bug was fixed in 2.0.7.

Best wishes, Duncan.

On 12/05/14 21:31, Gaurav Sehgal wrote:

We have recently started seeing a lot of Schema Disagreement errors. We are
using Cassandra 2.0.6 with Oracle Java 1.7. I went through the Cassandra FAQ and
followed the below steps:


  * nodetool disablethrift
  * nodetool disablegossip
  * nodetool drain
  *

'kill pid'.


As per the documentation; the commit logs should have been flush; but that did
not happen in our case. The commit logs were still there. So, I removed them
manually to make sure there are no commit logs when cassandra start up( which
was fine in our case as this data can always be replayed).  I also deleted the
schema* directory from the /data/system folder.

Though when we started cassandra back up the issue started happening again.


Any help would be appreciated

Cheers!
Gaurav






Re: Cassandra MapReduce/Storm/ etc

2014-05-13 Thread Shamim
Hi,
  check out these following links:
1) http://frommyworkshop.blogspot.ru/search/label/Cassandra
2) 
http://frommyworkshop.blogspot.ru/2012/07/single-node-hadoop-cassandra-pig-setup.html
-- 
Best regards
  Shamim A.


11.05.2014, 22:17, Manoj Khangaonkar khangaon...@gmail.com:
 Hi,

 Searching for Cassandra with MapReduce, I am finding that the search results 
 are really dated -- from version 0.7  2010/2011.

 Is there a good blog/article that describes how using MapReduce on Cassandra 
 table ?

 From my naive understanding, Cassandra is all about partitioning. Querying is 
 based on partitionkey + clustered column(s).

 Inputs to MapReduce is a sequence of Key,values. For Storm it is a stream of 
 tuples.

 If a database table is input source for MapReduce or Storm, for me , this is 
 in the simple case, is translating to a full table scan of the input table, 
 which can timeout and is generally not a recommended access pattern in 
 Cassandra.

 My initial reaction is that if I need to process data with MapReduce or 
 Storm, reading it from Cassandra might not be the optimal way. Storing the 
 output to Cassandra however does make sense.

 If anyone had links to blogs or personal experience in this area, I would 
 appreciate if you can share it.

 regards


Re: Cassandra 2.0.7 keeps reporting errors due to no space left on device

2014-05-13 Thread Yatong Zhang
Well, I finally resolved this issue by modifying cassandra to ignore
sstables that had size bigger than a threshold.

The leveled compaction will fall back to sized tiered compaction in some
situation and that's why I always got some old huge sstables compacted.
More details can be found in 'LeveledManifest.java' , the
'getCompactionCandidates' function. I modified the 'mostInterestingBucket'
method of 'SizeTieredCompactionStrategy.java' and added a filter before
function return:

IteratorSSTableReader iter = hottest.left.iterator();
 while (iter.hasNext()) {
 SSTableReader mysstable = iter.next();
 if (mysstable.onDiskLength()  1099511627776L) {
 logger.info(Removed candidate {} ,
 mysstable.toString());
 iter.remove();
 }
 }


 I don't have much time to do some more research to figure out if this has
side effect or not, but this is a solution for me. I hope this would be
useful to those who had similar issues.


On Sun, May 4, 2014 at 5:10 PM, Yatong Zhang bluefl...@gmail.com wrote:

 I am using the latest 2.0.7. The 'nodetool tpstats' shows as:

 [root@storage5 bin]# ./nodetool tpstats
 Pool NameActive   Pending  Completed   Blocked
 All time blocked
 ReadStage 0 0 628220
 0 0
 RequestResponseStage  0 03342234
 0 0
 MutationStage 0 03172116
 0 0
 ReadRepairStage   0 0  47666
 0 0
 ReplicateOnWriteStage 0 0  0
 0 0
 GossipStage   0 0 756024
 0 0
 AntiEntropyStage  0 0  0
 0 0
 MigrationStage0 0  0
 0 0
 MemoryMeter   0 0   6652
 0 0
 MemtablePostFlusher   0 0   7042
 0 0
 FlushWriter   0 0   4023
 0 0
 MiscStage 0 0  0
 0 0
 PendingRangeCalculator0 0 27
 0 0
 commitlog_archiver0 0  0
 0 0
 InternalResponseStage 0 0  0
 0 0
 HintedHandoff 0 0 28
 0 0

 Message type   Dropped
 RANGE_SLICE  0
 READ_REPAIR  0
 PAGED_RANGE  0
 BINARY   0
 READ 0
 MUTATION 0
 _TRACE   0
 REQUEST_RESPONSE 0
 COUNTER_MUTATION 0


  And here is another type of error, and these errors seem to occur after
 'disk is full'

 ERROR [SSTableBatchOpen:2] 2014-04-30 13:47:48,348 CassandraDaemon.java
 (line 198) Exception in thread Thread[SSTableBatchOpen:2,5,main]
 org.apache.cassandra.io.sstable.CorruptSSTableException:
 java.io.EOFException
 at
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:110)
 at
 org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:64)
 at
 org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:42)
 at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:458)
 at
 org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:422)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:203)
 at
 org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:184)
 at
 org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:264)

 at
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
 at java.util.concurrent.FutureTask.run(FutureTask.java:262)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:744)
 Caused by: java.io.EOFException
 at
 java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
 at java.io.DataInputStream.readUTF(DataInputStream.java:589)
 at java.io.DataInputStream.readUTF(DataInputStream.java:564)
 at
 org.apache.cassandra.io.compress.CompressionMetadata.init(CompressionMetadata.java:85)
 ... 12 more




 On Sun, May 4, 2014 at 4:59 PM, DuyHai Doan doanduy...@gmail.com wrote:

 The symptoms looks like there are pending compactions stacking up or
 failed compactions so temporary files (-tmp-Data.db) are not 

Schema errors when bootstrapping / restarting node

2014-05-13 Thread Adam Cramer
Hi All,

I'm having some major issues bootstrapping a new node to my cluster.  We
are running 1.2.16, with vnodes enabled.

When a new node starts up (with auto_bootstrap), it selects a host ID and
finds the ring successfully:

INFO 18:42:29,559 JOINING: waiting for ring information

It successfully selects a set of tokens.  Then the weird stuff begins.  I
get this error once, while the node is reading the system keyspace:

ERROR 18:42:32,921 Exception in thread
Thread[InternalResponseStage:1,5,main]
java.lang.NullPointerException
at org.apache.cassandra.utils.ByteBufferUtil.toLong(ByteBufferUtil.java:421)
at org.apache.cassandra.cql.jdbc.JdbcLong.compose(JdbcLong.java:94)
at org.apache.cassandra.db.marshal.LongType.compose(LongType.java:34)
at org.apache.cassandra
.cql3.UntypedResultSet$Row.getLong(UntypedResultSet.java:138)
at org.apache.cassandra.db.SystemTable.migrateKeyAlias(SystemTable.java:199)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:346)
at org.apache.cassandra
.service.MigrationTask$1.response(MigrationTask.java:66)
at org.apache.cassandra
.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:47)
at org.apache.cassandra
.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


But it doesn't stop the bootstrap process.  The node successfully
handshakes versions, and pauses before bootstrapping:


 INFO 18:42:59,564 JOINING: schema complete, ready to bootstrap
 INFO 18:42:59,565 JOINING: waiting for pending range calculation
 INFO 18:42:59,565 JOINING: calculation complete, ready to bootstrap
 INFO 18:42:59,565 JOINING: getting bootstrap token
 INFO 18:42:59,705 JOINING: sleeping 3 ms for pending range setup


After 30 seconds, I get a flood of endless
org.apache.cassandra.db.UnknownColumnFamilyException
errors, and all other nodes in the cluster log the following endlessly:

INFO [HANDSHAKE-/x.x.x.x] 2014-05-09 18:44:36,289
OutboundTcpConnection.java (line 418) Handshaking version with /x.x.x.x


I suspect there may be something wrong with my schemas.  Sometimes while
restarting an existing node, the node will fail to restart, with the
following error, again while reading the system keyspace:

ERROR [InternalResponseStage:5] 2014-05-05 23:56:03,786
CassandraDaemon.java (line 191) Exception in thread
Thread[InternalResponseStage:5,5,main]
org.apache.cassandra.db.marshal.MarshalException: cannot parse 'column1' as
hex bytes
at org.apache.cassandra
.db.marshal.BytesType.fromString(BytesType.java:69)
at org.apache.cassandra
.config.ColumnDefinition.fromSchema(ColumnDefinition.java:231)
at org.apache.cassandra
.config.CFMetaData.addColumnDefinitionSchema(CFMetaData.java:1524)
at org.apache.cassandra
.config.CFMetaData.fromSchema(CFMetaData.java:1456)
at org.apache.cassandra
.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:306)
at org.apache.cassandra
.db.DefsTable.mergeColumnFamilies(DefsTable.java:444)
at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:356)
at org.apache.cassandra
.service.MigrationTask$1.response(MigrationTask.java:66)
at org.apache.cassandra
.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:47)
at org.apache.cassandra
.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NumberFormatException: An hex string representing
bytes must have an even length
at org.apache.cassandra.utils.Hex.hexToBytes(Hex.java:52)
at org.apache.cassandra
.db.marshal.BytesType.fromString(BytesType.java:65)
... 12 more

I am able to fix this error by clearing out the schema_columns system table
on disk.  After that, a node can boot successfully.

Does anyone have a clue what's going on here?

Thanks!


Re: Can Cassandra client programs use hostnames instead of IPs?

2014-05-13 Thread Ben Bromhead
You can set listen_address in cassandra.yaml to a hostname 
(http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html).
 

Cassandra will use the IP address returned by a DNS query for that hostname. On 
AWS you don't have to assign an elastic IP, all instances will come with a 
public IP that lasts its lifetime (if you use ec2-classic or your VPC is set up 
to assign them).

Note that whatever hostname you set in a nodes listen_address, it will need to 
return the private IP as AWS instances only have network access via there 
private address. Traffic to a instances public IP is NATed and forwarded to the 
private address. So you may as well just use the nodes IP address.

If you run hadoop on instances in the same AWS region it will be able to access 
your Cassandra cluster via private IP. If you run hadoop externally just use 
the public IPs. 

If you run in a VPC without public addressing and want to connect from external 
hosts you will want to look at a VPN 
(http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_VPN.html).

Ben Bromhead
Instaclustr | www.instaclustr.com | @instaclustr | +61 415 936 359




On 13/05/2014, at 4:31 AM, Huiliang Zhang zhl...@gmail.com wrote:

 Hi,
 
 Cassandra returns ips of the nodes in the cassandra cluster for further 
 communication between hadoop program and the casandra cluster. Is there a way 
 to configure the cassandra cluster to return hostnames instead of ips? My 
 cassandra cluster is on AWS and has no elastic ips which can be accessed 
 outside AWS.
 
 Thanks,
 Huiliang
 
 



Re: Disable reads during node rebuild

2014-05-13 Thread Aaron Morton
 I'm not able to replace a dead node using the ordinary procedure 
 (boostrap+join), and would like to rebuild the replacement node from another 
 DC.
Normally when you want to add a new DC to the cluster the command to use is 
nodetool rebuild $DC_NAME .(with auto_bootstrap: false) That will get the node 
to stream data from the $DC_NAME

 The problem is that if I start a node with auto_bootstrap=false to perform 
 the rebuild, it automatically starts serving empty reads (CL=LOCAL_ONE).

When adding a new DC the nodes wont be processing reads, that is not the case 
for you.

You should disable the client API’s to prevent the clients from calling the new 
nodes, use -Dcassandra.start_rpc=false and 
-Dcassandra.start_native_transport=false in cassandra-env.sh or appropriate 
settings in cassandra.yaml

Disabling reads from other nodes will be harder. IIRC during bootstrap a 
different timeout (based on ring_delay) is used to detect if the bootstrapping 
node is down. However if the node is running and you use nodetool rebuild i’m 
pretty sure the normal gossip failure detectors will kick in. Which means you 
cannot disable gossip to prevent reads. Also we would want the node to be up 
for writes. 

But what you can do is artificially set the severity of the node high so the 
dynamic snitch will route around it. See 
https://github.com/apache/cassandra/blob/cassandra-2.0/src/java/org/apache/cassandra/locator/DynamicEndpointSnitchMBean.java#L37
 

* Set the value to something high on the node you will be rebuilding, the 
number or cores on the system should do.  (jmxterm is handy for this 
http://wiki.cyclopsgroup.org/jmxterm) 
* Check nodetool gossipinfo on the other nodes to see the SEVERITY app state 
has propagated. 
* Watch completed ReadStage tasks on the node you want to rebuild. If you have 
read repair enabled it will still get some traffic. 
* Do rebuild 
* Reset severity to 0

Hope that helps. 
Aaron

-
Aaron Morton
New Zealand
@aaronmorton

Co-Founder  Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 13/05/2014, at 5:18 am, Paulo Ricardo Motta Gomes 
paulo.mo...@chaordicsystems.com wrote:

 Hello,
 
 I'm not able to replace a dead node using the ordinary procedure 
 (boostrap+join), and would like to rebuild the replacement node from another 
 DC. The problem is that if I start a node with auto_bootstrap=false to 
 perform the rebuild, it automatically starts serving empty reads 
 (CL=LOCAL_ONE).
 
 Is there a way to disable reads from a node while performing rebuild from 
 another datacenter? I tried starting the node in write survery mode, but the 
 nodetool rebuild command does not work in this mode.
 
 Thanks,
 
 -- 
 Paulo Motta
 
 Chaordic | Platform
 www.chaordic.com.br
 +55 48 3232.3200



Re: Storing log structured data in Cassandra without compactions for performance boost.

2014-05-13 Thread Chris Lohfink
Whats your data model look like?

 I think it would be best to just disable compactions.

Why? are you never doing reads?  There is also a cost to repairs/bootstrapping 
when you have a ton of sstables.  This might be a premature optimization.

If the data is read from a slice of a partition that has been added over time 
there will be a part of that row in every almost sstable. That would mean all 
of them (multiple disk seeks depending on clustering order per sstable) would 
have to be read from in order to service the query.  Data model can help or 
hurt a lot though.

If you set the TTL for the columns you added then C* will clean up sstables (if 
size tiered and post 1.2) once the datas been expired.  Since you never delete 
set the gc_grace_seconds to 0 so the ttl expiration doesnt result in tombstones.

---
Chris Lohfink 



On May 6, 2014, at 7:55 PM, Kevin Burton bur...@spinn3r.com wrote:

 I'm looking at storing log data in Cassandra… 
 
 Every record is a unique timestamp for the key, and then the log line for the 
 value.
 
 I think it would be best to just disable compactions.
 
 - there will never be any deletes.
 
 - all the data will be accessed in time range (probably partitioned randomly) 
 and sequentially.
 
 So every time a memtable flushes, we will just keep that SSTable forever.  
 
 Compacting the data is kind of redundant in this situation.
 
 I was thinking the best strategy is to use setcompactionthreshold and set the 
 value VERY high to compactions are never triggered.
 
 Also, It would be IDEAL to be able to tell cassandra to just drop a full 
 SSTable so that I can truncate older data without having to do a major 
 compaction and without having to mark everything with a tombstone.  Is this 
 possible?
 
 
 
 -- 
 
 Founder/CEO Spinn3r.com
 Location: San Francisco, CA
 Skype: burtonator
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 
 War is peace. Freedom is slavery. Ignorance is strength. Corporations are 
 people.
 



Re: Schema disagreement errors

2014-05-13 Thread Vincent Mallet
Hey Gaurav,

You should consider moving to 2.0.7 which fixes a bunch of these schema
disagreement problems. You could also play around with nodetool
resetlocalschema on the nodes that are behind, but be careful with that
one. I'd go with 2.0.7 first for sure.

Thanks,

   Vince.


On Mon, May 12, 2014 at 12:31 PM, Gaurav Sehgal gsehg...@gmail.com wrote:

 We have recently started seeing a lot of Schema Disagreement errors. We
 are using Cassandra 2.0.6 with Oracle Java 1.7. I went through the
 Cassandra FAQ and followed the below steps:



- nodetool disablethrift
- nodetool disablegossip
- nodetool drain
-

'kill pid'.


 As per the documentation; the commit logs should have been flush; but that
 did not happen in our case. The commit logs were still there. So, I removed
 them manually to make sure there are no commit logs when cassandra start
 up( which was fine in our case as this data can always be replayed).  I
 also deleted the schema* directory from the /data/system folder.

 Though when we started cassandra back up the issue started happening again.


 Any help would be appreciated

 Cheers!
 Gaurav





How to balance this cluster out ?

2014-05-13 Thread Oleg Dulin

I have a cluster that looks like this:

Datacenter: us-east
==
Replicas: 2

Address RackStatus State   LoadOwns 
  Token
   
  113427455640312821154458202477256070484
*.*.*.1   1b  Up Normal  141.88 GB   66.67% 
56713727820156410577229101238628035242

*.*.*.2  1a  Up Normal  113.2 GB66.67%  210
*.*.*.3   1d  Up Normal  102.37 GB   66.67% 
113427455640312821154458202477256070484



Obviously, the first node in 1b has 40% more data than the others. If I 
wanted to rebalance this cluster, how would I go about that ? Would 
shifting the tokens accomplish what I need and which tokens ?


Regards,
Oleg




Re: Schema disagreement errors

2014-05-13 Thread Robert Coli
On Tue, May 13, 2014 at 5:11 PM, Donald Smith 
donald.sm...@audiencescience.com wrote:

  I too have noticed that after doing “nodetool flush” (or “nodetool
 drain”), the commit logs are still there. I think they’re NEW (empty)
 commit logs, but I may be wrong. Anyone know?


Assuming they are being correctly marked clean after drain (which
historically has been a nontrivial assumption) they are new, empty
commit log segments which have been recycled.

=Rob


Couter column family performance problems

2014-05-13 Thread Batranut Bogdan
Hello all,

I have a counter CF defined as pk text PRIMARY KEY, a counter, b counter, c 
counter, d counter
After inserting a few million keys... 55 mil, the performance goes down the 
drain, 2-3 nodes in the cluster are on medium load, and when inserting batches 
of same lengths writes take longer and longer until the whole cluster becomes 
loaded and I get a lot of TExceptions... and the cluster becomes unresponsive.

Did anyone have the same problem?
Feel free to comment and share experiences about counter CF performance.

Datacenter understanding question

2014-05-13 Thread ng
If I have configuration of two data center with one node each.
Replication factor is also 1.
Will these 2 nodes going to be mirrored/replicated?


Re: Avoiding email duplicates when registering users

2014-05-13 Thread Nikolay Mihaylov
the real question is - if you want the email to be unique, why use
surrogate primary key as UUID.

I wonder what UUID gives you at all?

If you want to have non email primary key, why not use md5(email) ?




On Wed, May 7, 2014 at 2:19 AM, Tyler Hobbs ty...@datastax.com wrote:


 On Mon, May 5, 2014 at 10:27 AM, Ignacio Martin natx...@gmail.com wrote:


 When a user registers, the server generates a UUID and performs an INSERT
 ... IF NOT EXISTS into the email_to_UUID table. Immediately after, perform
 a SELECT from the same table and see if the read UUID is the same that the
 one we just generated. If it is, we are allowed to INSERT the data in the
 user table, knowing that no other will be doing it.


 INSERT ... IF NOT EXISTS is the correct thing to do here, but you don't
 need to SELECT afterwards.  If the row does exist, the query results will
 show that the insert was not applied and the existing row will be returned.


 --
 Tyler Hobbs
 DataStax http://datastax.com/



Really need some advices on large data considerations

2014-05-13 Thread Yatong Zhang
Hi,

We're going to deploy a large Cassandra cluster in PB level. Our scenario
would be:

1. Lots of writes, about 150 writes/second at average, and about 300K size
per write.
2. Relatively very small reads
3. Our data will be never updated
4. But we will delete old data periodically to free space for new data

We've learned that compaction strategy would be an important point cause
we've ran into 'no space' trouble because of the 'sized tiered'  compaction
strategy.

We've read http://wiki.apache.org/cassandra/LargeDataSetConsiderations and
is this enough or update-to-date? From our experience changing any
settings/schema during a large cluster is on line and has been running for
some time is really really a pain. So we're gathering more info and
expecting some more practical suggestions before we set up  the cassandra
cluster.

Thanks and any help is of great appreciation


Re: How long are expired values actually returned?

2014-05-13 Thread Sebastian Schmidt
Ah thank you!
Am 12.05.2014 16:31, schrieb Peter Reilly:
 You need to set grace period as well.

 Peter


 On Thu, May 8, 2014 at 8:44 AM, Sebastian Schmidt isib...@gmail.com
 mailto:isib...@gmail.com wrote:

 Hi,

 I'm using the TTL feature for my application. In my tests, when
 using a
 TTL of 5, the inserted rows are still returned after 7 seconds, and
 after 70 seconds. Is this normal or am I doing something wrong?.

 Kind Regards,
 Sebastian





signature.asc
Description: OpenPGP digital signature