Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Emalayan Vairavanathan
I am sorry if I was not clear. I was using nodes to refer machines (or vice 
versa).

Let me put in another way... 

The application is composed of multiple instances of an executable. The 
application runs on multiple machines concurrently. All the instances are going 
to issue the same CQL command to and try to create exactly same namespace and 
column families.

Thank you
Emalayan



 From: Arthur Zubarev 
To: Emalayan Vairavanathan ; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 1:15 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently
 


so where the multiple nodes are? I am just puzzled  
From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:43 PM
To: Arthur Zubarev ; user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple 
nodes concurrently
  "Would 
each device/machine have its own keyspace?"
 
No. 
All the machines are going to run the exactly same CQL commands and going to 
create the same namespace and column families.
 
Thank 
you
Emalayan
 


 From: Arthur Zubarev 
To: Emalayan Vairavanathan 
; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 12:20 
PM
Subject: Re: Creating 
namespace and column family from multiple nodes concurrently

 
Would each device/machine have its own keyspace?
 
Basically, your client needs to take care of a successful creation of the 
schema and any other verifications and it is going to be time consuming.  
From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:07 PM
To: user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple 
nodes concurrently
  Hi Arthur and Farraz,

Thank 
you for getting back to me.

I 
am trying to avoid sync among concurrent instances and thisis why I am 
preferring Option - 2. Further in my application, I have 
reasonable window between the application initialization phase and the 
application runtime.  So as long as Cassandra can safely handle concurrent 
creation I should be fine.

Do you have any idea how Cassandra is 
going to handle concurrent namespace and column family creation (Here all the 
instances are going to create the same namespace and column families 
concurrently)? 
    
- Does Cassandra take much time to agree on a final schema (In case if 
Cassandra 
is using some sort of exponential back off algorithms to handle schema 
conflicts) ? 
    
- Or is it going to result schema conflicts which needs manual intervention 
?
    
- Or will this result in race conditions ?
    
- Or some other issues e.g: memory/ cpu /network bottlenecks ?  

Thank you
Emalayan
 


 From: Arthur Zubarev 
To: user@cassandra.apache.org; 
svemala...@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column 
family from multiple nodes concurrently

 
I am 
assuming here you want to sync all the 100s of nodes once the application is 
airborne. I suspect this would flood the network and even potentially affect 
the 
machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)? 
 
Regards,

Arthur


 
 
-Original 
Message-
From: Emalayan Vairavanathan 
To: 
user 
Sent: Wed, May 22, 2013 8:31 
pm
Subject: Creating namespace and column family from multiple nodes 
concurrently


Hi all,
 
I 
am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying 
storage.
 
The 
application creates the schema 
(name space and column families) during initialization phase.  It seems I have 
two options 
to create the schema.

Option - 1 : 
Using a single node for schema creation.
    
Option - 2: Having all the nodes (> 100) to run the same schema creation 
logic (First, nodes will check whether the schema is already available and then 
try to create the schema if it is not available already).  
 
To 
keep the initialization phase simple, I prefer to go for Option - 2. However I 
am not sure how Cassandra is going to behave if multiple nodes try to create 
the 
same schema (namespace and column families) concurrently. It would be nice if 
someone can tell me about the implications of Option - 2 with Cassandra version 
1.2.2.

Please let me know if you have 
question.

Thank you
VE

Re: High performance disk io

2013-05-23 Thread aaron morton
>  I am currently trying to really study the effect of the width of a row 
> (being in multiple sstables) vs its 95th percentile read time.
I'd be interested to see your findings. 

Is use 3+ SSTables per read as (from cfhistograms) as a warning sign to dig 
deeper in the data model. Also the type of query impacts on the number of 
SSTables per read, queries by column name can short circuit and may be served 
from (say) 0 or 1 sstables even if the row is spread out. 

> -We don’t change anything and just keep upping our keycache.
> 

800MB is a very high key cache and may result in poor GC performance which is 
ultimately going to hurt your read latency. Pay attention to what GC is doing, 
both ParNew and CMS and reduce the key cache if needed. When ParNew runs the 
server is stalled. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 24/05/2013, at 3:16 AM, Edward Capriolo  wrote:

> I have used both rotation disks with lots of RAM as well as SSD devices. An 
> important thing to consider is that SSD devices are not magic. You have 
> big-o-notation in several places. 
> 1) more data large bloom filters
> 2) more data (larger key caches) JVM overhead
> 3) more requests more young gen JVM overhead
> 4) more data longer compaction (even with ssd)
> 5) more writes (more memtable flushing)
> Bottom line: more data more disk seeks
> 
> We have used both the mid level SSD as well as the costly fusion io. Fit in 
> RAM/VFScache delivers better more predictable low latency, even with very 
> fast disks the average, 95th, and 99th, percentile can get by very far apart. 
> I am currently trying to really study the effect of the width of a row (being 
> in multiple sstables) vs its 95th percentile read time.
> 
> 
> On Thu, May 23, 2013 at 10:43 AM, Christopher Wirt  
> wrote:
> Hi Igor,
> 
>  
> 
> I was talking about 99th percentile from the Cassandra histograms when I said 
> ‘1 or 2 ms for most cf’.
> 
>  
> 
> But we have measured client side too and generally get a couple ms added on 
> top.. as one might expect.
> 
>  
> 
> Anyone interested -
> 
> diskio (my original question) we have tried out the multiple SSD setup and 
> found it to work well and reduce the impact of a repair on node performance.
> 
> We ended up going with the single data directory in cassandra.yaml and mount 
> one SSD against that. Then have a dedicated SSD per large column family.
> 
> We’re now moving all of nodes to have the same setup.
> 
>  
> 
>  
> 
> Chris
> 
>  
> 
> From: Igor [mailto:i...@4friends.od.ua] 
> Sent: 23 May 2013 15:00
> To: user@cassandra.apache.org
> Subject: Re: High performance disk io
> 
>  
> 
> Hello Christopher,
> 
> BTW, are you talking about 99th percentiles on client side, or about 
> percentiles from cassandra histograms for CF on cassandra side?
> 
> Thanks!
> 
> On 05/22/2013 05:41 PM, Christopher Wirt wrote:
> 
> Hi Igor,
> 
>  
> 
> Yea same here, 15ms for 99th percentile is our max. Currently getting one or 
> two ms for most CF. It goes up at peak times which is what we want to avoid.
> 
>  
> 
> We’re using Cass 1.2.4 w/vnodes and our own barebones driver on top of 
> thrift. Needed to be .NET so Hector and Astyanax were not options.
> 
>  
> 
> Do you use SSDs or multiple SSDs in any kind of configuration or RAID?
> 
>  
> 
> Thanks
> 
>  
> 
> Chris
> 
>  
> 
> From: Igor [mailto:i...@4friends.od.ua] 
> Sent: 22 May 2013 15:07
> To: user@cassandra.apache.org
> Subject: Re: High performance disk io
> 
>  
> 
> Hello
> 
> What level of read performance do you expect? We have limit 15 ms for 99 
> percentile with average read latency near 0.9ms. For some CF 99 percentile 
> actually equals to 2ms, for other - to 10ms, this depends on the data volume 
> you read in each query.
> 
> Tuning read performance involved cleaning up data model, tuning 
> cassandra.yaml, switching from Hector to astyanax, tuning OS parameters.
> 
> On 05/22/2013 04:40 PM, Christopher Wirt wrote:
> 
> Hello,
> 
>  
> 
> We’re looking at deploying a new ring where we want the best possible read 
> performance.
> 
>  
> 
> We’ve setup a cluster with 6 nodes, replication level 3, 32Gb of memory, 8Gb 
> Heap, 800Mb keycache, each holding 40/50Gb of data on a 200Gb SSD and 500Gb 
> SATA for OS and commitlog
> 
> Three column families
> 
> ColFamily1 50% of the load and data
> 
> ColFamily2 35% of the load and data
> 
> ColFamily3 15% of the load and data
> 
>  
> 
> At the moment we are still seeing around 20% disk utilisation and 
> occasionally as high as 40/50% on some nodes at peak time.. we are conducting 
> some semi live testing.
> 
> CPU looks fine, memory is fine, keycache hit rate is about 80% (could be 
> better, so maybe we should be increasing the keycache size?)
> 
>  
> 
> Anyway, we’re looking into what we can do to improve this.
> 
>  
> 
> One conversion we are having at the moment is around the SSD disk setup..
> 
>  
> 
> We are

Re: For those using Cassandra from .Net

2013-05-23 Thread aaron morton
Thanks, when and were is the talk ? 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 23/05/2013, at 6:42 AM, Peter Lin  wrote:

> 
> NativeX is giving a talk about using Cassandra with .Net. Our firm created a 
> port of Hector over to .Net late last year.
> 
> Here is the abstract.
> 
> The Perils and Triumphs of using Cassandra at a .NET/Microsoft Shop
> 
> Speakers: Derek Bromenshenkel and Jeff Smoley, Infrastructure Architects at 
> NativeX
> 
>  
> NativeX (formerly W3i) recently transitioned a large portion of their backend 
> infrastructure from Microsoft SQL Server to Apache Cassandra. Today, its 
> Cassandra cluster backs its mobile advertising network supporting over 10 
> million daily active users that produce over 10,000 transactions per second 
> with an average database request latency of under 2 milliseconds. Come hear 
> our story about how we were successful at getting our .NET web apps to 
> reliably connect to Cassandra. Come learn about FluentCassandra, Snowflake, 
> Hector, and IKVM. It's a story of struggle and perseverance, where everyone 
> lives happily ever after.
> 
> 



Re: bootstrapping a new node...

2013-05-23 Thread aaron morton
> 1.  Is compaction supposed to go off during a bootstrapping node?
When a new file is received during streaming it is added to the list of 
SSTables for the CF through the same process as a SSTable flush. Once the 
SStable count gets high enough compaction will do it's thing. 

> 2.  I seem to recall a bootstrap node setting in cassandra.yaml but that was 
> not one of the steps I recall in the datastax docs we went off of……in 1.2.2, 
> is there any setting we need to set for a bootstrapping node that we 
> missed(our other nodes joined just fine though and seem to be working great).
The elders speak of an auto_bootstrap settings from the before time. It 
defaults to true, you can add it to the yaml if you want to disable it. 

If I'm working on a cluster that is under stress I'll increase the 
phi_convict_threshold to 16 via yaml or JMX. I *think* it's not necessary in 
later versions but have not checked. 

> 3.  What can I do to get this node to start streaming files again …can I just 
> reboot the cassandra or should I start from scratch somehow?

Without Ops Centre I use this to track netstat progress

diff <(nodetool netstats) <(sleep 60 & nodetool netstats)  

If you restart the bootstrapping node it will retry the bootstrapping process, 
you should see "Detected previous bootstrap failure; retrying" in the log. 

Use auto_bootstrap to prevent this 

> 4.  IF I need to start from scratch, I assume I a) stop the node, b) wipe 
> commitlog and data directories, c) start the node back up.  Would that be 
> correct?  After all, the other nodes don't seem to know about this new node 
> according to "nodetool ring" command.
yes. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 22/05/2013, at 2:23 AM, "Hiller, Dean"  wrote:

> We are using 1.2.2 cassandra and have rolled on 3 additionals nodes to our 6 
> node cluster(totalling 9 so far).  We are trying to roll on node 10 but 
> during the streaming a compaction kicked off which seemed very odd to us.  
> "nodetool netstats" still reported tons of files that were not transferred 
> yet.  Is this normal that compaction might kick off during bootstrapping a 
> new node.  Our node still says "Joining" in "nodetool netstats" as well.  The 
> ring does not show the new node yet either.  Lastly, "nodetool netstats" 
> reports 0% on EVERY single file and this doesn't seem to change.  The 
> bootstrap node seems hung so a few questions
> 
> 1.  Is compaction supposed to go off during a bootstrapping node?
> 2.  I seem to recall a bootstrap node setting in cassandra.yaml but that was 
> not one of the steps I recall in the datastax docs we went off of……in 1.2.2, 
> is there any setting we need to set for a bootstrapping node that we 
> missed(our other nodes joined just fine though and seem to be working great).
> 3.  What can I do to get this node to start streaming files again …can I just 
> reboot the cassandra or should I start from scratch somehow?
> 4.  IF I need to start from scratch, I assume I a) stop the node, b) wipe 
> commitlog and data directories, c) start the node back up.  Would that be 
> correct?  After all, the other nodes don't seem to know about this new node 
> according to "nodetool ring" command.
> 
> Thanks for any help on this one,
> Dean



Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-05-23 Thread aaron morton
> Any other ideas?
Sounds like a nasty heisenbug, can you replace or rebuild the machine?

 Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/05/2013, at 9:36 PM, Michal Michalski  wrote:

> I've finally had some time to experiment a bit with this problem (it occured 
> twice again) and here's what I found:
> 
> 1. So far (three occurences in total), *when* it happened, it happened only 
> for streaming to  *one* specific C* node (but it works on this node too for 
> 99,9% of the time)
> 2. It happens with compression turned on (cassandra.output.compression.class 
> set to org.apache.cassandra.io.compress.DeflateCompressor, but it doesn't 
> matter what the chunk length is)
> 3. Everything works fine when compression is turned off.
> 
> So it looks like I have a workaround for now, but I don't really understand 
> the root cause of this problem and what's the "right" solution if we want to 
> keep using compression.
> 
> Anyway, the thing that interests me the most is why does it fail so rarely 
> and - assuming it's not a coincidence - why only for one C* node...
> 
> May it be a DeflateCompressor's bug?
> Any other ideas?
> 
> Regards,
> Michał
> 
> 
> W dniu 31.03.2013 12:01, aaron morton pisze:
>>>  but yesterday one of 600 mappers failed
>>> 
>> :)
>> 
>>> From what I can understand by looking into the C* source, it seems to me 
>>> that the problem is caused by a empty (or surprisingly finished?) input 
>>> buffer (?) causing token to be set to -1 which is improper for 
>>> RandomPartitioner:
>> Yes, there is a zero length key which as a -1 token.
>> 
>>> However, I can't figure out what's the root cause of this problem.
>>> Any ideas?
>> mmm, the BulkOutputFormat uses a SSTableSimpleUnsortedWriter and neither of 
>> them check for zero length row keys. I would look there first.
>> 
>> There is no validation in the  AbstractSSTableSimpleWriter, not sure if that 
>> is by design or an oversight. Can you catch the zero length key in your map 
>> job ?
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 28/03/2013, at 2:26 PM, Michal Michalski  wrote:
>> 
>>> We're streaming data to Cassandra directly from MapReduce job using 
>>> BulkOutputFormat. It's been working for more than a year without any 
>>> problems, but yesterday one of 600 mappers faild and we got a 
>>> strange-looking exception on one of the C* nodes.
>>> 
>>> IMPORTANT: It happens on one node and on one cluster only. We've loaded the 
>>> same data to test cluster and it worked.
>>> 
>>> 
>>> ERROR [Thread-1340977] 2013-03-28 06:35:47,695 CassandraDaemon.java (line 
>>> 133) Exception in thread Thread[Thread-1340977,5,main]
>>> java.lang.RuntimeException: Last written key 
>>> DecoratedKey(5664330507961197044404922676062547179, 
>>> 302c6461696c792c32303133303332352c312c646f6d61696e2c756e6971756575736572732c633a494e2c433a6d63635f6d6e635f636172726965725f43656c6c4f6e655f4b61726e6174616b615f2842616e67616c6f7265295f494e2c643a53616d73756e675f47542d49393037302c703a612c673a3133)
>>>  >= current key DecoratedKey(-1, ) writing into 
>>> /cassandra/production/IndexedValues/production-IndexedValues-tmp-ib-240346-Data.db
>>> at 
>>> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:133)
>>> at 
>>> org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:209)
>>> at 
>>> org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:179)
>>> at 
>>> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122)
>>> at 
>>> org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226)
>>> at 
>>> org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166)
>>> at 
>>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>>> 
>>> 
>>> From what I can understand by looking into the C* source, it seems to me 
>>> that the problem is caused by a empty (or surprisingly finished?) input 
>>> buffer (?) causing token to be set to -1 which is improper for 
>>> RandomPartitioner:
>>> 
>>> public BigIntegerToken getToken(ByteBuffer key)
>>> {
>>>if (key.remaining() == 0)
>>>return MINIMUM;  // Which is -1
>>>return new BigIntegerToken(FBUtilities.hashToBigInteger(key));
>>> }
>>> 
>>> However, I can't figure out what's the root cause of this problem.
>>> Any ideas?
>>> 
>>> Of course I can't exclude a bug in my code which streams these data, but - 
>>> as I said - it works when loading the same data to test cluster (which has 
>>> different number of nodes, thus different token assignment, which might be 
>>> a case too).
>>> 
>>> Michał
>> 
>> 
> 



Re: Cassandra hangs on large hinted handoffs

2013-05-23 Thread Edward Capriolo
For some reason the 1.0.7 hints actually use a super column :)


On Thu, May 23, 2013 at 6:18 PM, aaron morton wrote:

> I know how this sounds, but upgrading to 1.1.11 is the best approach.
> 1.0X is not getting any fixes, 1.1X is the most stable and still getting
> some patches, and 1.2 is stable and in use.
>
> Hint storage has been redesigned in 1.2.
>
> Any suggestions on how to make the cluster more tolerant to downtimes?
>
> Hints are always seen as an optimisation, their success or otherwise does
> not impact the consistency guarantees.
>
> If are you dealing with a very high throughput as a work around you can
> reduce the time that hints are stored for a down node, see the yaml file
> for info.
>
> The behaviour is changes if you have lots of small or large column, this
> is the from HintedHandoff manager that selects the page size.
>
> int pageSize = PAGE_SIZE;
> // read less columns (mutations) per page if they are very large
> if (hintStore.getMeanColumns() > 0)
> {
> int averageColumnSize = (int) (hintStore.getMeanRowSize() /
> hintStore.getMeanColumns());
> pageSize = Math.min(PAGE_SIZE,
> DatabaseDescriptor.getInMemoryCompactionLimit() / averageColumnSize);
> pageSize = Math.max(2, pageSize); // page size of 1 does not
> allow actual paging b/c of >= behavior on startColumn
> logger_.debug("average hinted-row column size is {}; using
> pageSize of {}", averageColumnSize, pageSize);
> }
>
> If you reduce the in_memory_compaction_limit yaml setting that would
> reduce the page size
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 21/05/2013, at 9:26 PM, Vladimir Volkov  wrote:
>
> Hello.
>
> I'm stress-testing our Cassandra (version 1.0.9) cluster, and tried
> turning off two of the four nodes for half an hour under heavy load. As a
> result I got a large volume of hints on the alive nodes - HintsColumnFamily
> takes about 1.5 GB disk space on each of the nodes. It seems, these hints
> are never replayed successfully.
>
> After I bring other nodes back online, tpstats shows active handoffs, but
> I can't see any writes on the target nodes.
> The log indicates memory pressure - the heap is >80% full (heap size is
> 8GB total, 1GB young).
>
> A fragment of the log:
>  INFO 18:34:05,513 Started hinted handoff for token: 1 with IP: /
> 84.201.162.144
>  INFO 18:34:06,794 GC for ParNew: 300 ms for 1 collections, 5974181760
> used; max is 8588951552
>  INFO 18:34:07,795 GC for ParNew: 263 ms for 1 collections, 6226018744
> used; max is 8588951552
>  INFO 18:34:08,795 GC for ParNew: 256 ms for 1 collections, 6559918392
> used; max is 8588951552
>  INFO 18:34:09,796 GC for ParNew: 231 ms for 1 collections, 6846133712
> used; max is 8588951552
>  WARN 18:34:09,805 Heap is 0.7978131149667941 full.  You may need to
> reduce memtable and/or cache sizes.  Cassandra will now flush up to the two
> largest memtables to free up memory.
>  WARN 18:34:09,805 Flushing CFS(Keyspace='test', ColumnFamily='t2') to
> relieve memory pressure
>  INFO 18:34:09,806 Enqueuing flush of Memtable-t2@639524673(60608588/571839171
> serialized/live bytes, 743266 ops)
>  INFO 18:34:09,807 Writing Memtable-t2@639524673(60608588/571839171
> serialized/live bytes, 743266 ops)
>  INFO 18:34:11,018 GC for ParNew: 449 ms for 2 collections, 6573394480used; 
> max is
> 8588951552
>  INFO 18:34:12,019 GC for ParNew: 265 ms for 1 collections, 6820930056
> used; max is 8588951552
>  INFO 18:34:13,112 GC for ParNew: 331 ms for 1 collections, 6900566728
> used; max is 8588951552
>  INFO 18:34:14,181 GC for ParNew: 269 ms for 1 collections, 7101358936
> used; max is 8588951552
>  INFO 18:34:14,691 Completed flushing
> /mnt/raid/cassandra/data/test/t2-hc-244-Data.db (56156246 bytes)
>  INFO 18:34:15,381 GC for ParNew: 280 ms for 1 collections, 7268441248
> used; max is 8588951552
>  INFO 18:34:35,306 InetAddress /84.201.162.144 is now dead.
>  INFO 18:34:35,306 GC for ConcurrentMarkSweep: 19223 ms for 1 collections,
> 3774714808 used; max is 8588951552
>  INFO 18:34:35,309 InetAddress /84.201.162.144 is now UP
>
> After taking off the load and restatring the service, I still see pending
> handoffs:
> $ nodetool -h localhost tpstats
> Pool NameActive   Pending  Completed   Blocked
> All time blocked
> ReadStage 0 01004257
> 0 0
> RequestResponseStage  0 0  92555
> 0 0
> MutationStage 0 0  6
> 0 0
> ReadRepairStage   0 0  57773
> 0 0
> ReplicateOnWriteStage 0 0  0
> 0 0
> GossipStage   0 0 143332
> 0 0
> AntiEntropyStage  

Re: Cassandra hangs on large hinted handoffs

2013-05-23 Thread aaron morton
I know how this sounds, but upgrading to 1.1.11 is the best approach. 
1.0X is not getting any fixes, 1.1X is the most stable and still getting some 
patches, and 1.2 is stable and in use. 

Hint storage has been redesigned in 1.2. 

> Any suggestions on how to make the cluster more tolerant to downtimes?
Hints are always seen as an optimisation, their success or otherwise does not 
impact the consistency guarantees. 

If are you dealing with a very high throughput as a work around you can reduce 
the time that hints are stored for a down node, see the yaml file for info. 

The behaviour is changes if you have lots of small or large column, this is the 
from HintedHandoff manager that selects the page size. 

int pageSize = PAGE_SIZE;
// read less columns (mutations) per page if they are very large
if (hintStore.getMeanColumns() > 0)
{
int averageColumnSize = (int) (hintStore.getMeanRowSize() / 
hintStore.getMeanColumns());
pageSize = Math.min(PAGE_SIZE, 
DatabaseDescriptor.getInMemoryCompactionLimit() / averageColumnSize);
pageSize = Math.max(2, pageSize); // page size of 1 does not allow 
actual paging b/c of >= behavior on startColumn
logger_.debug("average hinted-row column size is {}; using pageSize 
of {}", averageColumnSize, pageSize);
}
 
If you reduce the in_memory_compaction_limit yaml setting that would reduce the 
page size 

Cheers
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/05/2013, at 9:26 PM, Vladimir Volkov  wrote:

> Hello.
> 
> I'm stress-testing our Cassandra (version 1.0.9) cluster, and tried turning 
> off two of the four nodes for half an hour under heavy load. As a result I 
> got a large volume of hints on the alive nodes - HintsColumnFamily takes 
> about 1.5 GB disk space on each of the nodes. It seems, these hints are never 
> replayed successfully.
> 
> After I bring other nodes back online, tpstats shows active handoffs, but I 
> can't see any writes on the target nodes.
> The log indicates memory pressure - the heap is >80% full (heap size is 8GB 
> total, 1GB young).
> 
> A fragment of the log:
>  INFO 18:34:05,513 Started hinted handoff for token: 1 with IP: 
> /84.201.162.144
>  INFO 18:34:06,794 GC for ParNew: 300 ms for 1 collections, 5974181760 used; 
> max is 8588951552
>  INFO 18:34:07,795 GC for ParNew: 263 ms for 1 collections, 6226018744 used; 
> max is 8588951552
>  INFO 18:34:08,795 GC for ParNew: 256 ms for 1 collections, 6559918392 used; 
> max is 8588951552
>  INFO 18:34:09,796 GC for ParNew: 231 ms for 1 collections, 6846133712 used; 
> max is 8588951552
>  WARN 18:34:09,805 Heap is 0.7978131149667941 full.  You may need to reduce 
> memtable and/or cache sizes.  Cassandra will now flush up to the two largest 
> memtables to free up memory.
>  WARN 18:34:09,805 Flushing CFS(Keyspace='test', ColumnFamily='t2') to 
> relieve memory pressure
>  INFO 18:34:09,806 Enqueuing flush of 
> Memtable-t2@639524673(60608588/571839171 serialized/live bytes, 743266 ops)
>  INFO 18:34:09,807 Writing Memtable-t2@639524673(60608588/571839171 
> serialized/live bytes, 743266 ops)
>  INFO 18:34:11,018 GC for ParNew: 449 ms for 2 collections, 6573394480 used; 
> max is 8588951552
>  INFO 18:34:12,019 GC for ParNew: 265 ms for 1 collections, 6820930056 used; 
> max is 8588951552
>  INFO 18:34:13,112 GC for ParNew: 331 ms for 1 collections, 6900566728 used; 
> max is 8588951552
>  INFO 18:34:14,181 GC for ParNew: 269 ms for 1 collections, 7101358936 used; 
> max is 8588951552
>  INFO 18:34:14,691 Completed flushing 
> /mnt/raid/cassandra/data/test/t2-hc-244-Data.db (56156246 bytes)
>  INFO 18:34:15,381 GC for ParNew: 280 ms for 1 collections, 7268441248 used; 
> max is 8588951552
>  INFO 18:34:35,306 InetAddress /84.201.162.144 is now dead.
>  INFO 18:34:35,306 GC for ConcurrentMarkSweep: 19223 ms for 1 collections, 
> 3774714808 used; max is 8588951552
>  INFO 18:34:35,309 InetAddress /84.201.162.144 is now UP
> 
> After taking off the load and restatring the service, I still see pending 
> handoffs:
> $ nodetool -h localhost tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> ReadStage 0 01004257 0
>  0
> RequestResponseStage  0 0  92555 0
>  0
> MutationStage 0 0  6 0
>  0
> ReadRepairStage   0 0  57773 0
>  0
> ReplicateOnWriteStage 0 0  0 0
>  0
> GossipStage   0 0 143332 0
>  0
> AntiEntropyStage  0 0  0 0
>  0
> MigrationStage0  

Re: Cassandra read reapair

2013-05-23 Thread aaron morton
If you are reading and writing at CL QUOURM and getting inconsistent results 
that sounds like a bug. If you are mixing the CL levels such that R + W <= N 
then it's expected behaviour. 


Can you reproduce the issue outside of your app ? 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/05/2013, at 8:55 PM, Kais Ahmed  wrote:

> > Checking you do not mean the row key is corrupt and cannot be read. 
> Yes, i can read it but all read don't return the same result except for CL ALL
> 
> > By default in 1.X and beyond the default read repair chance is 0.1, so it's 
> > only enabled on 10% of requests. 
> You are right read repair chance is set to 0.1, but i launched a read repair 
> which did not solved the problem. Any idea?
> 
> >What CL are you writing at ? 
> All write are in CL QUORUM
> 
> thank you aaron for your answer. 
> 
> 
> 2013/5/21 aaron morton 
>> Only some keys of one CF are corrupt. 
> Checking you do not mean the row key is corrupt and cannot be read. 
> 
>> I thought using CF ALL, would correct the problem with READ REPAIR, but by 
>> returning to CL QUORUM, the problem persists.
>> 
> 
> By default in 1.X and beyond the default read repair chance is 0.1, so it's 
> only enabled on 10% of requests. 
> 
> 
> In the absence of further writes all reads (at any CL) should return the same 
> value. 
> 
> What CL are you writing at ? 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 19/05/2013, at 1:28 AM, Kais Ahmed  wrote:
> 
>> Hi all,
>> 
>> I encountered a consistency problem one some keys using phpcassa and 
>> Cassandra 1.2.3 since a server crash 
>> 
>> Only some keys of one CF are corrupt. 
>> 
>> I lauched a nodetool repair that successfully completed but don't correct 
>> the issue.
>> 
>> 
>> 
>> When i try to get a corrupt Key with :
>> 
>> CL ONE, the result contains 7 or 8 or 9 columns
>> 
>> CL QUORUM, result contains 8 or 9 columns
>> 
>> CL ALL, the data is consistent and returns always 9 columns
>> 
>> 
>> 
>> I thought using CF ALL, would correct the problem with READ REPAIR, but by 
>> returning to CL QUORUM, the problem persists.
>> 
>> 
>> Thank you for your help
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 



Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Robert Coli
On Wed, May 22, 2013 at 11:32 PM, Tamar Fraenkel wrote:

> I am using Hector HLockManagerImpl, which creates a keyspace named
> HLockManagerImpl and CF HLocks.
> For some reason I have a row with single column that should have expired
> yesterday who is still there.
> I tried deleting it using cli, but it is stuck...
> Any ideas how to delete it?
>

"is still there" is sorta ambiguous. Do you mean that clients see it or
that it is still in the (immutable) data file it was previously in?

If the latter, what is gc_grace_seconds set to? Make sure it's set to a low
value and then make sure that your TTL-expired key is compacted?

=Rob


Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Arthur Zubarev
so where the multiple nodes are? I am just puzzled 

From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:43 PM
To: Arthur Zubarev ; user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently

"Would each device/machine have its own keyspace?"

No. All the machines are going to run the exactly same CQL commands and going 
to create the same namespace and column families.

Thank you
Emalayan



From: Arthur Zubarev 
To: Emalayan Vairavanathan ; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 12:20 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently


Would each device/machine have its own keyspace?

Basically, your client needs to take care of a successful creation of the 
schema and any other verifications and it is going to be time consuming. 

From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:07 PM
To: user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently

Hi Arthur and Farraz,


Thank you for getting back to me.


I am trying to avoid sync among concurrent instances and this is why I am 
preferring Option - 2. Further in my application, I have reasonable window 
between the application initialization phase and the application runtime.  So 
as long as Cassandra can safely handle concurrent creation I should be fine.


Do you have any idea how Cassandra is going to handle concurrent namespace and 
column family creation (Here all the instances are going to create the same 
namespace and column families concurrently)? 
- Does Cassandra take much time to agree on a final schema (In case if 
Cassandra is using some sort of exponential back off algorithms to handle 
schema conflicts) ? 
- Or is it going to result schema conflicts which needs manual 
intervention ?
- Or will this result in race conditions ?
- Or some other issues e.g: memory/ cpu /network bottlenecks ?  


Thank you
Emalayan



From: Arthur Zubarev 
To: user@cassandra.apache.org; svemala...@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently


I am assuming here you want to sync all the 100s of nodes once the application 
is airborne. I suspect this would flood the network and even potentially affect 
the machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)? 


Regards,

Arthur




-Original Message-
From: Emalayan Vairavanathan 
To: user 
Sent: Wed, May 22, 2013 8:31 pm
Subject: Creating namespace and column family from multiple nodes concurrently


Hi all,

I am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying storage.

The application creates the schema (name space and column families) during 
initialization phase.  It seems I have two options to create the schema.


Option - 1 : Using a single node for schema creation.
Option - 2: Having all the nodes (> 100) to run the same schema 
creation logic (First, nodes will check whether the schema is already available 
and then try to create the schema if it is not available already).  

To keep the initialization phase simple, I prefer to go for Option - 2. However 
I am not sure how Cassandra is going to behave if multiple nodes try to create 
the same schema (namespace and column families) concurrently. It would be nice 
if someone can tell me about the implications of Option - 2 with Cassandra 
version 1.2.2.


Please let me know if you have question.


Thank you
VE





 








Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Robert Coli
On Thu, May 23, 2013 at 12:07 PM, Emalayan Vairavanathan
 wrote:
> Do you have any idea how Cassandra is going to handle concurrent namespace
> and column family creation (Here all the instances are going to create the
> same namespace and column families concurrently)?
> [...]
> However I am not sure how Cassandra is going to behave if multiple nodes try
> to create the same schema (namespace and column families) concurrently. It
> would be nice if someone can tell me about the implications of Option - 2
> with Cassandra version 1.2.2.

Concurrent CREATE is allegedly working in 1.2.0, per NEWS.txt [1]. I
say allegedly working because this feature was also allegedly working
in 1.1.0. Given past experience, I continue to (perhaps
pessimistically) believe that frequent dynamic updates of schema are
likely to result in schema desynch. I would be interested to hear if
you go down this route and do not encounter problems.

See also CASSANDRA-3794 [2] for details.

=Rob

[1] https://github.com/apache/cassandra/blob/cassandra-1.2/NEWS.txt
[2] https://issues.apache.org/jira/browse/CASSANDRA-3794


Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Emalayan Vairavanathan
"Would each device/machine have its own keyspace?"

No. All the machines are going to run the exactly same CQL commands and going 
to create the same namespace and column families.

Thank you
Emalayan



 From: Arthur Zubarev 
To: Emalayan Vairavanathan ; user@cassandra.apache.org 
Sent: Thursday, 23 May 2013 12:20 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently
 


Would each device/machine have its own keyspace?
 
Basically, your client needs to take care of a successful creation of the 
schema and any other verifications and it is going to be time consuming.  
From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:07 PM
To: user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple 
nodes concurrently
  Hi Arthur and Farraz,

Thank 
you for getting back to me.

I 
am trying to avoid sync among concurrent instances and thisis why I am 
preferring Option - 2. Further in my application, I have 
reasonable window between the application initialization phase and the 
application runtime.  So as long as Cassandra can safely handle concurrent 
creation I should be fine.

Do you have any idea how Cassandra is 
going to handle concurrent namespace and column family creation (Here all the 
instances are going to create the same namespace and column families 
concurrently)? 
    
- Does Cassandra take much time to agree on a final schema (In case if 
Cassandra 
is using some sort of exponential back off algorithms to handle schema 
conflicts) ? 
    
- Or is it going to result schema conflicts which needs manual intervention 
?
    
- Or will this result in race conditions ?
    
- Or some other issues e.g: memory/ cpu /network bottlenecks ?  

Thank you
Emalayan
 


 From: Arthur Zubarev 
To: user@cassandra.apache.org; 
svemala...@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column 
family from multiple nodes concurrently

 
I am 
assuming here you want to sync all the 100s of nodes once the application is 
airborne. I suspect this would flood the network and even potentially affect 
the 
machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)? 
 
Regards,

Arthur


 
 
-Original 
Message-
From: Emalayan Vairavanathan 
To: 
user 
Sent: Wed, May 22, 2013 8:31 
pm
Subject: Creating namespace and column family from multiple nodes 
concurrently


Hi all,
 
I 
am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying 
storage.
 
The 
application creates the schema 
(name space and column families) during initialization phase.  It seems I have 
two options 
to create the schema.

Option - 1 : 
Using a single node for schema creation.
    
Option - 2: Having all the nodes (> 100) to run the same schema creation 
logic (First, nodes will check whether the schema is already available and then 
try to create the schema if it is not available already).  
 
To 
keep the initialization phase simple, I prefer to go for Option - 2. However I 
am not sure how Cassandra is going to behave if multiple nodes try to create 
the 
same schema (namespace and column families) concurrently. It would be nice if 
someone can tell me about the implications of Option - 2 with Cassandra version 
1.2.2.

Please let me know if you have 
question.

Thank you
VE

Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Arthur Zubarev
Would each device/machine have its own keyspace?

Basically, your client needs to take care of a successful creation of the 
schema and any other verifications and it is going to be time consuming. 

From: Emalayan Vairavanathan 
Sent: Thursday, May 23, 2013 3:07 PM
To: user@cassandra.apache.org 
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently

Hi Arthur and Farraz,


Thank you for getting back to me.


I am trying to avoid sync among concurrent instances and this is why I am 
preferring Option - 2. Further in my application, I have reasonable window 
between the application initialization phase and the application runtime.  So 
as long as Cassandra can safely handle concurrent creation I should be fine.


Do you have any idea how Cassandra is going to handle concurrent namespace and 
column family creation (Here all the instances are going to create the same 
namespace and column families concurrently)? 
- Does Cassandra take much time to agree on a final schema (In case if 
Cassandra is using some sort of exponential back off algorithms to handle 
schema conflicts) ? 
- Or is it going to result schema conflicts which needs manual 
intervention ?
- Or will this result in race conditions ?
- Or some other issues e.g: memory/ cpu /network bottlenecks ?  


Thank you
Emalayan



From: Arthur Zubarev 
To: user@cassandra.apache.org; svemala...@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently


I am assuming here you want to sync all the 100s of nodes once the application 
is airborne. I suspect this would flood the network and even potentially affect 
the machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)? 


Regards,

Arthur




-Original Message-
From: Emalayan Vairavanathan 
To: user 
Sent: Wed, May 22, 2013 8:31 pm
Subject: Creating namespace and column family from multiple nodes concurrently


Hi all,

I am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying storage.

The application creates the schema (name space and column families) during 
initialization phase.  It seems I have two options to create the schema.


Option - 1 : Using a single node for schema creation.
Option - 2: Having all the nodes (> 100) to run the same schema 
creation logic (First, nodes will check whether the schema is already available 
and then try to create the schema if it is not available already).  

To keep the initialization phase simple, I prefer to go for Option - 2. However 
I am not sure how Cassandra is going to behave if multiple nodes try to create 
the same schema (namespace and column families) concurrently. It would be nice 
if someone can tell me about the implications of Option - 2 with Cassandra 
version 1.2.2.


Please let me know if you have question.


Thank you
VE





 





Re: Creating namespace and column family from multiple nodes concurrently

2013-05-23 Thread Emalayan Vairavanathan
Hi Arthur and Farraz,

Thank you for getting back to me.

I am trying to avoid sync among concurrent instances and this is why I am 
preferring Option - 2. Further in my application, I have reasonable window 
between the application initialization phase and the application runtime.  So 
as long as Cassandra can safely handle concurrent creation I should be fine.

Do you have any idea how Cassandra is going to handle concurrent namespace and 
column family creation (Here all the instances are going to create the same 
namespace and column families concurrently)? 
        - Does Cassandra take much time to agree on a final schema (In case if 
Cassandra is using some sort of exponential back off algorithms to handle 
schema conflicts) ? 
        - Or is it going to result schema conflicts which needs manual 
intervention ?
        - Or will this result in race conditions ?
        - Or some other issues e.g: memory/ cpu /network bottlenecks ?  

Thank you
Emalayan



 From: Arthur Zubarev 
To: user@cassandra.apache.org; svemala...@yahoo.com 
Sent: Wednesday, 22 May 2013 8:07 PM
Subject: Re: Creating namespace and column family from multiple nodes 
concurrently
 


I am assuming here you want to sync all the 100s of nodes once the application 
is airborne. I suspect this would flood the network and even potentially affect 
the machine itself memory-wise. How are you going to maintain the nodes 
(compaction+repair)?


Regards,

Arthur




-Original Message-
From: Emalayan Vairavanathan 
To: user 
Sent: Wed, May 22, 2013 8:31 pm
Subject: Creating namespace and column family from multiple nodes concurrently


Hi all,

I am implementing a distributed application which runs on 100s of machines 
concurrently. This application is going to use Cassandra as underlaying storage.

The application creates the schema (name space and column families) during 
initialization phase.  It seems I have two options to create the schema.

Option - 1 : Using a single node for schema creation.
        Option - 2: Having all the nodes (> 100) to run the same schema 
creation logic (First, nodes will check whether the schema is already available 
and then try to create the schema if it is not available already).  

To keep the initialization phase simple, I prefer to go for Option - 2. However 
I am not sure how Cassandra is going to behave if multiple nodes try to create 
the same schema (namespace and column families) concurrently. It would be nice 
if someone can tell me about the implications of Option - 2 with Cassandra 
version 1.2.2.

Please let me know if you have question.

Thank you
VE

Re: write time of CQL3 set items

2013-05-23 Thread Sylvain Lebresne
>   Does anyone know I way I could expose the write time of set items?
>

You cannot currently unfortunately.
The problem is really just an API one. Since currently you can only ever
query a full collection, you cannot apply writeTime() to only an element,
and applying it to the whole collection doesn't make sense, in the sense
that each element have a write time as you said.

We'll likely allow to query individual elements of collections in the
future, at which point allowing to get the write time of said individual
will work. But let's say that today we just don't have a syntax yet to make
it work.

--
Sylvain


Re: exception causes streaming to hang forever

2013-05-23 Thread Yuki Morishita
What kind of error does the other end of streaming(/10.10.42.36) say?

On Wed, May 22, 2013 at 5:19 PM, Hiller, Dean  wrote:
> We had 3 nodes roll on good and the next 2, we see a remote node with this 
> exception every time we start over and bootstrap the node
>
> ERROR [Streaming to /10.10.42.36:2] 2013-05-22 14:47:59,404 
> CassandraDaemon.java (line 132) Exception in thread Thread[Streaming to 
> /10.10.42.36:2,5,main]
> java.lang.RuntimeException: java.io.IOException: Input/output error
> at com.google.common.base.Throwables.propagate(Throwables.java:160)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Input/output error
> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
> at 
> sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:405)
> at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:506)
> at 
> org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:90)
> at 
> org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> ... 3 more
>
> Are there any ideas what this is?  Google doesn't real show any useful advice 
> on this and our node has not joined the ring yet so I don't think we can run 
> a repair just yet to avoid it and try synching via another means.  It seems 
> on a streaming failure, it never recovers from this.  Any ideas?
>
> We are on cassandra 1.2.2
>
> Thanks,
> Dean
>



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


Re: Cassandra 1.2 TTL histogram problem

2013-05-23 Thread Yuki Morishita
> Are you sure that it is a good idea to estimate remainingKeys like that?

Since we don't want to scan every row to check overlap and cause heavy
IO automatically, the method can only do the best-effort type of
calculation.
In your case, try running user defined compaction on that sstable
file. It goes through every row and remove tombstones when droppable.


On Wed, May 22, 2013 at 11:48 AM, cem  wrote:
> Thanks for the answer.
>
> It means that if we use randompartioner it will be very difficult to  find a
> sstable without any overlap.
>
> Let me give you an example from my test.
>
> I have ~50 sstables in total and an sstable with droppable ratio 0.9. I use
> GUID for key and only insert (no update -delete) so I dont expect a key in
> different sstables.
>
> I put extra logging to  AbstractCompactionStrategy to see the
> overlaps.size() and keys and remainingKeys:
>
> overlaps.size() is around 30, number of keys for that sstable is around 5 M
> and remainingKeys is always 0.
>
> Are you sure that it is a good idea to estimate remainingKeys like that?
>
> Best Regards,
> Cem
>
>
>
> On Wed, May 22, 2013 at 5:58 PM, Yuki Morishita  wrote:
>>
>> > Can method calculate non-overlapping keys as overlapping?
>>
>> Yes.
>> And randomized keys don't matter here since sstables are sorted by
>> "token" calculated from key by your partitioner, and the method uses
>> sstable's min/max token to estimate overlap.
>>
>> On Tue, May 21, 2013 at 4:43 PM, cem  wrote:
>> > Thank you very much for the swift answer.
>> >
>> > I have one more question about the second part. Can method calculate
>> > non-overlapping keys as overlapping? I mean it uses max and min tokens
>> > and
>> > column count. They can be very close to each other if random keys are
>> > used.
>> >
>> > In my use case I generate a GUID for each key and send a single write
>> > request.
>> >
>> > Cem
>> >
>> > On Tue, May 21, 2013 at 11:13 PM, Yuki Morishita 
>> > wrote:
>> >>
>> >> > Why does Cassandra single table compaction skips the keys that are in
>> >> > the other sstables?
>> >>
>> >> because we don't want to resurrect deleted columns. Say, sstable A has
>> >> the column with timestamp 1, and sstable B has the same column which
>> >> deleted at timestamp 2. Then if we purge that column only from sstable
>> >> B, we would see the column with timestamp 1 again.
>> >>
>> >> > I also dont understand why we have this line in
>> >> > worthDroppingTombstones
>> >> > method
>> >>
>> >> What the method is trying to do is to "guess" how many columns that
>> >> are not in the rows that don't overlap, without actually going through
>> >> every rows in the sstable. We have statistics like column count
>> >> histogram, min and max row token for every sstables, we use those in
>> >> the method to estimate how many columns the two sstables overlap.
>> >> You may have remainingColumnsRatio of 0 when the two sstables overlap
>> >> almost entirely.
>> >>
>> >>
>> >> On Tue, May 21, 2013 at 3:43 PM, cem  wrote:
>> >> > Hi all,
>> >> >
>> >> > I have a question about ticket
>> >> > https://issues.apache.org/jira/browse/CASSANDRA-3442
>> >> >
>> >> > Why does Cassandra single table compaction skips the keys that are in
>> >> > the
>> >> > other sstables? Please correct if I am wrong.
>> >> >
>> >> > I also dont understand why we have this line in
>> >> > worthDroppingTombstones
>> >> > method:
>> >> >
>> >> > double remainingColumnsRatio = ((double) columns) /
>> >> > (sstable.getEstimatedColumnCount().count() *
>> >> > sstable.getEstimatedColumnCount().mean());
>> >> >
>> >> > remainingColumnsRatio  is always 0 in my case and the droppableRatio
>> >> > is
>> >> > 0.9. Cassandra skips all sstables which are already expired.
>> >> >
>> >> > This line was introduced by
>> >> > https://issues.apache.org/jira/browse/CASSANDRA-4022.
>> >> >
>> >> > Best Regards,
>> >> > Cem
>> >>
>> >>
>> >>
>> >> --
>> >> Yuki Morishita
>> >>  t:yukim (http://twitter.com/yukim)
>> >
>> >
>>
>>
>>
>> --
>> Yuki Morishita
>>  t:yukim (http://twitter.com/yukim)
>
>



-- 
Yuki Morishita
 t:yukim (http://twitter.com/yukim)


write time of CQL3 set items

2013-05-23 Thread Keith Wright
Hi all,

I am using C* 1.2.4 with CQL3 and am taking advantage of the new collection 
support.  One usage case I have is that I want a set of text and I need to know 
the time when each item in the set was written.  If I understand CQL3 
correctly, the underlying data engine utilizes composites for sets and maps 
(where sets are just maps with no values).  I was hoping that I could store my 
text items as a set and then somehow get the write time of each item via 
WRITETIME.  Instead it appears that I need to use a map of TEXT to TIMESTAMP 
which will increase my data set.  Does anyone know I way I could expose the 
write time of set items?

Thanks!


Re: High performance disk io

2013-05-23 Thread Edward Capriolo
I have used both rotation disks with lots of RAM as well as SSD devices. An
important thing to consider is that SSD devices are not magic. You have
big-o-notation in several places.
1) more data large bloom filters
2) more data (larger key caches) JVM overhead
3) more requests more young gen JVM overhead
4) more data longer compaction (even with ssd)
5) more writes (more memtable flushing)
Bottom line: more data more disk seeks

We have used both the mid level SSD as well as the costly fusion io. Fit in
RAM/VFScache delivers better more predictable low latency, even with very
fast disks the average, 95th, and 99th, percentile can get by very far
apart. I am currently trying to really study the effect of the width of a
row (being in multiple sstables) vs its 95th percentile read time.


On Thu, May 23, 2013 at 10:43 AM, Christopher Wirt wrote:

> Hi Igor,
>
> ** **
>
> I was talking about 99th percentile from the Cassandra histograms when I
> said ‘1 or 2 ms for most cf’. 
>
> ** **
>
> But we have measured client side too and generally get a couple ms added
> on top.. as one might expect.
>
> ** **
>
> Anyone interested - 
>
> diskio (my original question) we have tried out the multiple SSD setup and
> found it to work well and reduce the impact of a repair on node
> performance. 
>
> We ended up going with the single data directory in cassandra.yaml and
> mount one SSD against that. Then have a dedicated SSD per large column
> family.
>
> We’re now moving all of nodes to have the same setup.
>
> ** **
>
> ** **
>
> Chris
>
> ** **
>
> *From:* Igor [mailto:i...@4friends.od.ua]
> *Sent:* 23 May 2013 15:00
> *To:* user@cassandra.apache.org
> *Subject:* Re: High performance disk io
>
> ** **
>
> Hello Christopher,
>
> BTW, are you talking about 99th percentiles on client side, or about
> percentiles from cassandra histograms for CF on cassandra side?
>
> Thanks!
>
> On 05/22/2013 05:41 PM, Christopher Wirt wrote:
>
> Hi Igor, 
>
>  
>
> Yea same here, 15ms for 99th percentile is our max. Currently getting one
> or two ms for most CF. It goes up at peak times which is what we want to
> avoid.
>
>  
>
> We’re using Cass 1.2.4 w/vnodes and our own barebones driver on top of
> thrift. Needed to be .NET so Hector and Astyanax were not options.
>
>  
>
> Do you use SSDs or multiple SSDs in any kind of configuration or RAID?
>
>  
>
> Thanks
>
>  
>
> Chris
>
>  
>
> *From:* Igor [mailto:i...@4friends.od.ua ]
> *Sent:* 22 May 2013 15:07
> *To:* user@cassandra.apache.org
> *Subject:* Re: High performance disk io
>
>  
>
> Hello
>
> What level of read performance do you expect? We have limit 15 ms for 99
> percentile with average read latency near 0.9ms. For some CF 99 percentile
> actually equals to 2ms, for other - to 10ms, this depends on the data
> volume you read in each query.
>
> Tuning read performance involved cleaning up data model, tuning
> cassandra.yaml, switching from Hector to astyanax, tuning OS parameters.
>
> On 05/22/2013 04:40 PM, Christopher Wirt wrote:
>
> Hello,
>
>  
>
> We’re looking at deploying a new ring where we want the best possible read
> performance.
>
>  
>
> We’ve setup a cluster with 6 nodes, replication level 3, 32Gb of memory,
> 8Gb Heap, 800Mb keycache, each holding 40/50Gb of data on a 200Gb SSD and
> 500Gb SATA for OS and commitlog
>
> Three column families
>
> ColFamily1 50% of the load and data
>
> ColFamily2 35% of the load and data
>
> ColFamily3 15% of the load and data
>
>  
>
> At the moment we are still seeing around 20% disk utilisation and
> occasionally as high as 40/50% on some nodes at peak time.. we are
> conducting some semi live testing.
>
> CPU looks fine, memory is fine, keycache hit rate is about 80% (could be
> better, so maybe we should be increasing the keycache size?)
>
>  
>
> Anyway, we’re looking into what we can do to improve this.
>
>  
>
> One conversion we are having at the moment is around the SSD disk setup..*
> ***
>
>  
>
> We are considering moving to have 3 smaller SSD drives and spreading the
> data across those.
>
>  
>
> The possibilities are:
>
> -We have a RAID0 of the smaller SSDs and hope that improves performance. *
> ***
>
> Will this acutally yield better throughput?
>
>  
>
> -We mount the SSDs to different directories and define multiple data
> directories in Cassandra.yaml.
>
> Will not having a layer of RAID controller improve the throughput?
>
>  
>
> -We mount the SSDs to different columns family directories and have a
> single data directory declared in Cassandra.yaml. 
>
> Think this is quite attractive idea.
>
> What are the drawbacks? System column families will be on the main SATA?**
> **
>
>  
>
> -We don’t change anything and just keep upping our keycache.
>
> -Anything you guys can think of.
>
>  
>
> I

RE: High performance disk io

2013-05-23 Thread Christopher Wirt
Hi Igor,

 

I was talking about 99th percentile from the Cassandra histograms when I
said '1 or 2 ms for most cf'. 

 

But we have measured client side too and generally get a couple ms added on
top.. as one might expect.

 

Anyone interested - 

diskio (my original question) we have tried out the multiple SSD setup and
found it to work well and reduce the impact of a repair on node performance.


We ended up going with the single data directory in cassandra.yaml and mount
one SSD against that. Then have a dedicated SSD per large column family.

We're now moving all of nodes to have the same setup.

 

 

Chris

 

From: Igor [mailto:i...@4friends.od.ua] 
Sent: 23 May 2013 15:00
To: user@cassandra.apache.org
Subject: Re: High performance disk io

 

Hello Christopher,

BTW, are you talking about 99th percentiles on client side, or about
percentiles from cassandra histograms for CF on cassandra side?

Thanks!

On 05/22/2013 05:41 PM, Christopher Wirt wrote:

Hi Igor, 

 

Yea same here, 15ms for 99th percentile is our max. Currently getting one or
two ms for most CF. It goes up at peak times which is what we want to avoid.

 

We're using Cass 1.2.4 w/vnodes and our own barebones driver on top of
thrift. Needed to be .NET so Hector and Astyanax were not options.

 

Do you use SSDs or multiple SSDs in any kind of configuration or RAID?

 

Thanks

 

Chris

 

From: Igor [mailto:i...@4friends.od.ua] 
Sent: 22 May 2013 15:07
To: user@cassandra.apache.org
Subject: Re: High performance disk io

 

Hello

What level of read performance do you expect? We have limit 15 ms for 99
percentile with average read latency near 0.9ms. For some CF 99 percentile
actually equals to 2ms, for other - to 10ms, this depends on the data volume
you read in each query.

Tuning read performance involved cleaning up data model, tuning
cassandra.yaml, switching from Hector to astyanax, tuning OS parameters.

On 05/22/2013 04:40 PM, Christopher Wirt wrote:

Hello,

 

We're looking at deploying a new ring where we want the best possible read
performance.

 

We've setup a cluster with 6 nodes, replication level 3, 32Gb of memory, 8Gb
Heap, 800Mb keycache, each holding 40/50Gb of data on a 200Gb SSD and 500Gb
SATA for OS and commitlog

Three column families

ColFamily1 50% of the load and data

ColFamily2 35% of the load and data

ColFamily3 15% of the load and data

 

At the moment we are still seeing around 20% disk utilisation and
occasionally as high as 40/50% on some nodes at peak time.. we are
conducting some semi live testing.

CPU looks fine, memory is fine, keycache hit rate is about 80% (could be
better, so maybe we should be increasing the keycache size?)

 

Anyway, we're looking into what we can do to improve this.

 

One conversion we are having at the moment is around the SSD disk setup..

 

We are considering moving to have 3 smaller SSD drives and spreading the
data across those.

 

The possibilities are:

-We have a RAID0 of the smaller SSDs and hope that improves performance. 

Will this acutally yield better throughput?

 

-We mount the SSDs to different directories and define multiple data
directories in Cassandra.yaml.

Will not having a layer of RAID controller improve the throughput?

 

-We mount the SSDs to different columns family directories and have a single
data directory declared in Cassandra.yaml. 

Think this is quite attractive idea.

What are the drawbacks? System column families will be on the main SATA?

 

-We don't change anything and just keep upping our keycache.

-Anything you guys can think of.

 

Ideas and thoughts welcome. Thanks for your time and expertise. 

 

Chris

 

 

 

 



Re: High performance disk io

2013-05-23 Thread Igor

Hello Christopher,

BTW, are you talking about 99th percentiles on client side, or about 
percentiles from cassandra histograms for CF on cassandra side?


Thanks!

On 05/22/2013 05:41 PM, Christopher Wirt wrote:


Hi Igor,

Yea same here, 15ms for 99^th percentile is our max. Currently getting 
one or two ms for most CF. It goes up at peak times which is what we 
want to avoid.


We're using Cass 1.2.4 w/vnodes and our own barebones driver on top of 
thrift. Needed to be .NET so Hector and Astyanax were not options.


Do you use SSDs or multiple SSDs in any kind of configuration or RAID?

Thanks

Chris

*From:*Igor [mailto:i...@4friends.od.ua]
*Sent:* 22 May 2013 15:07
*To:* user@cassandra.apache.org
*Subject:* Re: High performance disk io

Hello

What level of read performance do you expect? We have limit 15 ms for 
99 percentile with average read latency near 0.9ms. For some CF 99 
percentile actually equals to 2ms, for other - to 10ms, this depends 
on the data volume you read in each query.


Tuning read performance involved cleaning up data model, tuning 
cassandra.yaml, switching from Hector to astyanax, tuning OS parameters.


On 05/22/2013 04:40 PM, Christopher Wirt wrote:

Hello,

We're looking at deploying a new ring where we want the best
possible read performance.

We've setup a cluster with 6 nodes, replication level 3, 32Gb of
memory, 8Gb Heap, 800Mb keycache, each holding 40/50Gb of data on
a 200Gb SSD and 500Gb SATA for OS and commitlog

Three column families

ColFamily1 50% of the load and data

ColFamily2 35% of the load and data

ColFamily3 15% of the load and data

At the moment we are still seeing around 20% disk utilisation and
occasionally as high as 40/50% on some nodes at peak time.. we are
conducting some semi live testing.

CPU looks fine, memory is fine, keycache hit rate is about 80%
(could be better, so maybe we should be increasing the keycache size?)

Anyway, we're looking into what we can do to improve this.

One conversion we are having at the moment is around the SSD disk
setup..

We are considering moving to have 3 smaller SSD drives and
spreading the data across those.

The possibilities are:

-We have a RAID0 of the smaller SSDs and hope that improves
performance.

Will this acutally yield better throughput?

-We mount the SSDs to different directories and define multiple
data directories in Cassandra.yaml.

Will not having a layer of RAID controller improve the throughput?

-We mount the SSDs to different columns family directories and
have a single data directory declared in Cassandra.yaml.

Think this is quite attractive idea.

What are the drawbacks? System column families will be on the main
SATA?

-We don't change anything and just keep upping our keycache.

-Anything you guys can think of.

Ideas and thoughts welcome. Thanks for your time and expertise.

Chris





Re: Commit Log Magic

2013-05-23 Thread Jonathan Ellis
Sstables must be sorted by token, or we can't compact efficiently.
Since writes usually do not arrive in token order, we stage them first
in a memtable.

(cc user@)

On Thu, May 23, 2013 at 8:44 AM, Ansar Rafique  wrote:
> Hi Jonathan,
>
> I am Ansar Rafique and I asked you few questions 2 week ago about Cassandra
> Implementation. I was watching your presentation where you suggested the
> page below.
>
> http://nosql.mypopescu.com/post/27684111441/cassandra-and-solid-state-drives
>
> I have a question and I have tried to find the answer but didn't really get
> satisfactory response yet. My question is why Cassandra using Commit log for
> durability instead direct write to SSTable. Cassandra acheives high write
> throughput because it stores data first in memtable and then flush into
> disk. Sounds good but remeber Cassandra also write in commit log for
> durability. I made it sure and it's written that write to memetable and
> commit log is synchronous which means it will write first in commit log and
> wait until it complete and will start writing in memtable or vice versa.
> Writing transaction to commit log requires an I/O operation which means for
> each insert we need an I/O :( for writing data in commit log and later
> requires more I/O's to flush data again on disk. Isn't writing to commit log
> is overhead ? Isn't it better to directly write data on disk instead of
> commit log ?
>
> Remember I/O operations are expensive and reduction in I/O's mean
> improvement in performance. If we look at RDBMS, it stores data in commit
> log as well as disk. Fair enough but if we don't insert data in commit log.
> It's performance should be the same as Cassandra because it perform I/O to
> insert data on disk and Cassandra also perform's I/O to insert data on
> commit log. Is commit log is less expensive ? I didn't really understood the
> magic :) Would you like to elaborate it more ?
>
> Thank you in advance for your time. Looking to hear from you.
>
> Regards,
> Ansar Rafique
>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced


Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Tamar Fraenkel
good point!

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Thu, May 23, 2013 at 2:25 PM,  wrote:

> (Probably will not solve your problem, but worth mentioning): It’s not
> enough to check that the clocks of all the servers are synchronized – I
> believe that the client node sets the timestamp for a record being written.
> So, you should also check the timestamp on your Hector client nodes.
>
> ** **
>
> *From:* Tamar Fraenkel [mailto:ta...@tok-media.com]
> *Sent:* Thursday, May 23, 2013 2:17 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: column with TTL of 10 seconds lives very long...
>
> ** **
>
> Hi!
>
> TTL was set:
>
> [default@HLockingManager] get
> HLocks['/LockedTopic/31a30c12-652d-45b3-9ac2-0401cce85517'];
> => (column=69b057d4-3578-4326-a9d9-c975cb8316d2,
> value=36396230353764342d333537382d343332362d613964392d633937356362383331366432,
> timestamp=1369307815049000, ttl=10)
>
> 
>
> Also, all other lock columns expire as expected.
>
> Thanks,
> Tamar
>
>
> 
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media 
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956 
>
> ** **
>
> ** **
>
> ** **
>
> On Thu, May 23, 2013 at 1:58 PM,  wrote:
>
> Maybe you didn’t set the TTL correctly.
>
> Check the TTL of the column using CQL, e.g.:
>
> SELECT TTL (colName) from colFamilyName WHERE ;
>
>  
>
> *From:* Felipe Sere [mailto:felipe.s...@1und1.de]
> *Sent:* Thursday, May 23, 2013 1:28 PM
> *To:* user@cassandra.apache.org
> *Subject:* AW: column with TTL of 10 seconds lives very long...
>
>  
>
> This is interesting as it might affect me too :)
> I have been observing deadlocks with HLockManagerImpl which dont get
> resolved for a long time
> even though the columns with the locks should only live for about 5-10secs.
>
> Any ideas how to investigate this further from the Cassandra-side?
> --
>
> *Von:* Tamar Fraenkel [ta...@tok-media.com]
> *Gesendet:* Donnerstag, 23. Mai 2013 11:58
> *An:* user@cassandra.apache.org
> *Betreff:* Re: column with TTL of 10 seconds lives very long...
>
> Thanks for the response.
> Running date simultaneously on all nodes (using parallel ssh) shows that
> they are synced.
>
> Tamar
>
>
> 
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media 
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956 
>
>  
>
>  
>
>  
>
> On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov  wrote:**
> **
>
> Did you synchronized the clocks between servers?
>
>  
>
> On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel 
> wrote:
>
> Hi!
> I have Cassandra cluster with 3 node running version 1.0.11.
>
> I am using Hector HLockManagerImpl, which creates a keyspace named
> HLockManagerImpl and CF HLocks.
>
> For some reason I have a row with single column that should have expired
> yesterday who is still there.
> I tried deleting it using cli, but it is stuck...
> Any ideas how to delete it?
>
> Thanks,
>
>
> 
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media 
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956 
>
>  
>
>  
>
>  
>
>  
>
> ___
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see http://publicresearch.barclays.com.
>
> ___
>
> ** **
>
> ___
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Ba

RE: column with TTL of 10 seconds lives very long...

2013-05-23 Thread moshe.kranc
(Probably will not solve your problem, but worth mentioning): It's not enough 
to check that the clocks of all the servers are synchronized - I believe that 
the client node sets the timestamp for a record being written. So, you should 
also check the timestamp on your Hector client nodes.

From: Tamar Fraenkel [mailto:ta...@tok-media.com]
Sent: Thursday, May 23, 2013 2:17 PM
To: user@cassandra.apache.org
Subject: Re: column with TTL of 10 seconds lives very long...

Hi!

TTL was set:

[default@HLockingManager] get 
HLocks['/LockedTopic/31a30c12-652d-45b3-9ac2-0401cce85517'];
=> (column=69b057d4-3578-4326-a9d9-c975cb8316d2, 
value=36396230353764342d333537382d343332362d613964392d633937356362383331366432, 
timestamp=1369307815049000, ttl=10)

Also, all other lock columns expire as expected.
Thanks,
Tamar

Tamar Fraenkel
Senior Software Engineer, TOK Media
[cid:image001.png@01CE57C1.5D7C60A0]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956



On Thu, May 23, 2013 at 1:58 PM, 
mailto:moshe.kr...@barclays.com>> wrote:
Maybe you didn't set the TTL correctly.
Check the TTL of the column using CQL, e.g.:
SELECT TTL (colName) from colFamilyName WHERE ;

From: Felipe Sere [mailto:felipe.s...@1und1.de]
Sent: Thursday, May 23, 2013 1:28 PM
To: user@cassandra.apache.org
Subject: AW: column with TTL of 10 seconds lives very long...

This is interesting as it might affect me too :)
I have been observing deadlocks with HLockManagerImpl which dont get resolved 
for a long time
even though the columns with the locks should only live for about 5-10secs.

Any ideas how to investigate this further from the Cassandra-side?

Von: Tamar Fraenkel [ta...@tok-media.com]
Gesendet: Donnerstag, 23. Mai 2013 11:58
An: user@cassandra.apache.org
Betreff: Re: column with TTL of 10 seconds lives very long...
Thanks for the response.
Running date simultaneously on all nodes (using parallel ssh) shows that they 
are synced.
Tamar

Tamar Fraenkel
Senior Software Engineer, TOK Media
[cid:image001.png@01CE57C1.5D7C60A0]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956



On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov 
mailto:n...@nmmm.nu>> wrote:
Did you synchronized the clocks between servers?

On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel 
mailto:ta...@tok-media.com>> wrote:
Hi!
I have Cassandra cluster with 3 node running version 1.0.11.
I am using Hector HLockManagerImpl, which creates a keyspace named 
HLockManagerImpl and CF HLocks.
For some reason I have a row with single column that should have expired 
yesterday who is still there.
I tried deleting it using cli, but it is stuck...
Any ideas how to delete it?
Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[cid:image001.png@01CE57C1.5D7C60A0]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





___

This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer
 regarding market commentary from Barclays Sales and/or Trading, who are active 
market participants; and in respect of Barclays Research, including disclosures 
relating to specific issuers, please see http://publicresearch.barclays.com.

___


___

This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer regarding market commentary from 
Barclays Sales and/or Trading, who are active market participants; and in 
respect of Barclays Research, including disclosures relating to specific 
issuers, please see http://publicresearch.barclays.com.

___
<>

Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Tamar Fraenkel
Hi!

TTL was set:

[default@HLockingManager] get
HLocks['/LockedTopic/31a30c12-652d-45b3-9ac2-0401cce85517'];
=> (column=69b057d4-3578-4326-a9d9-c975cb8316d2,
value=36396230353764342d333537382d343332362d613964392d633937356362383331366432,
timestamp=1369307815049000, ttl=10)


Also, all other lock columns expire as expected.

Thanks,
Tamar

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Thu, May 23, 2013 at 1:58 PM,  wrote:

> Maybe you didn’t set the TTL correctly.
>
> Check the TTL of the column using CQL, e.g.:
>
> SELECT TTL (colName) from colFamilyName WHERE ;
>
> ** **
>
> *From:* Felipe Sere [mailto:felipe.s...@1und1.de]
> *Sent:* Thursday, May 23, 2013 1:28 PM
> *To:* user@cassandra.apache.org
> *Subject:* AW: column with TTL of 10 seconds lives very long...
>
> ** **
>
> This is interesting as it might affect me too :)
> I have been observing deadlocks with HLockManagerImpl which dont get
> resolved for a long time
> even though the columns with the locks should only live for about 5-10secs.
>
> Any ideas how to investigate this further from the Cassandra-side?
> --
>
> *Von:* Tamar Fraenkel [ta...@tok-media.com]
> *Gesendet:* Donnerstag, 23. Mai 2013 11:58
> *An:* user@cassandra.apache.org
> *Betreff:* Re: column with TTL of 10 seconds lives very long...
>
> Thanks for the response.
> Running date simultaneously on all nodes (using parallel ssh) shows that
> they are synced.
>
> Tamar
>
>
> 
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media 
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956 
>
> ** **
>
> ** **
>
> ** **
>
> On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov  wrote:**
> **
>
> Did you synchronized the clocks between servers?
>
> ** **
>
> On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel 
> wrote:
>
> Hi!
> I have Cassandra cluster with 3 node running version 1.0.11.
>
> I am using Hector HLockManagerImpl, which creates a keyspace named
> HLockManagerImpl and CF HLocks.
>
> For some reason I have a row with single column that should have expired
> yesterday who is still there.
> I tried deleting it using cli, but it is stuck...
> Any ideas how to delete it?
>
> Thanks,
>
>
> 
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media 
>
> [image: Inline image 1]
>
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956 
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> ___
>
> This message is for information purposes only, it is not a recommendation,
> advice, offer or solicitation to buy or sell a product or service nor an
> official confirmation of any transaction. It is directed at persons who are
> professionals and is not intended for retail customer use. Intended for
> recipient only. This message is subject to the terms at:
> www.barclays.com/emaildisclaimer.
>
> For important disclosures, please see:
> www.barclays.com/salesandtradingdisclaimer regarding market commentary
> from Barclays Sales and/or Trading, who are active market participants; and
> in respect of Barclays Research, including disclosures relating to specific
> issuers, please see http://publicresearch.barclays.com.
>
> ___
>
<><>

RE: column with TTL of 10 seconds lives very long...

2013-05-23 Thread moshe.kranc
Maybe you didn't set the TTL correctly.
Check the TTL of the column using CQL, e.g.:
SELECT TTL (colName) from colFamilyName WHERE ;

From: Felipe Sere [mailto:felipe.s...@1und1.de]
Sent: Thursday, May 23, 2013 1:28 PM
To: user@cassandra.apache.org
Subject: AW: column with TTL of 10 seconds lives very long...

This is interesting as it might affect me too :)
I have been observing deadlocks with HLockManagerImpl which dont get resolved 
for a long time
even though the columns with the locks should only live for about 5-10secs.

Any ideas how to investigate this further from the Cassandra-side?

Von: Tamar Fraenkel [ta...@tok-media.com]
Gesendet: Donnerstag, 23. Mai 2013 11:58
An: user@cassandra.apache.org
Betreff: Re: column with TTL of 10 seconds lives very long...
Thanks for the response.
Running date simultaneously on all nodes (using parallel ssh) shows that they 
are synced.
Tamar

Tamar Fraenkel
Senior Software Engineer, TOK Media
[cid:image001.png@01CE57BD.9C67B200]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956



On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov 
mailto:n...@nmmm.nu>> wrote:
Did you synchronized the clocks between servers?

On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel 
mailto:ta...@tok-media.com>> wrote:
Hi!
I have Cassandra cluster with 3 node running version 1.0.11.
I am using Hector HLockManagerImpl, which creates a keyspace named 
HLockManagerImpl and CF HLocks.
For some reason I have a row with single column that should have expired 
yesterday who is still there.
I tried deleting it using cli, but it is stuck...
Any ideas how to delete it?
Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media
[cid:image001.png@01CE57BD.9C67B200]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956





___

This message is for information purposes only, it is not a recommendation, 
advice, offer or solicitation to buy or sell a product or service nor an 
official confirmation of any transaction. It is directed at persons who are 
professionals and is not intended for retail customer use. Intended for 
recipient only. This message is subject to the terms at: 
www.barclays.com/emaildisclaimer.

For important disclosures, please see: 
www.barclays.com/salesandtradingdisclaimer regarding market commentary from 
Barclays Sales and/or Trading, who are active market participants; and in 
respect of Barclays Research, including disclosures relating to specific 
issuers, please see http://publicresearch.barclays.com.

___
<>

AW: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Felipe Sere
This is interesting as it might affect me too :)
I have been observing deadlocks with HLockManagerImpl which dont get resolved 
for a long time
even though the columns with the locks should only live for about 5-10secs.

Any ideas how to investigate this further from the Cassandra-side?

Von: Tamar Fraenkel [ta...@tok-media.com]
Gesendet: Donnerstag, 23. Mai 2013 11:58
An: user@cassandra.apache.org
Betreff: Re: column with TTL of 10 seconds lives very long...

Thanks for the response.
Running date simultaneously on all nodes (using parallel ssh) shows that they 
are synced.
Tamar

Tamar Fraenkel
Senior Software Engineer, TOK Media

[Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov 
mailto:n...@nmmm.nu>> wrote:
Did you synchronized the clocks between servers?


On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel 
mailto:ta...@tok-media.com>> wrote:
Hi!
I have Cassandra cluster with 3 node running version 1.0.11.

I am using Hector HLockManagerImpl, which creates a keyspace named 
HLockManagerImpl and CF HLocks.
For some reason I have a row with single column that should have expired 
yesterday who is still there.
I tried deleting it using cli, but it is stuck...
Any ideas how to delete it?

Thanks,

Tamar Fraenkel
Senior Software Engineer, TOK Media

[Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




<>

Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Tamar Fraenkel
Thanks for the response.
Running date simultaneously on all nodes (using parallel ssh) shows that
they are synced.
Tamar

*Tamar Fraenkel *
Senior Software Engineer, TOK Media

[image: Inline image 1]

ta...@tok-media.com
Tel:   +972 2 6409736
Mob:  +972 54 8356490
Fax:   +972 2 5612956




On Thu, May 23, 2013 at 12:29 PM, Nikolay Mihaylov  wrote:

> Did you synchronized the clocks between servers?
>
>
> On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel wrote:
>
>> Hi!
>> I have Cassandra cluster with 3 node running version 1.0.11.
>>
>> I am using Hector HLockManagerImpl, which creates a keyspace named
>> HLockManagerImpl and CF HLocks.
>> For some reason I have a row with single column that should have expired
>> yesterday who is still there.
>> I tried deleting it using cli, but it is stuck...
>> Any ideas how to delete it?
>>
>> Thanks,
>>
>> *Tamar Fraenkel *
>> Senior Software Engineer, TOK Media
>>
>> [image: Inline image 1]
>>
>> ta...@tok-media.com
>> Tel:   +972 2 6409736
>> Mob:  +972 54 8356490
>> Fax:   +972 2 5612956
>>
>>
>>
>
<><>

Re: column with TTL of 10 seconds lives very long...

2013-05-23 Thread Nikolay Mihaylov
Did you synchronized the clocks between servers?


On Thu, May 23, 2013 at 9:32 AM, Tamar Fraenkel  wrote:

> Hi!
> I have Cassandra cluster with 3 node running version 1.0.11.
>
> I am using Hector HLockManagerImpl, which creates a keyspace named
> HLockManagerImpl and CF HLocks.
> For some reason I have a row with single column that should have expired
> yesterday who is still there.
> I tried deleting it using cli, but it is stuck...
> Any ideas how to delete it?
>
> Thanks,
>
> *Tamar Fraenkel *
> Senior Software Engineer, TOK Media
>
> [image: Inline image 1]
>
> ta...@tok-media.com
> Tel:   +972 2 6409736
> Mob:  +972 54 8356490
> Fax:   +972 2 5612956
>
>
>
<>