Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Wojciech Meler
Can you provide output from sar command for the time period when long
GC occurred ?

Regards,
Wojciech Meler


Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Nikolay Kоvshov
48 G of Ram on that machine, swap is not used. I will disable swap at all just 
in case
I have 4 cassandra processes (parts of 4 different clusters), each allocated 8 
GB and using 4 of them

java -version
java version 1.7.0
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)




23.07.2012, 20:12, Joost van de Wijgerd jwijg...@gmail.com:
 Howmuch memory do you have on the machine. Seems like you have 8G
 reserved for the Cassandra java process, If this is all the memory on
 the machine you might be swapping. Also which jvm do you use?

 kind regards

 Joost

 On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov nkovs...@yandex.ru wrote:

   21th I have mirgated to cassandra 1.1.2 but see no improvement

  cat /var/log/cassandra/Earth1.log | grep GC for
  INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 123) 
 GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 123) 
 GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 123) 
 GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 123) 
 GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 123) 
 GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
 8464105472
  === Migrated from 1.0.0 to 1.1.2
  INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 122) 
 GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 122) 
 GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 122) 
 GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 8464105472
  INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 122) 
 GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 8464105472

  Memory and yaml memory-related settings are default
  I do not do deletes
  I have 2 CF's and no secondary indexes

  LiveRatio's:
   INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
 was 1.0).  calculation took 85ms for 6236 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 8ms for 1 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
 was 1.0).  calculation took 99ms for 1094 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 80ms for 242 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
 was 1.0).  calculation took 505ms for 2678 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
 was 1.0).  calculation took 776ms for 5236 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 64ms for 389 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 (just-counted 
 was 1.0).  calculation took 1378ms for 8948 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 140ms for 1082 columns
   INFO [MemoryMeter:1] 2012-07-21 09:05:08,439 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 
 2.5763038186160894 (just-counted was 2.5763038186160894).  calculation took 
 8796ms for 102193 columns

  18.07.2012, 07:51, aaron morton aa...@thelastpickle.com:
  Assuming all the memory and yaml settings default that does not sound 
 right. The first thought would be the memory meter not counting correctly...
  Do you do a lot of deletes ?
  Do you have a lot of CF's and/or secondary indexes ?
  Can you see log lines about the liveRatio for your cf's ?
  I would upgrade to 1.0.10 before getting too carried away though.
  Cheers

  -
  Aaron Morton
  Freelance Developer
 

Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Nikolay Kоvshov

I ran sar only recently after your advice and did not meet any huge GC-s on 
that server

At 08:14 there was a GC lasting 4.5 seconds, that's not five minutes of course, 
but also quite an unpleasant value;

Still I'm waiting for big GC values and will provide according sar logs.

07:25:01 PM  pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s 
pgscand/s pgsteal/s%vmeff
07:35:01 PM  0.00  4.34  9.02  0.00 13.61  0.00  
0.00  0.00  0.00
07:45:01 PM  0.00  5.17 20.47  0.00 25.77  0.00  
0.00  0.00  0.00
07:55:01 PM  0.00  4.66  8.63  0.00 18.69  0.00  
0.00  0.00  0.00
08:05:01 PM  0.00  8.11  8.84  0.00 14.37  0.00  
0.00  0.00  0.00
08:15:01 PM  0.00  5.19 21.65  0.00 25.94  0.00  
0.00  0.00  0.00


24.07.2012, 10:22, Wojciech Meler wojciech.me...@gmail.com:
 Can you provide output from sar command for the time period when long
 GC occurred ?

 Regards,
 Wojciech Meler


Re: Cassandra 1.0 hangs during GC

2012-07-24 Thread Joost van de Wijgerd
You are better off using Sun Java 6 to run Cassandra. In the past
there were issues reported on 7. Can you try running it on Sun Java 6?

kind regards

Joost

On Tue, Jul 24, 2012 at 10:04 AM, Nikolay Kоvshov nkovs...@yandex.ru wrote:
 48 G of Ram on that machine, swap is not used. I will disable swap at all 
 just in case
 I have 4 cassandra processes (parts of 4 different clusters), each allocated 
 8 GB and using 4 of them

java -version
 java version 1.7.0
 Java(TM) SE Runtime Environment (build 1.7.0-b147)
 Java HotSpot(TM) 64-Bit Server VM (build 21.0-b17, mixed mode)




 23.07.2012, 20:12, Joost van de Wijgerd jwijg...@gmail.com:
 Howmuch memory do you have on the machine. Seems like you have 8G
 reserved for the Cassandra java process, If this is all the memory on
 the machine you might be swapping. Also which jvm do you use?

 kind regards

 Joost

 On Mon, Jul 23, 2012 at 10:07 AM, Nikolay Kоvshov nkovs...@yandex.ru wrote:

   21th I have mirgated to cassandra 1.1.2 but see no improvement

  cat /var/log/cassandra/Earth1.log | grep GC for
  INFO [ScheduledTasks:1] 2012-05-22 17:42:48,445 GCInspector.java (line 
 123) GC for ParNew: 345 ms for 1 collections, 82451888 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-05-23 02:47:13,911 GCInspector.java (line 
 123) GC for ParNew: 312 ms for 1 collections, 110617416 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-05-23 11:57:54,317 GCInspector.java (line 
 123) GC for ParNew: 298 ms for 1 collections, 98161920 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-02 08:52:37,019 GCInspector.java (line 
 123) GC for ParNew: 196886 ms for 1 collections, 2310058496 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-16 17:41:25,940 GCInspector.java (line 
 123) GC for ParNew: 200146 ms for 1 collections, 2345987088 used; max is 
 8464105472
  === Migrated from 1.0.0 to 1.1.2
  INFO [ScheduledTasks:1] 2012-07-21 09:05:08,280 GCInspector.java (line 
 122) GC for ParNew: 282 ms for 1 collections, 466406864 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-21 12:38:43,132 GCInspector.java (line 
 122) GC for ParNew: 233 ms for 1 collections, 405269504 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-22 02:29:09,596 GCInspector.java (line 
 122) GC for ParNew: 253 ms for 1 collections, 389700768 used; max is 
 8464105472
  INFO [ScheduledTasks:1] 2012-07-22 17:45:46,357 GCInspector.java (line 
 122) GC for ParNew: 57391 ms for 1 collections, 400083984 used; max is 
 8464105472

  Memory and yaml memory-related settings are default
  I do not do deletes
  I have 2 CF's and no secondary indexes

  LiveRatio's:
   INFO [pool-1-thread-1] 2012-06-09 02:36:07,759 Memtable.java (line 177) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 85ms for 6236 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:47,614 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 8ms for 1 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,012 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 99ms for 1094 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,331 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 80ms for 242 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:51,856 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 505ms for 2678 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:52,881 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 776ms for 5236 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:52,945 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 64ms for 389 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:55,162 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='PSS') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 1378ms for 8948 columns
   INFO [MemoryMeter:1] 2012-07-21 09:04:55,304 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 1.0 
 (just-counted was 1.0).  calculation took 140ms for 1082 columns
   INFO [MemoryMeter:1] 2012-07-21 09:05:08,439 Memtable.java (line 213) 
 CFS(Keyspace='Keyspace1', ColumnFamily='Standard1') liveRatio is 
 2.5763038186160894 (just-counted was 2.5763038186160894).  calculation took 
 8796ms for 102193 columns

  18.07.2012, 07:51, aaron morton aa...@thelastpickle.com:
  Assuming all the memory and yaml settings default that does not sound 
 right. The first thought would be the memory meter not counting 
 correctly...
  Do you do a lot of 

keyspace no longer modifiable

2012-07-24 Thread Marco Matarazzo
Greetings.

We have a very strange problem: it seems that sometimes our keyspaces become 
immodifiable.

user@server:~$ cqlsh -3 -k goh_master cassandra1
Connected to GOH Cluster at cassandra1:9160.
[cqlsh 2.2.0 | Cassandra 1.1.2 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
Use HELP for help.
cqlsh:goh_master drop columnfamily agents_blueprints;
cqlsh:goh_master 

[Here i disconnected, just in case. It's exactly the same if I don't do this.]

user@server:~$ cqlsh -3 -k goh_master cassandra1
Connected to GOH Cluster at cassandra1:9160.
[cqlsh 2.2.0 | Cassandra 1.1.2 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
Use HELP for help.
cqlsh:goh_master DESCRIBE COLUMNFAMILY agents_blueprints 

CREATE TABLE agents_blueprints (
  agent_id ascii,
  archetype ascii,
  proto_id ascii,
  PRIMARY KEY (agent_id, archetype)
) WITH COMPACT STORAGE AND
  comment='' AND
  caching='KEYS_ONLY' AND
  read_repair_chance=0.10 AND
  gc_grace_seconds=864000 AND
  min_compaction_threshold=4 AND
  max_compaction_threshold=32 AND
  replicate_on_write='true' AND
  compaction_strategy_class='SizeTieredCompactionStrategy' AND
  compression_parameters:sstable_compression='SnappyCompressor';

cqlsh:goh_master

Is it still possible to write and read data from the tables, they just can't be 
dropped, created or altered.

With 1.1.1 we discovered that a rolling restart of the cluster used to fix the 
problem. This is no longer happening with 1.1.2, and the only way we found to 
come out from this situation is to bring down the cluster, delete everything in 
/var/lib/cassandra (everything inside commitlog, data and saved_caches), start 
over with a clean cluster and dump again new data.

This happens to us very often, both on our 3 nodes cluster and on our test 
single-node cluster. We use Ubuntu LTS 12.04, with Sun Oracle Java 6.

Is it something known ? This is a pretty ugly bug, to us.

--
Marco Matarazzo
== Hex Keep ==

You can learn more about a man
  in one hour of play
  than in one year of conversation.” - Plato






Re: CQL3 and column slices

2012-07-24 Thread Sylvain Lebresne
On Tue, Jul 24, 2012 at 12:09 AM, Josep Blanquer
blanq...@rightscale.com wrote:
 is there some way to express that in CQL3? something logically equivalent to

 SELECT *  FROM bug_test WHERE a:b:c:d:e  1:1:1:1:2??

No, there isn't. Not currently at least. But feel free of course to
open a ticket/request on
https://issues.apache.org/jira/browse/CASSANDRA.

I note that I would be curious to know the concrete use case you have
for such type of queries. It would also help as an argument to add
such facilities more quickly (or at all). Typically, we should
support it in CQL3 because it was possible with thrift is
definitively an argument, but a much weaker one without concrete
examples of why it might be useful in the first place.

--
Sylvain


Dropping counter mutations taking longer than rpc_timeout

2012-07-24 Thread Omid Aladini
Hey,

Mutations taking longer than rpc_timeout will be dropped because
coordinator won't be waiting for the coordinator and will return
TimeoutException to the client, if it doesn't reach the consistency level
[1].

In case of counters though, since counter mutations aren't idempotent, the
client is not supposed to retry an increment on TimeoutException. So why
doesn't a counter mutation gets processed regardless of rpc_timeout?

Cheers,
Omid

[1] http://wiki.apache.org/cassandra/FAQ#dropped_messages


going back in time

2012-07-24 Thread mdione.ext
  One of the scenarios I have to have in account for a small Cassandra cluster 
(N=4) is
restoring the data back in time. I will have full backups for 15 days, and it's 
possible
that I will need to restore, let's say, the data from 10 days ago (don't ask, 
I'm not 
going into the details why). 

  I know/suspect that restores by KeySpace/ColumnFamily are possible (we 
haven't tested 
yet), but I wonder if it would there any side effects of stopping all the nodes 
(assuming cut to the other KS are OK), restoring the SSTables, and starting the 
nodes 
one by one.

  So far we're thinking in RF=4, but also in RF=3 or at least RFN in the 
future. Is 
all this crazy talk or is it possible? Any side effects in the KS system and/or 
indexes 
to have in account?

--
Marcos Dione
SysAdmin
Astek Sud-Est
pour FT/TGPF/OPF/PORTAIL/DOP/HEBEX @ Marco Polo
04 97 12 62 45 - mdione@orange.com



_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Re: going back in time

2012-07-24 Thread Pierre-Yves Ritschard
mdione@orange.com writes:

   One of the scenarios I have to have in account for a small Cassandra 
 cluster (N=4) is
 restoring the data back in time. I will have full backups for 15 days, and 
 it's possible
 that I will need to restore, let's say, the data from 10 days ago (don't ask, 
 I'm not 
 going into the details why). 

   I know/suspect that restores by KeySpace/ColumnFamily are possible (we 
 haven't tested 
 yet), but I wonder if it would there any side effects of stopping all the 
 nodes 
 (assuming cut to the other KS are OK), restoring the SSTables, and starting 
 the nodes 
 one by one.

   So far we're thinking in RF=4, but also in RF=3 or at least RFN in the 
 future. Is 
 all this crazy talk or is it possible? Any side effects in the KS system 
 and/or indexes 
 to have in account?



Snapshot and restores are great for point in time recovery. There's no
particular side-effect if you're willing to accept the downtime.

If you don't want to take your whole cluster offline you can use
sstableloader as well.


RE: going back in time

2012-07-24 Thread mdione.ext
De : Pierre-Yves Ritschard [mailto:p...@spootnik.org]
 Snapshot and restores are great for point in time recovery. There's no
 particular side-effect if you're willing to accept the downtime.

  Are you sure? The system KS has no book-keeping about the KSs/CFs? 
For instance, schema changes, etc?

 If you don't want to take your whole cluster offline you can use
 sstableloader as well.

  Sounds wonderful.

_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Re: going back in time

2012-07-24 Thread Pierre-Yves Ritschard
mdione@orange.com writes:

 De : Pierre-Yves Ritschard [mailto:p...@spootnik.org]
 Snapshot and restores are great for point in time recovery. There's no
 particular side-effect if you're willing to accept the downtime.

   Are you sure? The system KS has no book-keeping about the KSs/CFs? 
 For instance, schema changes, etc?


Don't take my word for it, it's easy enough to fill up a small 3 node
cluster and play with it.

Here's a few more things that you should pay attention to:

- If you change the schema you are obviously going to have to reconverge
  to the previous schema.
- You want to avoid pending commitlog entries. If you want to load a
  full snapshot from scratch, the easiest route would be to drop the KS,
  recreate it with the expected schema and load your sstables from disk.

 If you don't want to take your whole cluster offline you can use
 sstableloader as well.

   Sounds wonderful.


Re: Migrating data from a 0.8.8 - 1.1.2 ring

2012-07-24 Thread Mike Heffner
On Mon, Jul 23, 2012 at 1:25 PM, Mike Heffner m...@librato.com wrote:

 Hi,

 We are migrating from a 0.8.8 ring to a 1.1.2 ring and we are noticing
 missing data post-migration. We use pre-built/configured AMIs so our
 preferred route is to leave our existing production 0.8.8 untouched and
 bring up a parallel 1.1.2 ring and migrate data into it. Data is written to
 the rings via batch processes so we can easily assure that both the
 existing and new rings will have the same data post migration.

 snip


 The steps we are taking are:

 1. Bring up a 1.1.2 ring in the same AZ/data center configuration with
 tokens matching the corresponding nodes in the 0.8.8 ring.
 2. Create the same keyspace on 1.1.2.
 3. Create each CF in the keyspace on 1.1.2.
 4. Flush each node of the 0.8.8 ring.
 5. Rsync each non-compacted sstable from 0.8.8 to the corresponding node
 in 1.1.2.
 6. Move each 0.8.8 sstable into the 1.1.2 directory structure by renaming
 the file to the  /cassandra/data/keyspace/cf/keyspace-cf... format.
 For example, for the keyspace Metrics and CF epochs_60 we get:
 cassandra/data/Metrics/epochs_60/Metrics-epochs_60-g-941-Data.db.
 7. On each 1.1.2 node run `nodetool -h localhost refresh Metrics CF` for
 each CF in the keyspace. We notice that storage load jumps accordingly.
 8. On each 1.1.2 node run `nodetool -h localhost upgradesstables`. This
 takes awhile but appears to correctly rewrite each sstable in the new 1.1.x
 format. Storage load drops as sstables are compressed.


So, after some further testing we've observed that the `upgradesstables`
command is removing data from the sstables, leading to our missing data.
We've repeated the steps above with several variations:

WORKS refresh - scrub
WORKS refresh - scrub - major compaction

FAILS refresh - upgradesstables
FAILS refresh - scrub - upgradesstables
FAILS refresh - scrub - major compaction - upgradesstables

So, we are able to migrate our test CFs from a 0.8.8 ring to a 1.1.2 ring
when we use scrub. However, whenever we run an upgradesstables command the
sstables are shrunk significantly and our tests show missing data:

 INFO [CompactionExecutor:4] 2012-07-24 04:27:36,837 CompactionTask.java
(line 109) Compacting
[SSTableReader(path='/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-51-Data.db')]
 INFO [CompactionExecutor:4] 2012-07-24 04:27:51,090 CompactionTask.java
(line 221) Compacted to
[/raid0/cassandra/data/Metrics/metrics_900/Metrics-metrics_900-hd-58-Data.db,].
 60,449,155 to 2,578,102 (~4% of original) bytes for 4,002 keys at
0.172562MB/s.  Time: 14,248ms.

Is there a scenario where upgradesstables would remove data that a scrub
command wouldn't? According the documentation, it would appear that the
scrub command is actually more destructive than upgradesstables in terms of
removing data. On 1.1.x, upgradesstables is the documented upgrade command
over a scrub.

The keyspace is defined as:

Keyspace: Metrics:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [us-east:3]

And the column family above defined as:

ColumnFamily: metrics_900
  Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
  Default column value validator:
org.apache.cassandra.db.marshal.BytesType
  Columns sorted by:
org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.LongType,org.apache.cassandra.db.marshal.UTF8Type,org.apache.cassandra.db.marshal.UTF8Type)
  GC grace seconds: 0
  Compaction min/max thresholds: 4/32
  Read repair chance: 0.1
  DC Local Read repair chance: 0.0
  Replicate on write: true
  Caching: KEYS_ONLY
  Bloom Filter FP chance: default
  Built indexes: []
  Compaction Strategy:
org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
  Compression Options:
sstable_compression:
org.apache.cassandra.io.compress.SnappyCompressor

All rows have a TTL of 30 days, so it's possible that, along with the
gc_grace=0, a small number would be removed during a
compaction/scrub/upgradesstables step. However, the majority should still
be kept as their TTL has not expired yet.

We are still experimenting to see under what conditions this happens, but I
thought I'd send out some more info in case there is something clearly
wrong we're doing here.


Thanks,

Mike
-- 

  Mike Heffner m...@librato.com
  Librato, Inc.


Re: How to Optimizing Cassandra Updates -( Use of memtables)

2012-07-24 Thread Dean Hiller
I am guessing you already asked if they could give you three 100MB files
instead? so you could parallelize the operation.  or maybe your task
doesn't lend itself well to that.

Dean

On Tue, Jul 24, 2012 at 10:01 AM, Pushpalanka Jayawardhana 
pushpalankaj...@gmail.com wrote:

 Hi all,

 I am dealing with a scenario where I receive a .csv file in every 10mins
 intervals which is of average 300MB. I need to update a Cassandra cluster
 according to the received data from .csv file, after some processing
 functions.

 Current approach is keeping a Hashmap in memory, updating it from the
 processed .csv files gathering the data to be updated(This data is mostly a
 update on a counter). Then periodically(let's say in 2s intervals) the
 values in the Hashmap are read one by one again and updated in Cassandra.

 I have tried generating sstables and loading data as batches via
 sstableloader, but it is lot slower than the requirement that I need near
 real time results.

 Are there any hints on what I can try out? Is there any possibility to do
 something like directly updating values in a memtable (Instead of using
 Hashmap) and sending to Cassandra than loading via sstables?



 --
 Pushpalanka Jayawardhana





Re: CQL3 and column slices

2012-07-24 Thread Josep Blanquer
Thank Sylvain,

 The main argument for this is pagination. Let me try to explain the use
cases, and compare it to RDBMS for better illustration:
 1- Right now, Cassandra doesn't stream the requests, so large resultsets
are a royal pain in the neck to deal with. I.e., if I have a range_slice,
or even a slice query that cuts across 1 million columns...I have to
completely eat it all in the client receiving the response. That is, I'll
need to store 1 million results in the client no matter what, and that can
be quite prohibitive.
 2- In an effort to alleviate that, one can be smarter in the client and
play the pagination game...i.e., start slicing at some column and get the
next N results, then start the slice at the last column seen and get N
moreetc. That results in many more queries from the smart client, but
at least it would allow you to handle large result sets. (That's where the
need for the CQL query in my original email was about).
3- There's another important factor related to this problem in my opinion:
the LIMIT clause in Cassandra (in both CQL or Thrift) is a required
field. What I mean by required is that cassandra requires an explicit
count to operate underneath. So it is really different from RDBMS'
semantics where no LIMIT means you'll get all the results (instead of the
high, yet still bound count of 10K or 20K max resultset row cassandra
enforces by defaul)...and I cannot tell you how many problems we've had
with developers forgetting about these default counts in queries, and
realizing that some had results truncated because of that...in my mind,
LIMIT should be to only used restrict results...queries with no LIMIT
should always return all results (much like RDBMS)...otherwise the query
seems the same but it is semantically different.

So, all in all I think that the main problem/use case I'm facing is that
Cassandra cannot stream resultsets. If it did, I believe that the need for
my pagination use case would basically disappear, since it'd be the
transport/client that would throttle how many results are stored in the
client buffer at any point time. At the same time, I believe that with a
streaming protocol you could simply change Cassandra internals to have
infinite default limits...since there wouldn't be no reason to stop
scanning (unless an explicit LIMIT clause was specified by the client).
That would give you not only the SQL-equivalent syntax, but also the
equivalent semantics of most current DBs.

I hope that makes sense. That being said, are there any plans for streaming
results? I believe that without that (and especially with the new CQL
restrictions) it make much more difficult to use Cassandra with wide rows
and large resultsets (which, in my mind is one of its sweet spots ). I
believe that if that doesn't happen it would a) force the clients to be
built in a much more complex and inefficient way to handle wide rows or b)
will force users to use different, less efficient datamodels for their
data. Both seem bad propositions to me, as they wouldn't be taking
advantage of Cassandra's power, therefore diminishing its value.

 Cheers,

 Josep M.


On Tue, Jul 24, 2012 at 3:11 AM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Tue, Jul 24, 2012 at 12:09 AM, Josep Blanquer
 blanq...@rightscale.com wrote:
  is there some way to express that in CQL3? something logically
 equivalent to
 
  SELECT *  FROM bug_test WHERE a:b:c:d:e  1:1:1:1:2??

 No, there isn't. Not currently at least. But feel free of course to
 open a ticket/request on
 https://issues.apache.org/jira/browse/CASSANDRA.

 I note that I would be curious to know the concrete use case you have
 for such type of queries. It would also help as an argument to add
 such facilities more quickly (or at all). Typically, we should
 support it in CQL3 because it was possible with thrift is
 definitively an argument, but a much weaker one without concrete
 examples of why it might be useful in the first place.

 --
 Sylvain



Re: Bringing a dead node back up after fixing hardware issues

2012-07-24 Thread Brandon Williams
On Mon, Jul 23, 2012 at 10:24 PM, Eran Chinthaka Withana
eran.chinth...@gmail.com wrote:
 Thanks Brandon for the answer (and I didn't know driftx = Brandon Williams.
 Thanks for your awesome support in Cassandra IRC)

Thanks :)

 Increasing CL is tricky for us for now, as our RF on that datacenter is 2
 and CL is set to ONE. If we make the CL to be LOCAL_QUORUM, then, if a node
 goes down we will have trouble. I will try to increase the RF to 3 in that
 data center and set the CL to LOCAL_QUORUM if nothing works out.

Increasing the RF and and using LOCAL_QUORUM is the right thing in
this case.  By choosing CL.ONE, you are agreeing that read misses are
acceptable.  If they are not, then adjusting your RF/CL is the only
path.

 About decommissioning, if the node goes down. There is no way of knowing
 running that command on that node, right? IIUC, decommissioning should be
 run on a node that needs to be decommissioned.

Well, decom and removetoken are both ways of removing a node.  The
former is for a live node, and the latter is for a dead node.  Since
your node was actually alive you could have decommissioned it.

 Coming back to the original question, without touching the CL, can we bring
 back a dead node (after fixing it) and somehow tell Cassandra that the node
 is backup and do not send read requests until it gets all the data?

No, as I said, you are accepting this behavior by choosing CL.ONE.

-Brandon


Re: TimedOutException caused by

2012-07-24 Thread J . Amding


aaron morton aaron at thelastpickle.com writes:

 
 The cluster is running into GC problems and this is slowing it down under the 
stress test. When it slows down one or more of the nodes is failing to perform 
the write within rpc_timeout . This causes the coordinator of the write to 
raise 
the TimedOutException. 
 You options are:
 
 * allocate more memory
 * ease back on the stress test. 
 * work as a CL QUORUM so that one node failing does result in the error. 
 
 see also http://wiki.apache.org/cassandra/FAQ#slows_down_after_lotso_inserts
 
 
 Cheers
  
 
 
 
 
 
 -
 Aaron Morton
 Freelance Developer
  at aaronmorton
 http://www.thelastpickle.com
 
 
 
 On 28/05/2012, at 12:59 PM, Jason Tang wrote:
 Hi
 My system is 4 nodes 64 bit cassandra cluster, 6G big per node,default 
configuration (which means 1/3 heap for memtable), replicate number 3, write 
all, read one.
 When I run stress load testing, I got this TimedOutException, and some 
operation failed, and all traffic hang for a while. 
 
 And when I have 1G memory 32 bit cassandra on standalone model, I didn't find 
so frequently Stop the world behavior.
 
 So I wonder what kind of operation will hang the cassandra system. 
 
 
 How to collect information for tuning.
 
 From the system log and document, I guess there are three type operations:
 1) Flush memtable when meet max size
 
 2) Compact SSTable (why?)
 3) Java GC
 
 system.log:
 
  INFO [main] 2012-05-25 16:12:17,054 ColumnFamilyStore.java (line 688) 
Enqueuing flush of Memtable-LocationInfo at 1229893321(53/66 serialized/live 
bytes, 2 ops)
  INFO [FlushWriter:1] 2012-05-25 16:12:17,054 Memtable.java (line 239) 
 Writing 
Memtable-LocationInfo at 1229893321(53/66 serialized/live bytes, 2 ops)
  INFO [FlushWriter:1] 2012-05-25 16:12:17,166 Memtable.java (line 275) 
Completed flushing /var/proclog/raw/cassandra/data/system/LocationInfo-hb-2-
Data.db (163 bytes)
 
 ...
 
  INFO [CompactionExecutor:441] 2012-05-28 08:02:55,345 CompactionTask.java 
(line 112) Compacting 
[SSTableReader(path='/var/proclog/raw/cassandra/data/myks/queue-hb-41-Data.db'),
 
SSTableReader(path='/var/proclog/raw/cassandra/data/
 myks /queue-hb-32-Data.db'), 
SSTableReader(path='/var/proclog/raw/cassandra/data/
 myks /queue-hb-37-Data.db'), 
SSTableReader(path='/var/proclog/raw/cassandra/data/
 myks /queue-hb-53-Data.db')]
 ...
 
 
  WARN [ScheduledTasks:1] 2012-05-28 08:02:26,619 GCInspector.java (line 146) 
Heap is 0.7993011015621736 full.  You may need to reduce memtable and/or cache 
sizes.  Cassandra will now flush up to the two largest memtables to free up 
memory.  Adjust flush_largest_memtables_at threshold in cassandra.yaml if you 
don't want Cassandra to do this automatically
  INFO [ScheduledTasks:1] 2012-05-28 08:02:54,980 GCInspector.java (line 123) 
GC for ConcurrentMarkSweep: 728 ms for 2 collections, 3594946600 used; max is 
6274678784
  INFO [ScheduledTasks:1] 2012-05-28 08:41:34,030 GCInspector.java (line 123) 
GC for ParNew: 1668 ms for 1 collections, 4171503448 used; max is 6274678784
  INFO [ScheduledTasks:1] 2012-05-28 08:41:48,978 GCInspector.java (line 123) 
GC for ParNew: 1087 ms for 1 collections, 2623067496 used; max is 6274678784
  INFO [ScheduledTasks:1] 2012-05-28 08:41:48,987 GCInspector.java (line 123) 
GC for ConcurrentMarkSweep: 3198 ms for 3 collections, 2623361280 used; max is 
6274678784
 
 
 
 Timeout Exception:
 
 Caused by: org.apache.cassandra.thrift.TimedOutException: null
         at 
org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:19
495) ~[na:na]
         at 
org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:10
35) ~[na:na]
         at 
org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:1009) ~
[na:na]
         at 
me.prettyprint.cassandra.service.KeyspaceServiceImpl$1.execute(KeyspaceServiceIm
pl.java:95) ~[na:na]
         ... 64 common frames omitted
 
 
 BRs
 //Tang Weiqiang
 
 
 
 
 
 
 

Hi, I've been running into the same type of issue, but on a single machine with 
CL ONE. Also a custom insertion stress utility. What would I need to do to 
address the timeouts? By allocate more memory do you mean increase heap size in 
the environment conf file?

Thanks,
J.







Fwd: Call for Papers for ApacheCon Europe 2012 now open!

2012-07-24 Thread Jonathan Ellis
There are Big Data and NoSQL tracks where Cassandra talks would be appropriate.


-- Forwarded message --
From: Nick Burch nick.bu...@alfresco.com
Date: Thu, Jul 19, 2012 at 1:14 PM
Subject: Call for Papers for ApacheCon Europe 2012 now open!
To: committ...@apache.org


Hi All

We're pleased to announce that the Call for Papers for ApacheCon
Europe 2012 is finally open!

(For those who don't already know, ApacheCon Europe will be taking
place between the 5th and the 9th of November this year, in Sinsheim,
Germany.)

If you'd like to submit a talk proposal, please visit the conference
website at http://www.apachecon.eu/ and sign up for a new account.
Once you've signed up, use your dashboard to enter your speaker bio,
then submit your talk proposal(s). There's more information on the CFP
page on the conference website.

We welcome talk proposals from all projects, from right across the
bredth of projects at the foundation! To make things easier for talk
selection and scheduling, we'd ask that you tag your proposal with the
track that it most closely fits within. The details of the tracks, and
what projects they expect to cover, are available at
http://www.apachecon.eu/tracks/.

(If your project/group of projects was intending to submit a track,
and missed the deadline, then please get in touch with us on
apachecon-disc...@apache.org  straight away, so we can work out if
it's possible to squeeze you in...)

The CFP will close on Friday 3rd August, so you've a little over weeks
to send in your talk proposal. Don't put it off! We'll look forward to
seeing some great ones shortly!

Thanks
Nick
(On behalf of the Conferences committee)


-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com