date:20140911

Re: cassandra + spark / pyspark

2014-09-11 Thread abhinav chowdary

Adding to conversation...

there are 3 great open source options available

1. Calliope http://tuplejump.github.io/calliope/
This is the first library that was out some time late last year (as i
can recall) and I have been using this for a while, mostly very stable,
uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)

2. Datastax spark cassandra connector
https://github.com/datastax/spark-cassandra-connector: Main difference is
this uses cql3, again a great library but has few issues, also is very
actively developed by far and still uses thrift for minor stuff but all
heavy lifting in cql3

3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more to
offer if you use all startio stack, Deep is for Spark, Statio Streaming is
built on top of spark streaming, Stratio meta is something similar to
sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
with advanced Lucene based indexing

Re: cassandra + spark / pyspark

2014-09-11 Thread DuyHai Doan

2. still uses thrift for minor stuff -- I think that the only call using
thrift is describe_ring to get an estimate of ratio of partition keys
within the token range

3. Stratio has a talk today at the SF Summit, presenting Stratio META. For
the folks not attending the conference, video should be available within
one month after


On Thu, Sep 11, 2014 at 6:23 AM, abhinav chowdary 
abhinav.chowd...@gmail.com wrote:

 Adding to conversation...

 there are 3 great open source options available

 1. Calliope http://tuplejump.github.io/calliope/
 This is the first library that was out some time late last year (as i
 can recall) and I have been using this for a while, mostly very stable,
 uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)

 2. Datastax spark cassandra connector
 https://github.com/datastax/spark-cassandra-connector: Main difference is
 this uses cql3, again a great library but has few issues, also is very
 actively developed by far and still uses thrift for minor stuff but all
 heavy lifting in cql3

 3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more to
 offer if you use all startio stack, Deep is for Spark, Statio Streaming is
 built on top of spark streaming, Stratio meta is something similar to
 sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
 with advanced Lucene based indexing

Re: cassandra + spark / pyspark

2014-09-11 Thread Oleg Ruchovets

Ok.
   DataStax , Startio are required mesos, hadoop yarn other third party to
get spark cluster HA.

What in case of calliope?
Is it sufficient to have cassandra + calliope + spark to be able process
aggregations?
In my case we have quite a lot of data so doing aggregation only in memory
- impossible.

Does calliope support not in memory mode for spark?

Thanks
Oleg.

On Thu, Sep 11, 2014 at 9:23 PM, abhinav chowdary 
abhinav.chowd...@gmail.com wrote:

 Adding to conversation...

 there are 3 great open source options available

 1. Calliope http://tuplejump.github.io/calliope/
 This is the first library that was out some time late last year (as i
 can recall) and I have been using this for a while, mostly very stable,
 uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)

 2. Datastax spark cassandra connector
 https://github.com/datastax/spark-cassandra-connector: Main difference is
 this uses cql3, again a great library but has few issues, also is very
 actively developed by far and still uses thrift for minor stuff but all
 heavy lifting in cql3

 3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more to
 offer if you use all startio stack, Deep is for Spark, Statio Streaming is
 built on top of spark streaming, Stratio meta is something similar to
 sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
 with advanced Lucene based indexing

Re: cassandra + spark / pyspark

2014-09-11 Thread Rohit Rai

Hi Oleg,

I am the creator of Calliope. Calliope doesn't force any deployment
model... that means you can run it with Mesos or Hadoop or Standalone. To
be fair I don't think the other libs mentioned here should work too.

The Spark cluster HA can be provided using ZooKeeper even in the standalone
deployment mode.


Can you explain what do you mean by in memory aggregations not being
possible. With Calliope being able to utilize the secondary indexes and
also our Stargate Indexes (Distributed lucene indexing for C*)  I am sure
we can handle any scenario. Calliope is used in production at many large
organizations over very very big data.

Feel free to mail me directly, and we can work with you to get you started.

Regards,
Rohit


*Founder  CEO, **Tuplejump, Inc.*

www.tuplejump.com
*The Data Engineering Platform*

On Thu, Sep 11, 2014 at 8:09 PM, Oleg Ruchovets oruchov...@gmail.com
wrote:

 Ok.
DataStax , Startio are required mesos, hadoop yarn other third party to
 get spark cluster HA.

 What in case of calliope?
 Is it sufficient to have cassandra + calliope + spark to be able process
 aggregations?
 In my case we have quite a lot of data so doing aggregation only in memory
 - impossible.

 Does calliope support not in memory mode for spark?

 Thanks
 Oleg.

 On Thu, Sep 11, 2014 at 9:23 PM, abhinav chowdary 
 abhinav.chowd...@gmail.com wrote:

 Adding to conversation...

 there are 3 great open source options available

 1. Calliope http://tuplejump.github.io/calliope/
 This is the first library that was out some time late last year (as i
 can recall) and I have been using this for a while, mostly very stable,
 uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)

 2. Datastax spark cassandra connector
 https://github.com/datastax/spark-cassandra-connector: Main difference
 is this uses cql3, again a great library but has few issues, also is very
 actively developed by far and still uses thrift for minor stuff but all
 heavy lifting in cql3

 3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more to
 offer if you use all startio stack, Deep is for Spark, Statio Streaming is
 built on top of spark streaming, Stratio meta is something similar to
 sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
 with advanced Lucene based indexing

[RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Sylvain Lebresne

The Cassandra team is pleased to announce the release of the final version
of Apache Cassandra 2.1.0.

Cassandra 2.1.0 brings a number of new features and improvements including
(but
not limited to):
 - Improved support of Windows.
 - A new incremental repair option[4, 5]
 - A better row cache that can cache only the head of partitions[6]
 - Off-heap memtables[7]
 - Numerous performance improvements[8, 9]
 - CQL improvements and additions: User-defined types, tuple types, 2ndary
   indexing of collections, ...[10]
 - An improved stress tool[11]

Please refer to the release notes[1] and changelog[2] for details.

Both source and binary distributions of Cassandra 2.1.0 can be downloaded
at:

 http://cassandra.apache.org/download/

As usual, a debian package is available from the project APT repository[3]
(you will need to use the 21x series).

The Cassandra team

[1]: http://goo.gl/k4eM39 (CHANGES.txt)
[2]: http://goo.gl/npCsro (NEWS.txt)
[3]: http://wiki.apache.org/cassandra/DebianPackaging
[4]: http://goo.gl/MjohJp
[5]: http://goo.gl/f8jSme
[6]: http://goo.gl/6TJPH6
[7]: http://goo.gl/YT7znJ
[8]: http://goo.gl/Rg3tdA
[9]: http://goo.gl/JfDBGW
[10]: http://goo.gl/kQl7GW
[11]: http://goo.gl/OTNqiQ

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa

Hello,

The jstack output can be seen in : http://pastebin.com/LXnNyY3U.


I run the tpstats today and always get the same output:


Pool NameActive   Pending  Completed   Blocked  All
time blocked
ReadStage 0 0  0 0
0
RequestResponseStage  0 0  0 0
0
*MutationStage32   5832690042 0
0*
ReadRepairStage   0 0  0 0
0
ReplicateOnWriteStage 0 0  0 0
0
GossipStage   0 0  0 0
0
AntiEntropyStage  0 0  0 0
0
MigrationStage0 0  0 0
0
MemoryMeter   0 0 98 0
0
MemtablePostFlusher   0 0  7 0
0
FlushWriter   0 0  5 0
0
MiscStage 0 0  0 0
0
commitlog_archiver0 0  0 0
0
InternalResponseStage 0 0  0 0
0



The OpCenter show the following status:


Status: Active - Starting
Gossip:Down
Thrift:Down
Native Transport: Down
Pending Tasks: 0





Thanks
Eduardo






On Wed, Sep 10, 2014 at 10:30 PM, Benedict Elliott Smith 
belliottsm...@datastax.com wrote:

 Could you post the results of jstack on the process somewhere?


 On Thu, Sep 11, 2014 at 7:07 AM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Sep 10, 2014 at 1:53 PM, Eduardo Cusa 
 eduardo.c...@usmediaconsulting.com wrote:

 No, is still running the Mutation Stage.


 If you're sure that it is not receiving Hinted Handoff, then the only
 mutations in question can be from the replay of the commit log.

 The commit log should take less than forever to replay.

 =Rob

Re: Quickly loading C* dataset into memory (row cache)

2014-09-11 Thread Danny Chan

What are you referring to when you say memory store?

RAM disk? memcached?

Thanks,

Danny

On Wed, Sep 10, 2014 at 1:11 AM, DuyHai Doan doanduy...@gmail.com wrote:
 Rob Coli strikes again, you're Doing It Wrong, and he's right :D

 Using Cassandra as an distributed cache is a bad idea, seriously. Putting
 6GB into row cache is another one.


 On Tue, Sep 9, 2014 at 9:21 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Sep 9, 2014 at 12:10 PM, Danny Chan tofuda...@gmail.com wrote:

 Is there a method to quickly load a large dataset into the row cache?
 I use row caching as I want the entire dataset to be in memory.


 You're doing it wrong. Use a memory store.

 =Rob

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Alain RODRIGUEZ

Thanks for this new version that seems to bring a lot of new interesting
features and improvements !

Definitely interested in trying new counters and incremental repairs.

Congrats.

PS: I am also quite curious to know what is still inside the heap :D. Maybe
key cache ? So what is recommended heap size while running 2.1 (with
memtable off-heap) ?

2014-09-11 17:05 GMT+02:00 Sylvain Lebresne sylv...@datastax.com:

 The Cassandra team is pleased to announce the release of the final version
 of Apache Cassandra 2.1.0.

 Cassandra 2.1.0 brings a number of new features and improvements including
 (but
 not limited to):
  - Improved support of Windows.
  - A new incremental repair option[4, 5]
  - A better row cache that can cache only the head of partitions[6]
  - Off-heap memtables[7]
  - Numerous performance improvements[8, 9]
  - CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ...[10]
  - An improved stress tool[11]

 Please refer to the release notes[1] and changelog[2] for details.

 Both source and binary distributions of Cassandra 2.1.0 can be downloaded
 at:

  http://cassandra.apache.org/download/

 As usual, a debian package is available from the project APT repository[3]
 (you will need to use the 21x series).

 The Cassandra team

 [1]: http://goo.gl/k4eM39 (CHANGES.txt)
 [2]: http://goo.gl/npCsro (NEWS.txt)
 [3]: http://wiki.apache.org/cassandra/DebianPackaging
 [4]: http://goo.gl/MjohJp
 [5]: http://goo.gl/f8jSme
 [6]: http://goo.gl/6TJPH6
 [7]: http://goo.gl/YT7znJ
 [8]: http://goo.gl/Rg3tdA
 [9]: http://goo.gl/JfDBGW
 [10]: http://goo.gl/kQl7GW
 [11]: http://goo.gl/OTNqiQ

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread Tony Anecito

Congrads team I know you worked hard on it!!
 
One question. Where can users get a java Datastax driver to support this 
version? If so is it released? 
 
Best Regards,
-Tony Anecito
Founder/President
MyUniPortal LLC
http://www.myuniportal.com 


On Thursday, September 11, 2014 9:05 AM, Sylvain Lebresne 
sylv...@datastax.com wrote:
  


The Cassandra team is pleased to announce the release of the final version
of Apache Cassandra 2.1.0.

Cassandra 2.1.0 brings a number of new features and improvements including (but
not limited to):
 - Improved support of Windows.
 - A new incremental repair option[4, 5]
 - A better row cache that can cache only the head of partitions[6]
 - Off-heap memtables[7]
 - Numerous performance improvements[8, 9]
 - CQL improvements and additions: User-defined types, tuple types, 2ndary
   indexing of collections, ...[10]
 - An improved stress tool[11]

Please refer to the release notes[1] and changelog[2] for details.

Both source and binary distributions of Cassandra 2.1.0 can be downloaded at:

 http://cassandra.apache.org/download/

As usual, a debian package is available from the project APT repository[3]
(you will need to use the 21x series).

The Cassandra team

[1]: http://goo.gl/k4eM39 (CHANGES.txt)
[2]: http://goo.gl/npCsro (NEWS.txt)
[3]: http://wiki.apache.org/cassandra/DebianPackaging
[4]: http://goo.gl/MjohJp
[5]: http://goo.gl/f8jSme
[6]: http://goo.gl/6TJPH6
[7]: http://goo.gl/YT7znJ
[8]: http://goo.gl/Rg3tdA
[9]: http://goo.gl/JfDBGW
[10]: http://goo.gl/kQl7GW
[11]: http://goo.gl/OTNqiQ

Re: [RELEASE] Apache Cassandra 2.1.0

2014-09-11 Thread abhinav chowdary

Yes its was released java driver 2.1
On Sep 11, 2014 8:33 AM, Tony Anecito adanec...@yahoo.com wrote:

 Congrads team I know you worked hard on it!!

 One question. Where can users get a java Datastax driver to support this
 version? If so is it released?

 Best Regards,
 -Tony Anecito
 Founder/President
 MyUniPortal LLC
 http://www.myuniportal.com


   On Thursday, September 11, 2014 9:05 AM, Sylvain Lebresne 
 sylv...@datastax.com wrote:


 The Cassandra team is pleased to announce the release of the final version
 of Apache Cassandra 2.1.0.

 Cassandra 2.1.0 brings a number of new features and improvements including
 (but
 not limited to):
  - Improved support of Windows.
  - A new incremental repair option[4, 5]
  - A better row cache that can cache only the head of partitions[6]
  - Off-heap memtables[7]
  - Numerous performance improvements[8, 9]
  - CQL improvements and additions: User-defined types, tuple types, 2ndary
indexing of collections, ...[10]
  - An improved stress tool[11]

 Please refer to the release notes[1] and changelog[2] for details.

 Both source and binary distributions of Cassandra 2.1.0 can be downloaded
 at:

  http://cassandra.apache.org/download/

 As usual, a debian package is available from the project APT repository[3]
 (you will need to use the 21x series).

 The Cassandra team

 [1]: http://goo.gl/k4eM39 (CHANGES.txt)
 [2]: http://goo.gl/npCsro (NEWS.txt)
 [3]: http://wiki.apache.org/cassandra/DebianPackaging
 [4]: http://goo.gl/MjohJp
 [5]: http://goo.gl/f8jSme
 [6]: http://goo.gl/6TJPH6
 [7]: http://goo.gl/YT7znJ
 [8]: http://goo.gl/Rg3tdA
 [9]: http://goo.gl/JfDBGW
 [10]: http://goo.gl/kQl7GW
 [11]: http://goo.gl/OTNqiQ

Re: cassandra + spark / pyspark

2014-09-11 Thread Oleg Ruchovets

Thank you Rohit.
   I sent the email to you.

Thanks
Oleg.

On Thu, Sep 11, 2014 at 10:51 PM, Rohit Rai ro...@tuplejump.com wrote:

 Hi Oleg,

 I am the creator of Calliope. Calliope doesn't force any deployment
 model... that means you can run it with Mesos or Hadoop or Standalone. To
 be fair I don't think the other libs mentioned here should work too.

 The Spark cluster HA can be provided using ZooKeeper even in the
 standalone deployment mode.


 Can you explain what do you mean by in memory aggregations not being
 possible. With Calliope being able to utilize the secondary indexes and
 also our Stargate Indexes (Distributed lucene indexing for C*)  I am sure
 we can handle any scenario. Calliope is used in production at many large
 organizations over very very big data.

 Feel free to mail me directly, and we can work with you to get you started.

 Regards,
 Rohit


 *Founder  CEO, **Tuplejump, Inc.*
 
 www.tuplejump.com
 *The Data Engineering Platform*

 On Thu, Sep 11, 2014 at 8:09 PM, Oleg Ruchovets oruchov...@gmail.com
 wrote:

 Ok.
DataStax , Startio are required mesos, hadoop yarn other third party
 to get spark cluster HA.

 What in case of calliope?
 Is it sufficient to have cassandra + calliope + spark to be able process
 aggregations?
 In my case we have quite a lot of data so doing aggregation only in
 memory - impossible.

 Does calliope support not in memory mode for spark?

 Thanks
 Oleg.

 On Thu, Sep 11, 2014 at 9:23 PM, abhinav chowdary 
 abhinav.chowd...@gmail.com wrote:

 Adding to conversation...

 there are 3 great open source options available

 1. Calliope http://tuplejump.github.io/calliope/
 This is the first library that was out some time late last year (as
 i can recall) and I have been using this for a while, mostly very stable,
 uses Hadoop i/o in Cassandra (note that it doesn't require hadoop)

 2. Datastax spark cassandra connector
 https://github.com/datastax/spark-cassandra-connector: Main difference
 is this uses cql3, again a great library but has few issues, also is very
 actively developed by far and still uses thrift for minor stuff but all
 heavy lifting in cql3

 3. Startio Deep https://github.com/Stratio/stratio-deep: Has lot more
 to offer if you use all startio stack, Deep is for Spark, Statio Streaming
 is built on top of spark streaming, Stratio meta is something similar to
 sharkor sparksql and finally stratio Cassandra which is a fork of Cassandra
 with advanced Lucene based indexing

Re: Quickly loading C* dataset into memory (row cache)

2014-09-11 Thread Robert Coli

On Thu, Sep 11, 2014 at 8:30 AM, Danny Chan tofuda...@gmail.com wrote:

 What are you referring to when you say memory store?

 RAM disk? memcached?


In 2014, probably Redis?

=Rob

Detecting bitrot with incremental repair

2014-09-11 Thread John Sumsion

jbellis talked about incremental repair, which is great, but as I understood, 
repair was also somewhat responsible for detecting and repairing bitrot on 
long-lived sstables.

If repair doesn't do it, what will?

Thanks,
John...


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.

Re: Detecting bitrot with incremental repair

2014-09-11 Thread Robert Coli

On Thu, Sep 11, 2014 at 9:44 AM, John Sumsion sumsio...@familysearch.org
wrote:

 jbellis talked about incremental repair, which is great, but as I
 understood, repair was also somewhat responsible for detecting and
 repairing bitrot on long-lived sstables.


SSTable checksums, and the checksums on individual compressed (and only
compressed) partitions provide some of this functionality, at very least
giving some visibility into bitrot style corruption.


 If repair doesn't do it, what will?


Read repair will help, but only repair is capable of providing the
guarantee you need. Probably Cassandra needs partition checksums on
uncompressed partitions, and then to mark a sstable un-repaired when it
detects a corrupt read.

=Rob

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa

Robert/Elliot.

I deleted commit logs, restarted cassandra and finally the node is up.

Thanks for helps!

Regards.
Eduardo










On Thu, Sep 11, 2014 at 12:08 PM, Eduardo Cusa 
eduardo.c...@usmediaconsulting.com wrote:

 Hello,

 The jstack output can be seen in : http://pastebin.com/LXnNyY3U.


 I run the tpstats today and always get the same output:


 Pool NameActive   Pending  Completed   Blocked
  All time blocked
 ReadStage 0 0  0 0
 0
 RequestResponseStage  0 0  0 0
 0
 *MutationStage32   5832690042 0
   0*
 ReadRepairStage   0 0  0 0
 0
 ReplicateOnWriteStage 0 0  0 0
 0
 GossipStage   0 0  0 0
 0
 AntiEntropyStage  0 0  0 0
 0
 MigrationStage0 0  0 0
 0
 MemoryMeter   0 0 98 0
 0
 MemtablePostFlusher   0 0  7 0
 0
 FlushWriter   0 0  5 0
 0
 MiscStage 0 0  0 0
 0
 commitlog_archiver0 0  0 0
 0
 InternalResponseStage 0 0  0 0
 0



 The OpCenter show the following status:


 Status: Active - Starting
 Gossip:Down
 Thrift:Down
 Native Transport: Down
 Pending Tasks: 0





 Thanks
 Eduardo






 On Wed, Sep 10, 2014 at 10:30 PM, Benedict Elliott Smith 
 belliottsm...@datastax.com wrote:

 Could you post the results of jstack on the process somewhere?


 On Thu, Sep 11, 2014 at 7:07 AM, Robert Coli rc...@eventbrite.com
 wrote:

 On Wed, Sep 10, 2014 at 1:53 PM, Eduardo Cusa 
 eduardo.c...@usmediaconsulting.com wrote:

 No, is still running the Mutation Stage.


 If you're sure that it is not receiving Hinted Handoff, then the only
 mutations in question can be from the replay of the commit log.

 The commit log should take less than forever to replay.

 =Rob

Re: Mutation Stage does not finish

2014-09-11 Thread Robert Coli

On Thu, Sep 11, 2014 at 10:34 AM, Eduardo Cusa 
eduardo.c...@usmediaconsulting.com wrote:

 I deleted commit logs, restarted cassandra and finally the node is up.


Do you have some crazy workload where you do a huge amount of delete or
something? Replaying a commitlog should not take longer than a few tens of
minutes in the worst case scenario.

=Rob

Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge

When setting up a new (additional) data center, the documentation tells us
to use nodetool rebuild -- old dc to fill up the node(s) in the new dc,
and to disable auto_bootstrap.

I'm wondering if it is possible to fill the node with auto_bootstrap=true
instead of a nodetool rebuild command. If so, how will Cassandra decide
from where to stream the data?

The reason I'm asking is that when using rebuild, I've learned from
experience that the node immediately joins the cluster, and starts
accepting reads (from other DCs) for the range it owns. But since the data
is not complete yet, it can't return anything. This seems to be a dangerous
side effect of this procedure, and therefore can't be used.

Thanks
Tom

Re: Is it possible to bootstrap the 1st node of a new DC?

2014-09-11 Thread Tom van den Berge

Thanks, Rob.
I actually tried using LOCAL_ONE instead of ONE, but I still saw this
problem. Maybe I missed some queries when updating to LOCAL_ONE. Anyway,
it's good to know that this is supposed to work.

Tom

On Thu, Sep 11, 2014 at 10:28 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Sep 11, 2014 at 1:18 PM, Tom van den Berge t...@drillster.com
 wrote:

 When setting up a new (additional) data center, the documentation tells
 us to use nodetool rebuild -- old dc to fill up the node(s) in the new
 dc, and to disable auto_bootstrap.

 I'm wondering if it is possible to fill the node with
 auto_bootstrap=true instead of a nodetool rebuild command. If so, how
 will Cassandra decide from where to stream the data?


 Yes, if that node can hold 100% of the replicas for the new DC.

 Cassandra will decide from where to stream the data in the same way it
 normally does, by picking one replica per range and streaming from it.

 But you probably don't generally want to do this, rebuild exists for this
 use case.

 The reason I'm asking is that when using rebuild, I've learned from
 experience that the node immediately joins the cluster, and starts
 accepting reads (from other DCs) for the range it owns. But since the data
 is not complete yet, it can't return anything. This seems to be a dangerous
 side effect of this procedure, and therefore can't be used.


 Yes, that's why LOCAL_ONE ConsistencyLevel was created. Use it, and
 LOCAL_QUORUM, instead of ONE and QUORUM.

 =Rob





-- 

Drillster BV
Middenburcht 136
3452MT Vleuten
Netherlands

+31 30 755 5330

Open your free account at www.drillster.com

Re: Mutation Stage does not finish

2014-09-11 Thread Eduardo Cusa

yes we have a huge amount insert that can be repeated, now we are working
in a new data model

On Thu, Sep 11, 2014 at 2:54 PM, Robert Coli rc...@eventbrite.com wrote:

 On Thu, Sep 11, 2014 at 10:34 AM, Eduardo Cusa 
 eduardo.c...@usmediaconsulting.com wrote:

 I deleted commit logs, restarted cassandra and finally the node is up.


 Do you have some crazy workload where you do a huge amount of delete or
 something? Replaying a commitlog should not take longer than a few tens of
 minutes in the worst case scenario.

 =Rob

Re: cassandra + spark / pyspark

Re: cassandra + spark / pyspark

Re: cassandra + spark / pyspark

Re: cassandra + spark / pyspark

[RELEASE] Apache Cassandra 2.1.0

Re: Mutation Stage does not finish

Re: Quickly loading C* dataset into memory (row cache)

Re: [RELEASE] Apache Cassandra 2.1.0

Re: [RELEASE] Apache Cassandra 2.1.0

Re: [RELEASE] Apache Cassandra 2.1.0

Re: cassandra + spark / pyspark

Re: Quickly loading C* dataset into memory (row cache)

Detecting bitrot with incremental repair

Re: Detecting bitrot with incremental repair

Re: Mutation Stage does not finish

Re: Mutation Stage does not finish

Is it possible to bootstrap the 1st node of a new DC?

Re: Is it possible to bootstrap the 1st node of a new DC?

Re: Mutation Stage does not finish

19 matches

Site Navigation

Mail list logo

Footer information