Re: Convert single node C* to cluster (rebalancing problem)

2017-06-15 Thread Varun Gupta
Akhil,

As per the blog, nodetool status shows data size for node1 even for token
ranges it does not own. Ain't this is bug in Cassandra?

Yes, on disk data will be present but it should be reflected in nodetool
status.

On Thu, Jun 15, 2017 at 6:17 PM, Akhil Mehra  wrote:

> Hi,
>
> I put together a blog explaining possible reasons for an unbalanced
> Cassandra nodes.
>
> http://abiasforaction.net/unbalanced-cassandra-cluster/
>
> Let me know if you have any questions.
>
> Cheers,
> Akhil
>
>
> On Thu, Jun 15, 2017 at 5:54 PM, Affan Syed  wrote:
>
>> John,
>>
>> I am a co-worker with Junaid -- he is out sick, so just wanted to confirm
>> that one of your shots in the dark is correct. This is a RF of 1x
>>
>> "CREATE KEYSPACE orion WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': '1'}  AND durable_writes = true;"
>>
>> However, how does the RF affect the redistribution of key/data?
>>
>> Affan
>>
>> - Affan
>>
>> On Wed, Jun 14, 2017 at 1:16 AM, John Hughes 
>> wrote:
>>
>>> OP, I was just looking at your original numbers and I have some
>>> questions:
>>>
>>> 270GB on one node and 414KB on the other, but something close to 50/50
>>> on "Owns(effective)".
>>> What replication factor are your keyspaces set up with? 1x or 2x or ??
>>>
>>> I would say you are seeing 50/50 because the tokens are allocated
>>> 50/50(others on the list please correct what are for me really just
>>> assumptions), but I would hazard a guess that your replication factor
>>> is still 1x, so it isn't moving anything around. Or your keyspace
>>> rplication is incorrect and isn't being distributed(I have had issues with
>>> the AWSMultiRegionSnitch and not getting the region correct[us-east vs
>>> us-east-1). It doesn't throw an error, but it doesn't work very well either
>>> =)
>>>
>>> Can you do a 'describe keyspace XXX' and show the first line(the CREATE
>>> KEYSPACE line).
>>>
>>> Mind you, these are all just shots in the dark from here.
>>>
>>> Cheers,
>>>
>>>
>>> On Tue, Jun 13, 2017 at 3:13 AM Junaid Nasir  wrote:
>>>
 Is the OP expecting a perfect 50%/50% split?


 best result I got was 240gb/30gb split, which I think is not properly
 balanced.


> Also, what are your outputs when you call out specific keyspaces? Do
> the numbers get more even?


 i don't know what you mean by *call out specific key spaces?* can you
 please explain that a bit.


 If your schema is not modelled correctly you can easily end up unevenly
> distributed data.


 I think that is the problem. initial 270gb data might not by modeled
 correctly. I have run a lot of tests on 270gb data including downsizing it
 to 5gb, they all resulted in same uneven distribution. I also tested a
 dummy dataset of 2gb which was balanced evenly. coming from rdb, I didn't
 give much thought to data modeling. can anyone please point me to some
 resources regarding this problem.

 On Tue, Jun 13, 2017 at 3:24 AM, Akhil Mehra 
 wrote:

> Great point John.
>
> The OP should also note that data distribution also depends on your
> schema and incoming data profile.
>
> If your schema is not modelled correctly you can easily end up
> unevenly distributed data.
>
> Cheers,
> Akhil
>
> On Tue, Jun 13, 2017 at 3:36 AM, John Hughes 
> wrote:
>
>> Is the OP expecting a perfect 50%/50% split? That, to my experience,
>> is not going to happen, it is almost always shifted from a fraction of a
>> percent to a couple percent.
>>
>> Datacenter: eu-west
>> ===
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>   Rack
>> UN  XX.XX.XX.XX22.71 GiB  256  47.6%
>> 57dafdde-2f62-467c-a8ff-c91e712f89c9  1c
>> UN  XX.XX.XX.XX  17.17 GiB  256  51.3%
>> d2a65c51-087d-48de-ae1f-a41142eb148d  1b
>> UN  XX.XX.XX.XX  26.15 GiB  256  52.4%
>> acf5dd34-5b81-4e5b-b7be-85a7fccd8e1c  1c
>> UN  XX.XX.XX.XX   16.64 GiB  256  50.2%
>> 6c8842dd-a966-467c-a7bc-bd6269ce3e7e  1a
>> UN  XX.XX.XX.XX  24.39 GiB  256  49.8%
>> fd92525d-edf2-4974-8bc5-a350a8831dfa  1a
>> UN  XX.XX.XX.XX   23.8 GiB   256  48.7%
>> bdc597c0-718c-4ef6-b3ef-7785110a9923  1b
>>
>> Though maybe part of what you are experiencing can be cleared up by
>> repair/compaction/cleanup. Also, what are your outputs when you call out
>> specific keyspaces? Do the numbers get more even?
>>
>> Cheers,
>>
>> On Mon, Jun 12, 2017 at 5:22 AM Akhil Mehra 
>> wrote:
>>
>>> auto_bootstrap is true by default. Ensure its set 

Impact of Write without consistency level and mutation failures on reads and cluster

2017-06-15 Thread srinivasarao daruna
Hi,

Recently one of our spark job had missed cassandra consistency property and
number of concurrent writes property.

Due to that, some of mutations are failed when we checked tpstats. Also, we
observed readtimeouts are occurring with not only the table that the job
inserts, but also from other tables, for which have always had consistency
level proper. We started repair, but due to the volume of data, repair
might take a day or two to complete. Mean while, wanted to get some inputs.

As the error planted lot of questions.
1) Is there a relation between mutation fails to read time outs and overall
cluster performance, if yes, how.?

2) When i checked the log, i found a warning in debug.log as below.
SELECT * FROM our_table WHERE partition_key = required_value LIMIT 5000:
total time 20353 msec - timeout 2 msec

Actual query:
SELECT * FROM our_table WHERE partition_key = required_value

Even though we are hitting partition key, i do not understand the reason
for such huge read time and timeouts.

3) We are using prepared statements to query the tables from API. How can
we set the fetch size, so that it wont use LIMIT 5000.?
Any thoughts.?


Thank You,
Regards,
Srini


Re: Convert single node C* to cluster (rebalancing problem)

2017-06-15 Thread Akhil Mehra
Hi,

I put together a blog explaining possible reasons for an unbalanced
Cassandra nodes.

http://abiasforaction.net/unbalanced-cassandra-cluster/

Let me know if you have any questions.

Cheers,
Akhil


On Thu, Jun 15, 2017 at 5:54 PM, Affan Syed  wrote:

> John,
>
> I am a co-worker with Junaid -- he is out sick, so just wanted to confirm
> that one of your shots in the dark is correct. This is a RF of 1x
>
> "CREATE KEYSPACE orion WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'}  AND durable_writes = true;"
>
> However, how does the RF affect the redistribution of key/data?
>
> Affan
>
> - Affan
>
> On Wed, Jun 14, 2017 at 1:16 AM, John Hughes 
> wrote:
>
>> OP, I was just looking at your original numbers and I have some questions:
>>
>> 270GB on one node and 414KB on the other, but something close to 50/50 on
>> "Owns(effective)".
>> What replication factor are your keyspaces set up with? 1x or 2x or ??
>>
>> I would say you are seeing 50/50 because the tokens are allocated
>> 50/50(others on the list please correct what are for me really just
>> assumptions), but I would hazard a guess that your replication factor
>> is still 1x, so it isn't moving anything around. Or your keyspace
>> rplication is incorrect and isn't being distributed(I have had issues with
>> the AWSMultiRegionSnitch and not getting the region correct[us-east vs
>> us-east-1). It doesn't throw an error, but it doesn't work very well either
>> =)
>>
>> Can you do a 'describe keyspace XXX' and show the first line(the CREATE
>> KEYSPACE line).
>>
>> Mind you, these are all just shots in the dark from here.
>>
>> Cheers,
>>
>>
>> On Tue, Jun 13, 2017 at 3:13 AM Junaid Nasir  wrote:
>>
>>> Is the OP expecting a perfect 50%/50% split?
>>>
>>>
>>> best result I got was 240gb/30gb split, which I think is not properly
>>> balanced.
>>>
>>>
 Also, what are your outputs when you call out specific keyspaces? Do
 the numbers get more even?
>>>
>>>
>>> i don't know what you mean by *call out specific key spaces?* can you
>>> please explain that a bit.
>>>
>>>
>>> If your schema is not modelled correctly you can easily end up unevenly
 distributed data.
>>>
>>>
>>> I think that is the problem. initial 270gb data might not by modeled
>>> correctly. I have run a lot of tests on 270gb data including downsizing it
>>> to 5gb, they all resulted in same uneven distribution. I also tested a
>>> dummy dataset of 2gb which was balanced evenly. coming from rdb, I didn't
>>> give much thought to data modeling. can anyone please point me to some
>>> resources regarding this problem.
>>>
>>> On Tue, Jun 13, 2017 at 3:24 AM, Akhil Mehra 
>>> wrote:
>>>
 Great point John.

 The OP should also note that data distribution also depends on your
 schema and incoming data profile.

 If your schema is not modelled correctly you can easily end up unevenly
 distributed data.

 Cheers,
 Akhil

 On Tue, Jun 13, 2017 at 3:36 AM, John Hughes 
 wrote:

> Is the OP expecting a perfect 50%/50% split? That, to my experience,
> is not going to happen, it is almost always shifted from a fraction of a
> percent to a couple percent.
>
> Datacenter: eu-west
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID
> Rack
> UN  XX.XX.XX.XX22.71 GiB  256  47.6%
> 57dafdde-2f62-467c-a8ff-c91e712f89c9  1c
> UN  XX.XX.XX.XX  17.17 GiB  256  51.3%
> d2a65c51-087d-48de-ae1f-a41142eb148d  1b
> UN  XX.XX.XX.XX  26.15 GiB  256  52.4%
> acf5dd34-5b81-4e5b-b7be-85a7fccd8e1c  1c
> UN  XX.XX.XX.XX   16.64 GiB  256  50.2%
> 6c8842dd-a966-467c-a7bc-bd6269ce3e7e  1a
> UN  XX.XX.XX.XX  24.39 GiB  256  49.8%
> fd92525d-edf2-4974-8bc5-a350a8831dfa  1a
> UN  XX.XX.XX.XX   23.8 GiB   256  48.7%
> bdc597c0-718c-4ef6-b3ef-7785110a9923  1b
>
> Though maybe part of what you are experiencing can be cleared up by
> repair/compaction/cleanup. Also, what are your outputs when you call out
> specific keyspaces? Do the numbers get more even?
>
> Cheers,
>
> On Mon, Jun 12, 2017 at 5:22 AM Akhil Mehra 
> wrote:
>
>> auto_bootstrap is true by default. Ensure its set to true. On startup
>> look at your logs for your auto_bootstrap value.  Look at the node
>> configuration line in your log file.
>>
>> Akhil
>>
>> On Mon, Jun 12, 2017 at 6:18 PM, Junaid Nasir  wrote:
>>
>>> No, I didn't set it (left it at default value)
>>>
>>> On Fri, Jun 9, 2017 at 3:18 AM, ZAIDI, ASAD A 
>>> wrote:
>>>
 Did you make sure 

Re: Question: Large partition warning

2017-06-15 Thread kurt greaves
fyi ticket already existed for this, I've submitted a patch that fixes this
specific issue but it looks like there are a few other properties that will
suffer from the same. As I said on the ticket, we should probably fix these
up even though setting things this high is generally bad practice. If other
people want to weigh in and think the same I'll create another ticket.
https://issues.apache.org/jira/browse/CASSANDRA-13172


Re: Upgrade from 3.0.6, where's the documentation?

2017-06-15 Thread Riccardo Ferrari
Jeff,

Thank you so much for your answer. If you say there are 2 very important
fixes in next release I believe we can wait couple of weeks.

Thanks!

On Fri, Jun 16, 2017 at 12:35 AM, Jeff Jirsa  wrote:

>
>
> On 2017-06-14 07:05 (-0700), Riccardo Ferrari  wrote:
> > Hi list,
> >
> > It's been a while since I upgraded my C* to 3.0.6, nevertheless I would
> > like to give TWCS a try (avaialble since 3.0.7).
> >
> > What happened to the upgrade documentation ? I was used to read some
> > step-by-step procedure from datastax but looks like they are not
> supporting
> > it anymore, on the flip side  I can't find anything meaningful on
> > cassandra.apache.org website. What am I missing?
> >
>
> The cassandra.apache.org docs are a work in progress.
>
> In the mean time, the old cached versions of the datastax pages are
> reasonably accessible:
>
> https://web.archive.org/web/20161004153529/http://docs.
> datastax.com:80/en/latest-upgrade/upgrade/cassandra/
> upgrdBestPractCassandra.html
>
> https://web.archive.org/web/20161004145950/http://docs.
> datastax.com:80/en/latest-upgrade/upgrade/cassandra/
> upgrdCassandraDetails.html
>
>
> > What is the most stabe 3.0.X version, is it the latest?
> >
>
> You'll want to wait for 3.0.14 if you can, it'll have 2 very important
> fixes in it. ETA is probably early next week.
>
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Upgrade from 3.0.6, where's the documentation?

2017-06-15 Thread Jeff Jirsa


On 2017-06-14 07:05 (-0700), Riccardo Ferrari  wrote: 
> Hi list,
> 
> It's been a while since I upgraded my C* to 3.0.6, nevertheless I would
> like to give TWCS a try (avaialble since 3.0.7).
> 
> What happened to the upgrade documentation ? I was used to read some
> step-by-step procedure from datastax but looks like they are not supporting
> it anymore, on the flip side  I can't find anything meaningful on
> cassandra.apache.org website. What am I missing?
> 

The cassandra.apache.org docs are a work in progress. 

In the mean time, the old cached versions of the datastax pages are reasonably 
accessible:

https://web.archive.org/web/20161004153529/http://docs.datastax.com:80/en/latest-upgrade/upgrade/cassandra/upgrdBestPractCassandra.html

https://web.archive.org/web/20161004145950/http://docs.datastax.com:80/en/latest-upgrade/upgrade/cassandra/upgrdCassandraDetails.html


> What is the most stabe 3.0.X version, is it the latest?
> 

You'll want to wait for 3.0.14 if you can, it'll have 2 very important fixes in 
it. ETA is probably early next week. 




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Bottleneck for small inserts?

2017-06-15 Thread Eric Pederson
Here are a couple of iostat snapshots showing the spikes in disk queue size
(in these cases correlating with spikes in w/s and %util)

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

sda   0.00 5.630.002.33 0.0063.73
27.31 0.000.57   0.41   0.10

sdb   0.00 0.00   48.03 17990.63  3679.73 143925.07
8.1823.391.30   0.01  22.57

dm-0  0.00 0.000.000.30 0.00 2.40
8.00 0.002.00   0.67   0.02

dm-2  0.00 0.00   48.03 17990.63  3679.73 143925.07
8.1823.561.30   0.01  22.83

dm-3  0.00 0.000.007.67 0.0061.33
8.00 0.000.44   0.10   0.08



Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz
avgqu-sz   await  svctm  %util

sda   0.00 2.100.001.33 0.0027.47
20.60 0.000.25   0.18   0.02

sdb   0.00 16309.00  109.43 2714.23  2609.87 152186.40
54.8211.444.05   0.08  23.54

dm-0  0.00 0.000.000.10 0.00 0.80
8.00 0.000.00   0.00   0.00

dm-2  0.00 0.00  109.43 19023.30  2609.87 152186.40
8.09   273.89   14.30   0.01  23.64

dm-3  0.00 0.000.003.33 0.0026.67
8.00 0.000.25   0.07   0.02


-- Eric

On Wed, Jun 14, 2017 at 11:17 PM, Eric Pederson  wrote:

> Using cassandra-stress with the out of the box schema I am seeing around
> 140k rows/second throughput using 1 client on each of 3 client machines.
> On the servers:
>
>- CPU utilization: 43% usr/20% sys, 55%/28%, 70%/10% (the last number
>is the older box)
>- Inbound network traffic: 174 Mbps, 190 Mbps, 178 Mbps
>- Disk writes/sec: ~10k each server
>- Disk utilization is in the low single digits but spikes up to 50%
>- Disk queue size is in the low single digits but spikes up into the
>mid hundreds.  I even saw in the thousands.   I had not noticed this
>before.
>
> The disk stats come from iostat -xz 1.   Given the low reported
> utilization %s I would not expect to see any disk queue buildup, even low
> single digits.
>
> Going to 2 cassandra-stress clients per machine the throughput dropped to
> 133k rows/sec.
>
>- CPU utilization: 13% usr/5% sys, 15%/25%, 40%/22% on the older box
>- Inbound network RX: 100Mbps, 125Mbps, 120Mbps
>- Disk utilization is a little lower, but with the same spiky behavior
>
> Going to 3 cassandra-stress clients per machine the throughput dropped to
> 110k rows/sec
>
>- CPU utilization: 15% usr/20% sys,  15%/20%, 40%/20% on the older box
>- Inbound network RX dropped to 130 Mbps
>- Disk utilization stayed roughly the same
>
> I noticed that with the standard cassandra-stress schema GC is not an
> issue.   But with my application-specific schema there is a lot of GC on
> the slower box.  Also with the application-specific schema I can't seem to
> get past 36k rows/sec.   The application schema has 64 columns (mostly
> ints) and the key is (date,sequence#).   The standard stress schema has a
> lot fewer columns and no clustering column.
>
> Thanks,
>
>
>
> -- Eric
>
> On Wed, Jun 14, 2017 at 1:47 AM, Eric Pederson  wrote:
>
>> Shoot - I didn't see that one.  I subscribe to the digest but was
>> focusing on the direct replies and accidentally missed Patrick and Jeff
>> Jirsa's messages.  Sorry about that...
>>
>> I've been using a combination of cassandra-stress, cqlsh COPY FROM and a
>> custom C++ application for my ingestion testing.   My default setting for
>> my custom client application is 96 threads, and then by default I run one
>> client application process on each of 3 machines.  I tried
>> doubling/quadrupling the number of client threads (and doubling/tripling
>> the number of client processes but keeping the threads per process the
>> same) but didn't see any change.   If I recall correctly I started getting
>> timeouts after I went much beyond concurrent_writes which is 384 (for a 48
>> CPU box) - meaning at 500 threads per client machine I started seeing
>> timeouts.I'll try again to be sure.
>>
>> For the purposes of this conversation I will try to always use
>> cassandra-stress to keep the number of unknowns limited.  I'll will run
>> more cassandra-stress clients tomorrow in line with Patrick's 3-5 per
>> server recommendation.
>>
>> Thanks!
>>
>>
>> -- Eric
>>
>> On Wed, Jun 14, 2017 at 12:40 AM, Jonathan Haddad 
>> wrote:
>>
>>> Did you try adding more client stress nodes as Patrick recommended?
>>>
>>> On Tue, Jun 13, 2017 at 9:31 PM Eric Pederson  wrote:
>>>
 Scratch that theory - the flamegraphs show that GC is only 3-4% of two
 newer machine's overall processing, compared to 18% on the slow machine.

 I took that machine out of the cluster completely and recreated the
 

UNSUBSCRIBE

2017-06-15 Thread Patten, David E
UNSUBSCRIBE

This email message and any attachments are for the sole use of the intended 
recipient(s). Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the sender by 
reply email and destroy all copies of the original message and any attachments.


Re: Reaper v0.6.1 released

2017-06-15 Thread Matthew O'Riordan
Awesome indeed.

On Thu, Jun 15, 2017 at 11:34 AM, Carlos Rolo  wrote:

> Great!
>
> Thanks a lot!
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 918 918 100 <+351%20918%20918%20100>
> www.pythian.com
>
> On Thu, Jun 15, 2017 at 7:56 AM, Aiman Parvaiz 
> wrote:
>
>> Great work!! Thanks
>>
>> Sent from my iPhone
>>
>> On Jun 14, 2017, at 11:30 PM, Shalom Sagges 
>> wrote:
>>
>> That's awesome!! Thanks for contributing! 
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>>  
>>  We Create Meaningful Connections
>>
>>
>>
>> On Thu, Jun 15, 2017 at 2:32 AM, Jonathan Haddad 
>> wrote:
>>
>>> Hey folks!
>>>
>>> I'm proud to announce the 0.6.1 release of the Reaper project, the open
>>> source repair management tool for Apache Cassandra.
>>>
>>> This release improves the Cassandra backend significantly, making it a
>>> first class citizen for storing repair schedules and managing repair
>>> progress.  It's no longer necessary to manage a PostgreSQL DB in addition
>>> to your Cassandra DB.
>>>
>>> We've been very active since we forked the original Spotify repo.  Since
>>> this time we've added:
>>>
>>> * A native Cassandra backend
>>> * Support for versions > 2.0
>>> * Merged in the WebUI, maintained by Stefan Podkowinski (
>>> https://github.com/spodkowinski/cassandra-reaper-ui)
>>> * Support for incremental repair (probably best to avoid till Cassandra
>>> 4.0, see CASSANDRA-9143)
>>>
>>> We're excited to continue making improvements past the original intent
>>> of the project.  With the lack of Cassandra 3.0 support in OpsCenter,
>>> there's a gap that needs to be filled for tools that help with managing a
>>> cluster.  Alex Dejanovski showed me a prototype he recently put together
>>> for a really nice view into cluster health.  We're also looking to add in
>>> support common cluster operations like snapshots, upgradesstables, cleanup,
>>> and setting options at runtime.
>>>
>>> Grab it here: https://github.com/thelastpickle/cassandra-reaper
>>>
>>> Feedback / bug reports / ideas are very much appreciated.
>>>
>>> We have a dedicated, low traffic ML here: https://groups.google.com/foru
>>> m/#!forum/tlp-apache-cassandra-reaper-users
>>>
>>> Jon Haddad
>>> Principal Consultant, The Last Pickle
>>> http://thelastpickle.com/
>>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>>
>
> --
>
>
>
>


-- 

Regards,

Matthew O'Riordan
CEO who codes
Ably - simply better realtime 

*Ably News: Ably push notifications have gone live
*


Re: Reaper v0.6.1 released

2017-06-15 Thread Carlos Rolo
Great!

Thanks a lot!

Regards,

Carlos Juzarte Rolo
Cassandra Consultant / Datastax Certified Architect / Cassandra MVP

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
*linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 918 918 100
www.pythian.com

On Thu, Jun 15, 2017 at 7:56 AM, Aiman Parvaiz  wrote:

> Great work!! Thanks
>
> Sent from my iPhone
>
> On Jun 14, 2017, at 11:30 PM, Shalom Sagges 
> wrote:
>
> That's awesome!! Thanks for contributing! 
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
>  
>  We Create Meaningful Connections
>
>
>
> On Thu, Jun 15, 2017 at 2:32 AM, Jonathan Haddad 
> wrote:
>
>> Hey folks!
>>
>> I'm proud to announce the 0.6.1 release of the Reaper project, the open
>> source repair management tool for Apache Cassandra.
>>
>> This release improves the Cassandra backend significantly, making it a
>> first class citizen for storing repair schedules and managing repair
>> progress.  It's no longer necessary to manage a PostgreSQL DB in addition
>> to your Cassandra DB.
>>
>> We've been very active since we forked the original Spotify repo.  Since
>> this time we've added:
>>
>> * A native Cassandra backend
>> * Support for versions > 2.0
>> * Merged in the WebUI, maintained by Stefan Podkowinski (
>> https://github.com/spodkowinski/cassandra-reaper-ui)
>> * Support for incremental repair (probably best to avoid till Cassandra
>> 4.0, see CASSANDRA-9143)
>>
>> We're excited to continue making improvements past the original intent of
>> the project.  With the lack of Cassandra 3.0 support in OpsCenter, there's
>> a gap that needs to be filled for tools that help with managing a cluster.
>> Alex Dejanovski showed me a prototype he recently put together for a really
>> nice view into cluster health.  We're also looking to add in support common
>> cluster operations like snapshots, upgradesstables, cleanup, and setting
>> options at runtime.
>>
>> Grab it here: https://github.com/thelastpickle/cassandra-reaper
>>
>> Feedback / bug reports / ideas are very much appreciated.
>>
>> We have a dedicated, low traffic ML here: https://groups.google.com/foru
>> m/#!forum/tlp-apache-cassandra-reaper-users
>>
>> Jon Haddad
>> Principal Consultant, The Last Pickle
>> http://thelastpickle.com/
>>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>

-- 


--





Re: Cassandra cost vs an RDBMS?

2017-06-15 Thread Dimitry Lvovsky
Ali,
 An RDBMS is solving a different problem than Cassandra is so its an apples
to oranges comparison from where I sit.  If you don't need Cassandra (which
sounds like you do), then don't use it -- more nodes and higher tiers of
AWS are a function of the features it offers.  Imagine what your costs
would be with an RDBMS with an attempt to replicate the features of
Cassandra?


- Dimitry




On Thu, Jun 15, 2017 at 10:17 AM, Ali Akhtar  wrote:

> A client recently inquired about the costs of running Cassandra vs a
> traditional RDBMS like Postgres or Mysql, in the cloud.
>
> They are releasing a b2b product similar to Slack, Trello, etc which will
> have a free tier. And they're concerned about the costs of running it on
> Cassandra, and whether it may be too expensive if it gets popular.
>
> They have a write heavy workload, where data is being received 24/7,
> analyzed and the results written to Cassandra. A few times a day, users
> will view the results of the analysis, which will be the read portion of
> the system.
>
> Its my understanding that it may cost slightly, e.g 10-15% more to run
> this system on Cassandra vs an RDBMS, because it needs more nodes, and
> higher tier of AWS / GCE instances to run.
>
> Can anyone who has experience scaling Cassandra share their insights?
>
> Costs, metrics (e.g users, requests per second), etc would be really
> helpful!
>


Cassandra cost vs an RDBMS?

2017-06-15 Thread Ali Akhtar
A client recently inquired about the costs of running Cassandra vs a
traditional RDBMS like Postgres or Mysql, in the cloud.

They are releasing a b2b product similar to Slack, Trello, etc which will
have a free tier. And they're concerned about the costs of running it on
Cassandra, and whether it may be too expensive if it gets popular.

They have a write heavy workload, where data is being received 24/7,
analyzed and the results written to Cassandra. A few times a day, users
will view the results of the analysis, which will be the read portion of
the system.

Its my understanding that it may cost slightly, e.g 10-15% more to run this
system on Cassandra vs an RDBMS, because it needs more nodes, and higher
tier of AWS / GCE instances to run.

Can anyone who has experience scaling Cassandra share their insights?

Costs, metrics (e.g users, requests per second), etc would be really
helpful!


Re: Reaper v0.6.1 released

2017-06-15 Thread Aiman Parvaiz
Great work!! Thanks

Sent from my iPhone

On Jun 14, 2017, at 11:30 PM, Shalom Sagges 
> wrote:

That's awesome!! Thanks for contributing! 


[https://signature.s3.amazonaws.com/2015/lp_logo.png]
Shalom Sagges
DBA
T: +972-74-700-4035
[https://signature.s3.amazonaws.com/2015/LinkedIn.png]
  [https://signature.s3.amazonaws.com/2015/Twitter.png] 
   
[https://signature.s3.amazonaws.com/2015/Facebook.png] 

We Create Meaningful Connections





On Thu, Jun 15, 2017 at 2:32 AM, Jonathan Haddad 
> wrote:
Hey folks!

I'm proud to announce the 0.6.1 release of the Reaper project, the open source 
repair management tool for Apache Cassandra.

This release improves the Cassandra backend significantly, making it a first 
class citizen for storing repair schedules and managing repair progress.  It's 
no longer necessary to manage a PostgreSQL DB in addition to your Cassandra DB.

We've been very active since we forked the original Spotify repo.  Since this 
time we've added:

* A native Cassandra backend
* Support for versions > 2.0
* Merged in the WebUI, maintained by Stefan Podkowinski 
(https://github.com/spodkowinski/cassandra-reaper-ui)
* Support for incremental repair (probably best to avoid till Cassandra 4.0, 
see CASSANDRA-9143)

We're excited to continue making improvements past the original intent of the 
project.  With the lack of Cassandra 3.0 support in OpsCenter, there's a gap 
that needs to be filled for tools that help with managing a cluster.  Alex 
Dejanovski showed me a prototype he recently put together for a really nice 
view into cluster health.  We're also looking to add in support common cluster 
operations like snapshots, upgradesstables, cleanup, and setting options at 
runtime.

Grab it here: https://github.com/thelastpickle/cassandra-reaper

Feedback / bug reports / ideas are very much appreciated.

We have a dedicated, low traffic ML here: 
https://groups.google.com/forum/#!forum/tlp-apache-cassandra-reaper-users

Jon Haddad
Principal Consultant, The Last Pickle
http://thelastpickle.com/


This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this message 
or any information herein.
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: Reaper v0.6.1 released

2017-06-15 Thread Shalom Sagges
That's awesome!! Thanks for contributing! 


Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Thu, Jun 15, 2017 at 2:32 AM, Jonathan Haddad  wrote:

> Hey folks!
>
> I'm proud to announce the 0.6.1 release of the Reaper project, the open
> source repair management tool for Apache Cassandra.
>
> This release improves the Cassandra backend significantly, making it a
> first class citizen for storing repair schedules and managing repair
> progress.  It's no longer necessary to manage a PostgreSQL DB in addition
> to your Cassandra DB.
>
> We've been very active since we forked the original Spotify repo.  Since
> this time we've added:
>
> * A native Cassandra backend
> * Support for versions > 2.0
> * Merged in the WebUI, maintained by Stefan Podkowinski (
> https://github.com/spodkowinski/cassandra-reaper-ui)
> * Support for incremental repair (probably best to avoid till Cassandra
> 4.0, see CASSANDRA-9143)
>
> We're excited to continue making improvements past the original intent of
> the project.  With the lack of Cassandra 3.0 support in OpsCenter, there's
> a gap that needs to be filled for tools that help with managing a cluster.
> Alex Dejanovski showed me a prototype he recently put together for a really
> nice view into cluster health.  We're also looking to add in support common
> cluster operations like snapshots, upgradesstables, cleanup, and setting
> options at runtime.
>
> Grab it here: https://github.com/thelastpickle/cassandra-reaper
>
> Feedback / bug reports / ideas are very much appreciated.
>
> We have a dedicated, low traffic ML here: https://groups.google.com/
> forum/#!forum/tlp-apache-cassandra-reaper-users
>
> Jon Haddad
> Principal Consultant, The Last Pickle
> http://thelastpickle.com/
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.