cqlsh COPY ... TO ... doesn't work if one node down

2018-06-29 Thread Dmitry Simonov
Hello!

I have cassandra cluster with 5 nodes.
There is a (relatively small) keyspace X with RF5.
One node goes down.

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address  Load   Tokens   Owns (effective)  Host
ID   Rack
UN  10.0.0.82   253.64 MB  256  100.0%
839bef9d-79af-422c-a21f-33bdcf4493c1  rack1
UN  10.0.0.154  255.92 MB  256  100.0%
ce23f3a7-67d2-47c0-9ece-7a5dd67c4105  rack1
UN  10.0.0.76   461.26 MB  256  100.0%
c8e18603-0ede-43f0-b713-3ff47ad92323  rack1
UN  10.0.0.94   575.78 MB  256  100.0%
9a324dbc-5ae1-4788-80e4-d86dcaae5a4c  rack1
DN  10.0.0.47   ?  256  100.0%
7b628ca2-4e47-457a-ba42-5191f7e5374b  rack1

I try to export some data using COPY TO, but it fails after long retries.
Why does it fail?
How can I make a copy?
There must be 4 copies of each row on other (alive) replicas.

cqlsh 10.0.0.154 -e "COPY X.Y TO 'backup/X.Y' WITH NUMPROCESSES=1"

Using 1 child processes

Starting copy of X.Y with columns [key, column1, value].
2018-06-29 19:12:23,661 Failed to create connection pool for new host
10.0.0.47:
Traceback (most recent call last):
  File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py",
line 2476, in run_add_or_renew_pool
new_pool = HostConnection(host, distance, self)
  File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/pool.py",
line 332, in __init__
self._connection = session.cluster.connection_factory(host.address)
  File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py",
line 1205, in connection_factory
return self.connection_class.factory(address, self.connect_timeout,
*args, **kwargs)
  File
"/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py", line
332, in factory
conn = cls(host, *args, **kwargs)
  File
"/usr/lib/foobar/lib/python3.5/site-packages/cassandra/io/asyncorereactor.py",
line 344, in __init__
self._connect_socket()
  File
"/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py", line
371, in _connect_socket
raise socket.error(sockerr.errno, "Tried connecting to %s. Last error:
%s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
OSError: [Errno None] Tried connecting to [('10.0.0.47', 9042)]. Last
error: timed out
2018-06-29 19:12:23,665 Host 10.0.0.47 has been marked down
2018-06-29 19:12:29,674 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 2.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:12:36,684 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 4.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:12:45,696 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 8.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:12:58,716 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 16.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:13:19,756 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 32.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:13:56,834 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 64.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:15:05,887 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 128.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:17:18,982 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 256.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
2018-06-29 19:21:40,064 Error attempting to reconnect to 10.0.0.47,
scheduling retry in 512.0 seconds: [Errno None] Tried connecting to
[('10.0.0.47', 9042)]. Last error: timed out
:1:(4, 'Interrupted system call')
IOError:
IOError:
IOError:
IOError:
IOError:


-- 
Best Regards,
Dmitry Simonov


Re: [EXTERNAL] Re: consultant recommendations

2018-06-29 Thread Joe Schwartz
Aaron Morton at the Last Pickle is solid; he knows his stuff.

Also, like Sean said, nothing against Instacluster; they are good folks
too.!

Joe


Joseph B. Schwartz
Western Region Sales Director
Mobile: 408-316-0289

On Fri, Jun 29, 2018 at 11:46 AM, Durity, Sean R <
sean_r_dur...@homedepot.com> wrote:

> I haven’t ever hired a Cassandra consultant, but the company named The
> Last Pickle (yes, an odd name) has some outstanding Cassandra experts. Not
> sure how they work, but worth a mention here.
>
>
>
> Nothing against Instacluster. There are great folks there, too.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Evelyn Smith 
> *Sent:* Friday, June 29, 2018 1:54 PM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Re: consultant recommendations
>
>
>
> Hey Randy
>
>
>
> Instaclustr provides consulting services for Cassandra as well as managed
> services if you are looking to offload the admin burden.
>
>
>
> https://www.instaclustr.com/services/cassandra-consulting/
> 
>
>
>
> Alternatively, send me an email at evelyn.ba...@instaclustr.com and I’d
> be happy to chase this up on Monday with the head of consulting (it’s
> Friday night my time).
>
>
>
> Cheers,
>
> Evelyn.
>
>
>
> On 30 Jun 2018, at 2:26 am, Randy Lynn  wrote:
>
>
>
> Having some OOM issues. Would love to get feedback from the group on what
> companies/consultants you might use?
>
> --
>
> *Randy Lynn *
> rl...@getavail.com
>
> office:
>
> 859.963.1616 <+1-859-963-1616>ext 202
>
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
> [image: Image removed by sender.]
> 
>
> getavail.com
> 
>
>
>


RE: [EXTERNAL] Re: consultant recommendations

2018-06-29 Thread Durity, Sean R
I haven’t ever hired a Cassandra consultant, but the company named The Last 
Pickle (yes, an odd name) has some outstanding Cassandra experts. Not sure how 
they work, but worth a mention here.

Nothing against Instacluster. There are great folks there, too.


Sean Durity

From: Evelyn Smith 
Sent: Friday, June 29, 2018 1:54 PM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: consultant recommendations

Hey Randy

Instaclustr provides consulting services for Cassandra as well as managed 
services if you are looking to offload the admin burden.

https://www.instaclustr.com/services/cassandra-consulting/

Alternatively, send me an email at 
evelyn.ba...@instaclustr.com and I’d be 
happy to chase this up on Monday with the head of consulting (it’s Friday night 
my time).

Cheers,
Evelyn.

On 30 Jun 2018, at 2:26 am, Randy Lynn 
mailto:rl...@getavail.com>> wrote:

Having some OOM issues. Would love to get feedback from the group on what 
companies/consultants you might use?

--
Randy Lynn
rl...@getavail.com

office:

859.963.1616  ext 202


163 East Main Street - Lexington, KY 40507 - USA

[Image removed by 
sender.]

getavail.com




Re: consultant recommendations

2018-06-29 Thread Evelyn Smith
Hey Randy

Instaclustr provides consulting services for Cassandra as well as managed 
services if you are looking to offload the admin burden.

https://www.instaclustr.com/services/cassandra-consulting/ 


Alternatively, send me an email at evelyn.ba...@instaclustr.com and I’d be 
happy to chase this up on Monday with the head of consulting (it’s Friday night 
my time).

Cheers,
Evelyn.

> On 30 Jun 2018, at 2:26 am, Randy Lynn  wrote:
> 
> Having some OOM issues. Would love to get feedback from the group on what 
> companies/consultants you might use?
> 
> -- 
> Randy Lynn 
> rl...@getavail.com <> 
> 
> office: 
> 859.963.1616  ext 202 
> 163 East Main Street - Lexington, KY 40507 - USA 
> 
>    getavail.com 


consultant recommendations

2018-06-29 Thread Randy Lynn
Having some OOM issues. Would love to get feedback from the group on what
companies/consultants you might use?

-- 
Randy Lynn
rl...@getavail.com

office:
859.963.1616 <+1-859-963-1616> ext 202
163 East Main Street - Lexington, KY 40507 - USA

 getavail.com 


Re: C* in multiple AWS AZ's

2018-06-29 Thread Pradeep Chhetri
Ohh i see now. It makes sense. Thanks a lot.

On Fri, Jun 29, 2018 at 9:17 PM, Randy Lynn  wrote:

> data is only lost if you stop the node. between restarts the storage is
> fine.
>
> On Fri, Jun 29, 2018 at 10:39 AM, Pradeep Chhetri 
> wrote:
>
>> Isnt NVMe storage an instance storage ie. the data will be lost in case
>> the instance restarts. How are you going to make sure that there is no data
>> loss in case instance gets rebooted?
>>
>> On Fri, 29 Jun 2018 at 7:00 PM, Randy Lynn  wrote:
>>
>>> GPFS - Rahul FTW! Thank you for your help!
>>>
>>> Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did
>>> not have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS
>>> so I intrinsically know we'll get better throughput.
>>>
>>> On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
 Totally agree. GPFS for the win. EC2 multi region snitch is an
 automation tool like Ansible or Puppet. Unless you have two orders of
 magnitude more servers than you do now, you don’t need it.

 Rahul
 On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
 wrote:

 Yes. You would just end up with a rack named differently to the AZ.
 This is not a problem as racks are just logical. I would recommend
 migrating all your DCs to GPFS though for consistency.

 On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:

> So we have two data centers already running..
>
> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site
> tunnel.. I'm wanting to move the current US-EAST from AZ 1a to 1e..
> I know all docs say use ec2multiregion for multi-DC.
>
> I like the GPFS idea. would that work with the multi-DC too?
> What's the downside? status would report rack of 1a, even though in 1e?
>
> Thanks in advance for the help/thoughts!!
>
>
> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
> wrote:
>
>> There is a need for a repair with both DCs as rebuild will not stream
>> all replicas, so unless you can guarantee you were perfectly consistent 
>> at
>> time of rebuild you'll want to do a repair after rebuild.
>>
>> On another note you could just replace the nodes but use GPFS instead
>> of EC2 snitch, using the same rack name.
>>
>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, <
>> rahul.xavier.si...@gmail.com> wrote:
>>
>>> Parallel load is the best approach and then switch your Data access
>>> code to only access the new hardware. After you verify that there are no
>>> local read / writes on the OLD dc and that the updates are only via 
>>> Gossip,
>>> then go ahead and change the replication factor on the key space to have
>>> zero replicas in the old DC. Then you can decommissioned.
>>>
>>> This way you are hundred percent sure that you aren’t missing any
>>> new data. No need for a DC to DC repair but a repair is always healthy.
>>>
>>> Rahul
>>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn ,
>>> wrote:
>>>
>>> Already running with Ec2.
>>>
>>> My original thought was a new DC parallel to the current, and then
>>> decommission the other DC.
>>>
>>> Also my data load is small right now.. I know small is relative
>>> term.. each node is carrying about 6GB..
>>>
>>> So given the data size, would you go with parallel DC or let the new
>>> AZ carry a heavy load until the others are migrated over?
>>> and then I think "repair" to cleanup the replications?
>>>
>>>
>>> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
 You don’t have to use EC2 snitch on AWS but if you have already
 started with it , it may put a node in a different DC.

 If your data density won’t be ridiculous You could add 3 to
 different DC/ Region and then sync up. After the new DC is operational 
 you
 can remove one at a time on the old DC and at the same time add to the 
 new
 one.

 Rahul
 On Jun 28, 2018, 9:03 AM -0500, Randy Lynn ,
 wrote:

 I have a 6-node cluster I'm migrating to the new i3 types.
 But at the same time I want to migrate to a different AZ.

 What happens if I do the "running node replace method" with 1 node
 at a time moving to the new AZ. Meaning, I'll have temporarily;

 5 nodes in AZ 1c
 1 new node in AZ 1e.

 I'll wash-rinse-repeat till all 6 are on the new machine type and
 in the new AZ.

 Any thoughts about whether this gets weird with the Ec2Snitch and a
 RF 3?

 --
 Randy Lynn
 rl...@getavail.com

 office:
 859.963.1616 <+1-859-963-1616> ext 202
 163 East Main Street - Lexington, 

Re: C* in multiple AWS AZ's

2018-06-29 Thread Randy Lynn
data is only lost if you stop the node. between restarts the storage is
fine.

On Fri, Jun 29, 2018 at 10:39 AM, Pradeep Chhetri 
wrote:

> Isnt NVMe storage an instance storage ie. the data will be lost in case
> the instance restarts. How are you going to make sure that there is no data
> loss in case instance gets rebooted?
>
> On Fri, 29 Jun 2018 at 7:00 PM, Randy Lynn  wrote:
>
>> GPFS - Rahul FTW! Thank you for your help!
>>
>> Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did
>> not have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS
>> so I intrinsically know we'll get better throughput.
>>
>> On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh <
>> rahul.xavier.si...@gmail.com> wrote:
>>
>>> Totally agree. GPFS for the win. EC2 multi region snitch is an
>>> automation tool like Ansible or Puppet. Unless you have two orders of
>>> magnitude more servers than you do now, you don’t need it.
>>>
>>> Rahul
>>> On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
>>> wrote:
>>>
>>> Yes. You would just end up with a rack named differently to the AZ. This
>>> is not a problem as racks are just logical. I would recommend migrating all
>>> your DCs to GPFS though for consistency.
>>>
>>> On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
>>>
 So we have two data centers already running..

 AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site
 tunnel.. I'm wanting to move the current US-EAST from AZ 1a to 1e..
 I know all docs say use ec2multiregion for multi-DC.

 I like the GPFS idea. would that work with the multi-DC too?
 What's the downside? status would report rack of 1a, even though in 1e?

 Thanks in advance for the help/thoughts!!


 On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
 wrote:

> There is a need for a repair with both DCs as rebuild will not stream
> all replicas, so unless you can guarantee you were perfectly consistent at
> time of rebuild you'll want to do a repair after rebuild.
>
> On another note you could just replace the nodes but use GPFS instead
> of EC2 snitch, using the same rack name.
>
> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, <
> rahul.xavier.si...@gmail.com> wrote:
>
>> Parallel load is the best approach and then switch your Data access
>> code to only access the new hardware. After you verify that there are no
>> local read / writes on the OLD dc and that the updates are only via 
>> Gossip,
>> then go ahead and change the replication factor on the key space to have
>> zero replicas in the old DC. Then you can decommissioned.
>>
>> This way you are hundred percent sure that you aren’t missing any new
>> data. No need for a DC to DC repair but a repair is always healthy.
>>
>> Rahul
>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn ,
>> wrote:
>>
>> Already running with Ec2.
>>
>> My original thought was a new DC parallel to the current, and then
>> decommission the other DC.
>>
>> Also my data load is small right now.. I know small is relative
>> term.. each node is carrying about 6GB..
>>
>> So given the data size, would you go with parallel DC or let the new
>> AZ carry a heavy load until the others are migrated over?
>> and then I think "repair" to cleanup the replications?
>>
>>
>> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
>> rahul.xavier.si...@gmail.com> wrote:
>>
>>> You don’t have to use EC2 snitch on AWS but if you have already
>>> started with it , it may put a node in a different DC.
>>>
>>> If your data density won’t be ridiculous You could add 3 to
>>> different DC/ Region and then sync up. After the new DC is operational 
>>> you
>>> can remove one at a time on the old DC and at the same time add to the 
>>> new
>>> one.
>>>
>>> Rahul
>>> On Jun 28, 2018, 9:03 AM -0500, Randy Lynn ,
>>> wrote:
>>>
>>> I have a 6-node cluster I'm migrating to the new i3 types.
>>> But at the same time I want to migrate to a different AZ.
>>>
>>> What happens if I do the "running node replace method" with 1 node
>>> at a time moving to the new AZ. Meaning, I'll have temporarily;
>>>
>>> 5 nodes in AZ 1c
>>> 1 new node in AZ 1e.
>>>
>>> I'll wash-rinse-repeat till all 6 are on the new machine type and in
>>> the new AZ.
>>>
>>> Any thoughts about whether this gets weird with the Ec2Snitch and a
>>> RF 3?
>>>
>>> --
>>> Randy Lynn
>>> rl...@getavail.com
>>>
>>> office:
>>> 859.963.1616 <+1-859-963-1616> ext 202
>>> 163 East Main Street - Lexington, KY 40507 - USA
>>> 
>>>
>>>  getavail.com 
>>>
>>>
>>
>>
>> --

Re: Cassandra read/sec and write/sec

2018-06-29 Thread Eric Evans
On Thu, Jun 28, 2018 at 5:19 PM Abdul Patel  wrote:
>
> Hi all
>
> We use prometheus to monitor cassandra and then put it on graphana for 
> dashboard.
> Whats the parameter to m3asure throughput  of cassandra?

I'm not sure how you're getting metrics from Cassandra to Prometheus,
or if you're using the JMX exporter agent, how your configuration
might be rewriting metric names, so I'll just refer you to the
document of JMX metrics in the hope you can infer what you need from
that.

http://cassandra.apache.org/doc/latest/operating/metrics.html

There are metrics that can tell you the number of queries being made,
broken down by request type:

http://cassandra.apache.org/doc/latest/operating/metrics.html#client-request-metrics

Use table metrics if you want the rates of reads/writes against the
data that those client requests translate into:

http://cassandra.apache.org/doc/latest/operating/metrics.html#table-metrics

Another measure of throughput you might want to monitor is compaction.
For example, the rate of compactions, and the number of bytes
compacted, are both of interest.  Pending compactions can give you an
idea of saturation.

http://cassandra.apache.org/doc/latest/operating/metrics.html#compaction-metrics

Hope this helps.

-- 
Eric Evans
john.eric.ev...@gmail.com

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: C* in multiple AWS AZ's

2018-06-29 Thread Pradeep Chhetri
Isnt NVMe storage an instance storage ie. the data will be lost in case the
instance restarts. How are you going to make sure that there is no data
loss in case instance gets rebooted?

On Fri, 29 Jun 2018 at 7:00 PM, Randy Lynn  wrote:

> GPFS - Rahul FTW! Thank you for your help!
>
> Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did not
> have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS so
> I intrinsically know we'll get better throughput.
>
> On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh  > wrote:
>
>> Totally agree. GPFS for the win. EC2 multi region snitch is an automation
>> tool like Ansible or Puppet. Unless you have two orders of magnitude more
>> servers than you do now, you don’t need it.
>>
>> Rahul
>> On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
>> wrote:
>>
>> Yes. You would just end up with a rack named differently to the AZ. This
>> is not a problem as racks are just logical. I would recommend migrating all
>> your DCs to GPFS though for consistency.
>>
>> On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
>>
>>> So we have two data centers already running..
>>>
>>> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site
>>> tunnel.. I'm wanting to move the current US-EAST from AZ 1a to 1e..
>>> I know all docs say use ec2multiregion for multi-DC.
>>>
>>> I like the GPFS idea. would that work with the multi-DC too?
>>> What's the downside? status would report rack of 1a, even though in 1e?
>>>
>>> Thanks in advance for the help/thoughts!!
>>>
>>>
>>> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
>>> wrote:
>>>
 There is a need for a repair with both DCs as rebuild will not stream
 all replicas, so unless you can guarantee you were perfectly consistent at
 time of rebuild you'll want to do a repair after rebuild.

 On another note you could just replace the nodes but use GPFS instead
 of EC2 snitch, using the same rack name.

 On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
 wrote:

> Parallel load is the best approach and then switch your Data access
> code to only access the new hardware. After you verify that there are no
> local read / writes on the OLD dc and that the updates are only via 
> Gossip,
> then go ahead and change the replication factor on the key space to have
> zero replicas in the old DC. Then you can decommissioned.
>
> This way you are hundred percent sure that you aren’t missing any new
> data. No need for a DC to DC repair but a repair is always healthy.
>
> Rahul
> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn ,
> wrote:
>
> Already running with Ec2.
>
> My original thought was a new DC parallel to the current, and then
> decommission the other DC.
>
> Also my data load is small right now.. I know small is relative term..
> each node is carrying about 6GB..
>
> So given the data size, would you go with parallel DC or let the new
> AZ carry a heavy load until the others are migrated over?
> and then I think "repair" to cleanup the replications?
>
>
> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
> rahul.xavier.si...@gmail.com> wrote:
>
>> You don’t have to use EC2 snitch on AWS but if you have already
>> started with it , it may put a node in a different DC.
>>
>> If your data density won’t be ridiculous You could add 3 to different
>> DC/ Region and then sync up. After the new DC is operational you can 
>> remove
>> one at a time on the old DC and at the same time add to the new one.
>>
>> Rahul
>> On Jun 28, 2018, 9:03 AM -0500, Randy Lynn ,
>> wrote:
>>
>> I have a 6-node cluster I'm migrating to the new i3 types.
>> But at the same time I want to migrate to a different AZ.
>>
>> What happens if I do the "running node replace method" with 1 node at
>> a time moving to the new AZ. Meaning, I'll have temporarily;
>>
>> 5 nodes in AZ 1c
>> 1 new node in AZ 1e.
>>
>> I'll wash-rinse-repeat till all 6 are on the new machine type and in
>> the new AZ.
>>
>> Any thoughts about whether this gets weird with the Ec2Snitch and a
>> RF 3?
>>
>> --
>> Randy Lynn
>> rl...@getavail.com
>>
>> office:
>> 859.963.1616 <+1-859-963-1616> ext 202
>> 163 East Main Street - Lexington, KY 40507 - USA
>> 
>>
>>  getavail.com 
>>
>>
>
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
>  getavail.com 
>
>

Re: C* in multiple AWS AZ's

2018-06-29 Thread Randy Lynn
GPFS - Rahul FTW! Thank you for your help!

Yes, Pradeep - migrating to i3 from r3. moving for NVMe storage, I did not
have the benefit of doing benchmarks.. but we're moving from 1,500 IOPS so
I intrinsically know we'll get better throughput.

On Fri, Jun 29, 2018 at 7:21 AM, Rahul Singh 
wrote:

> Totally agree. GPFS for the win. EC2 multi region snitch is an automation
> tool like Ansible or Puppet. Unless you have two orders of magnitude more
> servers than you do now, you don’t need it.
>
> Rahul
> On Jun 29, 2018, 6:18 AM -0400, kurt greaves ,
> wrote:
>
> Yes. You would just end up with a rack named differently to the AZ. This
> is not a problem as racks are just logical. I would recommend migrating all
> your DCs to GPFS though for consistency.
>
> On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
>
>> So we have two data centers already running..
>>
>> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site tunnel..
>> I'm wanting to move the current US-EAST from AZ 1a to 1e..
>> I know all docs say use ec2multiregion for multi-DC.
>>
>> I like the GPFS idea. would that work with the multi-DC too?
>> What's the downside? status would report rack of 1a, even though in 1e?
>>
>> Thanks in advance for the help/thoughts!!
>>
>>
>> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
>> wrote:
>>
>>> There is a need for a repair with both DCs as rebuild will not stream
>>> all replicas, so unless you can guarantee you were perfectly consistent at
>>> time of rebuild you'll want to do a repair after rebuild.
>>>
>>> On another note you could just replace the nodes but use GPFS instead of
>>> EC2 snitch, using the same rack name.
>>>
>>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
>>> wrote:
>>>
 Parallel load is the best approach and then switch your Data access
 code to only access the new hardware. After you verify that there are no
 local read / writes on the OLD dc and that the updates are only via Gossip,
 then go ahead and change the replication factor on the key space to have
 zero replicas in the old DC. Then you can decommissioned.

 This way you are hundred percent sure that you aren’t missing any new
 data. No need for a DC to DC repair but a repair is always healthy.

 Rahul
 On Jun 28, 2018, 9:15 AM -0500, Randy Lynn , wrote:

 Already running with Ec2.

 My original thought was a new DC parallel to the current, and then
 decommission the other DC.

 Also my data load is small right now.. I know small is relative term..
 each node is carrying about 6GB..

 So given the data size, would you go with parallel DC or let the new AZ
 carry a heavy load until the others are migrated over?
 and then I think "repair" to cleanup the replications?


 On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
 rahul.xavier.si...@gmail.com> wrote:

> You don’t have to use EC2 snitch on AWS but if you have already
> started with it , it may put a node in a different DC.
>
> If your data density won’t be ridiculous You could add 3 to different
> DC/ Region and then sync up. After the new DC is operational you can 
> remove
> one at a time on the old DC and at the same time add to the new one.
>
> Rahul
> On Jun 28, 2018, 9:03 AM -0500, Randy Lynn ,
> wrote:
>
> I have a 6-node cluster I'm migrating to the new i3 types.
> But at the same time I want to migrate to a different AZ.
>
> What happens if I do the "running node replace method" with 1 node at
> a time moving to the new AZ. Meaning, I'll have temporarily;
>
> 5 nodes in AZ 1c
> 1 new node in AZ 1e.
>
> I'll wash-rinse-repeat till all 6 are on the new machine type and in
> the new AZ.
>
> Any thoughts about whether this gets weird with the Ec2Snitch and a RF
> 3?
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
>  getavail.com 
>
>


 --
 Randy Lynn
 rl...@getavail.com

 office:
 859.963.1616 <+1-859-963-1616> ext 202
 163 East Main Street - Lexington, KY 40507 - USA
 

  getavail.com 


>>
>>
>> --
>> Randy Lynn
>> rl...@getavail.com
>>
>> office:
>> 859.963.1616 <+1-859-963-1616> ext 202
>> 163 East Main Street - Lexington, KY 40507 - USA
>> 
>>
>>  getavail.com 
>>
>


-- 
Randy Lynn
rl...@getavail.com

office:
859.963.1616 

Common knowledge on C* heap size/file _cache_size_in_mb/other RAM usage parameters

2018-06-29 Thread Vsevolod Filaretov
What are general community guidelines on setting up C* heap size,
file_cache_size_in_mb, offheap space usage and other RAM usage settings?

Are there any general guidelines like "if your total data size per node is
X/median/max partition size is Y, your RAM usage settings better be Z or
things will be kinda slow"? If no - are there any ways to deduce such
values other than trial-and-error?

Our case: 2.4 tb/node, median partition size 340mb, max partition size 2Gb,
and we can't concider partition schema changes.

Thank you.


Re: C* in multiple AWS AZ's

2018-06-29 Thread Rahul Singh
Totally agree. GPFS for the win. EC2 multi region snitch is an automation tool 
like Ansible or Puppet. Unless you have two orders of magnitude more servers 
than you do now, you don’t need it.

Rahul
On Jun 29, 2018, 6:18 AM -0400, kurt greaves , wrote:
> Yes. You would just end up with a rack named differently to the AZ. This is 
> not a problem as racks are just logical. I would recommend migrating all your 
> DCs to GPFS though for consistency.
>
> > On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:
> > > So we have two data centers already running..
> > >
> > > AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site tunnel.. 
> > > I'm wanting to move the current US-EAST from AZ 1a to 1e..
> > > I know all docs say use ec2multiregion for multi-DC.
> > >
> > > I like the GPFS idea. would that work with the multi-DC too?
> > > What's the downside? status would report rack of 1a, even though in 1e?
> > >
> > > Thanks in advance for the help/thoughts!!
> > >
> > >
> > > > On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves  
> > > > wrote:
> > > > > There is a need for a repair with both DCs as rebuild will not stream 
> > > > > all replicas, so unless you can guarantee you were perfectly 
> > > > > consistent at time of rebuild you'll want to do a repair after 
> > > > > rebuild.
> > > > >
> > > > > On another note you could just replace the nodes but use GPFS instead 
> > > > > of EC2 snitch, using the same rack name.
> > > > >
> > > > > > On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
> > > > > >  wrote:
> > > > > > > Parallel load is the best approach and then switch your Data 
> > > > > > > access code to only access the new hardware. After you verify 
> > > > > > > that there are no local read / writes on the OLD dc and that the 
> > > > > > > updates are only via Gossip, then go ahead and change the 
> > > > > > > replication factor on the key space to have zero replicas in the 
> > > > > > > old DC. Then you can decommissioned.
> > > > > > >
> > > > > > > This way you are hundred percent sure that you aren’t missing any 
> > > > > > > new data. No need for a DC to DC repair but a repair is always 
> > > > > > > healthy.
> > > > > > >
> > > > > > > Rahul
> > > > > > > On Jun 28, 2018, 9:15 AM -0500, Randy Lynn , 
> > > > > > > wrote:
> > > > > > > > Already running with Ec2.
> > > > > > > >
> > > > > > > > My original thought was a new DC parallel to the current, and 
> > > > > > > > then decommission the other DC.
> > > > > > > >
> > > > > > > > Also my data load is small right now.. I know small is relative 
> > > > > > > > term.. each node is carrying about 6GB..
> > > > > > > >
> > > > > > > > So given the data size, would you go with parallel DC or let 
> > > > > > > > the new AZ carry a heavy load until the others are migrated 
> > > > > > > > over?
> > > > > > > > and then I think "repair" to cleanup the replications?
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh 
> > > > > > > > >  wrote:
> > > > > > > > > > You don’t have to use EC2 snitch on AWS but if you have 
> > > > > > > > > > already started with it , it may put a node in a different 
> > > > > > > > > > DC.
> > > > > > > > > >
> > > > > > > > > > If your data density won’t be ridiculous You could add 3 to 
> > > > > > > > > > different DC/ Region and then sync up. After the new DC is 
> > > > > > > > > > operational you can remove one at a time on the old DC and 
> > > > > > > > > > at the same time add to the new one.
> > > > > > > > > >
> > > > > > > > > > Rahul
> > > > > > > > > > On Jun 28, 2018, 9:03 AM -0500, Randy Lynn 
> > > > > > > > > > , wrote:
> > > > > > > > > > > I have a 6-node cluster I'm migrating to the new i3 types.
> > > > > > > > > > > But at the same time I want to migrate to a different AZ.
> > > > > > > > > > >
> > > > > > > > > > > What happens if I do the "running node replace method" 
> > > > > > > > > > > with 1 node at a time moving to the new AZ. Meaning, I'll 
> > > > > > > > > > > have temporarily;
> > > > > > > > > > >
> > > > > > > > > > > 5 nodes in AZ 1c
> > > > > > > > > > > 1 new node in AZ 1e.
> > > > > > > > > > >
> > > > > > > > > > > I'll wash-rinse-repeat till all 6 are on the new machine 
> > > > > > > > > > > type and in the new AZ.
> > > > > > > > > > >
> > > > > > > > > > > Any thoughts about whether this gets weird with the 
> > > > > > > > > > > Ec2Snitch and a RF 3?
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Randy Lynn
> > > > > > > > > > > rl...@getavail.com
> > > > > > > > > > >
> > > > > > > > > > > office:
> > > > > > > > > > > 859.963.1616 ext 202
> > > > > > > > > > > 163 East Main Street - Lexington, KY 40507 - USA
> > > > > > > > > > >
> > > > > > > > > > > getavail.com
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Randy Lynn
> > > > > > > > rl...@getavail.com
> > > > > > > >
> > > > > > > > office:
> > > > > > > > 859.963.1616 ext 202
> > > > > > > > 163 East Main 

Re: C* in multiple AWS AZ's

2018-06-29 Thread kurt greaves
Yes. You would just end up with a rack named differently to the AZ. This is
not a problem as racks are just logical. I would recommend migrating all
your DCs to GPFS though for consistency.

On Fri., 29 Jun. 2018, 09:04 Randy Lynn,  wrote:

> So we have two data centers already running..
>
> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site tunnel..
> I'm wanting to move the current US-EAST from AZ 1a to 1e..
> I know all docs say use ec2multiregion for multi-DC.
>
> I like the GPFS idea. would that work with the multi-DC too?
> What's the downside? status would report rack of 1a, even though in 1e?
>
> Thanks in advance for the help/thoughts!!
>
>
> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
> wrote:
>
>> There is a need for a repair with both DCs as rebuild will not stream all
>> replicas, so unless you can guarantee you were perfectly consistent at time
>> of rebuild you'll want to do a repair after rebuild.
>>
>> On another note you could just replace the nodes but use GPFS instead of
>> EC2 snitch, using the same rack name.
>>
>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
>> wrote:
>>
>>> Parallel load is the best approach and then switch your Data access code
>>> to only access the new hardware. After you verify that there are no local
>>> read / writes on the OLD dc and that the updates are only via Gossip, then
>>> go ahead and change the replication factor on the key space to have zero
>>> replicas in the old DC. Then you can decommissioned.
>>>
>>> This way you are hundred percent sure that you aren’t missing any new
>>> data. No need for a DC to DC repair but a repair is always healthy.
>>>
>>> Rahul
>>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn , wrote:
>>>
>>> Already running with Ec2.
>>>
>>> My original thought was a new DC parallel to the current, and then
>>> decommission the other DC.
>>>
>>> Also my data load is small right now.. I know small is relative term..
>>> each node is carrying about 6GB..
>>>
>>> So given the data size, would you go with parallel DC or let the new AZ
>>> carry a heavy load until the others are migrated over?
>>> and then I think "repair" to cleanup the replications?
>>>
>>>
>>> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
 You don’t have to use EC2 snitch on AWS but if you have already started
 with it , it may put a node in a different DC.

 If your data density won’t be ridiculous You could add 3 to different
 DC/ Region and then sync up. After the new DC is operational you can remove
 one at a time on the old DC and at the same time add to the new one.

 Rahul
 On Jun 28, 2018, 9:03 AM -0500, Randy Lynn , wrote:

 I have a 6-node cluster I'm migrating to the new i3 types.
 But at the same time I want to migrate to a different AZ.

 What happens if I do the "running node replace method" with 1 node at a
 time moving to the new AZ. Meaning, I'll have temporarily;

 5 nodes in AZ 1c
 1 new node in AZ 1e.

 I'll wash-rinse-repeat till all 6 are on the new machine type and in
 the new AZ.

 Any thoughts about whether this gets weird with the Ec2Snitch and a RF
 3?

 --
 Randy Lynn
 rl...@getavail.com

 office:
 859.963.1616 <+1-859-963-1616> ext 202
 163 East Main Street - Lexington, KY 40507 - USA
 

  getavail.com 


>>>
>>>
>>> --
>>> Randy Lynn
>>> rl...@getavail.com
>>>
>>> office:
>>> 859.963.1616 <+1-859-963-1616> ext 202
>>> 163 East Main Street - Lexington, KY 40507 - USA
>>> 
>>>
>>>  getavail.com 
>>>
>>>
>
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
>
>  getavail.com 
>


Re: C* in multiple AWS AZ's

2018-06-29 Thread Pradeep Chhetri
Just curious -

>From which instance type are you migrating to i3 type and what are the
reasons to move to i3 type ?

Are you going to take benefit from NVMe instance storage - if yes, how ?

Since we are also migrating our cluster on AWS - but we are currently using
r4 instance, so i was interested to know if you did a comparison between r4
and i3 types.

Regards,
Pradeep

On Fri, Jun 29, 2018 at 4:49 AM, Randy Lynn  wrote:

> So we have two data centers already running..
>
> AP-SYDNEY, and US-EAST.. I'm using Ec2Snitch over a site-to-site tunnel..
> I'm wanting to move the current US-EAST from AZ 1a to 1e..
> I know all docs say use ec2multiregion for multi-DC.
>
> I like the GPFS idea. would that work with the multi-DC too?
> What's the downside? status would report rack of 1a, even though in 1e?
>
> Thanks in advance for the help/thoughts!!
>
>
> On Thu, Jun 28, 2018 at 6:20 PM, kurt greaves 
> wrote:
>
>> There is a need for a repair with both DCs as rebuild will not stream all
>> replicas, so unless you can guarantee you were perfectly consistent at time
>> of rebuild you'll want to do a repair after rebuild.
>>
>> On another note you could just replace the nodes but use GPFS instead of
>> EC2 snitch, using the same rack name.
>>
>> On Fri., 29 Jun. 2018, 00:19 Rahul Singh, 
>> wrote:
>>
>>> Parallel load is the best approach and then switch your Data access code
>>> to only access the new hardware. After you verify that there are no local
>>> read / writes on the OLD dc and that the updates are only via Gossip, then
>>> go ahead and change the replication factor on the key space to have zero
>>> replicas in the old DC. Then you can decommissioned.
>>>
>>> This way you are hundred percent sure that you aren’t missing any new
>>> data. No need for a DC to DC repair but a repair is always healthy.
>>>
>>> Rahul
>>> On Jun 28, 2018, 9:15 AM -0500, Randy Lynn , wrote:
>>>
>>> Already running with Ec2.
>>>
>>> My original thought was a new DC parallel to the current, and then
>>> decommission the other DC.
>>>
>>> Also my data load is small right now.. I know small is relative term..
>>> each node is carrying about 6GB..
>>>
>>> So given the data size, would you go with parallel DC or let the new AZ
>>> carry a heavy load until the others are migrated over?
>>> and then I think "repair" to cleanup the replications?
>>>
>>>
>>> On Thu, Jun 28, 2018 at 10:09 AM, Rahul Singh <
>>> rahul.xavier.si...@gmail.com> wrote:
>>>
 You don’t have to use EC2 snitch on AWS but if you have already started
 with it , it may put a node in a different DC.

 If your data density won’t be ridiculous You could add 3 to different
 DC/ Region and then sync up. After the new DC is operational you can remove
 one at a time on the old DC and at the same time add to the new one.

 Rahul
 On Jun 28, 2018, 9:03 AM -0500, Randy Lynn , wrote:

 I have a 6-node cluster I'm migrating to the new i3 types.
 But at the same time I want to migrate to a different AZ.

 What happens if I do the "running node replace method" with 1 node at a
 time moving to the new AZ. Meaning, I'll have temporarily;

 5 nodes in AZ 1c
 1 new node in AZ 1e.

 I'll wash-rinse-repeat till all 6 are on the new machine type and in
 the new AZ.

 Any thoughts about whether this gets weird with the Ec2Snitch and a RF
 3?

 --
 Randy Lynn
 rl...@getavail.com

 office:
 859.963.1616 <+1-859-963-1616> ext 202
 163 East Main Street - Lexington, KY 40507 - USA
 

  getavail.com 


>>>
>>>
>>> --
>>> Randy Lynn
>>> rl...@getavail.com
>>>
>>> office:
>>> 859.963.1616 <+1-859-963-1616> ext 202
>>> 163 East Main Street - Lexington, KY 40507 - USA
>>> 
>>>
>>>  getavail.com 
>>>
>>>
>
>
> --
> Randy Lynn
> rl...@getavail.com
>
> office:
> 859.963.1616 <+1-859-963-1616> ext 202
> 163 East Main Street - Lexington, KY 40507 - USA
> 
>
>  getavail.com 
>