Re: Convert single node C* to cluster (rebalancing problem)

2017-06-16 Thread Jeff Jirsa


On 2017-06-16 10:31 (-0700), John Hughes  wrote: 
> Hi Affan,
> 
> Others can likely speak to this more authoritatively I am sure, but with a
> RF of 1x, I would not expect it to rebalance. Now if you were 4 nodes and a
> RF of 2x I would expect it to.
> 

Even with an RF of 1, any token range that moves to the new (joining) node will 
result in data transfer.

What won't happen automatically, however, is data being removed from the 
source. "nodetool cleanup" is provided to do that.  Until you run "cleanup", no 
data will be removed from the original hosts.

Note, however, that if you're not sure if things are properly 
bootstrapped/repaired, don't run cleanup until you're happy. If you had RF=1 
and you joined a node with auto_bootstrap=false, the node would immediately 
join the ring and be responsible/own data that it never received, and with RF=1 
there's no way to run repair. You can either remove that node (nodetool 
decommission) and re-add it with auto_bootstrap=true to recover most of the old 
data (this isn't PERFECTLY safe, but it's better than not having the data at 
all).  Alternatively, you could use 'sstableloader' to re-stream the data into 
your cluster.

Again, that's only an issue if you had RF=1 and joined a node with 
auto_bootstrap=false.




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Convert single node C* to cluster (rebalancing problem)

2017-06-16 Thread John Hughes
Hi Affan,

Others can likely speak to this more authoritatively I am sure, but with a
RF of 1x, I would not expect it to rebalance. Now if you were 4 nodes and a
RF of 2x I would expect it to.

As a side note, I tend to grow and shrink my clusters to do upgrades and
such, and I rarely run anything less than 6 nodes(which is what I consider
the safe minimum[context: AWS Single Region with 3xAZ])

Also, you might want to clean up all old snapshots(nodetool clearsnapshots)
and auto_backups(manually removing contents of local 'backup' dirs) and
then run a cleanup just to see how that effects the numbers that nodetool
status is showing you



On Thu, Jun 15, 2017 at 1:54 AM Affan Syed  wrote:

> John,
>
> I am a co-worker with Junaid -- he is out sick, so just wanted to confirm
> that one of your shots in the dark is correct. This is a RF of 1x
>
> "CREATE KEYSPACE orion WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'}  AND durable_writes = true;"
>
> However, how does the RF affect the redistribution of key/data?
>
> Affan
>
> - Affan
>
> On Wed, Jun 14, 2017 at 1:16 AM, John Hughes 
> wrote:
>
>> OP, I was just looking at your original numbers and I have some questions:
>>
>> 270GB on one node and 414KB on the other, but something close to 50/50 on
>> "Owns(effective)".
>> What replication factor are your keyspaces set up with? 1x or 2x or ??
>>
>> I would say you are seeing 50/50 because the tokens are allocated
>> 50/50(others on the list please correct what are for me really just
>> assumptions), but I would hazard a guess that your replication factor
>> is still 1x, so it isn't moving anything around. Or your keyspace
>> rplication is incorrect and isn't being distributed(I have had issues with
>> the AWSMultiRegionSnitch and not getting the region correct[us-east vs
>> us-east-1). It doesn't throw an error, but it doesn't work very well either
>> =)
>>
>> Can you do a 'describe keyspace XXX' and show the first line(the CREATE
>> KEYSPACE line).
>>
>> Mind you, these are all just shots in the dark from here.
>>
>> Cheers,
>>
>>
>> On Tue, Jun 13, 2017 at 3:13 AM Junaid Nasir  wrote:
>>
>>> Is the OP expecting a perfect 50%/50% split?
>>>
>>>
>>> best result I got was 240gb/30gb split, which I think is not properly
>>> balanced.
>>>
>>>
 Also, what are your outputs when you call out specific keyspaces? Do
 the numbers get more even?
>>>
>>>
>>> i don't know what you mean by *call out specific key spaces?* can you
>>> please explain that a bit.
>>>
>>>
>>> If your schema is not modelled correctly you can easily end up unevenly
 distributed data.
>>>
>>>
>>> I think that is the problem. initial 270gb data might not by modeled
>>> correctly. I have run a lot of tests on 270gb data including downsizing it
>>> to 5gb, they all resulted in same uneven distribution. I also tested a
>>> dummy dataset of 2gb which was balanced evenly. coming from rdb, I didn't
>>> give much thought to data modeling. can anyone please point me to some
>>> resources regarding this problem.
>>>
>>> On Tue, Jun 13, 2017 at 3:24 AM, Akhil Mehra 
>>> wrote:
>>>
 Great point John.

 The OP should also note that data distribution also depends on your
 schema and incoming data profile.

 If your schema is not modelled correctly you can easily end up unevenly
 distributed data.

 Cheers,
 Akhil

 On Tue, Jun 13, 2017 at 3:36 AM, John Hughes 
 wrote:

> Is the OP expecting a perfect 50%/50% split? That, to my experience,
> is not going to happen, it is almost always shifted from a fraction of a
> percent to a couple percent.
>
> Datacenter: eu-west
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  AddressLoad   Tokens   Owns (effective)  Host ID
> Rack
> UN  XX.XX.XX.XX22.71 GiB  256  47.6%
> 57dafdde-2f62-467c-a8ff-c91e712f89c9  1c
> UN  XX.XX.XX.XX  17.17 GiB  256  51.3%
> d2a65c51-087d-48de-ae1f-a41142eb148d  1b
> UN  XX.XX.XX.XX  26.15 GiB  256  52.4%
> acf5dd34-5b81-4e5b-b7be-85a7fccd8e1c  1c
> UN  XX.XX.XX.XX   16.64 GiB  256  50.2%
> 6c8842dd-a966-467c-a7bc-bd6269ce3e7e  1a
> UN  XX.XX.XX.XX  24.39 GiB  256  49.8%
> fd92525d-edf2-4974-8bc5-a350a8831dfa  1a
> UN  XX.XX.XX.XX   23.8 GiB   256  48.7%
> bdc597c0-718c-4ef6-b3ef-7785110a9923  1b
>
> Though maybe part of what you are experiencing can be cleared up by
> repair/compaction/cleanup. Also, what are your outputs when you call out
> specific keyspaces? Do the numbers get more even?
>
> Cheers,
>
> On Mon, Jun 12, 2017 at 5:22 AM Akhil Mehra 
> wrote:
>
>> auto_bootstrap is true by default. Ensure its set to 

Re: Question: Large partition warning

2017-06-16 Thread Jeff Jirsa


On 2017-06-16 10:18 (-0700), "Jeff Jirsa" wrote: 
> 
> 
> On 2017-06-15 16:33 (-0700), kurt greaves  wrote: 
> > fyi ticket already existed for this, I've submitted a patch that fixes this
> > specific issue but it looks like there are a few other properties that will
> > suffer from the same. As I said on the ticket, we should probably fix these
> > up even though setting things this high is generally bad practice. If other
> > people want to weigh in and think the same I'll create another ticket.
> > https://issues.apache.org/jira/browse/CASSANDRA-13172
> > 
> 
> If you kick off tests, I'll review+commit for you (though quick glance, you 
> probably also want to update Config.java to make it a long).

Ignore the comment about Config.java, it's wrong, and your patch looks fine, 
just run tests.

- Jeff

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Question: Large partition warning

2017-06-16 Thread Jeff Jirsa


On 2017-06-15 16:33 (-0700), kurt greaves  wrote: 
> fyi ticket already existed for this, I've submitted a patch that fixes this
> specific issue but it looks like there are a few other properties that will
> suffer from the same. As I said on the ticket, we should probably fix these
> up even though setting things this high is generally bad practice. If other
> people want to weigh in and think the same I'll create another ticket.
> https://issues.apache.org/jira/browse/CASSANDRA-13172
> 

If you kick off tests, I'll review+commit for you (though quick glance, you 
probably also want to update Config.java to make it a long).

- Jeff

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Impact of Write without consistency level and mutation failures on reads and cluster

2017-06-16 Thread Jeff Jirsa


On 2017-06-15 19:10 (-0700), srinivasarao daruna  
wrote: 
> Hi,
> 
> Recently one of our spark job had missed cassandra consistency property and
> number of concurrent writes property.

Just for the record, you still have a consistency level set, it's just set to 
whatever your driver/spark defaults to (probably LOCAL_ONE). This probably 
means it's firing writes faster than you'd expect (no backpressure), which may 
have contributed to your problems.


> 
> Due to that, some of mutations are failed when we checked tpstats. Also, we
> observed readtimeouts are occurring with not only the table that the job
> inserts, but also from other tables, for which have always had consistency
> level proper. We started repair, but due to the volume of data, repair
> might take a day or two to complete. Mean while, wanted to get some inputs.
> 
> As the error planted lot of questions.
> 1) Is there a relation between mutation fails to read time outs and overall
> cluster performance, if yes, how.?
> 

When the cluster is heavily loaded, you'll see both dropped mutation and read 
timeouts, yes. 

It's also true that reads can impact writes, and writes can impact reads - 
especially since it's all in one shared JVM process, with common garbage 
collecting.

> 2) When i checked the log, i found a warning in debug.log as below.
> SELECT * FROM our_table WHERE partition_key = required_value LIMIT 5000:
> total time 20353 msec - timeout 2 msec
> 
> Actual query:
> SELECT * FROM our_table WHERE partition_key = required_value
> 
> Even though we are hitting partition key, i do not understand the reason
> for such huge read time and timeouts.

Likely related to JVM GC pauses. How big is that partition (nodetool cfstats 
may help here)? Are you seeing a lot of other GC pauses going on (you should 
have monitoring, or at least glance at the log for 'GCInspector' lines)? 

> 
> 3) We are using prepared statements to query the tables from API. How can
> we set the fetch size, so that it wont use LIMIT 5000.?
> Any thoughts.?
> 
> 

Driver dependent, but most of them offer this for prepared statements as well. 
The datastax java driver also offers it globally on the 
Cluster.builder().withQueryOptions(new QueryOptions().setFetchSize(100))
 



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Tool to manage cassandra

2017-06-16 Thread daemeon reiydelle
Ambari





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*"It is better to be insulted with the truth than kissed with a lie”*

On Fri, Jun 16, 2017 at 6:01 AM, Ram Bhatia  wrote:

> Hi
>
> May I know, if there a tool similar to Oracle Enterprise Manager for
> managing Cassandra ?
>
> Thank you in advance for your help,
> Ram Bhatia
> - To
> unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional
> commands, e-mail: user-h...@cassandra.apache.org


Re: Tool to manage cassandra

2017-06-16 Thread Oskar Kjellin
Do you mean Priam nitan?


> On 16 Jun 2017, at 16:48, Nitan Kainth  wrote:
> 
> you can try atlas from netflix, it is open source.
> 
>> On Jun 16, 2017, at 8:52 AM, Surbhi Gupta  wrote:
>> 
>> If u are using dse then u can use opscenter 
>> 
>> On Fri, Jun 16, 2017 at 6:01 AM Ram Bhatia  wrote:
>>> Hi
>>> 
>>> 
>>> 
>>> 
>>>  
>>> 
>>> 
>>> 
>>> 
>>> May I know, if there a tool similar to Oracle Enterprise Manager for 
>>> managing Cassandra ?
>>> 
>>> 
>>> 
>>> 
>>>  
>>> 
>>> 
>>> 
>>> 
>>> Thank you in advance for your help,
>>> 
>>> 
>>> 
>>> 
>>> Ram Bhatia
>>> 
>>> 
>>> 
>>> 
>>> -
>>> 
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> 
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>> 
>>> 
>>> 
> 


Re: Tool to manage cassandra

2017-06-16 Thread Haris Altaf
I use dbeaver http://dbeaver.jkiss.org/download/enterprise/

On Fri, 16 Jun 2017 at 19:48 Nitan Kainth  wrote:

> you can try atlas from netflix, it is open source.
>
> On Jun 16, 2017, at 8:52 AM, Surbhi Gupta 
> wrote:
>
> If u are using dse then u can use opscenter
>
> On Fri, Jun 16, 2017 at 6:01 AM Ram Bhatia  wrote:
>
>> Hi
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> May I know, if there a tool similar to Oracle Enterprise Manager for
>> managing Cassandra ?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Thank you in advance for your help,
>>
>>
>>
>>
>> Ram Bhatia
>>
>>
>>
>>
>> -
>>
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>>
> --
regards,
Haris


Re: Tool to manage cassandra

2017-06-16 Thread Nitan Kainth
you can try atlas from netflix, it is open source.

> On Jun 16, 2017, at 8:52 AM, Surbhi Gupta  wrote:
> 
> If u are using dse then u can use opscenter 
> 
> On Fri, Jun 16, 2017 at 6:01 AM Ram Bhatia  > wrote:
> Hi
> 
> 
> 
> 
>  
> 
> 
> 
> 
> May I know, if there a tool similar to Oracle Enterprise Manager for managing 
> Cassandra ?
> 
> 
> 
> 
>  
> 
> 
> 
> 
> Thank you in advance for your help,
> 
> 
> 
> 
> Ram Bhatia
> 
> 
> 
> 
> -
> 
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 
> 
> 
> 



Re: Tool to manage cassandra

2017-06-16 Thread Surbhi Gupta
If u are using dse then u can use opscenter

On Fri, Jun 16, 2017 at 6:01 AM Ram Bhatia  wrote:

> Hi
>
>
>
>
>
>
>
>
>
> May I know, if there a tool similar to Oracle Enterprise Manager for
> managing Cassandra ?
>
>
>
>
>
>
>
>
>
> Thank you in advance for your help,
>
>
>
>
> Ram Bhatia
>
>
>
>
> -
>
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>
>


Tool to manage cassandra

2017-06-16 Thread Ram Bhatia
Hi

 

May I know, if there a tool similar to Oracle Enterprise Manager for managing Cassandra ?

 

Thank you in advance for your help,

Ram Bhatia

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Convert single node C* to cluster (rebalancing problem)

2017-06-16 Thread Akhil Mehra
Good point Varun.

Here is how I understand it but feel free to disagree.

According to the documentation the load column in the nodetool status
output is "The amount of file system data under the cassandra data
directory after excluding all content in the snapshots subdirectories.
Because all SSTable data files are included, any data that is not cleaned
up, such as TTL-expired cell or tombstoned data) is counted.".

nodetool cleanup has been specifically added to remove unwanted data after
adding a new node to the cluster.

I am assuming its too expensive to calculate the actual load amount
excluding unwanted data.

Thus it seems the tool is working according to specification.

Cheers,
Akhil




On Fri, Jun 16, 2017 at 4:39 PM, Varun Gupta  wrote:

>
> Akhil,
>
> As per the blog, nodetool status shows data size for node1 even for token
> ranges it does not own. Ain't this is bug in Cassandra?
>
> Yes, on disk data will be present but it should be reflected in nodetool
> status.
>
> On Thu, Jun 15, 2017 at 6:17 PM, Akhil Mehra  wrote:
>
>> Hi,
>>
>> I put together a blog explaining possible reasons for an unbalanced
>> Cassandra nodes.
>>
>> http://abiasforaction.net/unbalanced-cassandra-cluster/
>>
>> Let me know if you have any questions.
>>
>> Cheers,
>> Akhil
>>
>>
>> On Thu, Jun 15, 2017 at 5:54 PM, Affan Syed  wrote:
>>
>>> John,
>>>
>>> I am a co-worker with Junaid -- he is out sick, so just wanted to
>>> confirm that one of your shots in the dark is correct. This is a RF of 1x
>>>
>>> "CREATE KEYSPACE orion WITH replication = {'class': 'SimpleStrategy',
>>> 'replication_factor': '1'}  AND durable_writes = true;"
>>>
>>> However, how does the RF affect the redistribution of key/data?
>>>
>>> Affan
>>>
>>> - Affan
>>>
>>> On Wed, Jun 14, 2017 at 1:16 AM, John Hughes 
>>> wrote:
>>>
 OP, I was just looking at your original numbers and I have some
 questions:

 270GB on one node and 414KB on the other, but something close to 50/50
 on "Owns(effective)".
 What replication factor are your keyspaces set up with? 1x or 2x or ??

 I would say you are seeing 50/50 because the tokens are allocated
 50/50(others on the list please correct what are for me really just
 assumptions), but I would hazard a guess that your replication factor
 is still 1x, so it isn't moving anything around. Or your keyspace
 rplication is incorrect and isn't being distributed(I have had issues with
 the AWSMultiRegionSnitch and not getting the region correct[us-east vs
 us-east-1). It doesn't throw an error, but it doesn't work very well either
 =)

 Can you do a 'describe keyspace XXX' and show the first line(the CREATE
 KEYSPACE line).

 Mind you, these are all just shots in the dark from here.

 Cheers,


 On Tue, Jun 13, 2017 at 3:13 AM Junaid Nasir  wrote:

> Is the OP expecting a perfect 50%/50% split?
>
>
> best result I got was 240gb/30gb split, which I think is not properly
> balanced.
>
>
>> Also, what are your outputs when you call out specific keyspaces? Do
>> the numbers get more even?
>
>
> i don't know what you mean by *call out specific key spaces?* can you
> please explain that a bit.
>
>
> If your schema is not modelled correctly you can easily end up
>> unevenly distributed data.
>
>
> I think that is the problem. initial 270gb data might not by modeled
> correctly. I have run a lot of tests on 270gb data including downsizing it
> to 5gb, they all resulted in same uneven distribution. I also tested a
> dummy dataset of 2gb which was balanced evenly. coming from rdb, I didn't
> give much thought to data modeling. can anyone please point me to some
> resources regarding this problem.
>
> On Tue, Jun 13, 2017 at 3:24 AM, Akhil Mehra 
> wrote:
>
>> Great point John.
>>
>> The OP should also note that data distribution also depends on your
>> schema and incoming data profile.
>>
>> If your schema is not modelled correctly you can easily end up
>> unevenly distributed data.
>>
>> Cheers,
>> Akhil
>>
>> On Tue, Jun 13, 2017 at 3:36 AM, John Hughes 
>> wrote:
>>
>>> Is the OP expecting a perfect 50%/50% split? That, to my experience,
>>> is not going to happen, it is almost always shifted from a fraction of a
>>> percent to a couple percent.
>>>
>>> Datacenter: eu-west
>>> ===
>>> Status=Up/Down
>>> |/ State=Normal/Leaving/Joining/Moving
>>> --  AddressLoad   Tokens   Owns (effective)  Host ID
>>>   Rack
>>> UN  XX.XX.XX.XX22.71 GiB  256  47.6%
>>> 57dafdde-2f62-467c-a8ff-c91e712f89c9  1c