Read Repairs and CL

2016-08-27 Thread kurt Greaves
Looking at the wiki for the read path (
http://wiki.apache.org/cassandra/ReadPathForUsers), in the bottom diagram
for reading with a read repair, it states the following when "reading from
all replica nodes" after there is a hash mismatch:

If hashes do not match, do conflict resolution. First step is to read all
> data from all replica nodes excluding the fastest replica (since CL=ALL)
>

 In the bottom left of the diagram it also states:

> In this example:
>
RF>=2
>
CL=ALL
>

The (since CL=ALL) implies that the CL for the read during the read repair
is based off the CL of the query. However I don't think that makes sense at
other CLs. Anyway, I just want to clarify what CL the read for the read
repair occurs at for cases where the overall query CL is not ALL.

Thanks,
Kurt.

-- 
Kurt Greaves
k...@instaclustr.com
www.instaclustr.com


Re: New data center to an existing cassandra cluster

2016-08-27 Thread laxmikanth sadula
Yes  , RF=3 in existing datacenters DC1 & DC2 and going to be same RF in
new datacenter DC3 which I'm going to add.


On Sat, Aug 27, 2016 at 11:15 PM, Alexander DEJANOVSKI <
adejanov...@gmail.com> wrote:

> Reads at quorum in dc3 will involve dc1 and dc2 as they will require a
> response from more than half the replicas throughout the Cluster.
>
> If you're using RF=3 in each DC, each read will need at least 5 responses,
> which DC3 cannot provide on its own.
>
> You can have troubles if DC3 has more than half then replicas, but I
> guess/hope it is not the case, otherwise you're fine.
>
> You would be in trouble though if you were using local_quorum on DC3 or
> ONE on any DC.
>
>
>
> Le sam. 27 août 2016 19:11, Surbhi Gupta  a
> écrit :
>
>> Yes, it will have issue during the time new nodes are building
>> So it is always advised to use LOCAL_QUORUM instead of QUORUM and
>> LOCAL_ONE instead of ONE
>>
>> On 27 August 2016 at 09:45, laxmikanth sadula 
>> wrote:
>>
>>> Hi,
>>>
>>> I'm going to add a new data center DC3 to an existing cassandra cluster
>>> which has already 2 data centers DC1 , DC2.
>>>
>>> The thing I'm worried of is about tables in one keyspace which has
>>> QUORUM reads and NOT LOCAL_QUORUM.
>>> So while adding a new data centers with auto_bootstrap:false and
>>> 'nodetool rebuild' , will the queries to tables in this keyspace will have
>>> any issue ?
>>>
>>> Thanks in advance.
>>>
>>> --
>>> Regards,
>>> Laxmikanth
>>>
>>
>>


-- 
Regards,
Laxmikanth
99621 38051


How to understand of dynamic snitch update interval?

2016-08-27 Thread Jun Wu
Hi there,
 I have a question for dynamic snitch, specifically in reading.
For dynamic snitch, it is wrapped with other snitches. For reading, dynamic 
snitch plays a very important role, as mentioned in this article: 
http://www.datastax.com/dev/blog/dynamic-snitching-in-cassandra-past-present-and-future
  But I do feel confused by the update interval in it. The default dynamic 
update interval is 100 ms. Does it means every 100 ms it'll calculate the score 
based on the history latency. But we know that for each read request, the 
process/service time is much less than 100ms. So during this 100ms, there's 
tons of read requests being processed, where each request may have a latency 
record. 
  So my questions are:  1. For the history latency, is it based on all 
latency information of the past 100 ms, or only a latest few samples?  2. 
After the score calculation, does it means for the lowest score/node, it will 
be choose as the replica to read from in the next 100ms. Then after next 100ms, 
the score will be recalculated and repeat the whole process?
  Any comment would be appreciated. Thanks in advance!
Jun   

Re: New data center to an existing cassandra cluster

2016-08-27 Thread Jeff Jirsa
If you’ve repaired at the point you alter keyspace to add the third DC, and you 
write with quorum, and your RF per DC is the same (for example, 3 in each DC), 
then you’ll likely get valid read requests, as long as none of the other nodes 
go down while you rebuild.

 

There are a lot of IFs and ANDs in that statement. Be sure you hit ALL of them 
or you may miss data on reads.

 

 

 

From: laxmikanth sadula 
Reply-To: "user@cassandra.apache.org" 
Date: Saturday, August 27, 2016 at 9:45 AM
To: "user@cassandra.apache.org" 
Subject: New data center to an existing cassandra cluster

 

Hi, 

 

I'm going to add a new data center DC3 to an existing cassandra cluster which 
has already 2 data centers DC1 , DC2.

The thing I'm worried of is about tables in one keyspace which has QUORUM reads 
and NOT LOCAL_QUORUM.
So while adding a new data centers with auto_bootstrap:false and 'nodetool 
rebuild' , will the queries to tables in this keyspace will have any issue ?

 

Thanks in advance.

 

-- 

Regards, 

Laxmikanth



smime.p7s
Description: S/MIME cryptographic signature


Re: New data center to an existing cassandra cluster

2016-08-27 Thread Alexander DEJANOVSKI
Reads at quorum in dc3 will involve dc1 and dc2 as they will require a
response from more than half the replicas throughout the Cluster.

If you're using RF=3 in each DC, each read will need at least 5 responses,
which DC3 cannot provide on its own.

You can have troubles if DC3 has more than half then replicas, but I
guess/hope it is not the case, otherwise you're fine.

You would be in trouble though if you were using local_quorum on DC3 or ONE
on any DC.



Le sam. 27 août 2016 19:11, Surbhi Gupta  a
écrit :

> Yes, it will have issue during the time new nodes are building
> So it is always advised to use LOCAL_QUORUM instead of QUORUM and
> LOCAL_ONE instead of ONE
>
> On 27 August 2016 at 09:45, laxmikanth sadula 
> wrote:
>
>> Hi,
>>
>> I'm going to add a new data center DC3 to an existing cassandra cluster
>> which has already 2 data centers DC1 , DC2.
>>
>> The thing I'm worried of is about tables in one keyspace which has QUORUM
>> reads and NOT LOCAL_QUORUM.
>> So while adding a new data centers with auto_bootstrap:false and
>> 'nodetool rebuild' , will the queries to tables in this keyspace will have
>> any issue ?
>>
>> Thanks in advance.
>>
>> --
>> Regards,
>> Laxmikanth
>>
>
>


Re: New data center to an existing cassandra cluster

2016-08-27 Thread Surbhi Gupta
Yes, it will have issue during the time new nodes are building
So it is always advised to use LOCAL_QUORUM instead of QUORUM and LOCAL_ONE
instead of ONE

On 27 August 2016 at 09:45, laxmikanth sadula 
wrote:

> Hi,
>
> I'm going to add a new data center DC3 to an existing cassandra cluster
> which has already 2 data centers DC1 , DC2.
>
> The thing I'm worried of is about tables in one keyspace which has QUORUM
> reads and NOT LOCAL_QUORUM.
> So while adding a new data centers with auto_bootstrap:false and 'nodetool
> rebuild' , will the queries to tables in this keyspace will have any issue ?
>
> Thanks in advance.
>
> --
> Regards,
> Laxmikanth
>


New data center to an existing cassandra cluster

2016-08-27 Thread laxmikanth sadula
Hi,

I'm going to add a new data center DC3 to an existing cassandra cluster
which has already 2 data centers DC1 , DC2.

The thing I'm worried of is about tables in one keyspace which has QUORUM
reads and NOT LOCAL_QUORUM.
So while adding a new data centers with auto_bootstrap:false and 'nodetool
rebuild' , will the queries to tables in this keyspace will have any issue ?

Thanks in advance.

-- 
Regards,
Laxmikanth


Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-27 Thread Benedict Elliott Smith
I did not claim you had no evidence, only that your statement lacked
justification.  Again, nuance is important.

I was suggesting that blanket statements without the necessary caveats, to
the user mailing list, countermanding the defaults without 'justification'
(explanation, reasoning) is liable to cause confusion on what best practice
is.  I attempted to provide some of the missing context to minimise this
confusion while still largely agreeing with you.

However you should also bear in mind that you work as a field engineer for
DataStax, and as such your sample of installation behaviours will be
biased - towards those where the defaults have not worked well.



On Saturday, 27 August 2016, Ryan Svihla  wrote:

>  I have been trying to get the docs fixed for this for the past 3 months,
> and there already is a ticket open for changing the defaults. I don't feel
> like I've had a small amount of evidence here. All observation in the 3
> years of work in the field suggests compaction keeps coming up as the
> bottleneck when you push Cassandra ingest.
> 0.6 as an initial setting has fixed 20+ broken clusters in practice and it
> improved overall performance in every case from defaults of 0.33 to
> defaults of 0.03 (yaml suggests per core flush writers, add in the
> prevelance of HT and you see a lot of 24+ flush writer systems in the wild)
>
> No disrespect intended but that default hasn't worked out well at all in
> my exposure to it, and 0.6 has never been worse than the default yet.
> Obviously write patterns, heap configuration, memtable size limits and what
> not affect the exact optimal setting and I've rarely had it end up 0.6
> after a tuning exercise. I never intended that as a blanket recommendation,
> just a starting one.
>
> _
> From: Benedict Elliott Smith  >
> Sent: Friday, August 26, 2016 9:40 AM
> Subject: Re: Guidelines for configuring Thresholds for Cassandra metrics
> To:  >
>
>
> The default when I wrote it was 0.4 but it was found this did not saturate
> flush writers in JBOD configurations. Iirc it now defaults to 1/(1+#disks)
> which is not a terrible default, but obviously comes out much lower if you
> have many disks.
>
> This smaller value behaves better for peak performance, but in a live
> system where compaction is king not saturating flush in return for lower
> write amplification (from flushing larger memtables) will indeed often be a
> win.
>
> 0.6, however, is probably not the best default unless you have a lot of
> tables being actively written to, in which case even 0.8 would be fine.
> With a single main table receiving your writes at a given time, 0.4 is
> probably an optimal value, when making this trade off against peak
> performance.
>
> Anyway, it's probably better to file a ticket to discuss defaults and
> documentation than making a statement like this without justification. I
> can see where you're coming from, but it's confusing for users to have such
> blanket guidance that counters the defaults.  If the defaults can be
> improved (which I agree they can) it's probably better to do that, along
> with better documentation, so the nuance is accounted for.
>
>
> On Friday, 26 August 2016, Ryan Svihla  > wrote:
>
>>
>> Forgot the most important thing. Logs
>> ERROR you should investigate
>> WARN you should have a list of known ones. Use case dependent. Ideally
>> you change configuration accordingly.
>> *PoolCleaner (slab or native) - good indication node is tuned badly if
>> you see a ton of this. Set memtable_cleanup_threshold to 0.6 as an initial
>> attempt to configure this correctly.  This is a complex topic to dive into,
>> so that may not be the best number, it'll likely be better than the
>> default, why its not the default is a big conversation.
>> There are a bunch of other logs I look for that are escaping me at
>> present but that's a good start
>>
>> -regards,
>>
>> Ryan Svihla
>>
>>
>>
>>
>> On Fri, Aug 26, 2016 at 7:21 AM -0500, "Ryan Svihla" > > wrote:
>>
>> Thomas,
>>>
>>> Not all metrics are KPIs and are only useful when researching a specific
>>> issue or after a use case specific threshold has been set.
>>>
>>> The main "canaries" I monitor are:
>>> * Pending compactions (dependent on the compaction strategy chosen but
>>> 1000 is a sign of severe issues in all cases)
>>> * dropped mutations (more than one I treat as a event to investigate, I
>>> believe in allowing operational overhead and any evidence of load shedding
>>> suggests I may not have as much as I thought)
>>> * blocked anything (flush writers, etc..more than one I investigate)
>>> * system hints ( More than 1k I investigate)
>>> * heap usage 

Re: Guidelines for configuring Thresholds for Cassandra metrics

2016-08-27 Thread Ryan Svihla
 I have been trying to get the docs fixed for this for the past 3 months, and 
there already is a ticket open for changing the defaults. I don't feel like 
I've had a small amount of evidence here. All observation in the 3 years of 
work in the field suggests compaction keeps coming up as the bottleneck when 
you push Cassandra ingest.0.6 as an initial setting has fixed 20+ broken 
clusters in practice and it improved overall performance in every case from 
defaults of 0.33 to defaults of 0.03 (yaml suggests per core flush writers, add 
in the prevelance of HT and you see a lot of 24+ flush writer systems in the 
wild)
No disrespect intended but that default hasn't worked out well at all in my 
exposure to it, and 0.6 has never been worse than the default yet. Obviously 
write patterns, heap configuration, memtable size limits and what not affect 
the exact optimal setting and I've rarely had it end up 0.6 after a tuning 
exercise. I never intended that as a blanket recommendation, just a starting 
one.

_
From: Benedict Elliott Smith 
Sent: Friday, August 26, 2016 9:40 AM
Subject: Re: Guidelines for configuring Thresholds for Cassandra metrics
To:  


The default when I wrote it was 0.4 but it was found this did not saturate 
flush writers in JBOD configurations. Iirc it now defaults to 1/(1+#disks) 
which is not a terrible default, but obviously comes out much lower if you have 
many disks.
This smaller value behaves better for peak performance, but in a live system 
where compaction is king not saturating flush in return for lower write 
amplification (from flushing larger memtables) will indeed often be a win.
0.6, however, is probably not the best default unless you have a lot of tables 
being actively written to, in which case even 0.8 would be fine. With a single 
main table receiving your writes at a given time, 0.4 is probably an optimal 
value, when making this trade off against peak performance.
Anyway, it's probably better to file a ticket to discuss defaults and 
documentation than making a statement like this without justification. I can 
see where you're coming from, but it's confusing for users to have such blanket 
guidance that counters the defaults.  If the defaults can be improved (which I 
agree they can) it's probably better to do that, along with better 
documentation, so the nuance is accounted for.

On Friday, 26 August 2016, Ryan Svihla  wrote:

Forgot the most important thing. LogsERROR you should investigateWARN you 
should have a list of known ones. Use case dependent. Ideally you change 
configuration accordingly.*PoolCleaner (slab or native) - good indication node 
is tuned badly if you see a ton of this. Set memtable_cleanup_threshold to 0.6 
as an initial attempt to configure this correctly.  This is a complex topic to 
dive into, so that may not be the best number, it'll likely be better than the 
default, why its not the default is a big conversation.There are a bunch of 
other logs I look for that are escaping me at present but that's a good start
-regards,
Ryan Svihla



On Fri, Aug 26, 2016 at 7:21 AM -0500, "Ryan Svihla"  wrote:

Thomas,
Not all metrics are KPIs and are only useful when researching a specific issue 
or after a use case specific threshold has been set.
The main "canaries" I monitor are:* Pending compactions (dependent on the 
compaction strategy chosen but 1000 is a sign of severe issues in all cases)* 
dropped mutations (more than one I treat as a event to investigate, I believe 
in allowing operational overhead and any evidence of load shedding suggests I 
may not have as much as I thought)* blocked anything (flush writers, etc..more 
than one I investigate)* system hints ( More than 1k I investigate)* heap usage 
and gc time vary a lot by use case and collector chosen, I aim for below 65% 
usage as an average with g1, but this again varies by use case a great deal. 
Sometimes I just looks the chart and query patterns and if they don't line up I 
have to do other deeper investigations* read and write latencies exceeding SLA 
is also use case dependent. Those that have none I tend to push towards p99 
with a middle end SSD based system having 100ms and a spindle based system 
having 600ms with CL one and assuming a "typical" query pattern (again query 
patterns and CL so vary here)* cell count and partition size vary greatly by 
hardware and gc tuning but I like to in the absence of all other relevant 
information like to keep cell count for a partition below 100k and size below 
100mb. I however have many successful use cases running more and I've had some 
fail well before that. Hardware and tuning tradeoff a shift this around a 
lot.There is unfortunately as you'll note a lot of nuance and the load out 
really changes what looks right (down to the model of SSDs I have different 
expectations for p99s if it's a model I 

problem with transactions

2016-08-27 Thread Nikhil Sharma
Hi,

Initially, one “if exists” query was not working on cassandra 3.0.8. Its
started working after we downgraded to 2.2.7. Now, there are some more “if
exists” queries which are not working on cassandra 2.2.7 in some cases. Is
there anybody else facing the same problem with transactions? We are using
latest oracle jdk on stock amazon AMI.

Regards,
Nikhil Sharma