Re: Re : Cluster performance after enabling SSL

2016-09-14 Thread sai krishnam raju potturi
Thanks a lot. That was really good info.

On Tue, Sep 13, 2016, 15:41 G P  wrote:

> Read this:
>
> http://www.aifb.kit.edu/images/5/58/IC2E2014-Performance_Overhead_TLS.pdf
>
> It can cause bigger variances in latencies, but not much.
> Terça-feira, 13 Setembro 2016, 08:01PM +01:00 de sai krishnam raju potturi
> pskraj...@gmail.com:
>
>
> hi;
>   will enabling SSL (node-to-node) cause an overhead in the performance of
> Cassandra? We have tried it out on a small test cluster while running
> Cassandra-stress tool, and did not see much difference in terms of read and
> write latencies.
>  Could somebody throw some light regarding any impact SSL will have on
> large clusters in terms of performance. Thanks in advance.
>
> Cassandra-version (2.1.15)
>
> thanks
> Sai
>
>


Re: Re : Cluster performance after enabling SSL

2016-09-14 Thread sai krishnam raju potturi
thanks Surabhi; we'll do further tests regarding this. The per node tps are
less, but for the overall cluster the tps are like 90k.

On Tue, Sep 13, 2016 at 3:25 PM, Surbhi Gupta 
wrote:

> We have seen a little overhead in latencies while enabling the
> client_encryption.
> Our cluster gets around 40-50K reads and writes per second.
>
> On 13 September 2016 at 12:01, sai krishnam raju potturi <
> pskraj...@gmail.com> wrote:
>
>> hi;
>>   will enabling SSL (node-to-node) cause an overhead in the performance
>> of Cassandra? We have tried it out on a small test cluster while running
>> Cassandra-stress tool, and did not see much difference in terms of read and
>> write latencies.
>>  Could somebody throw some light regarding any impact SSL will have
>> on large clusters in terms of performance. Thanks in advance.
>>
>> Cassandra-version (2.1.15)
>>
>> thanks
>> Sai
>>
>
>


Re: race condition for quorum consistency

2016-09-14 Thread Tyler Hobbs
On Wed, Sep 14, 2016 at 3:49 PM, Nicolas Douillet <
nicolas.douil...@gmail.com> wrote:

> -
> - during read requests, cassandra will ask to one node the data and to
> the others involved in the CL a digest, and if all digests do not match,
> will ask for them the entire data, handle the merge and finally will ask to
> those nodes a background repair. Your write may have succeed during this
> time.


This is very good info, but as a minor correction, the repair here will
happen in the foreground before the response is returned to the client.
So, at least from a single client's perspective, you get monotonic reads.


-- 
Tyler Hobbs
DataStax 


Re: race condition for quorum consistency

2016-09-14 Thread Nicolas Douillet
Hi,

In my opinion the guaranty provided by Cassandra is :
  if your write request in Quorum *succeed*, then the next (after the
write response) read requests in Quorum (that succeed too) will be
consistent
  (actually CL.Write + CL.Read > RF)

Of course while you haven't received a valid response to your write request
in Quorum the cluster is in a inconsistent state, and you have *to retry
your write request.*

That said, Cassandra provides some other important behaviors that will tend
to reduce the time of this inconsistent state :

   - the coordinator will not send the request to only the nodes that
   should answer to satisfy the CL, but to all nodes that should have
the data (of
   course with RF=3, only A,B are involved)

   - during read requests, cassandra will ask to one node the data and to
   the others involved in the CL a digest, and if all digests do not match,
   will ask for them the entire data, handle the merge and finally will ask to
   those nodes a background repair. Your write may have succeed during this
   time.

   - according to a chance ratio, cassandra will ask *sometimes* a read to
   all nodes holding the data, not only the ones involved in the CL and
   execute background repairs

   - you have to schedule repairs regularly


I'd add that if some nodes do not succeed to handle write requests in time,
they may be under pressure, and there is a small chance that they succeed
on a read request :)

And finally what is time? From where/when? You may schedule a read after an
other but receive the result before. Writing in Quorum is not writing
within a transaction, you'll certainly have to made some tradeoff.

Regards,

--
Nicolas




Le mer. 14 sept. 2016 à 21:14, Alexander Dejanovski 
a écrit :

> My understanding of the described scenario is that the write hasn't
> succeeded when reads are fired, as B and C haven't processed the mutation
> yet.
>
> There would be 3 clients here and not 2 : C1 writes, C2 and C3 read.
>
> So the race condition could still happen in this particular case.
>
> Le mer. 14 sept. 2016 21:07, Work  a écrit :
>
>> Hi Alex:
>>
>> Hmmm ... Assuming clock skew is eliminated And assuming nodes are up
>> and available ... And assuming quorum writes and quorum reads and everyone
>> waiting for success ( which is NOT The OP scenario), Two different clients
>> will be guaranteed to see all successful writes, or be told that read
>> failed.
>>
>> C1 writes at quorum to A,B
>> C2 reads at quorum.
>> So it tries to read from ALL nodes, A,B, C.
>> If A,B respond --> success
>> If A,C respond --> conflict
>> If B, C respond --> conflict
>> Because a quorum (2 nodes) responded, the coordinator will return the
>> latest time stamp and may issue read repair depending on YAML settings.
>>
>> So where do you see only one client having this guarantee?
>>
>> Regards,
>>
>> James
>>
>> On Sep 14, 2016, at 4:00 AM, Alexander DEJANOVSKI 
>> wrote:
>>
>> Hi,
>>
>> the analysis is valid, and strong consistency the Cassandra way means
>> that one client writing at quorum, then reading at quorum will always see
>> his previous write.
>> Two different clients have no guarantee to see the same data when using
>> quorum, as illustrated in your example.
>>
>> Only options here are to route requests to specific clients based on some
>> id to guarantee the sequence of operations outside of Cassandra (the same
>> client will always be responsible for a set of ids), or raise the CL to ALL
>> at the expense of availability (you should not do that).
>>
>>
>> Cheers,
>>
>> Alex
>>
>> Le mer. 14 sept. 2016 à 11:47, Qi Li  a écrit :
>>
>>> hi all,
>>>
>>> we are using quorum consistency, and we *suspect* there may be a race
>>> condition during the write. lets say RF is 3. so write will wait for at
>>> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other
>>> 2 nodes(node B, C) are still waiting to update. there come two read requests
>>> one read is having the data responded from the node B and C, so version
>>> 1 us returned.
>>> the other node is having data responded from node A and B, so the latest
>>> version 2 is returned.
>>>
>>> so clients are getting different data at the same time. is this a valid
>>> analysis? if so, is there any options we can set to deal with this issue?
>>>
>>> thanks
>>> Ken
>>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>


Re: race condition for quorum consistency

2016-09-14 Thread Alexander Dejanovski
My understanding of the described scenario is that the write hasn't
succeeded when reads are fired, as B and C haven't processed the mutation
yet.

There would be 3 clients here and not 2 : C1 writes, C2 and C3 read.

So the race condition could still happen in this particular case.

Le mer. 14 sept. 2016 21:07, Work  a écrit :

> Hi Alex:
>
> Hmmm ... Assuming clock skew is eliminated And assuming nodes are up
> and available ... And assuming quorum writes and quorum reads and everyone
> waiting for success ( which is NOT The OP scenario), Two different clients
> will be guaranteed to see all successful writes, or be told that read
> failed.
>
> C1 writes at quorum to A,B
> C2 reads at quorum.
> So it tries to read from ALL nodes, A,B, C.
> If A,B respond --> success
> If A,C respond --> conflict
> If B, C respond --> conflict
> Because a quorum (2 nodes) responded, the coordinator will return the
> latest time stamp and may issue read repair depending on YAML settings.
>
> So where do you see only one client having this guarantee?
>
> Regards,
>
> James
>
> On Sep 14, 2016, at 4:00 AM, Alexander DEJANOVSKI 
> wrote:
>
> Hi,
>
> the analysis is valid, and strong consistency the Cassandra way means that
> one client writing at quorum, then reading at quorum will always see his
> previous write.
> Two different clients have no guarantee to see the same data when using
> quorum, as illustrated in your example.
>
> Only options here are to route requests to specific clients based on some
> id to guarantee the sequence of operations outside of Cassandra (the same
> client will always be responsible for a set of ids), or raise the CL to ALL
> at the expense of availability (you should not do that).
>
>
> Cheers,
>
> Alex
>
> Le mer. 14 sept. 2016 à 11:47, Qi Li  a écrit :
>
>> hi all,
>>
>> we are using quorum consistency, and we *suspect* there may be a race
>> condition during the write. lets say RF is 3. so write will wait for at
>> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other
>> 2 nodes(node B, C) are still waiting to update. there come two read requests
>> one read is having the data responded from the node B and C, so version 1
>> us returned.
>> the other node is having data responded from node A and B, so the latest
>> version 2 is returned.
>>
>> so clients are getting different data at the same time. is this a valid
>> analysis? if so, is there any options we can set to deal with this issue?
>>
>> thanks
>> Ken
>>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: race condition for quorum consistency

2016-09-14 Thread Work
Hi Alex:

Hmmm ... Assuming clock skew is eliminated And assuming nodes are up and 
available ... And assuming quorum writes and quorum reads and everyone waiting 
for success ( which is NOT The OP scenario), Two different clients will be 
guaranteed to see all successful writes, or be told that read failed. 

C1 writes at quorum to A,B
C2 reads at quorum. 
So it tries to read from ALL nodes, A,B, C.
If A,B respond --> success
If A,C respond --> conflict
If B, C respond --> conflict
Because a quorum (2 nodes) responded, the coordinator will return the latest 
time stamp and may issue read repair depending on YAML settings.

So where do you see only one client having this guarantee?

Regards,

James

> On Sep 14, 2016, at 4:00 AM, Alexander DEJANOVSKI  
> wrote:
> 
> Hi, 
> 
> the analysis is valid, and strong consistency the Cassandra way means that 
> one client writing at quorum, then reading at quorum will always see his 
> previous write.
> Two different clients have no guarantee to see the same data when using 
> quorum, as illustrated in your example.
> 
> Only options here are to route requests to specific clients based on some id 
> to guarantee the sequence of operations outside of Cassandra (the same client 
> will always be responsible for a set of ids), or raise the CL to ALL at the 
> expense of availability (you should not do that).
> 
>  
> Cheers,
> 
> Alex
> 
>> Le mer. 14 sept. 2016 à 11:47, Qi Li  a écrit :
>> hi all,
>> 
>> we are using quorum consistency, and we *suspect* there may be a race 
>> condition during the write. lets say RF is 3. so write will wait for at 
>> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other 
>> 2 nodes(node B, C) are still waiting to update. there come two read requests
>> one read is having the data responded from the node B and C, so version 1 us 
>> returned.
>> the other node is having data responded from node A and B, so the latest 
>> version 2 is returned.
>> 
>> so clients are getting different data at the same time. is this a valid 
>> analysis? if so, is there any options we can set to deal with this issue? 
>> 
>> thanks
>> Ken


How Fast Does Information Spread With Gossip?

2016-09-14 Thread jerome
Hi,


I was curious if anyone had any kind of statistics or ballpark figures on how 
long it takes information to propagate through a cluster with Gossip? I'm 
particularly interested in how fast information about the liveness of a node 
spreads. For example, in an n-node cluster the median amount of time it takes 
for all nodes to learn that a node went down is f(n) seconds. Is a minute a 
reasonable upper bound for most clusters? Too high, too low?


Thanks,

Jerome


Re: unsubscribe

2016-09-14 Thread Alain RODRIGUEZ
Hi,

Sending a message to user-unsubscr...@cassandra.apache.org is the right way
to go if you want to unsubscribe from the "Cassandra User" mailing list,

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France


2016-09-12 18:02 GMT+02:00 Spencer Brown :

> ubsubscribe
>


Re: race condition for quorum consistency

2016-09-14 Thread Alexander DEJANOVSKI
Hi,

the analysis is valid, and strong consistency the Cassandra way means that
one client writing at quorum, then reading at quorum will always see his
previous write.
Two different clients have no guarantee to see the same data when using
quorum, as illustrated in your example.

Only options here are to route requests to specific clients based on some
id to guarantee the sequence of operations outside of Cassandra (the same
client will always be responsible for a set of ids), or raise the CL to ALL
at the expense of availability (you should not do that).


Cheers,

Alex

Le mer. 14 sept. 2016 à 11:47, Qi Li  a écrit :

> hi all,
>
> we are using quorum consistency, and we *suspect* there may be a race
> condition during the write. lets say RF is 3. so write will wait for at
> least 2 nodes to ack. suppose there is only 1 node acked(node A). the other
> 2 nodes(node B, C) are still waiting to update. there come two read requests
> one read is having the data responded from the node B and C, so version 1
> us returned.
> the other node is having data responded from node A and B, so the latest
> version 2 is returned.
>
> so clients are getting different data at the same time. is this a valid
> analysis? if so, is there any options we can set to deal with this issue?
>
> thanks
> Ken
>


Re: unsubscribe

2016-09-14 Thread Alain RODRIGUEZ
Hi,

Sending a message to user-unsubscr...@cassandra.apache.org is the right way
to go if you want to unsubscribe from the "Cassandra User" mailing list,

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-07 5:33 GMT+02:00 dhanesh malviya :

>
>
> --
> Regards,
> Dhanesh Malviya..
>


race condition for quorum consistency

2016-09-14 Thread Qi Li
hi all,

we are using quorum consistency, and we *suspect* there may be a race
condition during the write. lets say RF is 3. so write will wait for at
least 2 nodes to ack. suppose there is only 1 node acked(node A). the other
2 nodes(node B, C) are still waiting to update. there come two read requests
one read is having the data responded from the node B and C, so version 1
us returned.
the other node is having data responded from node A and B, so the latest
version 2 is returned.

so clients are getting different data at the same time. is this a valid
analysis? if so, is there any options we can set to deal with this issue?

thanks
Ken


Re: How to query '%' character using LIKE operator in Cassandra 3.7?

2016-09-14 Thread DuyHai Doan
Ok you're right, I get your point

LIKE '%%esc%' --> startWith('%esc')

LIKE 'escape%%' -->  = 'escape%'

What I strongly suspect is that in the source code of SASI, we parse the %
xxx % expression BEFORE applying escape. That will explain the observed
behavior. E.g:

LIKE '%%esc%'  parsed as %xxx% where xxx = %esc

LIKE 'escape%%' parsed as xxx% where xxx =escape%

Let me check in the source code and try to reproduce the issue



On Tue, Sep 13, 2016 at 7:24 PM, Mikhail Krupitskiy <
mikhail.krupits...@jetbrains.com> wrote:

> Looks like we have different understanding of what results are expected.
> I based my understanding on http://docs.datastax.com/
> en/cql/3.3/cql/cql_using/useSASIIndex.html
> According to the doc ‘esc’ is a pattern for exact match and I guess that
> there is no semantical difference between two LIKE patterns (both of
> patterns should be treated as ‘exact match'): ‘%%esc’ and ‘esc’.
>
> SELECT * FROM escape WHERE val LIKE '%%esc%'; --> Give all results
> *containing* '%esc' so *%esc*apeme is a possible match and also escape
> *%esc*
>
> Why ‘containing’? I expect that it should be ’starting’..
>
>
> SELECT * FROM escape WHERE val LIKE 'escape%%' --> Give all results
> *starting* with 'escape%' so *escape%*me is a valid result and also
> *escape%*esc
>
> Why ’starting’? I expect that it should be ‘exact matching’.
>
> Also I expect that “ LIKE ‘%s%sc%’ ” will return ‘escape%esc’ but it
> returns nothing (CASSANDRA-12573).
>
> What I’m missing?
>
> Thanks,
> Mikhail
>
> On 13 Sep 2016, at 19:31, DuyHai Doan  wrote:
>
> CREATE CUSTOM INDEX ON test.escape(val) USING 
> 'org.apache.cassandra.index.sasi.SASIIndex'
> WITH OPTIONS = {'mode': 'CONTAINS', 'analyzer_class':
> 'org.apache.cassandra.index.sasi.analyzer.NonTokenizingAnalyzer',
> 'case_sensitive': 'false'};
>
> I don't see any problem in the results you got
>
> SELECT * FROM escape WHERE val LIKE '%%esc%'; --> Give all results
> *containing* '%esc' so *%esc*apeme is a possible match and also escape
> *%esc*
>
> Why ‘containing’? I expect that it should be ’starting’..
>
>
> SELECT * FROM escape WHERE val LIKE 'escape%%' --> Give all results
> *starting* with 'escape%' so *escape%*me is a valid result and also
> *escape%*esc
>
> Why ’starting’? I expect that it should be ‘exact matching’.
>
>
> On Tue, Sep 13, 2016 at 5:58 PM, Mikhail Krupitskiy <
> mikhail.krupits...@jetbrains.com> wrote:
>
>> Thanks for the reply.
>> Could you please provide what index definition did you use?
>> With the index from my script I get the following results:
>>
>> cqlsh:test> select * from escape;
>>
>>  id | val
>> +---
>>   1 | %escapeme
>>   2 | escape%me
>> *  3 | escape%esc*
>>
>> Contains search
>>
>> cqlsh:test> SELECT * FROM escape WHERE val LIKE '%%esc%';
>>
>>  id | val
>> +---
>>   1 | %escapeme
>>   3
>> * | escape%esc*(2 rows)
>>
>>
>> Prefix search
>>
>> cqlsh:test> SELECT * FROM escape WHERE val LIKE 'escape%%';
>>
>>  id | val
>> +---
>>   2 | escape%me
>>   3
>> * | escape%esc*
>>
>> Thanks,
>> Mikhail
>>
>> On 13 Sep 2016, at 18:16, DuyHai Doan  wrote:
>>
>> Use % to escape %
>>
>> cqlsh:test> select * from escape;
>>
>>  id | val
>> +---
>>   1 | %escapeme
>>   2 | escape%me
>>
>>
>> Contains search
>>
>> cqlsh:test> SELECT * FROM escape WHERE val LIKE '%%esc%';
>>
>>  id | val
>> +---
>>   1 | %escapeme
>>
>> (1 rows)
>>
>>
>> Prefix search
>>
>> cqlsh:test> SELECT * FROM escape WHERE val LIKE 'escape%%';
>>
>>  id | val
>> +---
>>   2 | escape%me
>>
>> On Tue, Sep 13, 2016 at 5:06 PM, Mikhail Krupitskiy <
>> mikhail.krupits...@jetbrains.com> wrote:
>>
>>> Hi Cassandra guys,
>>>
>>> I use Cassandra 3.7 and wondering how to use ‘%’ as a simple char in a
>>> search pattern.
>>> Here is my test script:
>>>
>>> DROP keyspace if exists kmv;
>>> CREATE keyspace if not exists kmv WITH REPLICATION = { 'class' :
>>> 'SimpleStrategy', 'replication_factor':'1'} ;
>>> USE kmv;
>>> CREATE TABLE if not exists kmv (id int, c1 text, c2 text, PRIMARY
>>> KEY(id, c1));
>>> CREATE CUSTOM INDEX ON kmv.kmv  ( c2 ) USING '
>>> org.apache.cassandra.index.sasi.SASIIndex' WITH OPTIONS = {
>>> 'analyzed' : 'true',
>>> 'analyzer_class' : 'org.apache.cassandra.index.sa
>>> si.analyzer.NonTokenizingAnalyzer',
>>> 'case_sensitive' : 'false',
>>> 'mode' : 'CONTAINS'
>>> };
>>>
>>> INSERT into kmv (id, c1, c2) values (1, 'f22', 'qwe%asd');
>>> INSERT into kmv (id, c1, c2) values (2, 'f22', '%asd');
>>> INSERT into kmv (id, c1, c2) values (3, 'f22', 'asd%');
>>> INSERT into kmv (id, c1, c2) values (4, 'f22', 'asd%1');
>>> INSERT into kmv (id, c1, c2) values (5, 'f22', 'qweasd');
>>>
>>> SELECT c2 from kmv.kmv where c2 like ‘_pattern_';
>>>
>>> _pattern_ '%%%' finds all columns that contain %.
>>> How to find columns that start form ‘%’ or ‘%a’?
>>> How to find columns that end with ‘%’?
>>> What about more complex patterns: '%qwe%a%sd%’? How to