Re: Issue with ALLOW FILTERING

2014-08-06 Thread Jens Rantil
Hi Sávio,

I am really surprised by this. Could anyone explain why ALLOW FILTERING
only is allowed when using secondary index and not together with a PRIMARY
KEY? I'm struggling to see any reason for it not being supported.

Also, I don't believe the CQL specification makes it entirely clear that
only secondary indexes are supported. Or is it considered implementation
specific under what circumstances ALLOW FILTERING can be used?

Thanks,
Jens


On Tue, Aug 5, 2014 at 8:11 PM, Sávio S. Teles de Oliveira 
savio.te...@cuia.com.br wrote:

 You need to create an index on attribute *c.*


 2014-08-05 9:24 GMT-03:00 Jens Rantil jens.ran...@tink.se:

 Hi,

 I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a
 minimal example here:
 https://gist.github.com/JensRantil/ec43622c26acb56e5bc9

 I expect the second last to fail, but the last query to return a single
 row. In particular I expect the last SELECT to first select using the
 clustering primary id and then do filtering.

 I've been reading
 https://cassandra.apache.org/doc/cql3/CQL.html#selectStmt ALLOW
 FILTERING and can't wrap my head around why this won't work.

 Could anyone clarify this for me?

 Thanks,
 Jens




 --
 Atenciosamente,
 Sávio S. Teles de Oliveira
 voice: +55 62 9136 6996
 http://br.linkedin.com/in/savioteles
  Mestrando em Ciências da Computação - UFG
 Arquiteto de Software
 CUIA Internet Brasil



Re: Issue with ALLOW FILTERING

2014-08-06 Thread Sylvain Lebresne
On Wed, Aug 6, 2014 at 9:41 AM, Jens Rantil jens.ran...@tink.se wrote


 I'm struggling to see any reason for it not being supported.


The time to implement it, plus a bunch of internal implementation reasons
that makes it not as trivial to support as you seem to suggest it is (of
course, this is open source, you are welcome to have a look if that's a
particular itch you want to scratch; there is even a JIRA ticket:
https://issues.apache.org/jira/browse/CASSANDRA-6377).



 Or is it considered implementation specific under what circumstances ALLOW
 FILTERING can be used?


Currently, it kind of is. ALLOW FILTERING allows to execute some queries
that couldn't be otherwise, but not everything. Again, things that are not
supported are not mainly for implementation reasons, nothing more, and that
may/will change in the future. That said, I'm not saying documentation
cannot be improved (though I'm not sure having the doc saying this doesn't
work would be a lot more helpful than trying it and having the
implementation saying this doesn't work).

--
Sylvain



 Thanks,
 Jens


 On Tue, Aug 5, 2014 at 8:11 PM, Sávio S. Teles de Oliveira 
 savio.te...@cuia.com.br wrote:

 You need to create an index on attribute *c.*


 2014-08-05 9:24 GMT-03:00 Jens Rantil jens.ran...@tink.se:

 Hi,

 I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a
 minimal example here:
 https://gist.github.com/JensRantil/ec43622c26acb56e5bc9

 I expect the second last to fail, but the last query to return a single
 row. In particular I expect the last SELECT to first select using the
 clustering primary id and then do filtering.

 I've been reading
 https://cassandra.apache.org/doc/cql3/CQL.html#selectStmt ALLOW
 FILTERING and can't wrap my head around why this won't work.

 Could anyone clarify this for me?

 Thanks,
 Jens




 --
 Atenciosamente,
 Sávio S. Teles de Oliveira
 voice: +55 62 9136 6996
 http://br.linkedin.com/in/savioteles
  Mestrando em Ciências da Computação - UFG
 Arquiteto de Software
 CUIA Internet Brasil





Re: Cassandra process exiting mysteriously

2014-08-06 Thread Duncan Sands

Hi Clint,


INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,903
ThriftServer.java (line 141) Stop listening to thrift clients
  INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,920 Server.java
(line 182) Stop listening for CQL clients
  INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,930
Gossiper.java (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-08-05 19:14:53,930
MessagingService.java (line 683) Waiting for messaging service to
quiesce
  INFO [ACCEPT-/127.0.0.10] 2014-08-05 19:14:53,931
MessagingService.java (line 923) MessagingService has terminated the
accept() thread

Does anyone have any ideas about how to debug this?  Looking around on
google I found some threads suggesting that this could occur from an
OOM error 
(http://stackoverflow.com/questions/23755040/cassandra-exits-with-no-errors).


this doesn't look like an OOM to me.  If the kernel OOM kills Cassandra then 
Cassandra instantly vaporizes, and there will be nothing in the Cassandra logs 
(you will find information about the OOM in the system logs though, eg in 
dmesg).  In the log snippet above you see an orderly shutdown, this is 
completely different to the instant OOM kill.


Ciao, Duncan.


Re: Issue with ALLOW FILTERING

2014-08-06 Thread Jens Rantil
Sylvain,

Your answer was what I was hoping for - that means it's not impossible to
solve ;)

I'll keep an eye on the issue and in case I have the time I will dig into
some code.

Thanks,
Jens


On Wed, Aug 6, 2014 at 10:03 AM, Sylvain Lebresne sylv...@datastax.com
wrote:

 On Wed, Aug 6, 2014 at 9:41 AM, Jens Rantil jens.ran...@tink.se wrote


 I'm struggling to see any reason for it not being supported.


 The time to implement it, plus a bunch of internal implementation reasons
 that makes it not as trivial to support as you seem to suggest it is (of
 course, this is open source, you are welcome to have a look if that's a
 particular itch you want to scratch; there is even a JIRA ticket:
 https://issues.apache.org/jira/browse/CASSANDRA-6377).



 Or is it considered implementation specific under what circumstances
 ALLOW FILTERING can be used?


 Currently, it kind of is. ALLOW FILTERING allows to execute some queries
 that couldn't be otherwise, but not everything. Again, things that are not
 supported are not mainly for implementation reasons, nothing more, and that
 may/will change in the future. That said, I'm not saying documentation
 cannot be improved (though I'm not sure having the doc saying this doesn't
 work would be a lot more helpful than trying it and having the
 implementation saying this doesn't work).

 --
 Sylvain



 Thanks,
 Jens


 On Tue, Aug 5, 2014 at 8:11 PM, Sávio S. Teles de Oliveira 
 savio.te...@cuia.com.br wrote:

 You need to create an index on attribute *c.*


 2014-08-05 9:24 GMT-03:00 Jens Rantil jens.ran...@tink.se:

 Hi,

 I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a
 minimal example here:
 https://gist.github.com/JensRantil/ec43622c26acb56e5bc9

 I expect the second last to fail, but the last query to return a single
 row. In particular I expect the last SELECT to first select using the
 clustering primary id and then do filtering.

 I've been reading
 https://cassandra.apache.org/doc/cql3/CQL.html#selectStmt ALLOW
 FILTERING and can't wrap my head around why this won't work.

 Could anyone clarify this for me?

 Thanks,
 Jens




 --
 Atenciosamente,
 Sávio S. Teles de Oliveira
 voice: +55 62 9136 6996
 http://br.linkedin.com/in/savioteles
  Mestrando em Ciências da Computação - UFG
 Arquiteto de Software
 CUIA Internet Brasil






RE: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-06 Thread DE VITO Dominique
 The discussion about racks  NTS is also mentioned in this recent article : 
 planetcassandra.org/multi-data-center-replication-in-nosql-databases/

 The last section may be of interest for you

Thanks DuyHai.

Note that this section is also part of C* anti-patterns 
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html

But I think it's missing some advice for vnodes (something like due to 
randomly-generated tokens, define one single rack when using vnodes ??).

D.

[@@ THALES GROUP INTERNAL @@]

De : DuyHai Doan [mailto:doanduy...@gmail.com]
Envoyé : mardi 5 août 2014 20:07
À : user@cassandra.apache.org
Objet : RE: vnode and NetworkTopologyStrategy: not playing well together ?


The discussion about racks  NTS is also mentioned in this recent article : 
planetcassandra.org/multi-data-center-replication-in-nosql-databases/http://planetcassandra.org/multi-data-center-replication-in-nosql-databases/

The last section may be of interest for you
Le 5 août 2014 18:14, DE VITO Dominique 
dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com a 
écrit :
 Jonathan wrote:

 Yes, if you have only 1 machine in a rack then your cluster will be 
 imbalanced.  You're going to be able to dream up all sorts of weird failure 
 cases when you choose a scenario like RF=2  totally imbalanced network arch.

 Vnodes attempt to solve the problem of imbalanced rings by choosing so many 
 tokens that it's improbable that the ring will be imbalanced.

Storage/load distro = function(1st replica placement, other replica placement)

vnode solves the balancing pb for 1st replica placement // so, yes, I agree 
with you, but for 1st replica placement only

But NetworkTopologyStrategy (NTS) influences other (2+) replica placement = as 
NTS best behavior relies on token distro, and you have no control on tokens 
with vnodes, the best option I see with **vnode** is to use only one rack with 
NTS.

Dominique


-Message d'origine-
De : jonathan.had...@gmail.commailto:jonathan.had...@gmail.com 
[mailto:jonathan.had...@gmail.commailto:jonathan.had...@gmail.com] De la part 
de Jonathan Haddad
Envoyé : mardi 5 août 2014 18:04
À : user@cassandra.apache.orgmailto:user@cassandra.apache.org
Objet : Re: vnode and NetworkTopologyStrategy: not playing well together ?

Yes, if you have only 1 machine in a rack then your cluster will be imbalanced. 
 You're going to be able to dream up all sorts of weird failure cases when you 
choose a scenario like RF=2  totally imbalanced network arch.

Vnodes attempt to solve the problem of imbalanced rings by choosing so many 
tokens that it's improbable that the ring will be imbalanced.



On Tue, Aug 5, 2014 at 8:57 AM, DE VITO Dominique 
dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com 
wrote:
 First, thanks for your answer.

 This is incorrect.  Network Topology w/ Vnodes will be fine, assuming you've 
 got RF= # of racks.

 IMHO, it's not a good enough condition.
 Let's use an example with RF=2

 N1/rack_1   N2/rack_1   N3/rack_1   N4/rack_2

 Here, you have RF= # of racks
 And due to NetworkTopologyStrategy, N4 will store *all* the cluster data, 
 leading to a completely imbalanced cluster.

 IMHO, it happens when using nodes *or* vnodes.

 As well-balanced clusters with NetworkTopologyStrategy rely on carefully 
 chosen token distribution/path along the ring *and* as tokens are 
 randomly-generated with vnodes, my guess is that with vnodes and 
 NetworkTopologyStrategy, it's better to define a single (logical) rack // due 
 to carefully chosen tokens vs randomly-generated token clash.

 I don't see other options left.
 Do you see other ones ?

 Regards,
 Dominique




 -Message d'origine-
 De : jonathan.had...@gmail.commailto:jonathan.had...@gmail.com 
 [mailto:jonathan.had...@gmail.commailto:jonathan.had...@gmail.com] De
 la part de Jonathan Haddad Envoyé : mardi 5 août 2014 17:43 À :
 user@cassandra.apache.orgmailto:user@cassandra.apache.org Objet : Re: vnode 
 and
 NetworkTopologyStrategy: not playing well together ?

 This is incorrect.  Network Topology w/ Vnodes will be fine, assuming you've 
 got RF= # of racks.  For each token, replicas are chosen based on the 
 strategy.  Essentially, you could have a wild imbalance in token ownership, 
 but it wouldn't matter because the replicas would be distributed across the 
 rest of the machines.

 http://www.datastax.com/documentation/cassandra/2.0/cassandra/architec
 ture/architectureDataDistributeReplication_c.html

 On Tue, Aug 5, 2014 at 8:19 AM, DE VITO Dominique 
 dominique.dev...@thalesgroup.commailto:dominique.dev...@thalesgroup.com 
 wrote:
 Hi,



 My understanding is that NetworkTopologyStrategy does NOT play well
 with vnodes, due to:

 · Vnode = tokens are (usually) randomly generated (AFAIK)

 · NetworkTopologyStrategy = required carefully choosen tokens for
 all nodes in order to not to get a VERY unbalanced ring 

Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Duncan,

Thanks for your help.

I am at a loss as to what is causing this process to stop then.  I
would not expect the Cassandra process to finish until my code calls
Process#destroy, but it seems to non-deterministically stop much
earlier sometimes.

FWIW I have seen failures on another machine this morning which also
look orderly.  These nodes never even get to the point where they
announce they are listening for CQL clients.

If anyone has any ideas on what to look for, I would really appreciate
it.  I will try turning logging up to DEBUG and see if that produces
any useful errors.

Best regards,
Clint




On Wed, Aug 6, 2014 at 1:11 AM, Duncan Sands duncan.sa...@gmail.com wrote:
 Hi Clint,


 INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,903
 ThriftServer.java (line 141) Stop listening to thrift clients
   INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,920 Server.java
 (line 182) Stop listening for CQL clients
   INFO [StorageServiceShutdownHook] 2014-08-05 19:14:51,930
 Gossiper.java (line 1279) Announcing shutdown
   INFO [StorageServiceShutdownHook] 2014-08-05 19:14:53,930
 MessagingService.java (line 683) Waiting for messaging service to
 quiesce
   INFO [ACCEPT-/127.0.0.10] 2014-08-05 19:14:53,931
 MessagingService.java (line 923) MessagingService has terminated the
 accept() thread

 Does anyone have any ideas about how to debug this?  Looking around on
 google I found some threads suggesting that this could occur from an
 OOM error
 (http://stackoverflow.com/questions/23755040/cassandra-exits-with-no-errors).


 this doesn't look like an OOM to me.  If the kernel OOM kills Cassandra then
 Cassandra instantly vaporizes, and there will be nothing in the Cassandra
 logs (you will find information about the OOM in the system logs though, eg
 in dmesg).  In the log snippet above you see an orderly shutdown, this is
 completely different to the instant OOM kill.

 Ciao, Duncan.


Re: Cassandra process exiting mysteriously

2014-08-06 Thread Robert Coli
On Wed, Aug 6, 2014 at 1:11 AM, Duncan Sands duncan.sa...@gmail.com wrote:

 this doesn't look like an OOM to me.  If the kernel OOM kills Cassandra
 then Cassandra instantly vaporizes, and there will be nothing in the
 Cassandra logs (you will find information about the OOM in the system logs
 though, eg in dmesg).  In the log snippet above you see an orderly
 shutdown, this is completely different to the instant OOM kill.


Not really.

https://issues.apache.org/jira/browse/CASSANDRA-7507

=Rob


Re: Cassandra process exiting mysteriously

2014-08-06 Thread Robert Coli
On Wed, Aug 6, 2014 at 1:12 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Aug 6, 2014 at 1:11 AM, Duncan Sands duncan.sa...@gmail.com
 wrote:

 this doesn't look like an OOM to me.  If the kernel OOM kills Cassandra
 then Cassandra instantly vaporizes, and there will be nothing in the
 Cassandra logs (you will find information about the OOM in the system logs
 though, eg in dmesg).  In the log snippet above you see an orderly
 shutdown, this is completely different to the instant OOM kill.


 Not really.

 https://issues.apache.org/jira/browse/CASSANDRA-7507


To be clear, there's two different OOMs here, I am talking about the JVM
OOM, not system level. As CASSANDRA-7507 indicates, JVM OOM does not
necessarily result in the cassandra process dying, and can in fact trigger
clean shutdown.

System level OOM will in fact send the equivalent of KILL, which will not
trigger the clean shutdown hook in Cassandra.

=Rob


Re: Node stuck during nodetool rebuild

2014-08-06 Thread Vasileios Vlachos
Hello Mark and Rob,

Thank you very much for your input, I will increase the phi threshold and
report back any progress.

Vasilis
On 5 Aug 2014 21:52, Mark Reddy mark.re...@boxever.com wrote:

 Hi Vasilis,

 To further on what Rob said

 I believe you might be able to tune the phi detector threshold to help
 this operation complete, hopefully someone with direct experience of same
 will chime in.


 I have been through this operation where streams break due to a node
 falsely being marked down (flapping). In an attempt to  mitigate this I
 increase the phi_convict_threshold in cassandra.yaml from 8 to 10, after
 which the rebuild was able to successfully complete. The default value for
 phi_convict_threshold is 8 with 12 being the maximum recommended value.


 Mark


 On Tue, Aug 5, 2014 at 7:22 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos 
 vasileiosvlac...@gmail.com wrote:

 The problem is that the nodetool seems to be stuck, and nodetool
 netstats on node1 of DC2 appears to be stuck at 10% streaming a 5G file
 from node2 at DC1. This doesn't tally with nodetool netstats when running
 it against either of the DC1 nodes. The DC1 nodes don't think they stream
 anything to DC2.


 Yes, streaming is fragile and breaks and hangs forever and your only
 option in most cases is to stop the rebuilding node, nuke its data, and
 start again.

 I believe you might be able to tune the phi detector threshold to help
 this operation complete, hopefully someone with direct experience of same
 will chime in.

 =Rob






Re: Node stuck during nodetool rebuild

2014-08-06 Thread Vasileios Vlachos
Actually something else I would like to ask... Do you know if phi is
related to streaming_socket_timeout_in_ms? It seems to be set to infinity
by default. Could that be related to the hang behaviour of rebuild? Would
you recommend changing the default or I have completely misinterpreted its
meaning?

Many thanks,

Vasilis
On 5 Aug 2014 21:52, Mark Reddy mark.re...@boxever.com wrote:

 Hi Vasilis,

 To further on what Rob said

 I believe you might be able to tune the phi detector threshold to help
 this operation complete, hopefully someone with direct experience of same
 will chime in.


 I have been through this operation where streams break due to a node
 falsely being marked down (flapping). In an attempt to  mitigate this I
 increase the phi_convict_threshold in cassandra.yaml from 8 to 10, after
 which the rebuild was able to successfully complete. The default value for
 phi_convict_threshold is 8 with 12 being the maximum recommended value.


 Mark


 On Tue, Aug 5, 2014 at 7:22 PM, Robert Coli rc...@eventbrite.com wrote:

 On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos 
 vasileiosvlac...@gmail.com wrote:

 The problem is that the nodetool seems to be stuck, and nodetool
 netstats on node1 of DC2 appears to be stuck at 10% streaming a 5G file
 from node2 at DC1. This doesn't tally with nodetool netstats when running
 it against either of the DC1 nodes. The DC1 nodes don't think they stream
 anything to DC2.


 Yes, streaming is fragile and breaks and hangs forever and your only
 option in most cases is to stop the rebuilding node, nuke its data, and
 start again.

 I believe you might be able to tune the phi detector threshold to help
 this operation complete, hopefully someone with direct experience of same
 will chime in.

 =Rob






Re: Cassandra process exiting mysteriously

2014-08-06 Thread Clint Kelly
Hi Rob,

Thanks for the clarification; this is really useful.  I'll run some
experiments to see if the problem is a JVM OOM on our build machine.

Best regards,
Clint

On Wed, Aug 6, 2014 at 1:14 PM, Robert Coli rc...@eventbrite.com wrote:
 On Wed, Aug 6, 2014 at 1:12 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Aug 6, 2014 at 1:11 AM, Duncan Sands duncan.sa...@gmail.com
 wrote:

 this doesn't look like an OOM to me.  If the kernel OOM kills Cassandra
 then Cassandra instantly vaporizes, and there will be nothing in the
 Cassandra logs (you will find information about the OOM in the system logs
 though, eg in dmesg).  In the log snippet above you see an orderly shutdown,
 this is completely different to the instant OOM kill.


 Not really.

 https://issues.apache.org/jira/browse/CASSANDRA-7507


 To be clear, there's two different OOMs here, I am talking about the JVM
 OOM, not system level. As CASSANDRA-7507 indicates, JVM OOM does not
 necessarily result in the cassandra process dying, and can in fact trigger
 clean shutdown.

 System level OOM will in fact send the equivalent of KILL, which will not
 trigger the clean shutdown hook in Cassandra.

 =Rob


Re: Issue with ALLOW FILTERING

2014-08-06 Thread Robert Coli
On Wed, Aug 6, 2014 at 1:46 AM, Jens Rantil jens.ran...@tink.se wrote:

 Your answer was what I was hoping for - that means it's not impossible to
 solve ;)

 I'll keep an eye on the issue and in case I have the time I will dig into
 some code.


 Just have to mention here that :

ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly
describe its typical performance.

If you find yourself having to use ALLOW FILTERING, it is possible you are
Doing It Wrong.. :D

=Rob