RE: [EXTERNAL] Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-28 Thread Durity, Sean R
This may sound a bit harsh, but I teach my developers that if they are trying to use ALLOW FILTERING – they are doing it wrong! We often choose Cassandra for its high availability and scalability characteristics. We love no downtime. ALLOW FILTERING is breaking the rules of availability and

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-28 Thread Attila Wind
Hi Shalom, Thanks for your notes! So you also experienced this thing... fine Then maybe the best rules to follow are these: a) never(!) run a query "ALLOW FILTERING" on a Production cluster b) if you need these queries build a test cluster (somehow) and mirror the data (somehow) OR add

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-28 Thread shalom sagges
Hi Attila, I'm definitely no guru, but I've experienced several cases where people at my company used allow filtering and caused major performance issues. As data size increases, the impact will be stronger. If you have large partitions, performance will decrease. GC can be affected. And if GC

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-28 Thread Attila Wind
Hi Gurus, Looks we stopped this thread. However I would be very much curious answers regarding b) ... Anyone any comments on that? I do see this as a potential production outage risk now... Especially as we are planning to run analysis queries by hand exactly like that over the cluster...

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-23 Thread Attila Wind
Hi again, so remaining with a) for a second... "Why am I using ALLOW FILTERING in the first place?" Fully agreed! To put it this way: as I reviewer I never want to see string occurence "allow filtering" in any selects done by a production code. I clearly consider it as an indicator of a wrong

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-23 Thread shalom sagges
a) Interesting... But only in case you do not provide partitioning key right? (so IN() is for partitioning key?) I think you should ask yourself a different question. Why am I using ALLOW FILTERING in the first place? What happens if I remove it from the query? I prefer to denormalize the data to

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-22 Thread Attila Wind
Hi, "When you run a query with allow filtering, Cassandra doesn't know where the data is located, so it has to go node by node, searching for the requested data." a) Interesting... But only in case you do not provide partitioning key right? (so IN() is for partitioning key?) b) Still does

Re: Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-22 Thread shalom sagges
Hi Vsevolod, 1) Why such behavior? I thought any given SELECT request is handled by a limited subset of C* nodes and not by all of them, as per connection consistency/table replication settings, in case. When you run a query with allow filtering, Cassandra doesn't know where the data is located,

Select in allow filtering stalls whole cluster. How to prevent such behavior?

2019-05-22 Thread Vsevolod Filaretov
Hello everyone, We have an 8 node C* cluster with large volume of unbalanced data. Usual per-partition selects work somewhat fine, and are processed by limited number of nodes, but if user issues SELECT WHERE IN () ALLOW FILTERING, such command stalls all 8 nodes to halt and unresponsiveness to