[jira] [Created] (CASSANDRA-8141) Versioned rows

2014-10-19 Thread Robert Stupp (JIRA)
Robert Stupp created CASSANDRA-8141:
---

 Summary: Versioned rows
 Key: CASSANDRA-8141
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8141
 Project: Cassandra
  Issue Type: New Feature
Reporter: Robert Stupp


People still talk about global locks and distributed transactions. I think 
that introducing such things is both painful to implement and dangerous for a 
distributed application.

But it could be manageable to introduce versioned rows.

By versioned rows I mean to issue a SELECT against data that was valid at a 
specified timestamp - something like {{SELECT ... WITH READTIME=1413724696473}}.

In combination with something like {{UPDATE ... IF NOT MODIFIED SINCE 
1413724696473}} it could be powerful. (Sure, this one could be already be 
achieved by the application today.) 

It's just an idea I'd like to discuss.

We already have such a thing like versioned rows implicitly since we have the 
old data in the SSTables. Beside that it could be necessary to:
* don't throw away old columns/rows for some configurable timespan
* extend the row cache to optionally maintain old data
* (surely something more)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8116) HSHA fails with default rpc_max_threads setting

2014-10-19 Thread Mike Adamson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176336#comment-14176336
 ] 

Mike Adamson commented on CASSANDRA-8116:
-

It's pretty much guaranteed to OOM because the number of handlers per 
SelectorThread is based on the the max pool size which for the default is 
Integer.MAX_VALUE. This change happened as part of CASSANDRA-7594.

I suppose you could check and throw if the value is Integer.MAX_VALUE but you 
aren't going to be able to check every value. It also happens when the node is 
started so is pretty immediate which is why I suggested a doc change rather 
than a code change.

 HSHA fails with default rpc_max_threads setting
 ---

 Key: CASSANDRA-8116
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8116
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mike Adamson
Assignee: Mike Adamson
Priority: Minor
 Fix For: 2.0.11, 2.1.1

 Attachments: 8116.txt


 The HSHA server fails with 'Out of heap space' error if the rpc_max_threads 
 is left at its default setting (unlimited) in cassandra.yaml.
 I'm not proposing any code change for this but have submitted a patch for a 
 comment change in cassandra.yaml to indicate that rpc_max_threads needs to be 
 changed if you use HSHA.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7949) LCS compaction low performance, many pending compactions, nodes are almost idle

2014-10-19 Thread Nikolai Grigoriev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176426#comment-14176426
 ] 

Nikolai Grigoriev commented on CASSANDRA-7949:
--

[~krummas]

Marcus,

Which patch are you talking about? I am running latest DSE with Cassandra 
2.0.10.

 LCS compaction low performance, many pending compactions, nodes are almost 
 idle
 ---

 Key: CASSANDRA-7949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DSE 4.5.1-1, Cassandra 2.0.8
Reporter: Nikolai Grigoriev
 Attachments: iostats.txt, nodetool_compactionstats.txt, 
 nodetool_tpstats.txt, pending compactions 2day.png, system.log.gz, vmstat.txt


 I've been evaluating new cluster of 15 nodes (32 core, 6x800Gb SSD disks + 
 2x600Gb SAS, 128Gb RAM, OEL 6.5) and I've built a simulator that creates the 
 load similar to the load in our future product. Before running the simulator 
 I had to pre-generate enough data. This was done using Java code and DataStax 
 Java driver. To avoid going deep into details, two tables have been 
 generated. Each table currently has about 55M rows and between few dozens and 
 few thousands of columns in each row.
 This data generation process was generating massive amount of non-overlapping 
 data. Thus, the activity was write-only and highly parallel. This is not the 
 type of the traffic that the system will have ultimately to deal with, it 
 will be mix of reads and updates to the existing data in the future. This is 
 just to explain the choice of LCS, not mentioning the expensive SSD disk 
 space.
 At some point while generating the data I have noticed that the compactions 
 started to pile up. I knew that I was overloading the cluster but I still 
 wanted the genration test to complete. I was expecting to give the cluster 
 enough time to finish the pending compactions and get ready for real traffic.
 However, after the storm of write requests have been stopped I have noticed 
 that the number of pending compactions remained constant (and even climbed up 
 a little bit) on all nodes. After trying to tune some parameters (like 
 setting the compaction bandwidth cap to 0) I have noticed a strange pattern: 
 the nodes were compacting one of the CFs in a single stream using virtually 
 no CPU and no disk I/O. This process was taking hours. After that it would be 
 followed by a short burst of few dozens of compactions running in parallel 
 (CPU at 2000%, some disk I/O - up to 10-20%) and then getting stuck again for 
 many hours doing one compaction at time. So it looks like this:
 # nodetool compactionstats
 pending tasks: 3351
   compaction typekeyspace   table   completed 
   total  unit  progress
Compaction  myks table_list1 66499295588   
 1910515889913 bytes 3.48%
 Active compaction remaining time :n/a
 # df -h
 ...
 /dev/sdb1.5T  637G  854G  43% /cassandra-data/disk1
 /dev/sdc1.5T  425G  1.1T  29% /cassandra-data/disk2
 /dev/sdd1.5T  429G  1.1T  29% /cassandra-data/disk3
 # find . -name **table_list1**Data** | grep -v snapshot | wc -l
 1310
 Among these files I see:
 1043 files of 161Mb (my sstable size is 160Mb)
 9 large files - 3 between 1 and 2Gb, 3 of 5-8Gb, 55Gb, 70Gb and 370Gb
 263 files of various sized - between few dozens of Kb and 160Mb
 I've been running the heavy load for about 1,5days and it's been close to 3 
 days after that and the number of pending compactions does not go down.
 I have applied one of the not-so-obvious recommendations to disable 
 multithreaded compactions and that seems to be helping a bit - I see some 
 nodes started to have fewer pending compactions. About half of the cluster, 
 in fact. But even there I see they are sitting idle most of the time lazily 
 compacting in one stream with CPU at ~140% and occasionally doing the bursts 
 of compaction work for few minutes.
 I am wondering if this is really a bug or something in the LCS logic that 
 would manifest itself only in such an edge case scenario where I have loaded 
 lots of unique data quickly.
 By the way, I see this pattern only for one of two tables - the one that has 
 about 4 times more data than another (space-wise, number of rows is the 
 same). Looks like all these pending compactions are really only for that 
 larger table.
 I'll be attaching the relevant logs shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-4476) Support 2ndary index queries with only non-EQ clauses

2014-10-19 Thread Alexey Filippov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14176444#comment-14176444
 ] 

Alexey Filippov commented on CASSANDRA-4476:


Talking about {{Type.allowsIndexQuery}}, what operations do we want to support 
index queries now? {{LT, LTE, GTE, GT}} seem to work just fine; what about 
{{IN}} and {{NEQ}}? 


 Support 2ndary index queries with only non-EQ clauses
 -

 Key: CASSANDRA-4476
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4476
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: cql
 Fix For: 2.1.2


 Currently, a query that uses 2ndary indexes must have at least one EQ clause 
 (on an indexed column). Given that indexed CFs are local (and use 
 LocalPartitioner that order the row by the type of the indexed column), we 
 should extend 2ndary indexes to allow querying indexed columns even when no 
 EQ clause is provided.
 As far as I can tell, the main problem to solve for this is to update 
 KeysSearcher.highestSelectivityPredicate(). I.e. how do we estimate the 
 selectivity of non-EQ clauses? I note however that if we can do that estimate 
 reasonably accurately, this might provide better performance even for index 
 queries that both EQ and non-EQ clauses, because some non-EQ clauses may have 
 a much better selectivity than EQ ones (say you index both the user country 
 and birth date, for SELECT * FROM users WHERE country = 'US' AND birthdate  
 'Jan 2009' AND birtdate  'July 2009', you'd better use the birthdate index 
 first).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)