[jira] [Resolved] (CASSANDRA-9957) Unable to build Apache Cassandra Under Debian 8 OS with the provided ant script

2015-08-03 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9957.
---
Resolution: Not A Problem

Something is broken in your environment, but this is not a C* bug.

 Unable to build Apache Cassandra Under Debian 8 OS with the provided ant 
 script
 ---

 Key: CASSANDRA-9957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9957
 Project: Cassandra
  Issue Type: Bug
 Environment: PRETTY_NAME=Debian GNU/Linux 8 (jessie)
 NAME=Debian GNU/Linux
 VERSION_ID=8
 VERSION=8 (jessie)
 ID=debian
 HOME_URL=http://www.debian.org/;
 SUPPORT_URL=http://www.debian.org/support/;
 BUG_REPORT_URL=https://bugs.debian.org/;
  
 java version 1.8.0_45
 Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
 Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
 Apache Ant(TM) version 1.9.5 compiled on May 31 2015
Reporter: Adelin M.Ghanayem
  Labels: Cassandra, ant, build, build.xml

 Trying to use the tool CCM ( Cassandra Cluster Manger ) I've been blocked by 
 an issue related to compiling Cassandra source. CCM installs Cassandra builds 
 it source before anything else. However the CCM thrown an error 
 https://gist.github.com/AdelinGhanaem/593d1c8a63857113d0a7 here you can find 
 all info you need. 
 I've then tried to download the source and compile it using ant jar but 
 I've got the same error. 
 Basically the jars that are installed then running ant jar are corrupted ! 
 Extract them with jar xf thrown an error. 
 The only way that I could build the source is by downloading the jars by hand 
 from maven. I've described the error and the process in this post here
  
 http://mradelin.blogspot.com/2015/07/error-packaging-cassandra-220-db-source_31.html
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-02 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5220:
--
Assignee: Marcus Olsson

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-02 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5220:
--
Reviewer: Stefania  (was: Yuki Morishita)

Reassigning review to [~Stefania]

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9947) nodetool verify is broken

2015-07-31 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-9947:
-

 Summary: nodetool verify is broken
 Key: CASSANDRA-9947
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9947
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Priority: Critical
 Fix For: 2.2.x


Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they were 
ignored:

We mark sstables that fail verification as unrepaired, but that's not going to 
do what you think.  What it means is that the local node will use that sstable 
in the next repair, but other nodes will not. So all we'll end up doing is 
streaming whatever data we can read from it, to the other replicas.  If we 
could magically mark whatever sstables correspond on the remote nodes, to the 
data in the local sstable, that would work, but we can't.

IMO what we should do is:

*scrub, because it's quite likely we'll fail reading from the sstable 
otherwise and
*full repair across the data range covered by the sstable

Additionally,

* I'm not sure that keeping extended verify code around is worth it. Since 
the point is to work around not having a checksum, we could just scrub instead. 
This is slightly more heavyweight but it would be a one-time cost (scrub would 
build a new checksum) and we wouldn't have to worry about keeping two versions 
of almost-the-same-code in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9947) nodetool verify is broken

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649199#comment-14649199
 ] 

Jonathan Ellis commented on CASSANDRA-9947:
---

IMO we should disable verify for 2.2.1 until we can rearchitect it since this 
is a nontrivial change.

 nodetool verify is broken
 -

 Key: CASSANDRA-9947
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9947
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Priority: Critical
 Fix For: 2.2.x


 Raised these issues on CASSANDRA-5791, but didn't revert/re-open, so they 
 were ignored:
 We mark sstables that fail verification as unrepaired, but that's not going 
 to do what you think.  What it means is that the local node will use that 
 sstable in the next repair, but other nodes will not. So all we'll end up 
 doing is streaming whatever data we can read from it, to the other replicas.  
 If we could magically mark whatever sstables correspond on the remote nodes, 
 to the data in the local sstable, that would work, but we can't.
 IMO what we should do is:
 *scrub, because it's quite likely we'll fail reading from the sstable 
 otherwise and
 *full repair across the data range covered by the sstable
 Additionally,
 * I'm not sure that keeping extended verify code around is worth it. Since 
 the point is to work around not having a checksum, we could just scrub 
 instead. This is slightly more heavyweight but it would be a one-time cost 
 (scrub would build a new checksum) and we wouldn't have to worry about 
 keeping two versions of almost-the-same-code in sync.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5791) A nodetool command to validate all sstables in a node

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649200#comment-14649200
 ] 

Jonathan Ellis commented on CASSANDRA-5791:
---

Created CASSANDRA-9947 to follow up.

 A nodetool command to validate all sstables in a node
 -

 Key: CASSANDRA-5791
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5791
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Jeff Jirsa
Priority: Minor
 Fix For: 2.2.0 beta 1

 Attachments: cassandra-5791-20150319.diff, 
 cassandra-5791-patch-3.diff, cassandra-5791.patch-2


 CUrrently there is no nodetool command to validate all sstables on disk. The 
 only way to do this is to run a repair and see if it succeeds. But we cannot 
 repair the system keyspace. 
 Also we can run upgrade sstables but that re writes all the sstables. 
 This command should check the hash of all sstables and return whether all 
 data is readable all not. This should NOT care about consistency. 
 The compressed sstables do not have hash so not sure how it will work there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-8143) Partitioner should not be accessed through StorageService

2015-07-31 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-8143:
---

 Partitioner should not be accessed through StorageService
 -

 Key: CASSANDRA-8143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
 Fix For: 3.0 beta 1


 The configured partitioner is no longer the only partitioner in use in the 
 database, as e.g. index tables use LocalPartitioner.
 To make sure the correct partitioner is used for each table, accesses of 
 StorageService.getPartitioner() should be replaced with retrieval of the 
 CFS-specific partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9031) nodetool info -T throws ArrayOutOfBounds when the node has not joined the cluster

2015-07-31 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9031:
--
Reviewer: Stefania

[~Stefania] to review

 nodetool info -T throws ArrayOutOfBounds when the node has not joined the 
 cluster
 -

 Key: CASSANDRA-9031
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9031
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Ron Kuris
Assignee: Yuki Morishita
 Fix For: 2.1.x

 Attachments: patch.txt


 To reproduce, bring up a node that does not join the cluster, either using 
 -Dcassandra.write_survey=true or -Dcassandra.join_ring=false, then run 
 'nodetool info -T'. You'll get the following stack trace:
 {code}ID : e384209f-f7a9-4cff-8fd5-03adfaa0d846
 Gossip active  : true
 Thrift active  : true
 Native Transport active: true
 Load   : 76.69 KB
 Generation No  : 1427229938
 Uptime (seconds)   : 728
 Heap Memory (MB)   : 109.93 / 826.00
 Off Heap Memory (MB)   : 0.01
 Exception in thread main java.lang.IndexOutOfBoundsException: Index: 0, 
 Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:635)
   at java.util.ArrayList.get(ArrayList.java:411)
   at org.apache.cassandra.tools.NodeProbe.getEndpoint(NodeProbe.java:676)
   at 
 org.apache.cassandra.tools.NodeProbe.getDataCenter(NodeProbe.java:694)
   at org.apache.cassandra.tools.NodeCmd.printInfo(NodeCmd.java:666)
   at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1277){code}
 After applying the attached patch, the new error is:
 {code}ID : a7d76a2a-82d2-4faa-94e1-a30df6663ebb
 Gossip active  : true
 Thrift active  : false
 Native Transport active: false
 Load   : 89.36 KB
 Generation No  : 1427231804
 Uptime (seconds)   : 12
 Heap Memory (MB)   : 135.49 / 826.00
 Off Heap Memory (MB)   : 0.01
 Exception in thread main java.lang.RuntimeException: This node does not 
 have any tokens. Perhaps it is not part of the ring?
   at org.apache.cassandra.tools.NodeProbe.getEndpoint(NodeProbe.java:678)
   at 
 org.apache.cassandra.tools.NodeProbe.getDataCenter(NodeProbe.java:698)
   at org.apache.cassandra.tools.NodeCmd.printInfo(NodeCmd.java:676)
   at org.apache.cassandra.tools.NodeCmd.main(NodeCmd.java:1313){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9483) Document incompatibilities with -XX:+PerfDisableSharedMem

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649730#comment-14649730
 ] 

Jonathan Ellis commented on CASSANDRA-9483:
---

{{working:}} instead of {{working.}}, otherwise +1

 Document incompatibilities with -XX:+PerfDisableSharedMem
 -

 Key: CASSANDRA-9483
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9483
 Project: Cassandra
  Issue Type: Task
  Components: Config, Documentation  website
Reporter: Tyler Hobbs
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 3.0 beta 1

 Attachments: news_update.txt


 We recently discovered that [the Jolokia agent is incompatible with  the 
 -XX:+PerfDisableSharedMem JVM 
 option|https://github.com/rhuss/jolokia/issues/198].  I assume that this may 
 affect other monitoring tools as well.
 If we are going to leave this enabled by default, we should document the 
 potential problems with it.  A combination of a comment in 
 {{cassandra-env.sh}} (and the Windows equivalent) and a comment in NEWS.txt 
 should suffice, I think.
 If possible, it would be good to figure out what other tools are affected and 
 also mention them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9949) maxPurgeableTimestamp needs to check memtables too

2015-07-31 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9949:
--
Assignee: Stefania

 maxPurgeableTimestamp needs to check memtables too
 --

 Key: CASSANDRA-9949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Stefania
 Fix For: 2.1.x, 2.2.x


 overlapIterator/maxPurgeableTimestamp don't include the memtables, so a 
 very-out-of-order write could be ignored



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8143) Partitioner should not be accessed through StorageService

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649505#comment-14649505
 ] 

Jonathan Ellis commented on CASSANDRA-8143:
---

From IRC:

{noformat}
exlt this is going to be a hold the presses moment - we're going to jack up 
jenkins really quickly with this not being fixed
exlt ... the last run (as well as several before that) of that branch job was 
aborted for running out of control - 
http://cassci.datastax.com/view/Dev/view/blambov/job/blambov-8143-partitioner-dtest/17/
exlt this shouldn't have been merged
{noformat}

reverted pending a fix

 Partitioner should not be accessed through StorageService
 -

 Key: CASSANDRA-8143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Branimir Lambov
Assignee: Branimir Lambov
 Fix For: 3.0 beta 1


 The configured partitioner is no longer the only partitioner in use in the 
 database, as e.g. index tables use LocalPartitioner.
 To make sure the correct partitioner is used for each table, accesses of 
 StorageService.getPartitioner() should be replaced with retrieval of the 
 CFS-specific partitioner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9949) maxPurgeableTimestamp needs to check memtables too

2015-07-31 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-9949:
-

 Summary: maxPurgeableTimestamp needs to check memtables too
 Key: CASSANDRA-9949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
 Fix For: 2.1.x, 2.2.x


overlapIterator/maxPurgeableTimestamp don't include the memtables, so a 
very-out-of-order write could be ignored



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9949) maxPurgeableTimestamp needs to check memtables too

2015-07-31 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14649931#comment-14649931
 ] 

Jonathan Ellis commented on CASSANDRA-9949:
---

Nit: should probably reverse the order of the predicates in {{timestamp  
getMaxPurgeableTimestamp()  localDeletionTime  gcBefore}} since the former 
is expensive while the latter is not.

 maxPurgeableTimestamp needs to check memtables too
 --

 Key: CASSANDRA-9949
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9949
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
 Fix For: 2.1.x, 2.2.x


 overlapIterator/maxPurgeableTimestamp don't include the memtables, so a 
 very-out-of-order write could be ignored



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9935) Repair fails with RuntimeException

2015-07-30 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9935:
--
Assignee: Yuki Morishita

 Repair fails with RuntimeException
 --

 Key: CASSANDRA-9935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
 Project: Cassandra
  Issue Type: Bug
 Environment: C* 2.1.8, Debian Wheezy
Reporter: mlowicki
Assignee: Yuki Morishita
 Attachments: db1.sync.lati.osa.cassandra.log


 We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
 to 2.1.8 it started to work faster but now it fails with:
 {code}
 ...
 [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
 for range (-5474076923322749342,-5468600594078911162] finished
 [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
 for range (-8631877858109464676,-8624040066373718932] finished
 [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
 for range (-5372806541854279315,-5369354119480076785] finished
 [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
 for range (8166489034383821955,8168408930184216281] finished
 [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
 for range (6084602890817326921,6088328703025510057] finished
 [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
 for range (-781874602493000830,-781745173070807746] finished
 [2015-07-29 20:44:03,957] Repair command #4 finished
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
 at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
 at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 After running:
 {code}
 nodetool repair --partitioner-range --parallel --in-local-dc sync
 {code}
 Last records in logs regarding repair are:
 {code}
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
 (-7695808664784761779,-7693529816291585568] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
 (806371695398849,8065203836608925992] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
 (-5474076923322749342,-5468600594078911162] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
 Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
 (-8631877858109464676,-8624040066373718932] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
 (-5372806541854279315,-5369354119480076785] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
 (8166489034383821955,8168408930184216281] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
 (6084602890817326921,6088328703025510057] finished
 INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
 Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
 (-781874602493000830,-781745173070807746] finished
 {code}
 but a bit above I see (at least two times in attached log):
 {code}
 ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
 Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
 (5765414319217852786,5781018794516851576] failed with error 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
 org.apache.cassandra.exceptions.RepairException: [repair 
 #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
 (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
 at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
 [na:1.7.0_80]
 at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
 [na:1.7.0_80]
 at 
 org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
  ~[apache-cassandra-2.1.8.jar:2.1.8]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 [apache-cassandra-2.1.8.jar:2.1.8]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
 [na:1.7.0_80]
 at 

[jira] [Commented] (CASSANDRA-8325) Cassandra 2.1.x fails to start on FreeBSD (JVM crash)

2015-07-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647671#comment-14647671
 ] 

Jonathan Ellis commented on CASSANDRA-8325:
---

Basically we're waiting for someone who wants to run it badly enough to do the 
work.

 Cassandra 2.1.x fails to start on FreeBSD (JVM crash)
 -

 Key: CASSANDRA-8325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8325
 Project: Cassandra
  Issue Type: Bug
 Environment: FreeBSD 10.0 with openjdk version 1.7.0_71, 64-Bit 
 Server VM
Reporter: Leonid Shalupov
 Fix For: 2.1.x

 Attachments: hs_err_pid1856.log, system.log, unsafeCopy1.txt, 
 untested_8325.patch


 See attached error file after JVM crash
 {quote}
 FreeBSD xxx.intellij.net 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu 
 Jan 16 22:34:59 UTC 2014 
 r...@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC  amd64
 {quote}
 {quote}
  % java -version
 openjdk version 1.7.0_71
 OpenJDK Runtime Environment (build 1.7.0_71-b14)
 OpenJDK 64-Bit Server VM (build 24.71-b01, mixed mode)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9944) sstablesplit.bat does not split sstables

2015-07-30 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9944:
--
Assignee: Stefania

 sstablesplit.bat does not split sstables
 

 Key: CASSANDRA-9944
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9944
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Philip Thompson
Assignee: Stefania
 Fix For: 2.2.x


 The dtest {{sstablesplit_test.py:TestSStableSplit.split_test}} is failing on 
 windows on 2.2-head. An sstable approximately 280MB is created, and then we 
 run sstablesplit.bat on it. By default, we should split into 50MB sstables, 
 giving us six new sstables. Instead, nothing happens, and we are left with 
 the original.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9942) SStableofflinerevel and sstablelevelreset don't have windows versions

2015-07-30 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9942:
--
Assignee: Paulo Motta

 SStableofflinerevel and sstablelevelreset don't have windows versions
 -

 Key: CASSANDRA-9942
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9942
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Philip Thompson
Assignee: Paulo Motta
 Fix For: 2.2.x


 These two tools located in tools/bin do not have corresponding .bat versions, 
 so they do not run on windows. This is also breaking their related dtests on 
 windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9946) use ioprio_set instead of throttling by default

2015-07-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648573#comment-14648573
 ] 

Jonathan Ellis commented on CASSANDRA-9946:
---

Things I don't know:

# What is the Windows equivalent? /cc [~JoshuaMcKenzie]
# Should we pick a one-size-fits-all priority, or allow the user to override 
class/priority? /cc [~a...@ooyala.com]

 use ioprio_set instead of throttling by default
 ---

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9946) use ioprio_set on compaction threads by default instead of manually throttling

2015-07-30 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9946:
--
Summary: use ioprio_set on compaction threads by default instead of 
manually throttling  (was: use ioprio_set instead of throttling by default)

 use ioprio_set on compaction threads by default instead of manually throttling
 --

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9946) use ioprio_set instead of throttling by default

2015-07-30 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-9946:
-

 Summary: use ioprio_set instead of throttling by default
 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


Compaction throttling works as designed, but it has two drawbacks:

* it requires manual tuning to choose the right value for a given machine
* it does not allow compaction to burst above its limit if there is 
additional i/o capacity available while there are less application requests to 
serve

Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9843) Augment or replace partition index with adaptive range filters

2015-07-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648568#comment-14648568
 ] 

Jonathan Ellis commented on CASSANDRA-9843:
---

[~danchia], the first thing we'd need is an ARF implementation that supports 
Cell.

 Augment or replace partition index with adaptive range filters
 --

 Key: CASSANDRA-9843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9843
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: T Jake Luciani
  Labels: performance

 Adaptive range filters are, in principle, bloom filters for range queries.  
 They provide a space-efficient way to avoid scanning a partition when we can 
 tell that we do not contain any data for the range requested.  Like BF, they 
 can return false positives but not false negatives.
 The implementation is of course totally different from BF.  ARF is a tree 
 where each leaf of the tree is a range of data and a bit, either on or off, 
 denoting whether we have *some* data in that range.
 ARF are described here: http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9946) use ioprio_set on compaction threads by default instead of manually throttling

2015-07-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14648652#comment-14648652
 ] 

Jonathan Ellis commented on CASSANDRA-9946:
---

CFQ is default on debian and RHEL.  Is there a syscall that can check if it's 
enabled first?



 use ioprio_set on compaction threads by default instead of manually throttling
 --

 Key: CASSANDRA-9946
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9946
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
 Fix For: 3.x


 Compaction throttling works as designed, but it has two drawbacks:
 * it requires manual tuning to choose the right value for a given machine
 * it does not allow compaction to burst above its limit if there is 
 additional i/o capacity available while there are less application requests 
 to serve
 Using ioprio_set instead solves both of these problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646843#comment-14646843
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

Reverted 3bdcaa336a6e6a9727c333b433bb9f5d3afc0fb1 and 
b93f05d7d1490c6146576a35f5a572d9d0e72399 pending a fix.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 alpha 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9894) Serialize the header only once per message

2015-07-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9894:
--
Reviewer: Ariel Weisberg

[~aweisberg] to review

 Serialize the header only once per message
 --

 Key: CASSANDRA-9894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9894
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Benedict
 Fix For: 3.0 beta 1


 One last improvement I'd like to do on the serialization side is that we 
 currently serialize the {{SerializationHeader}} for each partition. That 
 header contains the serialized columns in particular and for range queries, 
 serializing that for every partition is wasted (note that it's only a problem 
 for the messaging protocol as for sstable we only write the header once per 
 sstable).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-29 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-6477:
---

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 alpha 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8931) IndexSummary (and Index) should store the token, and the minimal key to unambiguously direct a query

2015-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646225#comment-14646225
 ] 

Jonathan Ellis commented on CASSANDRA-8931:
---

Good idea.  This will save a lot of memory.

 IndexSummary (and Index) should store the token, and the minimal key to 
 unambiguously direct a query
 

 Key: CASSANDRA-8931
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8931
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: performance

 Since these files are likely sticking around a little longer, it is probably 
 worth optimising them. A relatively simple change to Index and IndexSummary 
 could reduce the amount of space required significantly, reduce the CPU 
 burden of lookup, and hopefully bound the amount of space needed as key size 
 grows. On writing first we always store the token before the key (if it is 
 different to the key); then we simply truncate the whole record to the 
 minimum length necessary to answer an inequality search. Since the data file 
 contains the key also, we can corroborate we have the right key once we've 
 looked up. Since BFs are used to reduce unnecessary lookups, we don't save 
 much by ruling the false positives out one step earlier. 
  An improved follow up version would be to use a trie of shortest length to 
 answer inequality lookups, as this would also ensure very long keys with 
 common prefixes would not significantly increase the size of the index or 
 summary. This would translate to a trie index for the summary keying into a 
 static trie page for the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8931) IndexSummary (and Index) should store the token, and the minimal key to unambiguously direct a query

2015-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646126#comment-14646126
 ] 

Jonathan Ellis commented on CASSANDRA-8931:
---

bq. then we simply truncate the whole record to the minimum length necessary to 
answer an inequality search

Meaning, we only store enough to disambiguate from the records before and after?

 IndexSummary (and Index) should store the token, and the minimal key to 
 unambiguously direct a query
 

 Key: CASSANDRA-8931
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8931
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: performance

 Since these files are likely sticking around a little longer, it is probably 
 worth optimising them. A relatively simple change to Index and IndexSummary 
 could reduce the amount of space required significantly, reduce the CPU 
 burden of lookup, and hopefully bound the amount of space needed as key size 
 grows. On writing first we always store the token before the key (if it is 
 different to the key); then we simply truncate the whole record to the 
 minimum length necessary to answer an inequality search. Since the data file 
 contains the key also, we can corroborate we have the right key once we've 
 looked up. Since BFs are used to reduce unnecessary lookups, we don't save 
 much by ruling the false positives out one step earlier. 
  An improved follow up version would be to use a trie of shortest length to 
 answer inequality lookups, as this would also ensure very long keys with 
 common prefixes would not significantly increase the size of the index or 
 summary. This would translate to a trie index for the summary keying into a 
 static trie page for the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9738) Migrate key-cache to be fully off-heap

2015-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646091#comment-14646091
 ] 

Jonathan Ellis commented on CASSANDRA-9738:
---

Did you mean to link a different issue?

 Migrate key-cache to be fully off-heap
 --

 Key: CASSANDRA-9738
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9738
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.x


 Key cache still uses a concurrent map on-heap. This could go to off-heap and 
 feels doable now after CASSANDRA-8099.
 Evaluation should be done in advance based on a POC to prove that pure 
 off-heap counter cache buys a performance and/or gc-pressure improvement.
 In theory, elimination of on-heap management of the map should buy us some 
 benefit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9895) Batchlog RF1 writes to a single node but not itself.

2015-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14646142#comment-14646142
 ] 

Jonathan Ellis commented on CASSANDRA-9895:
---

Writing to localhost doesn't really improve our durability though, since it's 
already involved as the coordinator pushing the batch through.  (To be clear, 
it improves durability more than logging to nothing, but much less than logging 
to a different node.)

 Batchlog RF1 writes to a single node but not itself.
 -

 Key: CASSANDRA-9895
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9895
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Aleksey Yeschenko
 Fix For: 2.1.x, 3.0 beta 1


 In the batchlogmanager when selecting the endpoints for to write the batchlog 
 to, for RF1,  we filter out any down nodes and the local node. 
 This means we require two nodes up but only write to one.  Why? This affects 
 availability since we need two nodes to write at CL.ONE.  
 If we *require* two copies of the batchlog then we should include ourselfs in 
 the calculation.
 If we allow a batchlog write with only a single node up then we should write 
 to the local batchlog.
 The code is here: 
 https://github.com/apache/cassandra/blob/1c80b04be1d47d03bbde888cea960f5ff8a95d58/src/java/org/apache/cassandra/db/BatchlogManager.java#L530



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14645359#comment-14645359
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

/cc [~nickmbailey]

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.0 alpha 1

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9790) CommitLogUpgradeTest.test{20,21} failure

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9790:
--
Assignee: Ariel Weisberg  (was: Sylvain Lebresne)

 CommitLogUpgradeTest.test{20,21} failure
 

 Key: CASSANDRA-9790
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9790
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Michael Shuler
Assignee: Ariel Weisberg
Priority: Blocker
  Labels: test-failure
 Fix For: 3.0 beta 1


 These test failures started with the 8099 commit.
 {noformat}
 Stacktrace
 java.lang.IllegalArgumentException
   at java.nio.Buffer.limit(Buffer.java:275)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:583)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:592)
   at 
 org.apache.cassandra.db.marshal.CompositeType.splitName(CompositeType.java:197)
   at 
 org.apache.cassandra.db.LegacyLayout.decodeClustering(LegacyLayout.java:235)
   at 
 org.apache.cassandra.db.LegacyLayout.decodeCellName(LegacyLayout.java:127)
   at 
 org.apache.cassandra.db.LegacyLayout.readLegacyCellBody(LegacyLayout.java:672)
   at 
 org.apache.cassandra.db.LegacyLayout.readLegacyCell(LegacyLayout.java:643)
   at 
 org.apache.cassandra.db.LegacyLayout$8.computeNext(LegacyLayout.java:713)
   at 
 org.apache.cassandra.db.LegacyLayout$8.computeNext(LegacyLayout.java:702)
   at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
   at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
   at 
 com.google.common.collect.Iterators$PeekingImpl.hasNext(Iterators.java:1149)
   at 
 org.apache.cassandra.db.LegacyLayout.toUnfilteredRowIterator(LegacyLayout.java:310)
   at 
 org.apache.cassandra.db.LegacyLayout.onWireCellstoUnfilteredRowIterator(LegacyLayout.java:298)
   at 
 org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:670)
   at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:276)
   at 
 org.apache.cassandra.db.commitlog.CommitLogTestReplayer.replayMutation(CommitLogTestReplayer.java:66)
   at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.replaySyncSection(CommitLogReplayer.java:464)
   at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:370)
   at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:145)
   at 
 org.apache.cassandra.db.commitlog.CommitLogUpgradeTest.testRestore(CommitLogUpgradeTest.java:105)
   at 
 org.apache.cassandra.db.commitlog.CommitLogUpgradeTest.test21(CommitLogUpgradeTest.java:66)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644344#comment-14644344
 ] 

Jonathan Ellis commented on CASSANDRA-9533:
---

I think this goes without saying, but to be explicit, I'm not willing to spend 
time optimizing for single-threaded performance.  (Even if it's a regression 
from 2.0.)

 Make batch commitlog mode easier to tune
 

 Key: CASSANDRA-9533
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Benedict
 Fix For: 3.x


 As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
 from a maximum time to wait between fsync to the minimum time, so one must be 
 very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9853) loadConfig() called twice on startup

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644350#comment-14644350
 ] 

Jonathan Ellis commented on CASSANDRA-9853:
---

+1

 loadConfig() called twice on startup
 

 Key: CASSANDRA-9853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9853
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.x

 Attachments: 9853.txt


 {{YamlConfigrationLoader.loadConfig()}} is called twice on startup from 
 {{org.apache.cassandra.locator.SimpleSeedProvider#getSeeds}} and 
 {{org.apache.cassandra.config.DatabaseDescriptor#forceStaticInitialization}}.
 It's not nice, but not fatal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9853) loadConfig() called twice on startup

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9853:
--
Reviewer: Jonathan Ellis  (was: Aleksey Yeschenko)

 loadConfig() called twice on startup
 

 Key: CASSANDRA-9853
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9853
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.x

 Attachments: 9853.txt


 {{YamlConfigrationLoader.loadConfig()}} is called twice on startup from 
 {{org.apache.cassandra.locator.SimpleSeedProvider#getSeeds}} and 
 {{org.apache.cassandra.config.DatabaseDescriptor#forceStaticInitialization}}.
 It's not nice, but not fatal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9880) ScrubTest.testScrubOutOfOrder should generate test file on the fly

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9880:
--
Reviewer: Stefania

[~Stefania] to review

 ScrubTest.testScrubOutOfOrder should generate test file on the fly
 --

 Key: CASSANDRA-9880
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9880
 Project: Cassandra
  Issue Type: Bug
Reporter: Yuki Morishita
Assignee: Yuki Morishita
Priority: Blocker
  Labels: test-failure
 Fix For: 3.0 beta 1


 ScrubTest#testScrubOutOfOrder is failing on trunk due to the serialization 
 format change from pre-generated out-of-order SSTable.
 We should change that to generate out-of-order SSTable on the fly so that we 
 don't need to bother generating SSTable by hand again.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9775) some paging dtests fail/flap on trunk

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9775:
--
Reviewer: Benjamin Lerer

 some paging dtests fail/flap on trunk
 -

 Key: CASSANDRA-9775
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9775
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Jim Witschey
Assignee: Sylvain Lebresne
Priority: Blocker
 Fix For: 3.0 beta 1


 Several paging dtests fail on trunk:
 [static_columns_paging_test|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingData/static_columns_paging_test/history/]
 [test_undefined_page_size_default|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingSize/test_undefined_page_size_default/history/]
 [test_failure_threshold_deletions|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/junit/paging_test/TestPagingWithDeletions/test_failure_threshold_deletions/history/]
 I'm not sure if these are all rooted in the same underlying problem, so I 
 defer to whoever takes this ticket on.
 [~thobbs] I'm assigning you because this is about paging, but reassign as you 
 see fit. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9895) Batchlog RF1 writes to a single node but not itself.

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644552#comment-14644552
 ] 

Jonathan Ellis commented on CASSANDRA-9895:
---

The idea was that the batchlog should give you the guarantee that you won't 
lose atomicity unless you lose 3 machines during the request (coordinator plus 
two others).  Allowing coordinator to be one of the replicas weakens this to 2.

 Batchlog RF1 writes to a single node but not itself.
 -

 Key: CASSANDRA-9895
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9895
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Aleksey Yeschenko
 Fix For: 2.1.x, 3.0 beta 1


 In the batchlogmanager when selecting the endpoints for to write the batchlog 
 to, for RF1,  we filter out any down nodes and the local node. 
 This means we require two nodes up but only write to one.  Why? This affects 
 availability since we need two nodes to write at CL.ONE.  
 If we *require* two copies of the batchlog then we should include ourselfs in 
 the calculation.
 If we allow a batchlog write with only a single node up then we should write 
 to the local batchlog.
 The code is here: 
 https://github.com/apache/cassandra/blob/1c80b04be1d47d03bbde888cea960f5ff8a95d58/src/java/org/apache/cassandra/db/BatchlogManager.java#L530



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644630#comment-14644630
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

bq. if we lose the new file, say, then we will delete the old files on 
startup none-the-wiser. The result being a partial replacement of the sstables 
(perhaps with nothing at all).

Isn't that worse than what I proposed a while back?

bq. log that the new [sstables] are in progress, then when they're done, we 
clear the in progress log file and delete the old files. If the process dies 
in between those two steps (very rare, deletes are fast) [or if the log file is 
corrupted] we have some extra redundant data left but correctness is preserved.

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.0 alpha 1

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644659#comment-14644659
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

Granted, but a bug in the implementation could lead to similar results.  I'd be 
a lot more comfortable with a design whose failure scenario is we do extra 
compaction than we silently lose data.

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.0 alpha 1

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-9801) Use vints where it makes sense

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-9801:
---
  Assignee: Ariel Weisberg  (was: Sylvain Lebresne)

We need to make progress towards less broken tests, not break more, even when 
we're convinced it's not the new code's fault.

I've reverted and am assigning to Ariel to finish up, if necessary, since he's 
already working on CASSANDRA-9865.

 Use vints where it makes sense
 --

 Key: CASSANDRA-9801
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9801
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
 Fix For: 3.0 alpha 1


 CASSANDRA-9705 have switched to vints for a number of things, but there is 
 some I've missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9801) Use vints where it makes sense

2015-07-28 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9801:
--
Fix Version/s: (was: 3.0 alpha 1)
   3.0 beta 1

 Use vints where it makes sense
 --

 Key: CASSANDRA-9801
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9801
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
 Fix For: 3.0 beta 1


 CASSANDRA-9705 have switched to vints for a number of things, but there is 
 some I've missed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644619#comment-14644619
 ] 

Jonathan Ellis commented on CASSANDRA-7066:
---

bq. If you lose one of these files you are SOL

Is that SOL as in Now we can't tell which is new so we have to keep both and 
do redundant compaction or as in Now the note can't start up?

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.0 alpha 1

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9664) Allow MV's select statements to be more complex

2015-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14644760#comment-14644760
 ] 

Jonathan Ellis commented on CASSANDRA-9664:
---

Materializing aggregates is out of scope here.  See CASSANDRA-9778 for that.

 Allow MV's select statements to be more complex
 ---

 Key: CASSANDRA-9664
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9664
 Project: Cassandra
  Issue Type: New Feature
Reporter: Carl Yeksigian

 [Materialized Views|https://issues.apache.org/jira/browse/CASSANDRA-6477] add 
 support for a syntax which includes a {{SELECT}} statement, but only allows 
 selection of direct columns, and does not allow any filtering to take place.
 We should add support to the MV {{SELECT}} statement to bring better parity 
 with the normal CQL {{SELECT}} statement, specifically simple functions in 
 the selected columns, as well as specifying a {{WHERE}} clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9265) Add checksum to saved cache files

2015-07-27 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14642737#comment-14642737
 ] 

Jonathan Ellis commented on CASSANDRA-9265:
---

Throwing out the cache isn't a showstopper since we can always rebuild it, but 
it will hurt performance until it's done.  Upgrading an entire cluster will be 
slower since you need to wait longer between machines.  So if possible it's 
nice to preserve compatibility.

 Add checksum to saved cache files
 -

 Key: CASSANDRA-9265
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9265
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
 Fix For: 3.x


 Saved caches are not covered by a checksum. We should at least emit a 
 checksum. My suggestion is a large checksum of the whole file (convenient 
 offline validation), and then smaller per record checksums after each record 
 is written (possibly a subset of the incrementally maintained larger 
 checksum).
 I wouldn't go for anything fancy to try to recover from corruption since it 
 is just a saved cache. If corruption is detected while reading I would just 
 have it bail out. I would rather have less code to review and test in this 
 instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9889) Disable scripted UDFs by default

2015-07-25 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641771#comment-14641771
 ] 

Jonathan Ellis commented on CASSANDRA-9889:
---

bq. Requiring that permission for script-UDFs would effectively always disable 
the sandbox for them.

Well, it's acknowledging reality, which is that if you allow users to create 
scripted UDFs then you need to trust them not to do something dumb.

 Disable scripted UDFs by default
 

 Key: CASSANDRA-9889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9889
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.0.0 rc1


 (Follow-up to CASSANDRA-9402)
 TL;DR this ticket is about to add an other config option to enable scripted 
 UDFs.
 Securing Java-UDFs is much easier than scripted UDFs.
 The secure execution of scripted UDFs heavily relies on how secure a 
 particular script provider implementation is. Nashorn is probably pretty good 
 at this - but (as discussed offline with [~iamaleksey]) we are not certain. 
 This becomes worse with other JSR-223 providers (which need to be installed 
 by the user anyway).
 E.g.:
 {noformat}
 # Enables use of scripted UDFs.
 # Java UDFs are always enabled, if enable_user_defined_functions is true.
 # Enable this option to be able to use UDFs with language javascript or any 
 custom JSR-223 provider.
 enable_scripted_user_defined_functions: false
 {noformat}
 TBH: I would feel more comfortable to have this one. But we should review 
 this along with enable_user_defined_functions for 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-07-24 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-9892:
-

 Summary: Add support for unsandboxed UDF
 Key: CASSANDRA-9892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Robert Stupp
Priority: Minor


From discussion on CASSANDRA-9402,

The approach postgresql takes is to distinguish between trusted (sandboxed) 
and untrusted (anything goes) UDF languages. 

Creating an untrusted language always requires superuser mode. Once that is 
done, creating functions in it requires nothing special.

Personally I would be fine with this approach, but I think it would be more 
useful to have the extra permission on creating the function, and also wouldn't 
require adding explicit CREATE LANGUAGE.

So I'd suggest just providing different CQL permissions for trusted and 
untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9481) FENCED UDFs

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9481.
---
Resolution: Won't Fix

Largely unnecessary with the successful resolution of 9402.  

 FENCED UDFs
 ---

 Key: CASSANDRA-9481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9481
 Project: Cassandra
  Issue Type: New Feature
Reporter:  Brian Hess

 Related to security/sandboxing of UDFs (CASSANDRA-9042)
 Essentially, the UDF will run in a separate process when it is registered as 
 FENCED, and run in-process when it is registered as UNFENCED.
 This doesn't necessarily remove all the issues, but it does help mitigate 
 them/some - especially since it would (optionally) run as another user.
 This could look like the following with Cassandra:
 - FENCED is a GRANTable privilege
 - In cassandra.yaml you can specify the user to use when launching the 
 separate process (so that it is not the same user that is running the 
 database - or optionally is)
   - This is good so that the UDF can't stop the database, delete database 
 files, etc.
 - For FENCED UDFs, IPC would be used to transfer rows to the UDF and to 
 return results. We could use CQL rows for the data. This could be shared 
 memory or sockets (Unux or TPC - slight preference for sockets for some 
 follow-on ideas).
 - Ideally, switching from FENCED to UNFENCED would be just a DDL change. That 
 is, the API would work such that a simple ALTER FUNCTION myFunction(DOUBLE, 
 DOUBLE) UNFENCED would change it.
 - If you wanted, because this is a separate process you could use a separate 
 class loader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9889) Disable scripted UDFs by default

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640421#comment-14640421
 ] 

Jonathan Ellis commented on CASSANDRA-9889:
---

What if we made scripted always untrusted?  CASSANDRA-9892

 Disable scripted UDFs by default
 

 Key: CASSANDRA-9889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9889
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.0.0 rc1


 (Follow-up to CASSANDRA-9402)
 TL;DR this ticket is about to add an other config option to enable scripted 
 UDFs.
 Securing Java-UDFs is much easier than scripted UDFs.
 The secure execution of scripted UDFs heavily relies on how secure a 
 particular script provider implementation is. Nashorn is probably pretty good 
 at this - but (as discussed offline with [~iamaleksey]) we are not certain. 
 This becomes worse with other JSR-223 providers (which need to be installed 
 by the user anyway).
 E.g.:
 {noformat}
 # Enables use of scripted UDFs.
 # Java UDFs are always enabled, if enable_user_defined_functions is true.
 # Enable this option to be able to use UDFs with language javascript or any 
 custom JSR-223 provider.
 enable_scripted_user_defined_functions: false
 {noformat}
 TBH: I would feel more comfortable to have this one. But we should review 
 this along with enable_user_defined_functions for 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9889) Disable scripted UDFs by default

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640519#comment-14640519
 ] 

Jonathan Ellis commented on CASSANDRA-9889:
---

I think it's the same issue.  If Groovy is always untrusted, then you require 
user to have CREATE UNTRUSTED permission and the problem is solved.

 Disable scripted UDFs by default
 

 Key: CASSANDRA-9889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9889
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.0.0 rc1


 (Follow-up to CASSANDRA-9402)
 TL;DR this ticket is about to add an other config option to enable scripted 
 UDFs.
 Securing Java-UDFs is much easier than scripted UDFs.
 The secure execution of scripted UDFs heavily relies on how secure a 
 particular script provider implementation is. Nashorn is probably pretty good 
 at this - but (as discussed offline with [~iamaleksey]) we are not certain. 
 This becomes worse with other JSR-223 providers (which need to be installed 
 by the user anyway).
 E.g.:
 {noformat}
 # Enables use of scripted UDFs.
 # Java UDFs are always enabled, if enable_user_defined_functions is true.
 # Enable this option to be able to use UDFs with language javascript or any 
 custom JSR-223 provider.
 enable_scripted_user_defined_functions: false
 {noformat}
 TBH: I would feel more comfortable to have this one. But we should review 
 this along with enable_user_defined_functions for 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9868) Archive commitlogs tests failing

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9868:
--
Assignee: Stefania

 Archive commitlogs tests failing
 

 Key: CASSANDRA-9868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9868
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Assignee: Stefania
Priority: Blocker
 Fix For: 3.0 alpha 1

 Attachments: commitlog_archiving.properties


 A number of archive commitlog dtests (snapshot_tests.py) are failing on trunk 
 at the point in the tests where the node is asked to restore data from 
 archived commitlogs. It appears that the snapshot functionality works, but 
 the 
 [assertion|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L312]
  regarding data that should have been restored from archived commitlogs 
 fails. I also tested this manually on trunk and found no success in restoring 
 either, so it appears to not just be a test issue. Should note that it seems 
 archiving the commitlogs works (in that they are actually copied) and rather 
 restoring them is the issue. Attached is a the commitlog properties file (to 
 show the commands used).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9865) Broken vint encoding, at least when interacting with OHCProvider

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641022#comment-14641022
 ] 

Jonathan Ellis commented on CASSANDRA-9865:
---

Is this still a problem now that 9863 is resolved?

 Broken vint encoding, at least when interacting with OHCProvider
 

 Key: CASSANDRA-9865
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9865
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
 Fix For: 3.0 alpha 1

 Attachments: 9865-hacky-test.txt


 I haven't investigated this very closely so I only have a slightly hacky way 
 to show the problem, but if you apply the patch attached, you'll see that the 
 vints serialized and the one deserialized are not the same ones. If you 
 remove the use of vints (as is currently on trunk, but only due to this issue 
 because we do want to use vints), everything works correctly.
 I'm honestly not sure where the problem is, but it sounds like it could be 
 either in {{NIODataInputStream}} or in the {{OHCProvider}} since it's used on 
 that test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9764) dtest for many UPDATE batches, low contention fails on trunk

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641023#comment-14641023
 ] 

Jonathan Ellis commented on CASSANDRA-9764:
---

[~enigmacurry] is this still a problem with 9863 committed?

 dtest for many UPDATE batches, low contention fails on trunk
 

 Key: CASSANDRA-9764
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9764
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Jim Witschey
Assignee: Sylvain Lebresne
Priority: Blocker
 Fix For: 3.0 alpha 1


 {{paxos_tests.py:TestPaxos.contention_test_multi_iterations}} fails 
 consistently on trunk ([cassci 
 history|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/paxos_tests/TestPaxos/contention_test_multi_iterations/history/]).
  The test works by creating 8 workers, each of which increments an integer 
 100 times using UPDATE. Based on the test failures, it looks like 2 or 3 of 
 the updates are consistently dropped. Other dtests that don't run as many 
 iterations but have more contention succeed.
 I'm assigning you, [~slebresne], because you wrote the test and were the last 
 person to modify it, but feel free to reassign. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9764) dtest for many UPDATE batches, low contention fails on trunk

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641023#comment-14641023
 ] 

Jonathan Ellis edited comment on CASSANDRA-9764 at 7/24/15 8:21 PM:


[~mambocab] is this still a problem with 9863 committed?


was (Author: jbellis):
[~enigmacurry] is this still a problem with 9863 committed?

 dtest for many UPDATE batches, low contention fails on trunk
 

 Key: CASSANDRA-9764
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9764
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Jim Witschey
Assignee: Sylvain Lebresne
Priority: Blocker
 Fix For: 3.0 alpha 1


 {{paxos_tests.py:TestPaxos.contention_test_multi_iterations}} fails 
 consistently on trunk ([cassci 
 history|http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastSuccessfulBuild/testReport/paxos_tests/TestPaxos/contention_test_multi_iterations/history/]).
  The test works by creating 8 workers, each of which increments an integer 
 100 times using UPDATE. Based on the test failures, it looks like 2 or 3 of 
 the updates are consistently dropped. Other dtests that don't run as many 
 iterations but have more contention succeed.
 I'm assigning you, [~slebresne], because you wrote the test and were the last 
 person to modify it, but feel free to reassign. Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9868) Archive commitlogs tests failing

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9868:
--
Assignee: Ariel Weisberg  (was: Stefania)

 Archive commitlogs tests failing
 

 Key: CASSANDRA-9868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9868
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Shawn Kumar
Assignee: Ariel Weisberg
Priority: Blocker
 Fix For: 3.0 alpha 1

 Attachments: commitlog_archiving.properties


 A number of archive commitlog dtests (snapshot_tests.py) are failing on trunk 
 at the point in the tests where the node is asked to restore data from 
 archived commitlogs. It appears that the snapshot functionality works, but 
 the 
 [assertion|https://github.com/riptano/cassandra-dtest/blob/master/snapshot_test.py#L312]
  regarding data that should have been restored from archived commitlogs 
 fails. I also tested this manually on trunk and found no success in restoring 
 either, so it appears to not just be a test issue. Should note that it seems 
 archiving the commitlogs works (in that they are actually copied) and rather 
 restoring them is the issue. Attached is a the commitlog properties file (to 
 show the commands used).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9865) Broken vint encoding, at least when interacting with OHCProvider

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9865:
--
Assignee: Ariel Weisberg

 Broken vint encoding, at least when interacting with OHCProvider
 

 Key: CASSANDRA-9865
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9865
 Project: Cassandra
  Issue Type: Bug
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
 Fix For: 3.0 alpha 1

 Attachments: 9865-hacky-test.txt


 I haven't investigated this very closely so I only have a slightly hacky way 
 to show the problem, but if you apply the patch attached, you'll see that the 
 vints serialized and the one deserialized are not the same ones. If you 
 remove the use of vints (as is currently on trunk, but only due to this issue 
 because we do want to use vints), everything works correctly.
 I'm honestly not sure where the problem is, but it sounds like it could be 
 either in {{NIODataInputStream}} or in the {{OHCProvider}} since it's used on 
 that test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9741) cfhistograms dtest flaps on trunk and 2.2

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9741:
--
Assignee: Ariel Weisberg  (was: Tyler Hobbs)

 cfhistograms dtest flaps on trunk and 2.2
 -

 Key: CASSANDRA-9741
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9741
 Project: Cassandra
  Issue Type: Bug
Reporter: Jim Witschey
Assignee: Ariel Weisberg
 Fix For: 2.2.x, 3.0.x


 {{jmx_test.py:TestJMX.cfhistograms_test}} flaps on CassCI under trunk and 2.2.
 On 2.2, it fails one of its assertions when {{'Unable to compute when 
 histogram overflowed'}} is found in the output of {{nodetool cfhistograms}}. 
 Here's the failure history for 2.2:
 http://cassci.datastax.com/view/cassandra-2.2/job/cassandra-2.2_dtest/lastCompletedBuild/testReport/junit/jmx_test/TestJMX/cfhistograms_test/history/
 On trunk, it fails when an error about a {{WriteFailureException}} during 
 hinted handoff is found in the C* logs after the tests run ([example cassci 
 output|http://cassci.datastax.com/view/trunk/job/trunk_dtest/315/testReport/junit/jmx_test/TestJMX/cfhistograms_test/]).
  Here's the failure history for trunk:
 http://cassci.datastax.com/view/trunk/job/trunk_dtest/lastCompletedBuild/testReport/junit/jmx_test/TestJMX/cfhistograms_test/history/
 I haven't seen it fail locally yet, but haven't run the test more than a 
 couple times because it takes a while.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7392) Abort in-progress queries that time out

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7392:
--
Reviewer: Ariel Weisberg  (was: Benjamin Lerer)

(Benjamin is out next week, so re-handing review to Ariel.)

 Abort in-progress queries that time out
 ---

 Key: CASSANDRA-7392
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Stefania
 Fix For: 3.x


 Currently we drop queries that time out before we get to them (because node 
 is overloaded) but not queries that time out while being processed.  
 (Particularly common for index queries on data that shouldn't be indexed.)  
 Adding the latter and logging when we have to interrupt one gets us a poor 
 man's slow query log for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8906) Experiment with optimizing partition merging when we can prove that some sources don't overlap

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8906:
--
Fix Version/s: 3.x

 Experiment with optimizing partition merging when we can prove that some 
 sources don't overlap
 --

 Key: CASSANDRA-8906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
  Labels: compaction, performance
 Fix For: 3.x


 When we merge a partition from two sources and it turns out that those 2 
 sources don't overlap for that partition, we still end up doing one 
 comparison by row in the first source. However, if we can prove that the 2 
 sources don't overlap, for example by using the sstable min/max clustering 
 values that we store, we could speed this up. Note that it practice it's 
 little bit more hairy because we need to deal with N sources, but that's 
 probably not too hard either.
 I'll note that using the sstable min/max clustering values is not terribly 
 precise. We could do better if we were to push the same reasoning inside the 
 merge iterator, by for instance using the sstable per-partition index, which 
 can in theory tell use things like don't bother comparing rows until the end 
 of this row block. This is quite a bit more involved though so maybe note 
 worth the complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8906) Experiment with optimizing partition merging when we can prove that some sources don't overlap

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8906:
--
Assignee: Ariel Weisberg

 Experiment with optimizing partition merging when we can prove that some 
 sources don't overlap
 --

 Key: CASSANDRA-8906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
  Labels: compaction, performance
 Fix For: 3.x


 When we merge a partition from two sources and it turns out that those 2 
 sources don't overlap for that partition, we still end up doing one 
 comparison by row in the first source. However, if we can prove that the 2 
 sources don't overlap, for example by using the sstable min/max clustering 
 values that we store, we could speed this up. Note that it practice it's 
 little bit more hairy because we need to deal with N sources, but that's 
 probably not too hard either.
 I'll note that using the sstable min/max clustering values is not terribly 
 precise. We could do better if we were to push the same reasoning inside the 
 merge iterator, by for instance using the sstable per-partition index, which 
 can in theory tell use things like don't bother comparing rows until the end 
 of this row block. This is quite a bit more involved though so maybe note 
 worth the complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9258) Range movement causes CPU performance impact

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14641218#comment-14641218
 ] 

Jonathan Ellis commented on CASSANDRA-9258:
---

No.

 Range movement causes CPU  performance impact
 --

 Key: CASSANDRA-9258
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9258
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.1.4
Reporter: Rick Branson
 Fix For: 2.1.x


 Observing big CPU  latency regressions when doing range movements on 
 clusters with many tens of thousands of vnodes. See CPU usage increase by 
 ~80% when a single node is being replaced.
 Top methods are:
 1) Ljava/math/BigInteger;.compareTo in 
 Lorg/apache/cassandra/dht/ComparableObjectToken;.compareTo 
 2) Lcom/google/common/collect/AbstractMapBasedMultimap;.wrapCollection in 
 Lcom/google/common/collect/AbstractMapBasedMultimap$AsMap$AsMapIterator;.next
 3) Lorg/apache/cassandra/db/DecoratedKey;.compareTo in 
 Lorg/apache/cassandra/dht/Range;.contains
 Here's a sample stack from a thread dump:
 {code}
 Thrift:50673 daemon prio=10 tid=0x7f2f20164800 nid=0x3a04af runnable 
 [0x7f2d878d]
java.lang.Thread.State: RUNNABLE
   at org.apache.cassandra.dht.Range.isWrapAround(Range.java:260)
   at org.apache.cassandra.dht.Range.contains(Range.java:51)
   at org.apache.cassandra.dht.Range.contains(Range.java:110)
   at 
 org.apache.cassandra.locator.TokenMetadata.pendingEndpointsFor(TokenMetadata.java:916)
   at 
 org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:775)
   at 
 org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:541)
   at 
 org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:616)
   at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1101)
   at 
 org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1083)
   at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:976)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
   at 
 org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980)
   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9259:
--
Assignee: Ariel Weisberg

 Bulk Reading from Cassandra
 ---

 Key: CASSANDRA-9259
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter:  Brian Hess
Assignee: Ariel Weisberg

 This ticket is following on from the 2015 NGCC.  This ticket is designed to 
 be a place for discussing and designing an approach to bulk reading.
 The goal is to have a bulk reading path for Cassandra.  That is, a path 
 optimized to grab a large portion of the data for a table (potentially all of 
 it).  This is a core element in the Spark integration with Cassandra, and the 
 speed at which Cassandra can deliver bulk data to Spark is limiting the 
 performance of Spark-plus-Cassandra operations.  This is especially of 
 importance as Cassandra will (likely) leverage Spark for internal operations 
 (for example CASSANDRA-8234).
 The core CQL to consider is the following:
 SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey)  X AND 
 Token(partitionKey) = Y
 Here, we choose X and Y to be contained within one token range (perhaps 
 considering the primary range of a node without vnodes, for example).  This 
 query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
 operations via Spark (or other processing frameworks - ETL, etc).  There are 
 a few causes (e.g., inefficient paging).
 There are a few approaches that could be considered.  First, we consider a 
 new Streaming Compaction approach.  The key observation here is that a bulk 
 read from Cassandra is a lot like a major compaction, though instead of 
 outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
 This would be similar to a CompactionTask, but would strip out some 
 unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
 projections could also be encapsulated in this new StreamingCompactionTask, 
 for example.
 Another approach would be an alternate storage format.  For example, we might 
 employ Parquet (just as an example) to store the same data as in the primary 
 Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
 alternate storage of the same data optimized for a particular query).  Then, 
 Cassandra can choose to leverage this alternate storage for particular CQL 
 queries (e.g., range scans).
 These are just 2 suggestions to get the conversation going.
 One thing to note is that it will be useful to have this storage segregated 
 by token range so that when you extract via these mechanisms you do not get 
 replications-factor numbers of copies of the data.  That will certainly be an 
 issue for some Spark operations (e.g., counting).  Thus, we will want 
 per-token-range storage (even for single disks), so this will likely leverage 
 CASSANDRA-6696 (though, we'll want to also consider the single disk case).
 It is also worth discussing what the success criteria is here.  It is 
 unlikely to be as fast as EDW or HDFS performance (though, that is still a 
 good goal), but being within some percentage of that performance should be 
 set as success.  For example, 2x as long as doing bulk operations on HDFS 
 with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9830:
--
Reviewer: Joshua McKenzie

 Option to disable bloom filter in highest level of LCS sstables
 ---

 Key: CASSANDRA-9830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Paulo Motta
Priority: Minor
  Labels: performance
 Fix For: 3.x


 We expect about 90% of data to be in the highest level of LCS in a fully 
 populated series.  (See also CASSANDRA-9829.)
 Thus if the user is primarily asking for data (partitions) that has actually 
 been inserted, the bloom filter on the highest level only helps reject 
 sstables about 10% of the time.
 We should add an option that suppresses bloom filter creation on top-level 
 sstables.  This will dramatically reduce memory usage for LCS and may even 
 improve performance as we no longer check a low-value filter.
 (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9890) Bytecode inspection for Java-UDFs

2015-07-24 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9890:
--
Reviewer: T Jake Luciani

 Bytecode inspection for Java-UDFs
 -

 Key: CASSANDRA-9890
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9890
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0.0 rc1


 (Follow-up to CASSANDRA-9402)
 For Java-UDFs we could inspect the compiled Java byte code to find usages of 
 the Java language that are forbidden to UDFs.
 These include usages of:
 * {{synchronized}} keyword
 * call to {{j.l.Object.wait}}
 * call to {{j.l.Object.notify}}
 * call to {{j.l.Object.notifyAll}}
 * call to {{j.l.Object.getClass}}
 * calls to specific methods of currently allowed classes in the driver (but 
 would need some investigation)
 By inspecting the byte code _before_ the class is actually used, even dirty 
 constructs like the following would be impossible:
 {noformat}
 CREATE OR REPLACE FUNCTION ... AS $$  return Math.sin(val);
 }
 {
   // anonymous initializer code
 }
 static {
   // static initializer code
 $$;
 {noformat}
 (inspired by [this blog 
 post|http://jordan-wright.com/blog/2015/03/08/elasticsearch-rce-vulnerability-cve-2015-1427/])



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640988#comment-14640988
 ] 

Jonathan Ellis commented on CASSANDRA-9830:
---

Good point, but IMO we should still make disabled the default [in 3.x] and let 
users enable it if necessary.

 Option to disable bloom filter in highest level of LCS sstables
 ---

 Key: CASSANDRA-9830
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9830
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Paulo Motta
Priority: Minor
  Labels: performance
 Fix For: 3.x


 We expect about 90% of data to be in the highest level of LCS in a fully 
 populated series.  (See also CASSANDRA-9829.)
 Thus if the user is primarily asking for data (partitions) that has actually 
 been inserted, the bloom filter on the highest level only helps reject 
 sstables about 10% of the time.
 We should add an option that suppresses bloom filter creation on top-level 
 sstables.  This will dramatically reduce memory usage for LCS and may even 
 improve performance as we no longer check a low-value filter.
 (This is also an idea from RocksDB.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-07-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14640442#comment-14640442
 ] 

Jonathan Ellis commented on CASSANDRA-9892:
---

I get what you mean, but from a user's perspective it would mean we trust the 
server to guarantee that the function can't do bad things.

We could use a different term if that's confusing though.

 Add support for unsandboxed UDF
 ---

 Key: CASSANDRA-9892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
 Project: Cassandra
  Issue Type: New Feature
Reporter: Jonathan Ellis
Assignee: Robert Stupp
Priority: Minor

 From discussion on CASSANDRA-9402,
 The approach postgresql takes is to distinguish between trusted (sandboxed) 
 and untrusted (anything goes) UDF languages. 
 Creating an untrusted language always requires superuser mode. Once that is 
 done, creating functions in it requires nothing special.
 Personally I would be fine with this approach, but I think it would be more 
 useful to have the extra permission on creating the function, and also 
 wouldn't require adding explicit CREATE LANGUAGE.
 So I'd suggest just providing different CQL permissions for trusted and 
 untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
 create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
 UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9498) If more than 65K columns, sparse layout will break

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9498.
---
   Resolution: Duplicate
 Assignee: (was: Benedict)
Fix Version/s: (was: 3.0 beta 1)

 If more than 65K columns, sparse layout will break
 --

 Key: CASSANDRA-9498
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9498
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Priority: Minor

 Follow up to CASSANDRA-8099. It is a relatively small bug, since the exposed 
 population of users is likely to be very low, but fixing it in a good way is 
 a bit tricky. I'm filing a separate JIRA, because I would like us to address 
 this by introducing a writeVInt method to DataOutputStreamPlus, that we can 
 also exploit to improve the encoding of timestamps and deletion times, and 
 this JIRA will help to track the dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9881) Rows with negative-sized keys can't be skipped by sstablescrub

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9881:
--
Assignee: Stefania

 Rows with negative-sized keys can't be skipped by sstablescrub
 --

 Key: CASSANDRA-9881
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9881
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Stefania
Priority: Minor
 Fix For: 2.1.x


 It is possible to have corruption in such a way that scrub (on or offline) 
 can't skip the row, so you end up in a loop where this just keeps repeating:
 {noformat}
 WARNING: Row starting at position 2087453 is unreadable; skipping to next 
 Reading row at 2087453 
 row (unreadable key) is -1 bytes
 {noformat}
 The workaround is to just delete the problem sstable since you were going to 
 have to repair anyway, but it would still be nice to salvage the rest of the 
 sstable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9498) If more than 65K columns, sparse layout will break

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9498:
--
Assignee: Benedict

Let's limit this to not imposing any new backwards compatibility challenges for 
b1.  We can do more in 3.x.

 If more than 65K columns, sparse layout will break
 --

 Key: CASSANDRA-9498
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9498
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 3.0 beta 1


 Follow up to CASSANDRA-8099. It is a relatively small bug, since the exposed 
 population of users is likely to be very low, but fixing it in a good way is 
 a bit tricky. I'm filing a separate JIRA, because I would like us to address 
 this by introducing a writeVInt method to DataOutputStreamPlus, that we can 
 also exploit to improve the encoding of timestamps and deletion times, and 
 this JIRA will help to track the dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2015-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14638908#comment-14638908
 ] 

Jonathan Ellis commented on CASSANDRA-7937:
---

I think we can do as well with a simpler approach by using MessagingService 
queues as a proxy for target's load.  (If the target is overwhelmed it will 
read slower from the socket and our queue will not drain; if it is not 
more-than-usually-overwhelmed but clients are sending us so many requests for 
that target that we still can't drain it fast enough, then we should also pause 
accepting extra requests.)

See CASSANDRA-9318 and in particular my summary 
[here|https://issues.apache.org/jira/browse/CASSANDRA-9318?focusedCommentId=14604649page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14604649].

(NB feel free to reassign to Jacek if he has free cycles.)

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
Assignee: Jacek Lewandowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is because analytic tools typically write data as fast as they can in 
 parallel, from many nodes and they are not artificially rate-limited, so C* 
 is the bottleneck here. Also, increasing the number of nodes doesn't really 
 help, because in a collocated setup this also increases number of 
 Hadoop/Spark nodes (writers) and although possible write performance is 
 higher, the problem still remains.
 We observe the following behavior:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. the available memory limit for memtables is reached and writes are no 
 longer accepted
 3. the application gets hit by write timeout, and retries repeatedly, in 
 vain 
 4. after several failed attempts to write, the job gets aborted 
 Desired behaviour:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. after exceeding some memtable fill threshold, C* applies adaptive rate 
 limiting to writes - the more the buffers are filled-up, the less writes/s 
 are accepted, however writes still occur within the write timeout.
 3. thanks to slowed down data ingestion, now flush can finish before all the 
 memory gets used
 Of course the details how rate limiting could be done are up for a discussion.
 It may be also worth considering putting such logic into the driver, not C* 
 core, but then C* needs to expose at least the following information to the 
 driver, so we could calculate the desired maximum data rate:
 1. current amount of memory available for writes before they would completely 
 block
 2. total amount of data queued to be flushed and flush progress (amount of 
 data to flush remaining for the memtable currently being flushed)
 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9402) Implement proper sandboxing for UDFs

2015-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14639147#comment-14639147
 ] 

Jonathan Ellis commented on CASSANDRA-9402:
---

nio is whitelisted, but my understanding is that's only checked *if* the 
SecurityManager approves.  All i/o (file, socket) is prohibited there.

 Implement proper sandboxing for UDFs
 

 Key: CASSANDRA-9402
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9402
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
Assignee: Robert Stupp
Priority: Critical
  Labels: docs-impacting, security
 Fix For: 3.0 beta 1

 Attachments: 9402-warning.txt


 We want to avoid a security exploit for our users.  We need to make sure we 
 ship 2.2 UDFs with good defaults so someone exposing it to the internet 
 accidentally doesn't open themselves up to having arbitrary code run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-7937) Apply backpressure gently when overloaded with writes

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-7937.
---
Resolution: Later
  Assignee: (was: Jacek Lewandowski)

Marking Later, we can reopen if 9318 proves insufficient.

 Apply backpressure gently when overloaded with writes
 -

 Key: CASSANDRA-7937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7937
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Cassandra 2.0
Reporter: Piotr Kołaczkowski
  Labels: performance

 When writing huge amounts of data into C* cluster from analytic tools like 
 Hadoop or Apache Spark, we can see that often C* can't keep up with the load. 
 This is because analytic tools typically write data as fast as they can in 
 parallel, from many nodes and they are not artificially rate-limited, so C* 
 is the bottleneck here. Also, increasing the number of nodes doesn't really 
 help, because in a collocated setup this also increases number of 
 Hadoop/Spark nodes (writers) and although possible write performance is 
 higher, the problem still remains.
 We observe the following behavior:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. the available memory limit for memtables is reached and writes are no 
 longer accepted
 3. the application gets hit by write timeout, and retries repeatedly, in 
 vain 
 4. after several failed attempts to write, the job gets aborted 
 Desired behaviour:
 1. data is ingested at an extreme fast pace into memtables and flush queue 
 fills up
 2. after exceeding some memtable fill threshold, C* applies adaptive rate 
 limiting to writes - the more the buffers are filled-up, the less writes/s 
 are accepted, however writes still occur within the write timeout.
 3. thanks to slowed down data ingestion, now flush can finish before all the 
 memory gets used
 Of course the details how rate limiting could be done are up for a discussion.
 It may be also worth considering putting such logic into the driver, not C* 
 core, but then C* needs to expose at least the following information to the 
 driver, so we could calculate the desired maximum data rate:
 1. current amount of memory available for writes before they would completely 
 block
 2. total amount of data queued to be flushed and flush progress (amount of 
 data to flush remaining for the memtable currently being flushed)
 3. average flush write speed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9318) Bound the number of in-flight requests at the coordinator

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9318:
--
Assignee: Jacek Lewandowski  (was: Ariel Weisberg)

 Bound the number of in-flight requests at the coordinator
 -

 Key: CASSANDRA-9318
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9318
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Jacek Lewandowski
 Fix For: 2.1.x, 2.2.x


 It's possible to somewhat bound the amount of load accepted into the cluster 
 by bounding the number of in-flight requests and request bytes.
 An implementation might do something like track the number of outstanding 
 bytes and requests and if it reaches a high watermark disable read on client 
 connections until it goes back below some low watermark.
 Need to make sure that disabling read on the client connection won't 
 introduce other issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9483) Document incompatibilities with -XX:+PerfDisableSharedMem

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9483:
--
Assignee: T Jake Luciani  (was: Tyler Hobbs)

 Document incompatibilities with -XX:+PerfDisableSharedMem
 -

 Key: CASSANDRA-9483
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9483
 Project: Cassandra
  Issue Type: Task
  Components: Config, Documentation  website
Reporter: Tyler Hobbs
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 3.0 beta 1


 We recently discovered that [the Jolokia agent is incompatible with  the 
 -XX:+PerfDisableSharedMem JVM 
 option|https://github.com/rhuss/jolokia/issues/198].  I assume that this may 
 affect other monitoring tools as well.
 If we are going to leave this enabled by default, we should document the 
 potential problems with it.  A combination of a comment in 
 {{cassandra-env.sh}} (and the Windows equivalent) and a comment in NEWS.txt 
 should suffice, I think.
 If possible, it would be good to figure out what other tools are affected and 
 also mention them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9416) 3.x should refuse to start on JVM_VERSION 1.8

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9416:
--
Assignee: Philip Thompson

 3.x should refuse to start on JVM_VERSION  1.8
 ---

 Key: CASSANDRA-9416
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9416
 Project: Cassandra
  Issue Type: Task
Reporter: Michael Shuler
Assignee: Philip Thompson
Priority: Minor
  Labels: lhf
 Fix For: 3.0 beta 1

 Attachments: trunk-9416.patch


 When I was looking at CASSANDRA-9408, I noticed that 
 {{conf/cassandra-env.sh}} and {{conf/cassandra-env.ps1}} do JVM version 
 checking and should get updated for 3.x to refuse to start with JVM_VERSION  
 1.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9498) If more than 65K columns, sparse layout will break

2015-07-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14639216#comment-14639216
 ] 

Jonathan Ellis commented on CASSANDRA-9498:
---

With 9499 finished what is left here?

 If more than 65K columns, sparse layout will break
 --

 Key: CASSANDRA-9498
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9498
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Priority: Minor
 Fix For: 3.0 beta 1


 Follow up to CASSANDRA-8099. It is a relatively small bug, since the exposed 
 population of users is likely to be very low, but fixing it in a good way is 
 a bit tricky. I'm filing a separate JIRA, because I would like us to address 
 this by introducing a writeVInt method to DataOutputStreamPlus, that we can 
 also exploit to improve the encoding of timestamps and deletion times, and 
 this JIRA will help to track the dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9717) TestCommitLog segment size dtests fail on trunk

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9717:
--
Assignee: Jim Witschey  (was: Branimir Lambov)
Reviewer: Ariel Weisberg

 TestCommitLog segment size dtests fail on trunk
 ---

 Key: CASSANDRA-9717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9717
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Jim Witschey
Assignee: Jim Witschey
Priority: Blocker
 Fix For: 3.0 beta 1


 The test for the commit log segment size when the specified size is 32MB. It 
 fails for me locally and on on cassci. ([cassci 
 link|http://cassci.datastax.com/view/trunk/job/trunk_dtest/305/testReport/commitlog_test/TestCommitLog/default_segment_size_test/])
 The command to run the test by itself is {{CASSANDRA_VERSION=git:trunk 
 nosetests commitlog_test.py:TestCommitLog.default_segment_size_test}}.
 EDIT: a similar test, 
 {{commitlog_test.py:TestCommitLog.small_segment_size_test}}, also fails with 
 a similar error.
 The solution here may just be to change the expected size or the acceptable 
 error -- the result isn't far off. I'm happy to make the dtest change if 
 that's the solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9799) RangeTombstonListTest sometimes fails on trunk

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9799:
--
Reviewer: Joshua McKenzie

 RangeTombstonListTest sometimes fails on trunk
 --

 Key: CASSANDRA-9799
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9799
 Project: Cassandra
  Issue Type: Test
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
  Labels: test
 Fix For: 3.0 beta 1


 I've seen random failures with {{RangeTombstoneList.addAllRandomTest}}. The 
 problem is 2 inequalities in {{RangeTombstoneList.insertFrom}} that should be 
 inclusive rather than strict when we deal with boundaries between range. In 
 practice, that makes us consider range like {{[3, 3)}} during addition, which 
 is non-sensical.
 Attaching patch as well as a test that reproduce (extracted from 
 {{addAllRandomTest}} with a failing seed).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9847) Don't serialize CFMetaData in read responses

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9847:
--
Reviewer: Joshua McKenzie

 Don't serialize CFMetaData in read responses
 

 Key: CASSANDRA-9847
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9847
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 3.0 beta 1


 Our CFMetaData ids are 16 bytes long, which for small messages is a non 
 trivial part of the size (we're further currently unnecessarily serialize it 
 with every partition). At least for read response, we don't really need to 
 serialize it at all since we always to which query this is a response of.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9828) Minor improvements to RowStats

2015-07-23 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9828:
--
Reviewer: Joshua McKenzie

 Minor improvements to RowStats
 --

 Key: CASSANDRA-9828
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9828
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 3.0 beta 1


 There is some small improvements/refactor I'd like to do for {{RowStats}}. 
 More specifically, I'm attaching 3 commits:
 # the first one merely rename {{RowStats}} to {{EncodingStats}}. {{RowStats}} 
 was not a terribly helpful name while {{EncodingStats}} at least give a sense 
 of why the thing exists.
 # the 2nd one improve the serialization of those {{EncodingStats}}. 
 {{EncodingStats}} holds both a {{minTimestamp}} and a 
 {{minLocalDeletionTime}}, both of which are unix timestamp (or at least 
 should be almost all the time for the timestamp by convention) and so are 
 fairly big numbers that don't get much love (if any) from vint encoding. So 
 the patch introducing hard-coded epoch numbers for both that roughly 
 correspond to now, and substract that to the actual {{EncodingStats}} number 
 to make it more rip for vint encoding. It does mean the exact encoding size 
 will deteriorate over time, but it'll take a while before it becomes useless 
 and we'll probably have more more change to the encodings by then anyway 
 (and/or we can change the epoch number regularly with new versions of the 
 messaging protocol if we so wish).
 # the last patch is just a small simple cleanup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637211#comment-14637211
 ] 

Jonathan Ellis commented on CASSANDRA-9302:
---

([~aholmber] will provide us a pure-python murmur hash, so we can start in on 
cqlsh side of TAR while that's happening.)

 Optimize cqlsh COPY FROM, part 3
 

 Key: CASSANDRA-9302
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: David Kua
 Fix For: 2.1.x


 We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
 but people need a good bulk load tool now.  One option is to add a separate 
 Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
 from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
 we point people, rather than adding more tools that need to be supported 
 indefinitely.
 Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
 CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9304) COPY TO improvements

2015-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9304:
--
Reviewer: Stefania Alborghetti

[~stefania_alborghetti] to review

 COPY TO improvements
 

 Key: CASSANDRA-9304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Kua
Priority: Minor
  Labels: cqlsh
 Fix For: 2.1.x


 COPY FROM has gotten a lot of love.  COPY TO not so much.  One obvious 
 improvement could be to parallelize reading and writing (write one page of 
 data while fetching the next).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9448) Metrics should use up to date nomenclature

2015-07-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14637659#comment-14637659
 ] 

Jonathan Ellis commented on CASSANDRA-9448:
---

Now that we don't cache entire partitions I actually think rowCache makes more 
sense.

 Metrics should use up to date nomenclature
 --

 Key: CASSANDRA-9448
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9448
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Sam Tunnicliffe
Assignee: Stefania
  Labels: docs-impacting, jmx
 Fix For: 3.0 beta 1


 There are a number of exposed metrics that currently are named using the old 
 nomenclature of columnfamily and rows (meaning partitions).
 It would be good to audit all metrics and update any names to match what they 
 actually represent; we should probably do that in a single sweep to avoid a 
 confusing mixture of old and new terminology. 
 As we'd need to do this in a major release, I've initially set the fixver for 
 3.0 beta1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9871) Cannot replace token does not exist - DN node removed as Fat Client

2015-07-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9871:
--
Assignee: Stefania

 Cannot replace token does not exist - DN node removed as Fat Client
 ---

 Key: CASSANDRA-9871
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9871
 Project: Cassandra
  Issue Type: Bug
Reporter: Sebastian Estevez
Assignee: Stefania
 Fix For: 2.1.x


 We lost a node due to disk failure, we tried to replace it via 
 -Dcassandra.replace_address per -- 
 http://docs.datastax.com/en/cassandra/2.1/cassandra/operations/opsReplaceNode.html
 The node would not come up with these errors in the system.log:
 {code}
 INFO  [main] 2015-07-22 03:20:06,722  StorageService.java:500 - Gathering 
 node replacement information for /10.171.115.233
 ...
 INFO  [SharedPool-Worker-1] 2015-07-22 03:22:34,281  Gossiper.java:954 - 
 InetAddress /10.111.183.101 is now UP
 INFO  [GossipTasks:1] 2015-07-22 03:22:59,300  Gossiper.java:735 - FatClient 
 /10.171.115.233 has been silent for 3ms, removing from gossip
 ERROR [main] 2015-07-22 03:23:28,485  CassandraDaemon.java:541 - Exception 
 encountered during startup
 java.lang.UnsupportedOperationException: Cannot replace token 
 -1013652079972151677 which does not exist!
 {code}
 It is not clear why Gossiper removed the node as a FatClient, given that it 
 was a full node before it died and it had tokens assigned to it (including 
 -1013652079972151677) in system.peers and nodetool ring. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9644) DTCS configuration proposals for handling consequences of repairs

2015-07-21 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9644:
--
Labels: compaction dtcs  (was: dtcs)

 DTCS configuration proposals for handling consequences of repairs
 -

 Key: CASSANDRA-9644
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9644
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Antti Nissinen
  Labels: compaction, dtcs
 Fix For: 3.x, 2.1.x

 Attachments: node0_20150621_1646_time_graph.txt, 
 node0_20150621_2320_time_graph.txt, node0_20150623_1526_time_graph.txt, 
 node1_20150621_1646_time_graph.txt, node1_20150621_2320_time_graph.txt, 
 node1_20150623_1526_time_graph.txt, node2_20150621_1646_time_graph.txt, 
 node2_20150621_2320_time_graph.txt, node2_20150623_1526_time_graph.txt, 
 nodetool status infos.txt, sstable_compaction_trace.txt, 
 sstable_compaction_trace_snipped.txt, sstable_counts.jpg


 This is a document bringing up some issues when DTCS is used to compact time 
 series data in a three node cluster. The DTCS is currently configured with a 
 few parameters that are making the configuration fairly simple, but might 
 cause problems in certain special cases like recovering from the flood of 
 small SSTables due to repair operation. We are suggesting some ideas that 
 might be a starting point for further discussions. Following sections are 
 containing:
 - Description of the cassandra setup
 - Feeding process of the data
 - Failure testing
 - Issues caused by the repair operations for the DTCS
 - Proposal for the DTCS configuration parameters
 Attachments are included to support the discussion and there is a separate 
 section giving explanation for those.
 Cassandra setup and data model
 - Cluster is composed from three nodes running Cassandra 2.1.2. Replication 
 factor is two and read and write consistency levels are ONE.
 - Data is time series data. Data is saved so that one row contains a certain 
 time span of data for a given metric ( 20 days in this case). The row key 
 contains information about the start time of the time span and metrix name. 
 Column name gives the offset from the beginning of time span. Column time 
 stamp is set to correspond time stamp when adding together the timestamp from 
 the row key and the offset (the actual time stamp of data point). Data model 
 is analog to KairosDB implementation.
 - Average sampling rate is 10 seconds varying significantly from metric to 
 metric.
 - 100 000 metrics are fed to the Cassandra.
 - max_sstable_age_days is set to 5 days (objective is to keep SStable files 
 in manageable size, around 50 GB)
 - TTL is not in use in the test.
 Procedure for the failure test.
 - Data is first dumped to Cassandra for 11 days and the data dumping is 
 stopped so that DTCS will have a change to finish all compactions. Data is 
 dumped with fake timestamps so that column time stamp is set when data is 
 written to Cassandra.
 - One of the nodes is taken down and new data is dumped on top of the earlier 
 data covering couple of hours worth of data (faked time stamps).
 - Dumping is stopped and the node is kept down for few hours.
 - Node is taken up and the nodetool repair is applied on the node that was 
 down.
 Consequences
 - Repair operation will lead to massive amount of new SStables far back in 
 the history. New SStables are covering similar time spans than the files that 
 were created by DTCS before the shutdown of one of the nodes.
 - To be able to compact the small files the max_sstable_age_days should be 
 increased to allow compaction to handle the files. However, the in a 
 practical case the time window will increase so large that generated files 
 will be huge that is not desirable. The compaction also combines together one 
 very large file with a bunch of small files in several phases that is not 
 effective. Generating really large files may also lead to out of disc space 
 problems.
 - See the list of time graphs later in the document.
 Improvement proposals for the DTCS configuration
 Below is a list of desired properties for the configuration. Current 
 parameters are mentioned if available.
 - Initial window size (currently:base_time_seconds)
 - The amount of similar size windows for the bucketing (currently: 
 min_threshold)
 - The multiplier for the window size when increased (currently: 
 min_threshold). This we would like to be independent from the min_threshold 
 parameter so that you could actually control the rate how fast the window 
 size is increased.
 - Maximum length of the time window inside which the files are assigned for a 
 certain bucket (not currently defined). This means that expansion of time 
 window length is restricted. When the limit is 

[jira] [Updated] (CASSANDRA-9644) DTCS configuration proposals for handling consequences of repairs

2015-07-21 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9644:
--
Assignee: Marcus Eriksson

 DTCS configuration proposals for handling consequences of repairs
 -

 Key: CASSANDRA-9644
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9644
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Antti Nissinen
Assignee: Marcus Eriksson
  Labels: compaction, dtcs
 Fix For: 3.x, 2.1.x

 Attachments: node0_20150621_1646_time_graph.txt, 
 node0_20150621_2320_time_graph.txt, node0_20150623_1526_time_graph.txt, 
 node1_20150621_1646_time_graph.txt, node1_20150621_2320_time_graph.txt, 
 node1_20150623_1526_time_graph.txt, node2_20150621_1646_time_graph.txt, 
 node2_20150621_2320_time_graph.txt, node2_20150623_1526_time_graph.txt, 
 nodetool status infos.txt, sstable_compaction_trace.txt, 
 sstable_compaction_trace_snipped.txt, sstable_counts.jpg


 This is a document bringing up some issues when DTCS is used to compact time 
 series data in a three node cluster. The DTCS is currently configured with a 
 few parameters that are making the configuration fairly simple, but might 
 cause problems in certain special cases like recovering from the flood of 
 small SSTables due to repair operation. We are suggesting some ideas that 
 might be a starting point for further discussions. Following sections are 
 containing:
 - Description of the cassandra setup
 - Feeding process of the data
 - Failure testing
 - Issues caused by the repair operations for the DTCS
 - Proposal for the DTCS configuration parameters
 Attachments are included to support the discussion and there is a separate 
 section giving explanation for those.
 Cassandra setup and data model
 - Cluster is composed from three nodes running Cassandra 2.1.2. Replication 
 factor is two and read and write consistency levels are ONE.
 - Data is time series data. Data is saved so that one row contains a certain 
 time span of data for a given metric ( 20 days in this case). The row key 
 contains information about the start time of the time span and metrix name. 
 Column name gives the offset from the beginning of time span. Column time 
 stamp is set to correspond time stamp when adding together the timestamp from 
 the row key and the offset (the actual time stamp of data point). Data model 
 is analog to KairosDB implementation.
 - Average sampling rate is 10 seconds varying significantly from metric to 
 metric.
 - 100 000 metrics are fed to the Cassandra.
 - max_sstable_age_days is set to 5 days (objective is to keep SStable files 
 in manageable size, around 50 GB)
 - TTL is not in use in the test.
 Procedure for the failure test.
 - Data is first dumped to Cassandra for 11 days and the data dumping is 
 stopped so that DTCS will have a change to finish all compactions. Data is 
 dumped with fake timestamps so that column time stamp is set when data is 
 written to Cassandra.
 - One of the nodes is taken down and new data is dumped on top of the earlier 
 data covering couple of hours worth of data (faked time stamps).
 - Dumping is stopped and the node is kept down for few hours.
 - Node is taken up and the nodetool repair is applied on the node that was 
 down.
 Consequences
 - Repair operation will lead to massive amount of new SStables far back in 
 the history. New SStables are covering similar time spans than the files that 
 were created by DTCS before the shutdown of one of the nodes.
 - To be able to compact the small files the max_sstable_age_days should be 
 increased to allow compaction to handle the files. However, the in a 
 practical case the time window will increase so large that generated files 
 will be huge that is not desirable. The compaction also combines together one 
 very large file with a bunch of small files in several phases that is not 
 effective. Generating really large files may also lead to out of disc space 
 problems.
 - See the list of time graphs later in the document.
 Improvement proposals for the DTCS configuration
 Below is a list of desired properties for the configuration. Current 
 parameters are mentioned if available.
 - Initial window size (currently:base_time_seconds)
 - The amount of similar size windows for the bucketing (currently: 
 min_threshold)
 - The multiplier for the window size when increased (currently: 
 min_threshold). This we would like to be independent from the min_threshold 
 parameter so that you could actually control the rate how fast the window 
 size is increased.
 - Maximum length of the time window inside which the files are assigned for a 
 certain bucket (not currently defined). This means that expansion of time 
 window length is 

[jira] [Commented] (CASSANDRA-9851) Write Durability Failures Even During Batch Commit Mode

2015-07-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14634149#comment-14634149
 ] 

Jonathan Ellis commented on CASSANDRA-9851:
---

Can you bisect?

 Write Durability Failures Even During Batch Commit Mode 
 

 Key: CASSANDRA-9851
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9851
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian, x86_64, Kernel 3.16.7
Reporter: Joel Knighton
 Attachments: n1.log, n2.log, n3.log, n4.log, n5.log


 Reproducible as of a66863861136a29dc04d7bc3b319f9f8fae0f49f on trunk, as well 
 as in other recent commits.
 Durability of writes seems to be violated, even under batch commitlog mode. 
 This issue was discovered by a test that adds a range of values to a CQL Set, 
 with no deletes issued. The test is available here 
 https://github.com/riptano/jepsen/blob/cassandra/cassandra/src/cassandra/collections/set.clj#L56.
 During this write pattern, random nodes in the 5 node cluster are kill -9ed. 
 Once all nodes have been brought back up, another read at CL.ALL is issued. 
 This read fails to return values that have previously been successfully read 
 from the cluster. This problem is not reproducible on 2.1.* or 2.2.
 Log files from each node are attached.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632541#comment-14632541
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

Why not just apply MV maintenance to streamed rows the way we do 2i maintenance?

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632542#comment-14632542
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

Where do we rely on a single node?

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-18 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632545#comment-14632545
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

The majority of use cases are going to be denormalizing what are today query 
tables, i.e., I want to give the client what it needs by scanning a single 
partition.  Doing extra queries to save disk space may occasionally be 
necessary but it is not the norm.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631410#comment-14631410
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

bq. From a user's perspective, I agree with Sylvain that the MV should respect 
the CL. I wouldn't expect to do a write at ALL, then do a read and get an old 
record back.

But the other side of that coin is is, we're effectively promoting all 
operations to at least QUORUM regardless of what the user asked for...

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631422#comment-14631422
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

1. Paired replica?  What?

2. Under what conditions does replica BL save you from replaying coordinator BL?

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631426#comment-14631426
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

Pedantically you are correct.  Which is why I said effectively and not 
literally. :)

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631466#comment-14631466
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

No, you're right.  Synchronous MV updates is a terrible idea, which is more 
obvious when considering the case of more than one MV.  In the extreme case you 
could touch every node in the cluster.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631471#comment-14631471
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

If there are multiple MVs being updated, do they get merged into a single set 
of batchlogs?  (I.e. Just one on coordinator, one on each base replica, instead 
of one per MV.)

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631734#comment-14631734
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

I disagree about making synchronous the default.  As Jake points out that can 
kill your availability even on a single MV if you are unlucky with replica 
placement, and it's virtually guaranteed to kill it with many MV.  I would go 
so far as to say that synchronous MV updates are not useful and we should not 
bother adding it.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9842) Creation of partition and update of static columns in the same LWT fails

2015-07-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9842.
---
Resolution: Not A Problem

Cassandra's behavior is correct, the partition not existing is not the same as 
existing with null value.

 Creation of partition and update of static columns in the same LWT fails
 

 Key: CASSANDRA-9842
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9842
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: cassandra-2.1.8 on Ubuntu 15.04
Reporter: Chandra Sekar

 Both inserting a row (in a non-existent partition) and updating a static 
 column in the same LWT fails. Creating the partition before performing the 
 LWT works.
 h3. Table Definition
 {code}
 create table txtable(pcol bigint, ccol bigint, scol bigint static, ncol text, 
 primary key((pcol), ccol));
 {code}
 h3. Inserting row in non-existent partition and updating static column in one 
 LWT
 {code}
 begin batch
 insert into txtable (pcol, ccol, ncol) values (1, 1, 'A');
 update txtable set scol = 1 where pcol = 1 if scol = null;
 apply batch;
 [applied]
 ---
  False
 {code}
 h3. Creating partition before LWT
 {code}
 insert into txtable (pcol, scol) values (1, null) if not exists;
 begin batch
 insert into txtable (pcol, ccol, ncol) values (1, 1, 'A');
 update txtable set scol = 1 where pcol = 1 if scol = null;
 apply batch;
 [applied]
 ---
  True
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9843) Augment or replace partition index with adaptive range filters

2015-07-17 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-9843:
-

 Summary: Augment or replace partition index with adaptive range 
filters
 Key: CASSANDRA-9843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9843
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: T Jake Luciani


Adaptive range filters are, in principle, bloom filters for range queries.  
They provide a space-efficient way to avoid scanning a partition when we can 
tell that we do not contain any data for the range requested.  Like BF, they 
can return false positives but not false negatives.

The implementation is of course totally different from BF.  ARF is a tree where 
each leaf of the tree is a range of data and a bit, either on or off, denoting 
whether we have *some* data in that range.

ARF are described here: http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9842) Creation of partition and update of static columns in the same LWT fails

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632248#comment-14632248
 ] 

Jonathan Ellis commented on CASSANDRA-9842:
---

The LWT clause is evaluated before the rest of the batch.  Statement order 
doesn't matter.

 Creation of partition and update of static columns in the same LWT fails
 

 Key: CASSANDRA-9842
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9842
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: cassandra-2.1.8 on Ubuntu 15.04
Reporter: Chandra Sekar

 Both inserting a row (in a non-existent partition) and updating a static 
 column in the same LWT fails. Creating the partition before performing the 
 LWT works.
 h3. Table Definition
 {code}
 create table txtable(pcol bigint, ccol bigint, scol bigint static, ncol text, 
 primary key((pcol), ccol));
 {code}
 h3. Inserting row in non-existent partition and updating static column in one 
 LWT
 {code}
 begin batch
 insert into txtable (pcol, ccol, ncol) values (1, 1, 'A');
 update txtable set scol = 1 where pcol = 1 if scol = null;
 apply batch;
 [applied]
 ---
  False
 {code}
 h3. Creating partition before LWT
 {code}
 insert into txtable (pcol, scol) values (1, null) if not exists;
 begin batch
 insert into txtable (pcol, ccol, ncol) values (1, 1, 'A');
 update txtable set scol = 1 where pcol = 1 if scol = null;
 apply batch;
 [applied]
 ---
  True
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9843) Augment or replace partition index with adaptive range filters

2015-07-17 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9843:
--
Labels: performance  (was: )

 Augment or replace partition index with adaptive range filters
 --

 Key: CASSANDRA-9843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9843
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: T Jake Luciani
  Labels: performance

 Adaptive range filters are, in principle, bloom filters for range queries.  
 They provide a space-efficient way to avoid scanning a partition when we can 
 tell that we do not contain any data for the range requested.  Like BF, they 
 can return false positives but not false negatives.
 The implementation is of course totally different from BF.  ARF is a tree 
 where each leaf of the tree is a range of data and a bit, either on or off, 
 denoting whether we have *some* data in that range.
 ARF are described here: http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9843) Augment or replace partition index with adaptive range filters

2015-07-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14632257#comment-14632257
 ] 

Jonathan Ellis commented on CASSANDRA-9843:
---

Rather than just adding an ARF per partition (the way we used to have a BF -- 
the difference is that BF is not useful for scans but this would be), we may be 
able to adapt this further by moving our index into the ARF.  Instead of just a 
bit indicating yes or no we could have the offset for the start of each range 
[that we do have data for] in the leaf.

(The adaptive in ARF means you can tune it to index hot parts of the data 
range in greater detail, without increasing the total memory used, at the cost 
of less detail for the cold ranges.  We could do this in Cassandra as well, 
writing updated ARF to a new file.  This could reduce the memory problems of 
pulling the indexes for very large partitions into memory.  However, the paper 
describes very good results even without adaptation, so this is not required 
for proof of concept.)

 Augment or replace partition index with adaptive range filters
 --

 Key: CASSANDRA-9843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9843
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: T Jake Luciani
  Labels: performance

 Adaptive range filters are, in principle, bloom filters for range queries.  
 They provide a space-efficient way to avoid scanning a partition when we can 
 tell that we do not contain any data for the range requested.  Like BF, they 
 can return false positives but not false negatives.
 The implementation is of course totally different from BF.  ARF is a tree 
 where each leaf of the tree is a range of data and a bit, either on or off, 
 denoting whether we have *some* data in that range.
 ARF are described here: http://www.vldb.org/pvldb/vol6/p1714-kossmann.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    3   4   5   6   7   8   9   10   11   12   >