from:"Jonathan Ellis \(JIRA\)"

[jira] [Updated] (CASSANDRA-10056) Fix AggregationTest post-test error messages

2015-08-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10056:
---
Reviewer: Benjamin Lerer

[~blerer] to review

 Fix AggregationTest post-test error messages
 

 Key: CASSANDRA-10056
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10056
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Trivial
 Fix For: 2.2.x


 AggregationTest prints error messages after test execution since some UDT 
 cannot be dropped. It's not critical to the tests themselves but makes the 
 log cleaner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9237) Gossip messages subject to head of line blocking by other intra-cluster traffic

2015-08-12 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694318#comment-14694318
 ] 

Jonathan Ellis commented on CASSANDRA-9237:
---

WFM.

 Gossip messages subject to head of line blocking by other intra-cluster 
 traffic
 ---

 Key: CASSANDRA-9237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0.0 rc1


 Reported as an issue over less than perfect networks like VPNs between data 
 centers.
 Gossip goes over the small message socket where small is 64k which isn't 
 particularly small. This is done for performance to keep most traffic on one 
 hot socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10060) Reuse TemporalRow when updating multiple MaterializedViews

2015-08-12 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14694493#comment-14694493
 ] 

Jonathan Ellis commented on CASSANDRA-10060:


does this combine the batchlogs generated at the replica too?  would that help?

 Reuse TemporalRow when updating multiple MaterializedViews
 --

 Key: CASSANDRA-10060
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10060
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 3.0.0 rc1


 If a table has 5 associated MVs the current logic reads the existing row for 
 the incoming mutation 5 times. 
 If we reuse the data from the first MV update we can cut out any further 
 reads.
 We know the existing data isn't changing because we are holding a lock on the 
 partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10052) Bring one node down, makes the whole cluster go down for a second

2015-08-12 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693539#comment-14693539
 ] 

Jonathan Ellis commented on CASSANDRA-10052:


How do you have clients connecting to non-localhost, if you've configured it to 
listen on localhost?

 Bring one node down, makes the whole cluster go down for a second
 -

 Key: CASSANDRA-10052
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10052
 Project: Cassandra
  Issue Type: Bug
Reporter: Sharvanath Pathak
Priority: Critical

 When a node goes down, the other nodes learn that through the gossip.
 And I do see the log from (Gossiper.java):
 {code}
 private void markDead(InetAddress addr, EndpointState localState)
{
if (logger.isTraceEnabled())
logger.trace(marking as down {}, addr);
localState.markDead();
liveEndpoints.remove(addr);
unreachableEndpoints.put(addr, System.nanoTime());
logger.info(InetAddress {} is now DOWN, addr);
for (IEndpointStateChangeSubscriber subscriber : subscribers)
subscriber.onDead(addr, localState);
if (logger.isTraceEnabled())
logger.trace(Notified  + subscribers);
}
 {code}
 Saying: InetAddress 192.168.101.1 is now Down, in the Cassandra's system 
 log.
 Now on all the other nodes the client side (java driver) says,  Cannot 
 connect to any host, scheduling retry in 1000 milliseconds. They eventually 
 do reconnect but some queries fail during this intermediate period.
 To me it seems like when the server pushes the nodeDown event, it call the 
 getRpcAddress(endpoint), and thus sends localhost as the argument in the 
 nodeDown event.  
 As in org.apache.cassandra.transport.Server.java
 {code}
   public void onDown(InetAddress endpoint)
{  

 server.connectionTracker.send(Event.StatusChange.nodeDown(getRpcAddress(endpoint),
  server.socket.getPort()));
}
 {code}
 the getRpcAddress returns localhost for any endpoint if the cassandra.yaml is 
 using localhost as the configuration for rpc_address (which by the way is the 
 default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9917:
--
Assignee: Paulo Motta  (was: Carl Yeksigian)

 MVs should validate gc grace seconds on the tables involved
 ---

 Key: CASSANDRA-9917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Paulo Motta
  Labels: materializedviews
 Fix For: 3.0 beta 1


 For correctness reasons (potential resurrection of dropped values), batchlog 
 entries are TTLs with the lowest gc grace second of all the tables involved 
 in a batch.
 It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
 will be dead on arrival, and never replayed.
 We should probably warn against such LOGGED writes taking place, in general, 
 but for MVs, we must validate that gc gs on the base table (and on the MV 
 table, if we should allow altering gc gs there at all), is never set too low, 
 or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10045) Sparse/Dense decision should be made per-row, not per-file

2015-08-12 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14693545#comment-14693545
 ] 

Jonathan Ellis commented on CASSANDRA-10045:


Okay, but let's keep fixver targeted at the must have release.

 Sparse/Dense decision should be made per-row, not per-file
 --

 Key: CASSANDRA-10045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 3.0.0 rc1


 Marking this as beta 1 in the hope I have time to rustle it up and get it 
 reviewed beforehand. If I do not, I will let it slide, but our behaviour 
 right now is not brilliant for workloads with a variance in density, and it 
 should not be challenging to make a more targeted decision.
 We can also make use of CASSANDRA-9894 to make column encoding more efficient 
 in many, even dense, cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10045) Sparse/Dense decision should be made per-row, not per-file

2015-08-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10045:
---
Fix Version/s: (was: 3.0 beta 1)
   3.0.0 rc1

 Sparse/Dense decision should be made per-row, not per-file
 --

 Key: CASSANDRA-10045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 3.0.0 rc1


 Marking this as beta 1 in the hope I have time to rustle it up and get it 
 reviewed beforehand. If I do not, I will let it slide, but our behaviour 
 right now is not brilliant for workloads with a variance in density, and it 
 should not be challenging to make a more targeted decision.
 We can also make use of CASSANDRA-9894 to make column encoding more efficient 
 in many, even dense, cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10052) Bringing one node down, makes the whole cluster go down for a second

2015-08-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10052:
---
Assignee: Stefania

I see.  Sounds like we should just special-case it and not send anything from 
onDown if a peer listening on localhost goes down.

 Bringing one node down, makes the whole cluster go down for a second
 

 Key: CASSANDRA-10052
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10052
 Project: Cassandra
  Issue Type: Bug
Reporter: Sharvanath Pathak
Assignee: Stefania
Priority: Critical

 When a node goes down, the other nodes learn that through the gossip.
 And I do see the log from (Gossiper.java):
 {code}
 private void markDead(InetAddress addr, EndpointState localState)
{
if (logger.isTraceEnabled())
logger.trace(marking as down {}, addr);
localState.markDead();
liveEndpoints.remove(addr);
unreachableEndpoints.put(addr, System.nanoTime());
logger.info(InetAddress {} is now DOWN, addr);
for (IEndpointStateChangeSubscriber subscriber : subscribers)
subscriber.onDead(addr, localState);
if (logger.isTraceEnabled())
logger.trace(Notified  + subscribers);
}
 {code}
 Saying: InetAddress 192.168.101.1 is now Down, in the Cassandra's system 
 log.
 Now on all the other nodes the client side (java driver) says,  Cannot 
 connect to any host, scheduling retry in 1000 milliseconds. They eventually 
 do reconnect but some queries fail during this intermediate period.
 To me it seems like when the server pushes the nodeDown event, it call the 
 getRpcAddress(endpoint), and thus sends localhost as the argument in the 
 nodeDown event.  
 As in org.apache.cassandra.transport.Server.java
 {code}
   public void onDown(InetAddress endpoint)
{  

 server.connectionTracker.send(Event.StatusChange.nodeDown(getRpcAddress(endpoint),
  server.socket.getPort()));
}
 {code}
 the getRpcAddress returns localhost for any endpoint if the cassandra.yaml is 
 using localhost as the configuration for rpc_address (which by the way is the 
 default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10052) Bringing one node down, makes the whole cluster go down for a second

2015-08-12 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10052:
---
Reviewer: Olivier Michallat  (was: Sylvain Lebresne)

 Bringing one node down, makes the whole cluster go down for a second
 

 Key: CASSANDRA-10052
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10052
 Project: Cassandra
  Issue Type: Bug
Reporter: Sharvanath Pathak
Assignee: Stefania
Priority: Critical

 When a node goes down, the other nodes learn that through the gossip.
 And I do see the log from (Gossiper.java):
 {code}
 private void markDead(InetAddress addr, EndpointState localState)
{
if (logger.isTraceEnabled())
logger.trace(marking as down {}, addr);
localState.markDead();
liveEndpoints.remove(addr);
unreachableEndpoints.put(addr, System.nanoTime());
logger.info(InetAddress {} is now DOWN, addr);
for (IEndpointStateChangeSubscriber subscriber : subscribers)
subscriber.onDead(addr, localState);
if (logger.isTraceEnabled())
logger.trace(Notified  + subscribers);
}
 {code}
 Saying: InetAddress 192.168.101.1 is now Down, in the Cassandra's system 
 log.
 Now on all the other nodes the client side (java driver) says,  Cannot 
 connect to any host, scheduling retry in 1000 milliseconds. They eventually 
 do reconnect but some queries fail during this intermediate period.
 To me it seems like when the server pushes the nodeDown event, it call the 
 getRpcAddress(endpoint), and thus sends localhost as the argument in the 
 nodeDown event.  
 As in org.apache.cassandra.transport.Server.java
 {code}
   public void onDown(InetAddress endpoint)
{  

 server.connectionTracker.send(Event.StatusChange.nodeDown(getRpcAddress(endpoint),
  server.socket.getPort()));
}
 {code}
 the getRpcAddress returns localhost for any endpoint if the cassandra.yaml is 
 using localhost as the configuration for rpc_address (which by the way is the 
 default).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10049) Commitlog initialization failure

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10049:
---
Fix Version/s: (was: 3.0 beta 1)
   3.0.0 rc1

 Commitlog initialization failure
 

 Key: CASSANDRA-10049
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10049
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Branimir Lambov
 Fix For: 3.0.0 rc1


 I've encountered this error locally during some dtests.
 It looks like a race condition in the commit log code.  
 http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/lastCompletedBuild/testReport/consistency_test/TestAccuracy/test_network_topology_strategy_users_2/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10045) Sparse/Dense decision should be made per-row, not per-file

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692307#comment-14692307
 ] 

Jonathan Ellis commented on CASSANDRA-10045:


Will this change the sstable format?  If not there is no rush to get it in 
before b1.

 Sparse/Dense decision should be made per-row, not per-file
 --

 Key: CASSANDRA-10045
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10045
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Minor
 Fix For: 3.0 beta 1


 Marking this as beta 1 in the hope I have time to rustle it up and get it 
 reviewed beforehand. If I do not, I will let it slide, but our behaviour 
 right now is not brilliant for workloads with a variance in density, and it 
 should not be challenging to make a more targeted decision.
 We can also make use of CASSANDRA-9894 to make column encoding more efficient 
 in many, even dense, cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8887) Direct (de)compression of internode communication

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8887:
--
Fix Version/s: 3.x

 Direct (de)compression of internode communication
 -

 Key: CASSANDRA-8887
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8887
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Matt Stump
Assignee: Ariel Weisberg
 Fix For: 3.x


 Internode compression is on by default. Currently we allocate one set of 
 buffers for the raw data, and then compress which results in another set of 
 buffers. This greatly increases the GC load. We can decrease the GC load by 
 doing direct compression/decompression of the communication buffers. This is 
 the same work as done in CASSANDRA-8464 but applied to internode 
 communication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-4175) Reduce memory, disk space, and cpu usage with a column name/id map

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-4175.
---
   Resolution: Duplicate
 Assignee: (was: Jason Brown)
Fix Version/s: (was: 3.x)

Column name duplication is removed in CASSANDRA-8099.  (See 
https://github.com/pcmanus/cassandra/blob/8099_engine_refactor/guide_8099.md.)

(We can do slightly better by encoding column ids in the schema, but doing in 
on a per-sstable basis is almost as good from a disk space perspective.)

IMO we should leave dealing with highly duplicated column *values* to the 
compression layer.

 Reduce memory, disk space, and cpu usage with a column name/id map
 --

 Key: CASSANDRA-4175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
  Labels: performance

 We spend a lot of memory on column names, both transiently (during reads) and 
 more permanently (in the row cache).  Compression mitigates this on disk but 
 not on the heap.
 The overhead is significant for typical small column values, e.g., ints.
 Even though we intern once we get to the memtable, this affects writes too 
 via very high allocation rates in the young generation, hence more GC 
 activity.
 Now that CQL3 provides us some guarantees that column names must be defined 
 before they are inserted, we could create a map of (say) 32-bit int column 
 id, to names, and use that internally right up until we return a resultset to 
 the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9749) CommitLogReplayer continues startup after encountering errors

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9749:
--
Reviewer: Ariel Weisberg

[~aweisberg] to review

 CommitLogReplayer continues startup after encountering errors
 -

 Key: CASSANDRA-9749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9749
 Project: Cassandra
  Issue Type: Bug
Reporter: Blake Eggleston
Assignee: Branimir Lambov
 Fix For: 2.2.x


 There are a few places where the commit log recovery method either skips 
 sections or just returns when it encounters errors.
 Specifically if it can't read the header here: 
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L298
 Or if there are compressor problems here: 
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L314
  and here: 
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L366
 Whether these are user-fixable or not, I think we should require more direct 
 user intervention (ie: fix what's wrong, or remove the bad file and restart) 
 since we're basically losing data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9945) Add transparent data encryption core classes

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9945:
--
Fix Version/s: (was: 3.x)
   3.2

 Add transparent data encryption core classes
 

 Key: CASSANDRA-9945
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9945
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jason Brown
Assignee: Jason Brown
  Labels: encryption
 Fix For: 3.2


 This patch will add the core infrastructure classes necessary for transparent 
 data encryption (file-level encryption), as required for CASSANDRA-6018 and 
 CASSANDRA-9633.  The phrase transparent data encryption, while not the most 
 aesthetically pleasing, seems to be used throughout the database industry 
 (Oracle, SQLQServer, Datastax Enterprise) to describe file level encryption, 
 so we'll go with that, as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9882) DTCS (maybe other strategies) can block flushing when there are lots of sstables

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682256#comment-14682256
 ] 

Jonathan Ellis commented on CASSANDRA-9882:
---

I don't see any value in making this configurable.

 DTCS (maybe other strategies) can block flushing when there are lots of 
 sstables
 

 Key: CASSANDRA-9882
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9882
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Marcus Eriksson
  Labels: dtcs
 Fix For: 2.1.x, 2.2.x


 MemtableFlushWriter tasks can get blocked by Compaction 
 getNextBackgroundTask.  This is in a wonky cluster with 200k sstables in the 
 CF, but seems bad for flushing to be blocked by getNextBackgroundTask when we 
 are trying to make these new smart strategies that may take some time to 
 calculate what to do.
 {noformat}
 MemtableFlushWriter:21 daemon prio=10 tid=0x7ff7ad965000 nid=0x6693 
 waiting for monitor entry [0x7ff78a667000]
java.lang.Thread.State: BLOCKED (on object monitor)
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:237)
   - waiting to lock 0x0006fcdbbf60 (a 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
   at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
   at 
 org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1475)
   at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
   at 
 org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1127)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
   - 0x000743b3ac38 (a 
 java.util.concurrent.ThreadPoolExecutor$Worker)
 MemtableFlushWriter:19 daemon prio=10 tid=0x7ff7ac57a000 nid=0x649b 
 waiting for monitor entry [0x7ff78b8ee000]
java.lang.Thread.State: BLOCKED (on object monitor)
   at 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:237)
   - waiting to lock 0x0006fcdbbf60 (a 
 org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
   at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
   at 
 org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1475)
   at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
   at 
 org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1127)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
 CompactionExecutor:14 daemon prio=10 tid=0x7ff7ad359800 nid=0x4d59 
 runnable [0x7fecce3ea000]
java.lang.Thread.State: RUNNABLE
   at 
 org.apache.cassandra.io.sstable.SSTableReader.equals(SSTableReader.java:628)
   at 
 com.google.common.collect.ImmutableSet.construct(ImmutableSet.java:206)
   at 
 com.google.common.collect.ImmutableSet.construct(ImmutableSet.java:220)
   at 
 com.google.common.collect.ImmutableSet.access$000(ImmutableSet.java:74)
   at 
 com.google.common.collect.ImmutableSet$Builder.build(ImmutableSet.java:531)
   at com.google.common.collect.Sets$1.immutableCopy(Sets.java:606)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:1352)
   at

[jira] [Updated] (CASSANDRA-9487) CommitLogTest hangs intermittently in 2.0

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9487:
--
Reviewer: Ariel Weisberg

Not merged yet AFAIK.  Assigning aweisberg to review.

 CommitLogTest hangs intermittently in 2.0
 -

 Key: CASSANDRA-9487
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9487
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Michael Shuler
Assignee: Branimir Lambov
 Fix For: 2.0.x

 Attachments: system.log


 Possibly related to CASSANDRA-8992 ?
 2.0 unit tests are hanging periodically in the same way (I have not gone 
 through all the branches, so can't say we're in the clear everywhere - 
 marking for just 2.x at the moment). CommitLogTest hung system.log attached 
 from local reproduction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14682261#comment-14682261
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Is this obsoleted by CASSANDRA-9500?

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Branimir Lambov
Priority: Minor
  Labels: performance
 Fix For: 2.1.x

 Attachments: 4338-gc.tar.gz, 4338.benchmark.png, 
 4338.benchmark.snappycompressor.png, 4338.single_node.read.png, 
 4338.single_node.write.png, gc-4338-patched.png, gc-trunk-me.png, 
 gc-trunk.png, gc-with-patch-me.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10047) nodetool aborts when attempting to cleanup a keyspace with no ranges

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10047:
---
Priority: Minor  (was: Critical)

 nodetool aborts when attempting to cleanup a keyspace with no ranges
 

 Key: CASSANDRA-10047
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10047
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 2.1.8
Reporter: Russell Bradberry
Priority: Minor

 When running nodetool cleanup in a DC that has no ranges for a keyspace, 
 nodetool will abort with the following message when attempting to cleanup 
 that keyspace:
 {code}
 Aborted cleaning up atleast one column family in keyspace ks, check server 
 logs for more information.
 error: nodetool failed, check server logs
 -- StackTrace --
 java.lang.RuntimeException: nodetool failed, check server logs
   at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
   at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
 {code}
 The error messages in the logs are :
 {code}
 CompactionManager.java:370 - Cleanup cannot run before a node has joined the 
 ring
 {code}
 This behavior prevents subsequent keyspaces from getting cleaned up. The 
 error message is also misleading as it suggests that the only reason  a node 
 may not have ranges for a keyspace is because it has yet to join the ring.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8887) Direct (de)compression of internode communication

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8887:
--
Priority: Minor  (was: Major)

 Direct (de)compression of internode communication
 -

 Key: CASSANDRA-8887
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8887
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Matt Stump
Assignee: Ariel Weisberg
Priority: Minor
 Fix For: 3.x


 Internode compression is on by default. Currently we allocate one set of 
 buffers for the raw data, and then compress which results in another set of 
 buffers. This greatly increases the GC load. We can decrease the GC load by 
 doing direct compression/decompression of the communication buffers. This is 
 the same work as done in CASSANDRA-8464 but applied to internode 
 communication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8457) nio MessagingService

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8457:
--
Priority: Minor  (was: Major)

 nio MessagingService
 

 Key: CASSANDRA-8457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Ariel Weisberg
Priority: Minor
  Labels: performance
 Fix For: 3.x


 Thread-per-peer (actually two each incoming and outbound) is a big 
 contributor to context switching, especially for larger clusters.  Let's look 
 at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2015-08-11 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-9259:
--
Priority: Critical (was: Major)

Bulk Reading from Cassandra
---

Key: CASSANDRA-9259
URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Brian Hess
Assignee: Ariel Weisberg
Priority: Critical
Fix For: 3.x

This ticket is following on from the 2015 NGCC. This ticket is designed to
be a place for discussing and designing an approach to bulk reading.
The goal is to have a bulk reading path for Cassandra. That is, a path
optimized to grab a large portion of the data for a table (potentially all of
it). This is a core element in the Spark integration with Cassandra, and the
speed at which Cassandra can deliver bulk data to Spark is limiting the
performance of Spark-plus-Cassandra operations. This is especially of
importance as Cassandra will (likely) leverage Spark for internal operations
(for example CASSANDRA-8234).
The core CQL to consider is the following:
SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) X AND
Token(partitionKey) = Y
Here, we choose X and Y to be contained within one token range (perhaps
considering the primary range of a node without vnodes, for example). This
query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk
operations via Spark (or other processing frameworks - ETL, etc). There are
a few causes (e.g., inefficient paging).
There are a few approaches that could be considered. First, we consider a
new Streaming Compaction approach. The key observation here is that a bulk
read from Cassandra is a lot like a major compaction, though instead of
outputting a new SSTable we would output CQL rows to a stream/socket/etc.
This would be similar to a CompactionTask, but would strip out some
unnecessary things in there (e.g., some of the indexing, etc). Predicates and
projections could also be encapsulated in this new StreamingCompactionTask,
for example.
Another approach would be an alternate storage format. For example, we might
employ Parquet (just as an example) to store the same data as in the primary
Cassandra storage (aka SSTables). This is akin to Global Indexes (an
alternate storage of the same data optimized for a particular query). Then,
Cassandra can choose to leverage this alternate storage for particular CQL
queries (e.g., range scans).
These are just 2 suggestions to get the conversation going.
One thing to note is that it will be useful to have this storage segregated
by token range so that when you extract via these mechanisms you do not get
replications-factor numbers of copies of the data. That will certainly be an
issue for some Spark operations (e.g., counting). Thus, we will want
per-token-range storage (even for single disks), so this will likely leverage
CASSANDRA-6696 (though, we'll want to also consider the single disk case).
It is also worth discussing what the success criteria is here. It is
unlikely to be as fast as EDW or HDFS performance (though, that is still a
good goal), but being within some percentage of that performance should be
set as success. For example, 2x as long as doing bulk operations on HDFS
with similar node count/size/etc.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8906) Experiment with optimizing partition merging when we can prove that some sources don't overlap

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8906:
--
Priority: Minor  (was: Major)

 Experiment with optimizing partition merging when we can prove that some 
 sources don't overlap
 --

 Key: CASSANDRA-8906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Ariel Weisberg
Priority: Minor
  Labels: compaction, performance
 Fix For: 3.x


 When we merge a partition from two sources and it turns out that those 2 
 sources don't overlap for that partition, we still end up doing one 
 comparison by row in the first source. However, if we can prove that the 2 
 sources don't overlap, for example by using the sstable min/max clustering 
 values that we store, we could speed this up. Note that it practice it's 
 little bit more hairy because we need to deal with N sources, but that's 
 probably not too hard either.
 I'll note that using the sstable min/max clustering values is not terribly 
 precise. We could do better if we were to push the same reasoning inside the 
 merge iterator, by for instance using the sstable per-partition index, which 
 can in theory tell use things like don't bother comparing rows until the end 
 of this row block. This is quite a bit more involved though so maybe note 
 worth the complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2015-08-11 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-9259:
--
Issue Type: New Feature (was: Improvement)

Bulk Reading from Cassandra
---

Key: CASSANDRA-9259
URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Brian Hess
Assignee: Ariel Weisberg
Fix For: 3.x

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9237) Gossip messages subject to head of line blocking by other intra-cluster traffic

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9237:
--
Fix Version/s: 3.0.0 rc1

 Gossip messages subject to head of line blocking by other intra-cluster 
 traffic
 ---

 Key: CASSANDRA-9237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0.0 rc1


 Reported as an issue over less than perfect networks like VPNs between data 
 centers.
 Gossip goes over the small message socket where small is 64k which isn't 
 particularly small. This is done for performance to keep most traffic on one 
 hot socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9259) Bulk Reading from Cassandra

2015-08-11 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis updated CASSANDRA-9259:
--
Fix Version/s: 3.x

Bulk Reading from Cassandra
---

Key: CASSANDRA-9259
URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Brian Hess
Assignee: Ariel Weisberg
Fix For: 3.x

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9237) Gossip messages subject to head of line blocking by other intra-cluster traffic

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692509#comment-14692509
 ] 

Jonathan Ellis commented on CASSANDRA-9237:
---

(IMO either of those would also be appropriate for 2.2.x.)

 Gossip messages subject to head of line blocking by other intra-cluster 
 traffic
 ---

 Key: CASSANDRA-9237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0.0 rc1


 Reported as an issue over less than perfect networks like VPNs between data 
 centers.
 Gossip goes over the small message socket where small is 64k which isn't 
 particularly small. This is done for performance to keep most traffic on one 
 hot socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9023) 2.0.13 write timeouts on driver

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9023.
---
   Resolution: Cannot Reproduce
Fix Version/s: (was: 2.0.x)

 2.0.13 write timeouts on driver
 ---

 Key: CASSANDRA-9023
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9023
 Project: Cassandra
  Issue Type: Bug
 Environment: For testing using only Single node 
 hardware configuration as follows:
 cpu :
 CPU(s):16
 On-line CPU(s) list:   0-15
 Thread(s) per core:2
 Core(s) per socket:8
 Socket(s): 1
 NUMA node(s):  1
 Vendor ID: GenuineIntel
 CPU MHz:   2000.174
 L1d cache: 32K
 L1i cache: 32K
 L2 cache:  256K
 L3 cache:  20480K
 NUMA node0 CPU(s): 0-15
 OS:
 Linux version 2.6.32-504.8.1.el6.x86_64 (mockbu...@c6b9.bsys.dev.centos.org) 
 (gcc version 4.4.7 20120313 (Red Hat 4.4.7-11) (GCC) ) 
 Disk: There only single disk in Raid i think space is 500 GB used is 5 GB
Reporter: anishek
Assignee: Ariel Weisberg
 Attachments: out_system.log


 Initially asked @ 
 http://www.mail-archive.com/user@cassandra.apache.org/msg41621.html
 Was suggested to post here. 
 If any more details are required please let me know 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9237) Gossip messages subject to head of line blocking by other intra-cluster traffic

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692554#comment-14692554
 ] 

Jonathan Ellis commented on CASSANDRA-9237:
---

Why not just switch it back to GOSSIP + INTERNAL if we're going to consider 
that?

 Gossip messages subject to head of line blocking by other intra-cluster 
 traffic
 ---

 Key: CASSANDRA-9237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0.0 rc1


 Reported as an issue over less than perfect networks like VPNs between data 
 centers.
 Gossip goes over the small message socket where small is 64k which isn't 
 particularly small. This is done for performance to keep most traffic on one 
 hot socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9940) ReadResponse serializes and then deserializes local responses

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9940:
--
Fix Version/s: 3.x

 ReadResponse serializes and then deserializes local responses
 -

 Key: CASSANDRA-9940
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9940
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.x


 Noticed this reviewing CASSANDRA-9894. It would be nice to not have to do 
 this busy work. Benedict said it wasn't straightforward to avoid because it's 
 being done to allow the read op order group to close.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7061) High accuracy, low overhead local read/write tracing

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7061:
--
 Assignee: (was: Ariel Weisberg)
Fix Version/s: (was: 3.x)

 High accuracy, low overhead local read/write tracing
 

 Key: CASSANDRA-7061
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7061
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict

 External profilers are pretty inadequate for getting accurate information at 
 the granularity we're working at: tracing is too high overhead, so measures 
 something completely different, and sampling suffers from bias of attribution 
 due to the way the stack traces are retrieved. Hyperthreading can make this 
 even worse.
 I propose to introduce an extremely low overhead tracing feature that must be 
 enabled with a system property that will trace operations within the node 
 only, so that we can perform various accurate low level analyses of 
 performance. This information will include threading info, so that we can 
 trace hand off delays and actual active time spent processing an operation. 
 With the property disabled there will be no increased burden of tracing, 
 however I hope to keep the total trace burden to less than one microsecond, 
 and any single trace command to a few tens of nanos.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8906) Experiment with optimizing partition merging when we can prove that some sources don't overlap

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8906:
--
 Assignee: (was: Ariel Weisberg)
Fix Version/s: (was: 3.x)

 Experiment with optimizing partition merging when we can prove that some 
 sources don't overlap
 --

 Key: CASSANDRA-8906
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8906
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Priority: Minor
  Labels: compaction, performance

 When we merge a partition from two sources and it turns out that those 2 
 sources don't overlap for that partition, we still end up doing one 
 comparison by row in the first source. However, if we can prove that the 2 
 sources don't overlap, for example by using the sstable min/max clustering 
 values that we store, we could speed this up. Note that it practice it's 
 little bit more hairy because we need to deal with N sources, but that's 
 probably not too hard either.
 I'll note that using the sstable min/max clustering values is not terribly 
 precise. We could do better if we were to push the same reasoning inside the 
 merge iterator, by for instance using the sstable per-partition index, which 
 can in theory tell use things like don't bother comparing rows until the end 
 of this row block. This is quite a bit more involved though so maybe note 
 worth the complexity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9241) ByteBuffer.array() without ByteBuffer.arrayOffset() + ByteBuffer.position() is a bug

2015-08-11 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9241:
--
Reviewer: Stefania

[~Stefania] to review

 ByteBuffer.array() without ByteBuffer.arrayOffset() + ByteBuffer.position() 
 is a bug
 

 Key: CASSANDRA-9241
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9241
 Project: Cassandra
  Issue Type: Bug
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
Priority: Minor
 Fix For: 3.0.x


 I found one instance of this on OHCProvider so it make sense to review all 
 usages since there aren't that many.
 Some suspect things:
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/FastByteOperations.java#L197
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L1877
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/TokenSerializer.java#L40
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java#L178
 https://github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/operations/predefined/CqlOperation.java#L104
 https://github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/operations/predefined/CqlOperation.java#L543
 https://github.com/apache/cassandra/blob/trunk/tools/stress/src/org/apache/cassandra/stress/operations/predefined/CqlOperation.java#L563
 I made this list off of 8099 so I might have missed some instances on trunk. 
 FastByteOperations makes me cross eyed so it is worth a second pass to make 
 sure offsets in byte buffers are handled correctly.
 Generally I like to use the full incantation even when I have done things 
 like allocate the buffer on the stack locally for copy pasta/refactoring 
 reasons and to make clear to new users how the API is supposed to work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9237) Gossip messages subject to head of line blocking by other intra-cluster traffic

2015-08-11 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14692505#comment-14692505
 ] 

Jonathan Ellis commented on CASSANDRA-9237:
---

I see two options for 3.0, both of which are better than the status quo: 

# Reduce the small-message threshold
# Go back to the old heuristic of putting gossip and internal responses on a 
separate socket

The problem with #2 in the past was that read responses, which are quite large, 
got jumbled in too.  (REQUEST_RESPONSE is too large an umbrella.)  We could 
split those out to their own verb, but it's not clear to me that putting write 
acks on the low traffic socket is a win.

Any redefinition of liveness or heartbeat generation belongs in a new ticket 
and is something of an open-ended research project with no clear answers imo.

 Gossip messages subject to head of line blocking by other intra-cluster 
 traffic
 ---

 Key: CASSANDRA-9237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0.0 rc1


 Reported as an issue over less than perfect networks like VPNs between data 
 centers.
 Gossip goes over the small message socket where small is 64k which isn't 
 particularly small. This is done for performance to keep most traffic on one 
 hot socket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6434) Repair-aware gc grace period

2015-08-10 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6434:
--
Reviewer: Yuki Morishita  (was: sankalp kohli)

 Repair-aware gc grace period 
 -

 Key: CASSANDRA-6434
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6434
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0 beta 1


 Since the reason for gcgs is to ensure that we don't purge tombstones until 
 every replica has been notified, it's redundant in a world where we're 
 tracking repair times per sstable (and repairing frequentily), i.e., a world 
 where we default to incremental repair a la CASSANDRA-5351.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-6434) Repair-aware gc grace period

2015-08-10 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680212#comment-14680212
 ] 

Jonathan Ellis edited comment on CASSANDRA-6434 at 8/10/15 2:48 PM:


Sylvain is out for another week. Can you review [~kohlisankalp]?

Edit: turns out Yuki is already working on it, assigning to him.


was (Author: jbellis):
Sylvain is out for another week. Can you review [~kohlisankalp]?

 Repair-aware gc grace period 
 -

 Key: CASSANDRA-6434
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6434
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0 beta 1


 Since the reason for gcgs is to ensure that we don't purge tombstones until 
 every replica has been notified, it's redundant in a world where we're 
 tracking repair times per sstable (and repairing frequentily), i.e., a world 
 where we default to incremental repair a la CASSANDRA-5351.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6434) Repair-aware gc grace period

2015-08-10 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6434:
--
Reviewer: sankalp kohli  (was: Sylvain Lebresne)

Sylvain is out for another week. Can you review [~kohlisankalp]?

 Repair-aware gc grace period 
 -

 Key: CASSANDRA-6434
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6434
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: sankalp kohli
Assignee: Marcus Eriksson
 Fix For: 3.0 beta 1


 Since the reason for gcgs is to ensure that we don't purge tombstones until 
 every replica has been notified, it's redundant in a world where we're 
 tracking repair times per sstable (and repairing frequentily), i.e., a world 
 where we default to incremental repair a la CASSANDRA-5351.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10006) 2.1 format sstable filenames with tmp are not handled by 3.0

2015-08-09 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10006:
---
Reviewer: Yuki Morishita

[~yukim] to review

 2.1 format sstable filenames with tmp are not handled by 3.0
 --

 Key: CASSANDRA-10006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10006
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Stefania
 Fix For: 3.0 beta 1


 In 3.0, {{Descriptor.fromFilename()}} doesn't handle tmp in sstable 
 filenames in the 2.1 (ka) format.  If you start 3.0 with one of these 
 filenames, you'll see an exception like the following:
 {noformat}
 ERROR [main] 2015-08-05 10:15:57,872 CassandraDaemon.java:623 - Exception 
 encountered during startup
 java.lang.AssertionError: Invalid file name 
 system-schema_columns-tmp-ka-5-Filter.db in 
 /tmp/dtest-Jstsy2/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:291) 
 ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:190) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks$7$1.visitFile(StartupChecks.java:226)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks$7$1.visitFile(StartupChecks.java:218)
  ~[main/:na]
 at java.nio.file.Files.walkFileTree(Files.java:2670) ~[na:1.8.0_45]
 at java.nio.file.Files.walkFileTree(Files.java:2742) ~[na:1.8.0_45]
 at 
 org.apache.cassandra.service.StartupChecks$7.execute(StartupChecks.java:251) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:103) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:163) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:504)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:610) 
 [main/:na]
 {noformat}
 I can reliably reproduce this with an [upgrade 
 dtest|https://github.com/thobbs/cassandra-dtest/blob/8099-backwards-compat/upgrade_tests/cql_tests.py#L5126-L5162]
  from CASSANDRA-9704, but it should also be reproducible by simply starting 
 3.0 with a filename like the one from the error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-7423) Allow updating individual subfields of UDT

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-7423:
--
Assignee: Benjamin Lerer

 Allow updating individual subfields of UDT
 --

 Key: CASSANDRA-7423
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7423
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Tupshin Harper
Assignee: Benjamin Lerer
  Labels: cql
 Fix For: 3.x


 Since user defined types were implemented in CASSANDRA-5590 as blobs (you 
 have to rewrite the entire type in order to make any modifications), they 
 can't be safely used without LWT for any operation that wants to modify a 
 subset of the UDT's fields by any client process that is not authoritative 
 for the entire blob. 
 When trying to use UDTs to model complex records (particularly with nesting), 
 this is not an exceptional circumstance, this is the totally expected normal 
 situation. 
 The use of UDTs for anything non-trivial is harmful to either performance or 
 consistency or both.
 edit: to clarify, i believe that most potential uses of UDTs should be 
 considered anti-patterns until/unless we have field-level r/w access to 
 individual elements of the UDT, with individual timestamps and standard LWW 
 semantics



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10020) Support eager retries for range queries

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10020:
---
  Priority: Minor  (was: Critical)
Issue Type: New Feature  (was: Bug)
   Summary: Support eager retries for range queries  (was: Range queries 
don't go on all replicas. )

 Support eager retries for range queries
 ---

 Key: CASSANDRA-10020
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10020
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Gautam Kumar
Priority: Minor

 A simple query like `select * from table` may time out if one of the nodes 
 fail.
 We had a 4-node cassandra cluster with RF=3 and CL=LOCAL_QUORUM. 
 The range query is issued to only two as per ConsistencyLevel.java: 
 liveEndpoints.subList(0, Math.min(liveEndpoints.size(), blockFor(keyspace)));
 If a node fails amongst this sublist, the query will time out. Why don't you 
 issue range queries to all replicas? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10014) Deletions using clustering keys not reflected in MV

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10014:
---
Since Version: 3.0 alpha 1  (was: 3.0.x)

 Deletions using clustering keys not reflected in MV
 ---

 Key: CASSANDRA-10014
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10014
 Project: Cassandra
  Issue Type: Bug
Reporter: Stefan Podkowinski
Assignee: Carl Yeksigian
 Fix For: 3.0.0 rc1


 I wrote a test to reproduce an 
 [issue|http://stackoverflow.com/questions/31810841/cassandra-materialized-view-shows-stale-data/31860487]
  reported on SO and turns out this is easily reproducible. There seems to be 
 a bug preventing deletes to be propagated to MVs in case a clustering key is 
 used. See 
 [here|https://github.com/spodkowinski/cassandra/commit/1c064523c8d8dbee30d46a03a0f58d3be97800dc]
  for test case (testClusteringKeyTombstone should fail).
 It seems {{MaterializedView.updateAffectsView()}} will not consider the 
 delete relevant for the view as {{partition.deletionInfo().isLive()}} will be 
 true during the test. In other test cases isLive will return false, which 
 seems to be the actual problem here. I'm not even sure the root cause is MV 
 specific, but wasn't able to dig much deeper as I'm not familiar with the 
 slightly confusing semantics around DeletionInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10015) Create tool to debug why expired sstables are not getting dropped

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10015:
---
Reviewer: Stefania

 Create tool to debug why expired sstables are not getting dropped
 -

 Key: CASSANDRA-10015
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10015
 Project: Cassandra
  Issue Type: Improvement
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson
 Fix For: 3.x, 2.1.x, 2.0.x, 2.2.x


 Sometimes fully expired sstables are not getting dropped, and it is a real 
 pain to manually find out why.
 A tool that outputs which sstable blocks (by having older data than the 
 newest tombstone in an expired sstable) expired ones would save a lot of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10006) 2.1 format sstable filenames with tmp are not handled by 3.0

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10006:
---
Assignee: Stefania  (was: Yuki Morishita)

 2.1 format sstable filenames with tmp are not handled by 3.0
 --

 Key: CASSANDRA-10006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10006
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Stefania
 Fix For: 3.0 beta 1


 In 3.0, {{Descriptor.fromFilename()}} doesn't handle tmp in sstable 
 filenames in the 2.1 (ka) format.  If you start 3.0 with one of these 
 filenames, you'll see an exception like the following:
 {noformat}
 ERROR [main] 2015-08-05 10:15:57,872 CassandraDaemon.java:623 - Exception 
 encountered during startup
 java.lang.AssertionError: Invalid file name 
 system-schema_columns-tmp-ka-5-Filter.db in 
 /tmp/dtest-Jstsy2/test/node1/data/system/schema_columns-296e9c049bec3085827dc17d3df2122a
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:291) 
 ~[main/:na]
 at 
 org.apache.cassandra.io.sstable.Descriptor.fromFilename(Descriptor.java:190) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks$7$1.visitFile(StartupChecks.java:226)
  ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks$7$1.visitFile(StartupChecks.java:218)
  ~[main/:na]
 at java.nio.file.Files.walkFileTree(Files.java:2670) ~[na:1.8.0_45]
 at java.nio.file.Files.walkFileTree(Files.java:2742) ~[na:1.8.0_45]
 at 
 org.apache.cassandra.service.StartupChecks$7.execute(StartupChecks.java:251) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:103) 
 ~[main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:163) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:504)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:610) 
 [main/:na]
 {noformat}
 I can reliably reproduce this with an [upgrade 
 dtest|https://github.com/thobbs/cassandra-dtest/blob/8099-backwards-compat/upgrade_tests/cql_tests.py#L5126-L5162]
  from CASSANDRA-9704, but it should also be reproducible by simply starting 
 3.0 with a filename like the one from the error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10008) Upgrading SSTables fails on 2.2.0 (after upgrade from 2.1.2)

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10008:
---
Assignee: Chris Moos

 Upgrading SSTables fails on 2.2.0 (after upgrade from 2.1.2)
 

 Key: CASSANDRA-10008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10008
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Moos
Assignee: Chris Moos
 Fix For: 2.2.x

 Attachments: CASSANDRA-10008.patch


 Running *nodetool upgradesstables* fails with the following after upgrading 
 to 2.2.0 from 2.1.2:
 {code}
 error: null
 -- StackTrace --
 java.lang.AssertionError
 at 
 org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkUnused(LifecycleTransaction.java:428)
 at 
 org.apache.cassandra.db.lifecycle.LifecycleTransaction.split(LifecycleTransaction.java:408)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.parallelAllSSTableOperation(CompactionManager.java:268)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:373)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:1524)
 at 
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2521)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10008) Upgrading SSTables fails on 2.2.0 (after upgrade from 2.1.2)

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10008:
---
Reviewer: Joshua McKenzie

[~JoshuaMcKenzie] to review

 Upgrading SSTables fails on 2.2.0 (after upgrade from 2.1.2)
 

 Key: CASSANDRA-10008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10008
 Project: Cassandra
  Issue Type: Bug
Reporter: Chris Moos
Assignee: Chris Moos
 Fix For: 2.2.x

 Attachments: CASSANDRA-10008.patch


 Running *nodetool upgradesstables* fails with the following after upgrading 
 to 2.2.0 from 2.1.2:
 {code}
 error: null
 -- StackTrace --
 java.lang.AssertionError
 at 
 org.apache.cassandra.db.lifecycle.LifecycleTransaction.checkUnused(LifecycleTransaction.java:428)
 at 
 org.apache.cassandra.db.lifecycle.LifecycleTransaction.split(LifecycleTransaction.java:408)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.parallelAllSSTableOperation(CompactionManager.java:268)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.performSSTableRewrite(CompactionManager.java:373)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.sstablesRewrite(ColumnFamilyStore.java:1524)
 at 
 org.apache.cassandra.service.StorageService.upgradeSSTables(StorageService.java:2521)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10014) Deletions using clustering keys not reflected in MV

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10014:
---
Assignee: Carl Yeksigian

 Deletions using clustering keys not reflected in MV
 ---

 Key: CASSANDRA-10014
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10014
 Project: Cassandra
  Issue Type: Bug
Reporter: Stefan Podkowinski
Assignee: Carl Yeksigian
 Fix For: 3.0.0 rc1


 I wrote a test to reproduce an 
 [issue|http://stackoverflow.com/questions/31810841/cassandra-materialized-view-shows-stale-data/31860487]
  reported on SO and turns out this is easily reproducible. There seems to be 
 a bug preventing deletes to be propagated to MVs in case a clustering key is 
 used. See 
 [here|https://github.com/spodkowinski/cassandra/commit/1c064523c8d8dbee30d46a03a0f58d3be97800dc]
  for test case (testClusteringKeyTombstone should fail).
 It seems {{MaterializedView.updateAffectsView()}} will not consider the 
 delete relevant for the view as {{partition.deletionInfo().isLive()}} will be 
 true during the test. In other test cases isLive will return false, which 
 seems to be the actual problem here. I'm not even sure the root cause is MV 
 specific, but wasn't able to dig much deeper as I'm not familiar with the 
 slightly confusing semantics around DeletionInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10014) Deletions using clustering keys not reflected in MV

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10014:
---
Fix Version/s: 3.0.0 rc1

 Deletions using clustering keys not reflected in MV
 ---

 Key: CASSANDRA-10014
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10014
 Project: Cassandra
  Issue Type: Bug
Reporter: Stefan Podkowinski
Assignee: Carl Yeksigian
 Fix For: 3.0.0 rc1


 I wrote a test to reproduce an 
 [issue|http://stackoverflow.com/questions/31810841/cassandra-materialized-view-shows-stale-data/31860487]
  reported on SO and turns out this is easily reproducible. There seems to be 
 a bug preventing deletes to be propagated to MVs in case a clustering key is 
 used. See 
 [here|https://github.com/spodkowinski/cassandra/commit/1c064523c8d8dbee30d46a03a0f58d3be97800dc]
  for test case (testClusteringKeyTombstone should fail).
 It seems {{MaterializedView.updateAffectsView()}} will not consider the 
 delete relevant for the view as {{partition.deletionInfo().isLive()}} will be 
 true during the test. In other test cases isLive will return false, which 
 seems to be the actual problem here. I'm not even sure the root cause is MV 
 specific, but wasn't able to dig much deeper as I'm not familiar with the 
 slightly confusing semantics around DeletionInfo.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9916) batch_mutate failing on trunk

2015-08-07 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9916:
--
Assignee: Paulo Motta  (was: Benjamin Lerer)

 batch_mutate failing on trunk
 -

 Key: CASSANDRA-9916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mike Adamson
Assignee: Paulo Motta
 Fix For: 3.0.0 rc1


 {{batch_mutate}} is failing on trunk with the following error:
 {noformat}
 java.lang.AssertionError: current = 
 ColumnDefinition{name=b@706172656e745f70617468, 
 type=org.apache.cassandra.db.marshal.BytesType, kind=STATIC, 
 componentIndex=null, indexName=cfs_parent_path, indexType=KEYS}, new = 
 ColumnDefinition{name=b@70617468, 
 type=org.apache.cassandra.db.marshal.BytesType, kind=STATIC, 
 componentIndex=null, indexName=cfs_path, indexType=KEYS}
 at 
 org.apache.cassandra.db.rows.ArrayBackedRow$SortedBuilder.setColumn(ArrayBackedRow.java:617)
 at 
 org.apache.cassandra.db.rows.ArrayBackedRow$SortedBuilder.addCell(ArrayBackedRow.java:630)
 at 
 org.apache.cassandra.db.LegacyLayout$CellGrouper.addCell(LegacyLayout.java:891)
 at 
 org.apache.cassandra.db.LegacyLayout$CellGrouper.addAtom(LegacyLayout.java:843)
 at 
 org.apache.cassandra.db.LegacyLayout.getNextRow(LegacyLayout.java:390)
 at 
 org.apache.cassandra.db.LegacyLayout.toUnfilteredRowIterator(LegacyLayout.java:326)
 at 
 org.apache.cassandra.db.LegacyLayout.toUnfilteredRowIterator(LegacyLayout.java:288)
 at 
 org.apache.cassandra.thrift.CassandraServer.createMutationList(CassandraServer.java:1110)
 at 
 org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:1249)
 {noformat}
 The following mutations was passed to {{batch_mutate}} to get this error
 {noformat}
 mutationMap = {java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]=
 {inode=[
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:80 62 
 00 04 70 61 74 68 00, value:2F, timestamp:1438165021749))), 
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:80 62 
 00 0B 70 61 72 65 6E 74 5F 70 61 74 68 00, value:6E 75 6C 6C, 
 timestamp:1438165021749))), 
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:80 62 
 00 08 73 65 6E 74 69 6E 65 6C 00, value:78, timestamp:1438165021749))), 
 Mutation(column_or_supercolumn:ColumnOrSuperColumn(column:Column(name:80 62 
 00 04 64 61 74 61 00, value:00 00 00 08 64 61 74 61 73 74 61 78 00 00 00 05 
 75 73 65 72 73 01 FF 00 00 00 00 00 04 00 00 00 01, timestamp:1438165021749)))
 ]}}
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8099) Refactor and modernize the storage engine

2015-08-06 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-8099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660772#comment-14660772
]

Jonathan Ellis commented on CASSANDRA-8099:
---

That's on Sylvain's list when he gets back in two weeks.

Refactor and modernize the storage engine
-

Key: CASSANDRA-8099
URL: https://issues.apache.org/jira/browse/CASSANDRA-8099
Project: Cassandra
Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Fix For: 3.0 beta 1

Attachments: 8099-nit

The current storage engine (which for this ticket I'll loosely define as the
code implementing the read/write path) is suffering from old age. One of the
main problem is that the only structure it deals with is the cell, which
completely ignores the more high level CQL structure that groups cell into
(CQL) rows.
This leads to many inefficiencies, like the fact that during a reads we have
to group cells multiple times (to count on replica, then to count on the
coordinator, then to produce the CQL resultset) because we forget about the
grouping right away each time (so lots of useless cell names comparisons in
particular). But outside inefficiencies, having to manually recreate the CQL
structure every time we need it for something is hindering new features and
makes the code more complex that it should be.
Said storage engine also has tons of technical debt. To pick an example, the
fact that during range queries we update {{SliceQueryFilter.count}} is pretty
hacky and error prone. Or the overly complex ways {{AbstractQueryPager}} has
to go into to simply remove the last query result.
So I want to bite the bullet and modernize this storage engine. I propose to
do 2 main things:
# Make the storage engine more aware of the CQL structure. In practice,
instead of having partitions be a simple iterable map of cells, it should be
an iterable list of row (each being itself composed of per-column cells,
though obviously not exactly the same kind of cell we have today).
# Make the engine more iterative. What I mean here is that in the read path,
we end up reading all cells in memory (we put them in a ColumnFamily object),
but there is really no reason to. If instead we were working with iterators
all the way through, we could get to a point where we're basically
transferring data from disk to the network, and we should be able to reduce
GC substantially.
Please note that such refactor should provide some performance improvements
right off the bat but it's not it's primary goal either. It's primary goal is
to simplify the storage engine and adds abstraction that are better suited to
further optimizations.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9966) batched CAS statements are not serializable

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660777#comment-14660777
 ] 

Jonathan Ellis commented on CASSANDRA-9966:
---

[~kohlisankalp] may also be able to help.

 batched CAS statements are not serializable
 ---

 Key: CASSANDRA-9966
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9966
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sam Overton
Assignee: Sylvain Lebresne
Priority: Critical
 Fix For: 3.x, 2.2.x


 It is possible to batch CAS statements such that their outcome is different 
 to the outcome were they executed sequentially outside of a batch.
 eg.
 a | b | c
 a | 1 | 1
 BEGIN BATCH
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 APPLY BATCH
 results in 
 a | b | c
 a | 2 | 2
 If these statements were not batched, the outcome would be 
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 a | b | c
 a | 2 | 1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 applied=false (pre-condition b=1 not met)
 Cassandra already checks for incompatible preconditions within a batch (eg 
 one statement with IF c=1 and another statement with IF c=2). It should also 
 check for mutations to columns in one statement that affect the 
 pre-conditions of another statement, or it should evaluate the statement 
 pre-conditions sequentially after applying the mutations of the previous 
 statement to an in-memory model of the partition.
 For backwards compatibility this would have to be a new strict batch mode, 
 eg.
 BEGIN STRICT BATCH



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10007) Repeated rows in paged result

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-10007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660775#comment-14660775
 ] 

Jonathan Ellis commented on CASSANDRA-10007:


Is this the same as CASSANDRA-10010?

 Repeated rows in paged result
 -

 Key: CASSANDRA-10007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Adam Holmberg
Assignee: Benjamin Lerer
  Labels: client-impacting
 Fix For: 3.x

 Attachments: paging-test.py


 We noticed an anomaly in paged results while testing against 3.0.0-alpha1. It 
 seems that unbounded selects can return rows repeated at page boundaries. 
 Furthermore, the number of repeated rows seems to dither in count across 
 consecutive runs of the same query.
 Does not reproduce on 2.2.0 and earlier.
 I also noted that this behavior only manifests on multi-node clusters.
 The attached script shows this behavior when run against 3.0.0-alpha1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9966) batched CAS statements are not serializable

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9966:
--
Fix Version/s: 2.2.x

 batched CAS statements are not serializable
 ---

 Key: CASSANDRA-9966
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9966
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sam Overton
Assignee: Sylvain Lebresne
Priority: Critical
 Fix For: 3.x, 2.2.x


 It is possible to batch CAS statements such that their outcome is different 
 to the outcome were they executed sequentially outside of a batch.
 eg.
 a | b | c
 a | 1 | 1
 BEGIN BATCH
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 APPLY BATCH
 results in 
 a | b | c
 a | 2 | 2
 If these statements were not batched, the outcome would be 
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 a | b | c
 a | 2 | 1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 applied=false (pre-condition b=1 not met)
 Cassandra already checks for incompatible preconditions within a batch (eg 
 one statement with IF c=1 and another statement with IF c=2). It should also 
 check for mutations to columns in one statement that affect the 
 pre-conditions of another statement, or it should evaluate the statement 
 pre-conditions sequentially after applying the mutations of the previous 
 statement to an in-memory model of the partition.
 For backwards compatibility this would have to be a new strict batch mode, 
 eg.
 BEGIN STRICT BATCH



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10001) Bug in encoding of sstables

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10001:
---
Assignee: Stefania

 Bug in encoding of sstables
 ---

 Key: CASSANDRA-10001
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10001
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Stefania
Priority: Blocker
 Fix For: 3.0 beta 1


 Fixing the compaction dtest I noticed we aren't encoding map data correctly 
 in sstables.
 The following code fails from newly committed {{ 
 compaction_test.py:TestCompaction_with_SizeTieredCompactionStrategy.large_compaction_warning_test}}
 {code}
  session.execute(CREATE TABLE large(userid text PRIMARY KEY, properties 
 mapint, text) with compression = {})
 for i in range(200):  # ensures partition size larger than 
 compaction_large_partition_warning_threshold_mb   
   
   
 session.execute(UPDATE ks.large SET properties[%i] = '%s' WHERE 
 userid = 'user' % (i, get_random_word(strlen)))
 ret = session.execute(SELECT properties from ks.large where userid = 
 'user')
 assert len(ret) == 1
   self.assertEqual(200, len(ret[0][0].keys()))
 {code}
 The last assert is failing with only 91 keys.  The large values are causing 
 flushes vs staying in the memtable so the issue is somewhere in the 
 serialization of collections in sstables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9966) batched CAS statements are not serializable

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9966:
--
Fix Version/s: 3.x

 batched CAS statements are not serializable
 ---

 Key: CASSANDRA-9966
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9966
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Sam Overton
Assignee: Sylvain Lebresne
Priority: Critical
 Fix For: 3.x


 It is possible to batch CAS statements such that their outcome is different 
 to the outcome were they executed sequentially outside of a batch.
 eg.
 a | b | c
 a | 1 | 1
 BEGIN BATCH
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 APPLY BATCH
 results in 
 a | b | c
 a | 2 | 2
 If these statements were not batched, the outcome would be 
 UPDATE foo SET b=2 WHERE a='a' iF c=1
 a | b | c
 a | 2 | 1
 UPDATE foo SET c=2 WHERE a='a' IF b=1
 applied=false (pre-condition b=1 not met)
 Cassandra already checks for incompatible preconditions within a batch (eg 
 one statement with IF c=1 and another statement with IF c=2). It should also 
 check for mutations to columns in one statement that affect the 
 pre-conditions of another statement, or it should evaluate the statement 
 pre-conditions sequentially after applying the mutations of the previous 
 statement to an in-memory model of the partition.
 For backwards compatibility this would have to be a new strict batch mode, 
 eg.
 BEGIN STRICT BATCH



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9927) Security for MaterializedViews

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9927:
--
Assignee: Paulo Motta

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
Assignee: Paulo Motta
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9928) Add Support for multiple non-primary key columns in Materialized View primary keys

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9928:
--
Fix Version/s: (was: 3.0 beta 1)
   3.x

 Add Support for multiple non-primary key columns in Materialized View primary 
 keys
 --

 Key: CASSANDRA-9928
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9928
 Project: Cassandra
  Issue Type: Improvement
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.x


 Currently we don't allow  1 non primary key from the base table in a MV 
 primary key.  We should remove this restriction assuming we continue 
 filtering out nulls.  With allowing nulls in the MV columns there are a lot 
 of multiplicative implications we need to think through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9927) Security for MaterializedViews

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660218#comment-14660218
 ] 

Jonathan Ellis edited comment on CASSANDRA-9927 at 8/7/15 4:23 AM:
---

I'm okay with either require explicit grants or always validate against base 
table for 3.0.  So let's go with the latter.


was (Author: jbellis):
I'm okay with either require explicit grants or always validate against base 
table for 3.0.

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Issue Type: New Feature  (was: Improvement)

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: New Feature
Reporter: Alan Boudreault
Priority: Minor
 Fix For: 3.x


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Fix Version/s: (was: 3.0 beta 1)
   3.x

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alan Boudreault
 Fix For: 3.x


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8684) Replace usage of Adler32 with CRC32

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8684:
--
Reviewer: T Jake Luciani  (was: Aleksey Yeschenko)

Giving review to Jake since he did the original benchmarking back in 
CASSANDRA-5862.

 Replace usage of Adler32 with CRC32
 ---

 Key: CASSANDRA-8684
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8684
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Fix For: 3.0 beta 1

 Attachments: CRCBenchmark.java, PureJavaCrc32.java, Sample.java


 I could not find a situation in which Adler32 outperformed PureJavaCrc32 much 
 less the intrinsic from Java 8. For small allocations PureJavaCrc32 was much 
 faster probably due to the JNI overhead of invoking the native Adler32 
 implementation where the array has to be allocated and copied.
 I tested on a 65w Sandy Bridge i5 running Ubuntu 14.04 with JDK 1.7.0_71 as 
 well as a c3.8xlarge running Ubuntu 14.04.
 I think it makes sense to stop using Adler32 when generating new checksums.
 c3.8xlarge, results are time in milliseconds, lower is better
 ||Allocation size|Adler32|CRC32|PureJavaCrc32||
 |64|47636|46075|25782|
 |128|36755|36712|23782|
 |256|31194|32211|22731|
 |1024|27194|28792|22010|
 |1048576|25941|27807|21808|
 |536870912|25957|27840|21836|
 i5
 ||Allocation size|Adler32|CRC32|PureJavaCrc32||
 |64|50539|50466|26826|
 |128|37092|38533|24553|
 |256|30630|32938|23459|
 |1024|26064|29079|22592|
 |1048576|24357|27911|22481|
 |536870912|24838|28360|22853|
 Another fun fact. Performance of the CRC32 intrinsic appears to double from 
 Sandy Bridge - Haswell. Unless I am measuring something different when going 
 from Linux/Sandy to Haswell/OS X.
 The intrinsic/JDK 8 implementation also operates against DirectByteBuffers 
 better and coding against the wrapper will get that boost when run with Java 
 8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Priority: Minor  (was: Major)

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alan Boudreault
Priority: Minor
 Fix For: 3.x


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is finished building, without having to query each node

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Summary: Determine if a Materialized View is finished building, without 
having to query each node  (was: Determine if a Materialized View is built 
(consistent with its base table after its creation))

 Determine if a Materialized View is finished building, without having to 
 query each node
 

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: New Feature
Reporter: Alan Boudreault
Priority: Minor
 Fix For: 3.x


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661287#comment-14661287
 ] 

Jonathan Ellis commented on CASSANDRA-9967:
---

For 3.0 you can query each node's local state.  ([~carlyeks], can you explain 
how?)

For 3.x I agree it would be useful to simplify this.

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alan Boudreault
 Fix For: 3.x


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10001) Bug in encoding of sstables

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10001:
---
Reviewer: T Jake Luciani

 Bug in encoding of sstables
 ---

 Key: CASSANDRA-10001
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10001
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: Stefania
Priority: Blocker
 Fix For: 3.0 beta 1


 Fixing the compaction dtest I noticed we aren't encoding map data correctly 
 in sstables.
 The following code fails from newly committed {{ 
 compaction_test.py:TestCompaction_with_SizeTieredCompactionStrategy.large_compaction_warning_test}}
 {code}
  session.execute(CREATE TABLE large(userid text PRIMARY KEY, properties 
 mapint, text) with compression = {})
 for i in range(200):  # ensures partition size larger than 
 compaction_large_partition_warning_threshold_mb   
   
   
 session.execute(UPDATE ks.large SET properties[%i] = '%s' WHERE 
 userid = 'user' % (i, get_random_word(strlen)))
 ret = session.execute(SELECT properties from ks.large where userid = 
 'user')
 assert len(ret) == 1
   self.assertEqual(200, len(ret[0][0].keys()))
 {code}
 The last assert is failing with only 91 keys.  The large values are causing 
 flushes vs staying in the memtable so the issue is somewhere in the 
 serialization of collections in sstables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-10002) Repeated slices on RowSearchers are incorrect

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-10002:
---
Reviewer: Stefania  (was: Aleksey Yeschenko)

[~Stefania] to review

 Repeated slices on RowSearchers are incorrect
 -

 Key: CASSANDRA-10002
 URL: https://issues.apache.org/jira/browse/CASSANDRA-10002
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Tyler Hobbs
 Fix For: 3.0 beta 1


 In {{AbstractThreadUnsafePartition}}, repeated {{slice()}} calls on a 
 {{RowSearcher}} can produce incorrect results.  This is caused by only 
 performing a binary search over a sublist (based on {{nextIdx}}), but not 
 taking {{nextIdx}} into account when using the search result index.
 I made a quick fix in [this 
 commit|https://github.com/thobbs/cassandra/commit/73725ea6825c9c0da1fa4986b01f39ae08130e10]
  on one of my branches, but the full fix also needs to cover 
 {{ReverseRowSearcher}} and include a test to reproduce the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9738) Migrate key-cache to be fully off-heap

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659996#comment-14659996
 ] 

Jonathan Ellis commented on CASSANDRA-9738:
---

What does code coverage show?  Because I know that would be Ariel's first 
question. :)

 Migrate key-cache to be fully off-heap
 --

 Key: CASSANDRA-9738
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9738
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0.0 rc1


 Key cache still uses a concurrent map on-heap. This could go to off-heap and 
 feels doable now after CASSANDRA-8099.
 Evaluation should be done in advance based on a POC to prove that pure 
 off-heap counter cache buys a performance and/or gc-pressure improvement.
 In theory, elimination of on-heap management of the map should buy us some 
 benefit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-8234) CTAS (CREATE TABLE AS SELECT)

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-8234:
--
Description: 
Continuous request from users is the ability to do CREATE TABLE AS SELECT.  The 
simplest form would be copying the entire table.  More advanced would allow 
specifying thes column and UDF to call as well as filtering rows out in WHERE.

More advanced still would be to get all the way to allowing JOIN, for which we 
probably want to integrate Spark.

  was:
Continuous request from users is the ability to do CREATE TABLE AS SELECT... 
The COPY command can be enhanced to perform simple and customized copies of 
existing tables to satisfy the need. 

- Simple copy is COPY table a TO new table b.

- Custom copy can mimic Postgres: (e.g. COPY (SELECT * FROM country WHERE 
country_name LIKE 'A%') TO …)

Summary: CTAS (CREATE TABLE AS SELECT)  (was: CTAS for COPY)

 CTAS (CREATE TABLE AS SELECT)
 -

 Key: CASSANDRA-8234
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8234
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Robin Schumacher
 Fix For: 3.x


 Continuous request from users is the ability to do CREATE TABLE AS SELECT.  
 The simplest form would be copying the entire table.  More advanced would 
 allow specifying thes column and UDF to call as well as filtering rows out in 
 WHERE.
 More advanced still would be to get all the way to allowing JOIN, for which 
 we probably want to integrate Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660003#comment-14660003
 ] 

Jonathan Ellis commented on CASSANDRA-5220:
---

Committed. Thanks, Marcus and Stefania!

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660201#comment-14660201
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

I'm happy with leaving MV permissions explicity.  Inheriting base permissions 
is definitely not the right thing in all situations.

NB: Aleksey pointed out that we do to require SELECT on the base table when 
CREATEing an MV.

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9927) Security for MaterializedViews

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660201#comment-14660201
 ] 

Jonathan Ellis edited comment on CASSANDRA-9927 at 8/6/15 3:58 PM:
---

I'm happy with leaving MV permissions to be set explicitly.  Inheriting base 
permissions is definitely not the right thing in all situations.

NB: Aleksey pointed out that we do to require SELECT on the base table when 
CREATEing an MV.


was (Author: jbellis):
I'm happy with leaving MV permissions explicity.  Inheriting base permissions 
is definitely not the right thing in all situations.

NB: Aleksey pointed out that we do to require SELECT on the base table when 
CREATEing an MV.

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9953) Snapshot file handlers are not released after snapshot deleted

2015-08-06 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9953.
---
Resolution: Duplicate

 Snapshot file handlers are not released after snapshot deleted
 --

 Key: CASSANDRA-9953
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9953
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Imri Zvik

 We are seeing a lot of opened file descriptors to deleted snapshots (deleted 
 using nodetool clearsnapshot):
 {code}
 java128657 cassandra  DELREG  253,2  
 569272514 
 /var/lib/cassandra/data/accounts/account_store_data/snapshots/feb5f790-316e-11e5-aec0-472b0d6e3fd4/accounts-account_store_data-jb-264593-Index.db
 java128657 cassandra  DELREG  253,2 
 1610616657 
 /var/lib/cassandra/data/accounts/account_store_counters/snapshots/03aa4710-316f-11e5-aec0-472b0d6e3fd4/accounts-account_store_counters-jb-635527-Index.db
 java128657 cassandra  DELREG  253,2 
 1610613856 
 /var/lib/cassandra/data/accounts/account_store_counters/snapshots/43c17170-316f-11e5-aec0-472b0d6e3fd4/accounts-account_store_counters-jb-635675-Index.db
 java128657 cassandra  DELREG  253,2 
 1610613052 
 /var/lib/cassandra/data/accounts/account_store_counters/snapshots/18e001a0-3170-11e5-aec0-472b0d6e3fd4/accounts-account_store_counters-jb-636200-Index.db
 [root@cassandra002 ~]# lsof -np 128657 |grep -c DEL
 56682
 {code}
 They are probably created by the routine repair process, but they are never 
 cleared (restarting the Cassandra process clears them, of course).
 We are seeing these also after all repair processes finished, and no repair 
 process is running in the cluster.
 There are no errors or fatals in the system.log.
 We are using Datastax community edition 2.0.13, installed from RPMs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-06 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14660218#comment-14660218
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

I'm okay with either require explicit grants or always validate against base 
table for 3.0.

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9985) Introduce our own AbstractIterator

2015-08-05 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9985:
--
Reviewer: Ariel Weisberg

 Introduce our own AbstractIterator
 --

 Key: CASSANDRA-9985
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9985
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 3.0.0 rc1


 The Guava AbstractIterator not only has unnecessary method call depth, it is 
 difficult to debug without attaching source. Since it's absolutely trivial to 
 write our own, and it's used widely within the codebase, I think we should do 
 so.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9533) Make batch commitlog mode easier to tune

2015-08-05 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9533:
--
Reviewer: Ariel Weisberg

[~aweisberg] to review

 Make batch commitlog mode easier to tune
 

 Key: CASSANDRA-9533
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9533
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Benedict
 Fix For: 3.x


 As discussed in CASSANDRA-9504, 2.1 changed commitlog_sync_batch_window_in_ms 
 from a maximum time to wait between fsync to the minimum time, so one must be 
 very careful to keep it small enough that most writers aren't kept waiting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9992) Sending batchlog verb to previous versions

2015-08-05 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9992:
--
Fix Version/s: (was: 3.0 beta 1)
   3.0.0 rc1

 Sending batchlog verb to previous versions
 --

 Key: CASSANDRA-9992
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9992
 Project: Cassandra
  Issue Type: Bug
Reporter: Carl Yeksigian
Assignee: Carl Yeksigian
 Fix For: 3.0.0 rc1


 We are currently sending {{Verb.BATCHLOG_MUTATION}} in 
 {{StorageProxy.syncWriteToBatchlog}} and 
 {{StorageProxy.asyncRemoveFromBatchlog}}. to previous versions which do not 
 have that Verb. We should be sending them {{Verb.MUTATION}} instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14658812#comment-14658812
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

Why can't we just inherit base table permissions for 3.0?

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9927) Security for MaterializedViews

2015-08-05 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659028#comment-14659028
 ] 

Jonathan Ellis commented on CASSANDRA-9927:
---

Long term, we do want to support this.  But it's pretty late to start design 
for 3.0.

 Security for MaterializedViews
 --

 Key: CASSANDRA-9927
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9927
 Project: Cassandra
  Issue Type: Task
Reporter: T Jake Luciani
  Labels: materializedviews
 Fix For: 3.0 beta 1


 We need to think about how to handle security wrt materialized views. Since 
 they are based on a source table we should possibly inherit the same security 
 model as that table.  
 However I can see cases where users would want to create different security 
 auth for different views.  esp once we have CASSANDRA-9664 and users can 
 filter out sensitive data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659423#comment-14659423
 ] 

Jonathan Ellis commented on CASSANDRA-9302:
---

There's no need to be passive aggressive.  Here's the reason it was tagged 
Later, straight from the comments:

bq. Whatever we end up with under the hood, I think that cqlsh and COPY are the 
right front end to present to users rather than a separate loader executable.


 Optimize cqlsh COPY FROM, part 3
 

 Key: CASSANDRA-9302
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: David Kua
 Fix For: 2.1.x


 We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
 but people need a good bulk load tool now.  One option is to add a separate 
 Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
 from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
 we point people, rather than adding more tools that need to be supported 
 indefinitely.
 Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
 CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-05 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14659424#comment-14659424
 ] 

Jonathan Ellis commented on CASSANDRA-5220:
---

I'm okay with adding this to 3.0, since otherwise we'll need to wait for either 
8110 or 4.0, and I don't think that's fair to Marcus since he had the first 
version written months ago.

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-04 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654755#comment-14654755
 ] 

Jonathan Ellis commented on CASSANDRA-5220:
---

We can't support repair anyway with older-version nodes until we have 
CASSANDRA-8110, so don't worry about it here.

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9945) Add transparent data encryption core classes

2015-08-04 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654250#comment-14654250
 ] 

Jonathan Ellis commented on CASSANDRA-9945:
---

3.2 actually.  (We should branch 3.1 from 3.0 on release.)

 Add transparent data encryption core classes
 

 Key: CASSANDRA-9945
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9945
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jason Brown
Assignee: Jason Brown
  Labels: encryption
 Fix For: 3.x


 This patch will add the core infrastructure classes necessary for transparent 
 data encryption (file-level encryption), as required for CASSANDRA-6018 and 
 CASSANDRA-9633.  The phrase transparent data encryption, while not the most 
 aesthetically pleasing, seems to be used throughout the database industry 
 (Oracle, SQLQServer, Datastax Enterprise) to describe file level encryption, 
 so we'll go with that, as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9129) HintedHandoff in pending state forever after upgrading to 2.0.14 from 2.0.11 and 2.0.12

2015-08-04 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9129:
--
Reviewer: Aleksey Yeschenko

 HintedHandoff in pending state forever after upgrading to 2.0.14 from 2.0.11 
 and 2.0.12
 ---

 Key: CASSANDRA-9129
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9129
 Project: Cassandra
  Issue Type: Bug
 Environment: Ubuntu 12.04.5 LTS
 AWS (m3.xlarge)
 15G RAM
 4 core Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
 Cassandra 2.0.14
Reporter: Russ Lavoie
Assignee: Sam Tunnicliffe
 Fix For: 2.0.x

 Attachments: 9129-2.0.txt


 Upgrading from Cassandra 2.0.11 or 2.0.12 to 2.0.14 I am seeing a pending 
 hinted hand off that never clears.  New hinted hand offs that go into pending 
 waiting for a node to come up clear as expected.  But 1 always remains.
 I through the following steps.
 1) stop cassandra
 2) Upgrade cassandra to 2.0.14
 3) Start cassandra
 4) nodetool tpstats
 There are no errors in the logs, to help with this issue.  I ran a few 
 nodetool commands to get some data and pasted them below:
 Below is what is shown after running nodetool status on each node in the ring
 {code}Status=Up/Down
 |/ State=Normal/Leaving/Joining/Moving
 --  Address   Load   Tokens  Owns   Host ID   Rack
 UN  NODE1  279.8 MB   256 34.9%  HOSTID   rack1
 UN  NODE2  279.79 MB  256 33.0%  HOSTID   rack1
 UN  NODE3  279.87 MB  256 32.1%  HOSTID   rack1
 {code}
 Below is what is shown after running nodetool tpstats on each node in the 
 ring showing a single HintedHandoff in pending status that never clears
 {code}
 Pool NameActive   Pending  Completed   Blocked  All 
 time blocked
 ReadStage 0 0  14550 0
  0
 RequestResponseStage  0 0 113040 0
  0
 MutationStage 0 0 168873 0
  0
 ReadRepairStage   0 0   1147 0
  0
 ReplicateOnWriteStage 0 0  0 0
  0
 GossipStage   0 0 232112 0
  0
 CacheCleanupExecutor  0 0  0 0
  0
 MigrationStage0 0  0 0
  0
 MemoryMeter   0 0  6 0
  0
 FlushWriter   0 0 38 0
  0
 ValidationExecutor0 0  0 0
  0
 InternalResponseStage 0 0  0 0
  0
 AntiEntropyStage  0 0  0 0
  0
 MemtablePostFlusher   0 0   1333 0
  0
 MiscStage 0 0  0 0
  0
 PendingRangeCalculator0 0  6 0
  0
 CompactionExecutor0 0178 0
  0
 commitlog_archiver0 0  0 0
  0
 HintedHandoff 0 1133 0
  0
 Message type   Dropped
 RANGE_SLICE  0
 READ_REPAIR  0
 PAGED_RANGE  0
 BINARY   0
 READ 0
 MUTATION 0
 _TRACE   0
 REQUEST_RESPONSE 0
 COUNTER_MUTATION 0
 {code}
 Below is what is shown after running nodetool cfstats system.hints on all 3 
 nodes.
 {code}
 Keyspace: system
   Read Count: 0
   Read Latency: NaN ms.
   Write Count: 0
   Write Latency: NaN ms.
   Pending Tasks: 0
   Table: hints
   SSTable count: 0
   Space used (live), bytes: 0
   Space used (total), bytes: 0
   Off heap memory used (total), bytes: 0
   SSTable Compression Ratio: 0.0
   Number of keys (estimate): 0
   Memtable cell count: 0
   Memtable data size, bytes: 0
   Memtable switch count: 0
   Local read count: 0
   Local read latency: 0.000 ms
   Local write count: 0
   Local write latency: 0.000 ms
   Pending tasks: 0
   Bloom filter false positives: 0
   Bloom filter false ratio: 0.0

[jira] [Commented] (CASSANDRA-5220) Repair improvements when using vnodes

2015-08-04 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654147#comment-14654147
 ] 

Jonathan Ellis commented on CASSANDRA-5220:
---

Very substantial.  Excited to get this in!

 Repair improvements when using vnodes
 -

 Key: CASSANDRA-5220
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5220
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.2.0 beta 1
Reporter: Brandon Williams
Assignee: Marcus Olsson
  Labels: performance, repair
 Attachments: 5220-yourkit.png, 5220-yourkit.tar.bz2, 
 cassandra-3.0-5220-1.patch, cassandra-3.0-5220-2.patch, 
 cassandra-3.0-5220.patch


 Currently when using vnodes, repair takes much longer to complete than 
 without them.  This appears at least in part because it's using a session per 
 range and processing them sequentially.  This generates a lot of log spam 
 with vnodes, and while being gentler and lighter on hard disk deployments, 
 ssd-based deployments would often prefer that repair be as fast as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9932) Make all partitions btree backed

2015-08-04 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9932:
--
Reviewer: Ariel Weisberg

 Make all partitions btree backed
 

 Key: CASSANDRA-9932
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9932
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 3.0.0 rc1


 Following on from the other btree related refactors, this patch makes all 
 partition (and partition-like) objects backed by the same basic structure: 
 {{AbstractBTreePartition}}. With two main offshoots: 
 {{ImmutableBTreePartition}} and {{AtomicBTreePartition}}
 The main upshot is a 30% net code reduction, meaning better exercise of btree 
 code paths and fewer new code paths to go wrong. A secondary upshort is that, 
 by funnelling all our comparisons through a btree, there is a higher 
 likelihood of icache occupancy and we have only one area to focus delivery of 
 improvements for their enjoyment by all.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

2015-08-04 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14653870#comment-14653870
]

Jonathan Ellis commented on CASSANDRA-9459:
---

First reaction: I'd rather use some kind of function call syntax so that it's
distinct from normal columns.

Second reaction: Not sure conflating with UDF is much better. Maybe need to
think on this some more.

SecondaryIndex API redesign
---

Key: CASSANDRA-9459
URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
Project: Cassandra
Issue Type: Improvement
Reporter: Sam Tunnicliffe
Assignee: Sam Tunnicliffe
Fix For: 3.0 beta 1

For some time now the index subsystem has been a pain point and in large part
this is due to the way that the APIs and principal classes have grown
organically over the years. It would be a good idea to conduct a wholesale
review of the area and see if we can come up with something a bit more
coherent.
A few starting points:
* There's a lot in AbstractPerColumnSecondaryIndex its subclasses which
could be pulled up into SecondaryIndexSearcher (note that to an extent, this
is done in CASSANDRA-8099).
* SecondayIndexManager is overly complex and several of its functions should
be simplified/re-examined. The handling of which columns are indexed and
index selection on both the read and write paths are somewhat dense and
unintuitive.
* The SecondaryIndex class hierarchy is rather convoluted and could use some
serious rework.
There are a number of outstanding tickets which we should be able to roll
into this higher level one as subtasks (but I'll defer doing that until
getting into the details of the redesign):
* CASSANDRA-7771
* CASSANDRA-8103
* CASSANDRA-9041
* CASSANDRA-4458
* CASSANDRA-8505
Whilst they're not hard dependencies, I propose that this be done on top of
both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the
storage engine changes may facilitate a friendlier index API, but also
because of the changes to SIS mentioned above. As for 6717, the changes to
schema tables there will help facilitate CASSANDRA-7771.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9917:
--
Reviewer: Marcus Eriksson

 MVs should validate gc grace seconds on the tables involved
 ---

 Key: CASSANDRA-9917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Carl Yeksigian
 Fix For: 3.0 beta 1


 For correctness reasons (potential resurrection of dropped values), batchlog 
 entries are TTLs with the lowest gc grace second of all the tables involved 
 in a batch.
 It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
 will be dead on arrival, and never replayed.
 We should probably warn against such LOGGED writes taking place, in general, 
 but for MVs, we must validate that gc gs on the base table (and on the MV 
 table, if we should allow altering gc gs there at all), is never set too low, 
 or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9917) MVs should validate gc grace seconds on the tables involved

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9917:
--
Assignee: Carl Yeksigian

 MVs should validate gc grace seconds on the tables involved
 ---

 Key: CASSANDRA-9917
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9917
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Carl Yeksigian
 Fix For: 3.0 beta 1


 For correctness reasons (potential resurrection of dropped values), batchlog 
 entries are TTLs with the lowest gc grace second of all the tables involved 
 in a batch.
 It means that if gc gs is set to 0 in one of the tables, the batchlog entry 
 will be dead on arrival, and never replayed.
 We should probably warn against such LOGGED writes taking place, in general, 
 but for MVs, we must validate that gc gs on the base table (and on the MV 
 table, if we should allow altering gc gs there at all), is never set too low, 
 or else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9961) cqlsh should have DESCRIBE MATERIALIZED VIEW

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9961:
--
Assignee: Stefania

 cqlsh should have DESCRIBE MATERIALIZED VIEW
 

 Key: CASSANDRA-9961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9961
 Project: Cassandra
  Issue Type: Improvement
Reporter: Carl Yeksigian
Assignee: Stefania
  Labels: materializedviews
 Fix For: 3.0 beta 1


 cqlsh doesn't currently produce describe output that can be used to recreate 
 a MV. Needs to add a new {{DESCRIBE MATERIALIZED VIEW}} command, and also add 
 to {{DESCRIBE KEYSPACE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9961) cqlsh should have DESCRIBE MATERIALIZED VIEW

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9961:
--
Reviewer: Benjamin Lerer

 cqlsh should have DESCRIBE MATERIALIZED VIEW
 

 Key: CASSANDRA-9961
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9961
 Project: Cassandra
  Issue Type: Improvement
Reporter: Carl Yeksigian
Assignee: Stefania
  Labels: materializedviews
 Fix For: 3.0 beta 1


 cqlsh doesn't currently produce describe output that can be used to recreate 
 a MV. Needs to add a new {{DESCRIBE MATERIALIZED VIEW}} command, and also add 
 to {{DESCRIBE KEYSPACE}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9963) Compaction not starting for new tables

2015-08-03 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652030#comment-14652030
 ] 

Jonathan Ellis commented on CASSANDRA-9963:
---

is this something we could catch with a utest?

 Compaction not starting for new tables
 --

 Key: CASSANDRA-9963
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9963
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Marcus Eriksson
 Fix For: 2.1.x

 Attachments: 0001-dont-use-isEnabled-since-that-checks-isActive.patch


 Something committed since 2.1.8 broke cassandra-2.1 HEAD
 {noformat}
 create keyspace test with replication = {'class': 'SimpleStrategy', 
 'replication_factor': 1};
 create table test.stcs ( a int PRIMARY KEY , b int);
 {noformat}
 repeat  more than 4 times:
 {noformat}
 insert into test.stcs (a, b) VALUES ( 1, 1);
 nodetool flush test stcs
 ls data dir/test/stcs-*
 {noformat}
 See a bunch of sstables where STCS should have kicked in and compacted them 
 down some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9971) Static variables with small page sizes

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9971:
--
Assignee: Benjamin Lerer

 Static variables with small page sizes
 --

 Key: CASSANDRA-9971
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9971
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
 Environment: Local
Reporter: Steve Wang
Assignee: Benjamin Lerer
 Fix For: 3.x

 Attachments: static_paging_test.py


 Selecting static variables with small page sizes causes them to display as 
 None. With large page sizes and non-static variables, test still pass. Works 
 fine in 2.1.x. Not sure if it runs in 2.2.x (I can't seem to run C* version 
 2.2.x). 
 Run the test below to see error. Remove the list on line 21 to see what's 
 actually erroring. 
 Related to 8502.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9945) Add transparent data encryption core classes

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9945:
--
Fix Version/s: (was: 3.0 beta 1)
   3.x

 Add transparent data encryption core classes
 

 Key: CASSANDRA-9945
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9945
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jason Brown
Assignee: Jason Brown
  Labels: encryption
 Fix For: 3.x


 This patch will add the core infrastructure classes necessary for transparent 
 data encryption (file-level encryption), as required for CASSANDRA-6018 and 
 CASSANDRA-9633.  The phrase transparent data encryption, while not the most 
 aesthetically pleasing, seems to be used throughout the database industry 
 (Oracle, SQLQServer, Datastax Enterprise) to describe file level encryption, 
 so we'll go with that, as well. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9889) Disable scripted UDFs by default

2015-08-03 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652574#comment-14652574
 ] 

Jonathan Ellis commented on CASSANDRA-9889:
---

Very well, I do not throw a binding -1.

 Disable scripted UDFs by default
 

 Key: CASSANDRA-9889
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9889
 Project: Cassandra
  Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
 Fix For: 3.0.0 rc1


 (Follow-up to CASSANDRA-9402)
 TL;DR this ticket is about to add an other config option to enable scripted 
 UDFs.
 Securing Java-UDFs is much easier than scripted UDFs.
 The secure execution of scripted UDFs heavily relies on how secure a 
 particular script provider implementation is. Nashorn is probably pretty good 
 at this - but (as discussed offline with [~iamaleksey]) we are not certain. 
 This becomes worse with other JSR-223 providers (which need to be installed 
 by the user anyway).
 E.g.:
 {noformat}
 # Enables use of scripted UDFs.
 # Java UDFs are always enabled, if enable_user_defined_functions is true.
 # Enable this option to be able to use UDFs with language javascript or any 
 custom JSR-223 provider.
 enable_scripted_user_defined_functions: false
 {noformat}
 TBH: I would feel more comfortable to have this one. But we should review 
 this along with enable_user_defined_functions for 4.0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6018) Add option to encrypt commitlog

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6018:
--
Reviewer: Branimir Lambov

 Add option to encrypt commitlog 
 

 Key: CASSANDRA-6018
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6018
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jason Brown
Assignee: Jason Brown
  Labels: commit_log, encryption, security
 Fix For: 3.x


 We are going to start using cassandra for a billing system, and while I can 
 encrypt sstables at rest (via Datastax Enterprise), commit logs are more or 
 less plain text. Thus, an attacker would be able to easily read, for example, 
 credit card numbers in the clear text commit log (if the calling app does not 
 encrypt the data itself before sending it to cassandra).
 I want to allow the option of encrypting the commit logs, most likely 
 controlled by a property in the yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-6018) Add option to encrypt commitlog

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-6018:
--
Fix Version/s: (was: 3.0 beta 1)
   3.x

 Add option to encrypt commitlog 
 

 Key: CASSANDRA-6018
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6018
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jason Brown
Assignee: Jason Brown
  Labels: commit_log, encryption, security
 Fix For: 3.x


 We are going to start using cassandra for a billing system, and while I can 
 encrypt sstables at rest (via Datastax Enterprise), commit logs are more or 
 less plain text. Thus, an attacker would be able to easily read, for example, 
 credit card numbers in the clear text commit log (if the calling app does not 
 encrypt the data itself before sending it to cassandra).
 I want to allow the option of encrypting the commit logs, most likely 
 controlled by a property in the yaml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9889) Disable scripted UDFs by default

2015-08-03 Thread Jonathan Ellis (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14652564#comment-14652564
]

Jonathan Ellis commented on CASSANDRA-9889:
---

I could be missing something, but I'm not a huge fan of adding config switches
that replicate limited pieces of authz functionality. Isn't this config switch
the equivalent of don't grant EXECUTE TRUSTED to anyone?

Disable scripted UDFs by default

Key: CASSANDRA-9889
URL: https://issues.apache.org/jira/browse/CASSANDRA-9889
Project: Cassandra
Issue Type: Improvement
Reporter: Robert Stupp
Assignee: Robert Stupp
Priority: Minor
Fix For: 3.0.0 rc1

(Follow-up to CASSANDRA-9402)
TL;DR this ticket is about to add an other config option to enable scripted
UDFs.
Securing Java-UDFs is much easier than scripted UDFs.
The secure execution of scripted UDFs heavily relies on how secure a
particular script provider implementation is. Nashorn is probably pretty good
at this - but (as discussed offline with [~iamaleksey]) we are not certain.
This becomes worse with other JSR-223 providers (which need to be installed
by the user anyway).
E.g.:
{noformat}
# Enables use of scripted UDFs.
# Java UDFs are always enabled, if enable_user_defined_functions is true.
# Enable this option to be able to use UDFs with language javascript or any
custom JSR-223 provider.
enable_scripted_user_defined_functions: false
{noformat}
TBH: I would feel more comfortable to have this one. But we should review
this along with enable_user_defined_functions for 4.0.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Fix Version/s: (was: 3.0.0 rc1)
   3.0 beta 1

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alan Boudreault
 Fix For: 3.0 beta 1


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9967) Determine if a Materialized View is built (consistent with its base table after its creation)

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-9967:
--
Priority: Major  (was: Minor)

 Determine if a Materialized View is built (consistent with its base table 
 after its creation)
 -

 Key: CASSANDRA-9967
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9967
 Project: Cassandra
  Issue Type: Improvement
Reporter: Alan Boudreault
 Fix For: 3.0 beta 1


 Since MVs are eventually consistent with its base table, It would nice if we 
 could easily know the state of the MV after its creation, so we could wait 
 until the MV is built before doing some operations.
 // cc [~mbroecheler] [~tjake] [~carlyeks] [~enigmacurry]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9955) In 3 node Cluster, when 1 node was forced down, data failures are observed in other 2 nodes.

2015-08-03 Thread Jonathan Ellis (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14651868#comment-14651868
 ] 

Jonathan Ellis commented on CASSANDRA-9955:
---

I'm not sure what failures you're referring to.  Nothing in the log you posted 
looks unexpected.

 In 3 node Cluster, when 1 node was forced down, data failures are observed in 
 other 2 nodes.
 

 Key: CASSANDRA-9955
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9955
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.0.14, Hector Client (1.0.1), Red Hat Linux 
 OS,
Reporter: Amit Singh Chowdhery

 Issue :
 On 3 node cluster,  inserts are happening normally but when 1 node was pulled 
 down, After few minutes application stops and then failure start coming on 
 both nodes. Hector keeps 
 Exception Logs hector:
 ERROR m.p.c.c.ConcurrentHClientPool - Transport exception in re-opening 
 client in release on ConcurrentCassandraClientPoolByHost.
 Cassandra Debug Logs :
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:37,698 ColumnFamilyStore.java (line 
 300) retryPolicy for local is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:38,969 ColumnFamilyStore.java (line 
 300) retryPolicy for encryptionKey is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:39,492 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchers.c_per__batchIdIdx is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:39,504 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchers.TX_STATEIdx is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:39,824 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchers.c_per__serialNumberIdx is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:39,824 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchers.c_per__subStateIdx is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:39,828 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchers is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:40,011 ColumnFamilyStore.java (line 
 300) retryPolicy for voucherHistory is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:40,021 ColumnFamilyStore.java (line 
 300) retryPolicy for vshash is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:40,180 ColumnFamilyStore.java (line 
 300) retryPolicy for vouchersByPurgeDate is 0.99
 DEBUG [OptionalTasks:1] 2015-07-31 11:57:40,395 ColumnFamilyStore.java (line 
 300) retryPolicy for serialNums is 0.99
 DEBUG [Thrift:7] 2015-07-31 11:57:40,452 CassandraServer.java (line 311) 
 get_slice
 DEBUG [Thrift:35] 2015-07-31 11:57:40,452 CassandraServer.java (line 943) 
 batch_mutate
 DEBUG [MutationStage:56] 2015-07-31 11:57:40,453 StorageProxy.java (line 928) 
 Adding hint for /192.168.5.65
 DEBUG [Thrift:35] 2015-07-31 11:57:40,453 Tracing.java (line 159) request 
 complete
 DEBUG [Thrift:7] 2015-07-31 11:57:40,453 RowDigestResolver.java (line 62) 
 resolving 2 responses
 DEBUG [Thrift:7] 2015-07-31 11:57:40,453 RowDigestResolver.java (line 94) 
 resolve: 0 ms.
 DEBUG [Thrift:7] 2015-07-31 11:57:40,454 StorageProxy.java (line 1275) Read: 
 1 ms.
 Steps to reproduce
 Step 1 : In 3 node cluster, on any 2 node start inserting records.
 Step 2 : Take down the node on which data insertion was not happening (init 
 0).
 Step 3: Failures can be seen on other 2 nodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9957) Unable to build Apache Cassandra Under Debian 8 OS with the provided ant script

2015-08-03 Thread Jonathan Ellis (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-9957.
---
Resolution: Not A Problem

Something is broken in your environment, but this is not a C* bug.

 Unable to build Apache Cassandra Under Debian 8 OS with the provided ant 
 script
 ---

 Key: CASSANDRA-9957
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9957
 Project: Cassandra
  Issue Type: Bug
 Environment: PRETTY_NAME=Debian GNU/Linux 8 (jessie)
 NAME=Debian GNU/Linux
 VERSION_ID=8
 VERSION=8 (jessie)
 ID=debian
 HOME_URL=http://www.debian.org/;
 SUPPORT_URL=http://www.debian.org/support/;
 BUG_REPORT_URL=https://bugs.debian.org/;
  
 java version 1.8.0_45
 Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
 Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)
 Apache Ant(TM) version 1.9.5 compiled on May 31 2015
Reporter: Adelin M.Ghanayem
  Labels: Cassandra, ant, build, build.xml

 Trying to use the tool CCM ( Cassandra Cluster Manger ) I've been blocked by 
 an issue related to compiling Cassandra source. CCM installs Cassandra builds 
 it source before anything else. However the CCM thrown an error 
 https://gist.github.com/AdelinGhanaem/593d1c8a63857113d0a7 here you can find 
 all info you need. 
 I've then tried to download the source and compile it using ant jar but 
 I've got the same error. 
 Basically the jars that are installed then running ant jar are corrupted ! 
 Extract them with jar xf thrown an error. 
 The only way that I could build the source is by downloading the jars by hand 
 from maven. I've described the error and the process in this post here
  
 http://mradelin.blogspot.com/2015/07/error-packaging-cassandra-220-db-source_31.html
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

< 2 3 4 5 6 7 8 9 10 11 >

601 - 700 of 16016 matches

Mail list logo