[jira] [Updated] (CASSANDRA-11052) Cannot use Java 8 lambda expression inside UDF code body

2016-02-08 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-11052:
-
Attachment: 11052-2.patch

> Cannot use Java 8 lambda expression inside UDF code body
> 
>
> Key: CASSANDRA-11052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11052
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11052-2.patch, 11052.patch
>
>
> When creating the following **UDF** using Java 8 lambda syntax
> {code:sql}
>  CREATE FUNCTION IF NOT EXISTS music.udf(state map, styles 
> list)
>  RETURNS NULL ON NULL INPUT
>  RETURNS map
>  LANGUAGE java
>  AS $$
>styles.forEach((Object o) -> {
>String style = (String)o;
>if(state.containsKey(style)) {
> state.put(style, (Long)state.get(style)+1);
>} else {
> state.put(style, 1L);   
>}
>});
>
>return state;
>  $$;
> {code}
>  I got the following exception:
> {code:java}
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Could 
> not compile function 'music.udf' from Java source: 
> org.apache.cassandra.exceptions.InvalidRequestException: Java source 
> compilation failed:
> Line 2: The type java.util.function.Consumer cannot be resolved. It is 
> indirectly referenced from required .class files
> Line 2: The method forEach(Consumer) from the type Iterable refers to the 
> missing type Consumer
> Line 2: The target type of this expression must be a functional interface
>   at 
> com.datastax.driver.core.Responses$Error.asException(Responses.java:136)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
>   at 
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:184)
>   at 
> com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:798)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:617)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1005)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:928)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
>   ... 1 more
> {code}
>  It looks like the compiler requires importing java.util.Consumer but I have 
> checked the source code and compiler options already support Java 8 source 
> code so I'm pretty puzzled here ...
> /cc [~snazy]



--

[jira] [Commented] (CASSANDRA-11052) Cannot use Java 8 lambda expression inside UDF code body

2016-02-07 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136352#comment-15136352
 ] 

Sean Bridges commented on CASSANDRA-11052:
--

What is the purpose of  UDFByteCodeVerifier?  Non java UDFs don't use 
UDFByteCodeVerifier, so UDFByteCodeVerifier shouldn't be used to enforce 
policies like not allowing java.lang.invoke or using the common fork join pool. 
  Javascript UDFs can do both now.

It seems all policy/security enforcement should be done using 
ThreadAwareSecurityManager or some other method common to all UDFs.

To stop UDFs using the common fork join pool we could set the system property 
java.util.concurrent.ForkJoinPool.common.threadFactory to a thread factory that 
never creates new threads. This would disallow using the common pool for the 
entire jvm though.

I can add tests to make sure ThreadAwareSecurityManager does not allow using 
java.lang.invoke in a malicious way.  I can also add tests that make sure 
ThreadAwareSecurityManagerudfs doesn't allow creating a new ForkJoinPool.


> Cannot use Java 8 lambda expression inside UDF code body
> 
>
> Key: CASSANDRA-11052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11052
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11052.patch
>
>
> When creating the following **UDF** using Java 8 lambda syntax
> {code:sql}
>  CREATE FUNCTION IF NOT EXISTS music.udf(state map, styles 
> list)
>  RETURNS NULL ON NULL INPUT
>  RETURNS map
>  LANGUAGE java
>  AS $$
>styles.forEach((Object o) -> {
>String style = (String)o;
>if(state.containsKey(style)) {
> state.put(style, (Long)state.get(style)+1);
>} else {
> state.put(style, 1L);   
>}
>});
>
>return state;
>  $$;
> {code}
>  I got the following exception:
> {code:java}
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Could 
> not compile function 'music.udf' from Java source: 
> org.apache.cassandra.exceptions.InvalidRequestException: Java source 
> compilation failed:
> Line 2: The type java.util.function.Consumer cannot be resolved. It is 
> indirectly referenced from required .class files
> Line 2: The method forEach(Consumer) from the type Iterable refers to the 
> missing type Consumer
> Line 2: The target type of this expression must be a functional interface
>   at 
> com.datastax.driver.core.Responses$Error.asException(Responses.java:136)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
>   at 
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:184)
>   at 
> com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:798)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:617)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1005)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:928)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> 

[jira] [Commented] (CASSANDRA-11052) Cannot use Java 8 lambda expression inside UDF code body

2016-02-07 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136326#comment-15136326
 ] 

Sean Bridges commented on CASSANDRA-11052:
--

Thanks for the feedback, I'll try to limit what is allowed.

What are we trying to protect against?  Is this to secure cassandra in a multi 
tenant environment from a malicious tenant, or to stop a user from accidentally 
causing instability?

> Cannot use Java 8 lambda expression inside UDF code body
> 
>
> Key: CASSANDRA-11052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11052
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11052.patch
>
>
> When creating the following **UDF** using Java 8 lambda syntax
> {code:sql}
>  CREATE FUNCTION IF NOT EXISTS music.udf(state map, styles 
> list)
>  RETURNS NULL ON NULL INPUT
>  RETURNS map
>  LANGUAGE java
>  AS $$
>styles.forEach((Object o) -> {
>String style = (String)o;
>if(state.containsKey(style)) {
> state.put(style, (Long)state.get(style)+1);
>} else {
> state.put(style, 1L);   
>}
>});
>
>return state;
>  $$;
> {code}
>  I got the following exception:
> {code:java}
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Could 
> not compile function 'music.udf' from Java source: 
> org.apache.cassandra.exceptions.InvalidRequestException: Java source 
> compilation failed:
> Line 2: The type java.util.function.Consumer cannot be resolved. It is 
> indirectly referenced from required .class files
> Line 2: The method forEach(Consumer) from the type Iterable refers to the 
> missing type Consumer
> Line 2: The target type of this expression must be a functional interface
>   at 
> com.datastax.driver.core.Responses$Error.asException(Responses.java:136)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
>   at 
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:184)
>   at 
> com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:798)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:617)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1005)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:928)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
>   

[jira] [Updated] (CASSANDRA-11052) Cannot use Java 8 lambda expression inside UDF code body

2016-01-31 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-11052:
-
Attachment: 11052.patch

> Cannot use Java 8 lambda expression inside UDF code body
> 
>
> Key: CASSANDRA-11052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11052
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
> Fix For: 3.x
>
> Attachments: 11052.patch
>
>
> When creating the following **UDF** using Java 8 lambda syntax
> {code:sql}
>  CREATE FUNCTION IF NOT EXISTS music.udf(state map, styles 
> list)
>  RETURNS NULL ON NULL INPUT
>  RETURNS map
>  LANGUAGE java
>  AS $$
>styles.forEach((Object o) -> {
>String style = (String)o;
>if(state.containsKey(style)) {
> state.put(style, (Long)state.get(style)+1);
>} else {
> state.put(style, 1L);   
>}
>});
>
>return state;
>  $$;
> {code}
>  I got the following exception:
> {code:java}
> Caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Could 
> not compile function 'music.udf' from Java source: 
> org.apache.cassandra.exceptions.InvalidRequestException: Java source 
> compilation failed:
> Line 2: The type java.util.function.Consumer cannot be resolved. It is 
> indirectly referenced from required .class files
> Line 2: The method forEach(Consumer) from the type Iterable refers to the 
> missing type Consumer
> Line 2: The target type of this expression must be a functional interface
>   at 
> com.datastax.driver.core.Responses$Error.asException(Responses.java:136)
>   at 
> com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
>   at 
> com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:184)
>   at 
> com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:43)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:798)
>   at 
> com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:617)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1005)
>   at 
> com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:928)
>   at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:276)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:263)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:318)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:304)
>   at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>   at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
>   ... 1 more
> {code}
>  It looks like the compiler requires importing java.util.Consumer but I have 
> checked the source code and compiler options already support Java 8 source 
> code so I'm pretty puzzled here ...
> /cc [~snazy]



--
This message was sent by 

[jira] [Commented] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-27 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185680#comment-14185680
 ] 

Sean Bridges commented on CASSANDRA-8177:
-

[~jbellis] Is fix version of 2.1.2 right, I'm not sure if this affects 2.1

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
Assignee: Yuki Morishita
 Fix For: 2.1.2

 Attachments: cassc-week.png, iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-24 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183783#comment-14183783
 ] 

Sean Bridges edited comment on CASSANDRA-8177 at 10/25/14 12:18 AM:


{quote}
My guess for sequential repair generating lots of IO is that, when reading from 
snapshot, it is hitting disk for each snapshot SSTable to read its bloom 
filters, index files etc
{quote}

When you snapshot you are hardlinking the old and original sstables, they are 
the same files, so the os cache shouldn't be the difference


was (Author: sgbridges):
{quote}
My guess for sequential repair generating lots of IO is that, when reading from 
snapshot, it is hitting disk for each snapshot SSTable to read its bloom 
filters, index files etc
{quote}

When you snapshot you are hardlinking the old and original sstables, they are 
the same file, so the os cache shouldn't be the difference

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
Assignee: Yuki Morishita
 Attachments: cassc-week.png, iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-24 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183783#comment-14183783
 ] 

Sean Bridges commented on CASSANDRA-8177:
-

{quote}
My guess for sequential repair generating lots of IO is that, when reading from 
snapshot, it is hitting disk for each snapshot SSTable to read its bloom 
filters, index files etc
{quote}

When you snapshot you are hardlinking the old and original sstables, they are 
the same file, so the os cache shouldn't be the difference

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
Assignee: Yuki Morishita
 Attachments: cassc-week.png, iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-23 Thread Sean Bridges (JIRA)
Sean Bridges created CASSANDRA-8177:
---

 Summary: sequential repair is much more expensive than parallel 
repair
 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges


This is with 2.0.10

The attached graph shows io read/write throughput (as measured with iostat) 
when doing repairs.

The large hump on the left is a sequential repair of one node.  The two much 
smaller peaks on the right are parallel repairs.

This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
recommended).  Cassandra reports load of 40 gigs.

We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-23 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-8177:

Attachment: iostats.png

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
 Attachments: iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-23 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182157#comment-14182157
 ] 

Sean Bridges commented on CASSANDRA-8177:
-

We can't easily upgrade to 2.1.

I don't think this issue is a dupe of CASSANDRA-5220. Looking at the graphs, I 
think something is quite wrong with sequential or parallel  repair.  With a 3 
node cluster, using sequential shouldn't cause repairs to take 13 times as 
long, and use a lot more io.


 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
 Attachments: iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-23 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges reopened CASSANDRA-8177:
-

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
 Attachments: iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6456) log listen address at startup

2014-01-22 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13879451#comment-13879451
 ] 

Sean Bridges commented on CASSANDRA-6456:
-

Sorry, didn't know you were waiting for me.  Latest patch looks good to me.

 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Sean Bridges
Priority: Trivial
 Attachments: 6456_v4_trunk.patch, CASSANDRA-6456-2.patch, 
 CASSANDRA-6456-3.patch, CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6456) log listen address at startup

2014-01-05 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-6456:


Attachment: CASSANDRA-6456-3.patch

 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Sean Bridges
Priority: Trivial
 Attachments: CASSANDRA-6456-2.patch, CASSANDRA-6456-3.patch, 
 CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6456) log listen address at startup

2014-01-05 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13862774#comment-13862774
 ] 

Sean Bridges commented on CASSANDRA-6456:
-

New patch removes all lines covered by this can go

 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Sean Bridges
Priority: Trivial
 Attachments: CASSANDRA-6456-2.patch, CASSANDRA-6456-3.patch, 
 CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6456) log listen address at startup

2014-01-02 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-6456:


Attachment: CASSANDRA-6456-2.patch

 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Sean Bridges
Priority: Trivial
 Attachments: CASSANDRA-6456-2.patch, CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-6456) log listen address at startup

2014-01-02 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13861183#comment-13861183
 ] 

Sean Bridges commented on CASSANDRA-6456:
-

New patch attached.

{quote}
I think we should change the format to a single line (helps when grep'ing) to a 
single line (see this gist)
{quote}

Changed to log on a single line with slightly modified format to be consistent 
with other log lines. 

{quote}
For the original intent of this JIRA I think we need to add a call to get 
address or something. As the IP's in the yaml can be left blank.
{quote}

I added a line to log InetAddress.getLocalHost() on startup in case listen 
address is not set


{quote}
I think this makes some ad-hoc config logging redundant as well?
{quote}

A couple of log lines were removed with the original patch, let me know if 
there are more to remove.


 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Sean Bridges
Priority: Trivial
 Attachments: CASSANDRA-6456-2.patch, CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Updated] (CASSANDRA-6456) log listen address at startup

2014-01-01 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-6456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-6456:


Attachment: CASSANDRA-6456.patch

This patch logs all config settings on startup, excepting some settings which 
may contain passwords

 log listen address at startup
 -

 Key: CASSANDRA-6456
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6456
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Jeremy Hanna
Assignee: Jeremy Hanna
Priority: Trivial
 Attachments: CASSANDRA-6456.patch


 When looking through logs from a cluster, sometimes it's handy to know the 
 address a node is from the logs.  It would be convenient if on startup, we 
 indicated the listen address for that node.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (CASSANDRA-5293) formalize that timestamps are epoch-in-micros in 2.0

2013-04-02 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13620144#comment-13620144
 ] 

Sean Bridges commented on CASSANDRA-5293:
-

This will break a lot of our code as we use non epoch-in-micro values as 
timestamps quite a bit.  It is very handy for ensuring order when you have 
another monotonically increasing id available.  

As an example we compute meta data for versioned objects, and store the meta 
data in cassandra.  The version id is a monotonically increasing long, and we 
write the meta data to cassandra with a timestamp of the version id.  Due to 
retries, multiple machines may be processing the same object with different 
version ids, but since we always write to cassandra with a timestamp of the 
version id, the latest version id always wins.

We have a couple other use cases, but having a user set timestamp that does not 
have to be an epoch-in-micros is very useful.

If you want a real timestamp, perhaps it is better to add a new 
timestamp-micros field which is set by the co-ordinator, and not visible to 
thrift/cql.

 formalize that timestamps are epoch-in-micros in 2.0
 

 Key: CASSANDRA-5293
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5293
 Project: Cassandra
  Issue Type: Task
  Components: Core
Reporter: Jonathan Ellis
 Fix For: 2.0


 We've worked around don't assume timestamps are actually timestamps but the 
 utility is not worth the complexity and lost opportunities to optimize this 
 imposes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-5392) cassandra-all 1.2.0 pom missing netty dependency

2013-03-27 Thread Sean Bridges (JIRA)
Sean Bridges created CASSANDRA-5392:
---

 Summary: cassandra-all 1.2.0 pom missing netty dependency
 Key: CASSANDRA-5392
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5392
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Affects Versions: 1.2.3
Reporter: Sean Bridges
 Fix For: 1.2.4


It seems that cassandra depends on netty now, however the pom excludes this 
dependency.  This was previously reported as CASSANDRA-5181, but the fix for 
5181 added netty to the dependency-management section of the pom, not the 
depencies section

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5392) cassandra-all 1.2.0 pom missing netty dependency

2013-03-27 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-5392:


Attachment: CASSANDRA-5392.txt

 cassandra-all 1.2.0 pom missing netty dependency
 

 Key: CASSANDRA-5392
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5392
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Affects Versions: 1.2.3
Reporter: Sean Bridges
 Fix For: 1.2.4

 Attachments: CASSANDRA-5392.txt


 It seems that cassandra depends on netty now, however the pom excludes this 
 dependency.  This was previously reported as CASSANDRA-5181, but the fix for 
 5181 added netty to the dependency-management section of the pom, not the 
 depencies section

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-22 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023233#comment-13023233
 ] 

Sean Bridges commented on CASSANDRA-2494:
-

I think the guarantee of quorum reads not seeing old writes once a quorum read 
sees a new write is  very useful.  I suspect most people already think that 
this guarantee occurs, including, it seems, Jonathan Ellis whose quote can be 
found in the email thread linked to in the bug,

The important guarantee this gives you is that once one quorum read sees the 
new value, all others will too.   You can't see the newest version, then see an 
older version on a subsequent write [sic, I
assume he meant read], which is the characteristic of non-strong consistency





 Quorum reads are not consistent
 ---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges

 As discussed in this thread,
 http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
 Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
 (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
 not Y and Z, then a read from X should not return N unless the read is 
 committed to at  least two nodes.  To ensure this, a read from X should wait 
 for an ack of the read repair write from either Y or Z before returning.
 Are there system tests for cassandra?  If so, there should be a test similar 
 to the original post in the email thread.  One thread should write 1,2,3... 
 at consistency level ONE.  Another thread should read at consistency level 
 QUORUM from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-22 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023242#comment-13023242
 ] 

Sean Bridges commented on CASSANDRA-2494:
-

To be clear, this is a new guarantee.  The current guarantee is R+WN gives you 
consistency.  This bug is asking that a quorum read of A means that A has been 
committed to a quorum of nodes.

How can we ensure the quorum read property that you want ?

If when reading at quorum, and no quorum can be found which agrees on a 
particular value, then the coordinator (?) will wait for acks of read repair 
writes (or perhaps just do normal writes) to be returned from a sufficient 
number of nodes to ensure that the value has been committed to a quorum of 
nodes.

Without this new guarantee it is hard for readers to function correctly.  The 
reader does not know that the quorum write failed, or is still in progress, so 
without reading at ALL, the R+WN guarantee does not help the reader.





 Quorum reads are not consistent
 ---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges

 As discussed in this thread,
 http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
 Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
 (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
 not Y and Z, then a read from X should not return N unless the read is 
 committed to at  least two nodes.  To ensure this, a read from X should wait 
 for an ack of the read repair write from either Y or Z before returning.
 Are there system tests for cassandra?  If so, there should be a test similar 
 to the original post in the email thread.  One thread should write 1,2,3... 
 at consistency level ONE.  Another thread should read at consistency level 
 QUORUM from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Issue Comment Edited] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-22 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13023242#comment-13023242
 ] 

Sean Bridges edited comment on CASSANDRA-2494 at 4/22/11 3:23 PM:
--

To be clear, this is a new guarantee.  The current guarantee is R+WN gives you 
consistency.  This bug is asking that a successful quorum read of A means that 
A has been committed to a quorum of nodes.

How can we ensure the quorum read property that you want ?

If when reading at quorum, and no quorum can be found which agrees on a 
particular value, then the coordinator ( ? ) will wait for acks of read repair 
writes (or perhaps just do normal writes) to be returned from a sufficient 
number of nodes to ensure that the value has been committed to a quorum of 
nodes.

Without this new guarantee it is hard for readers to function correctly.  The 
reader does not know that the quorum write failed, or is still in progress, so 
without reading at ALL, the R+WN guarantee does not help the reader.





  was (Author: sbridges):
To be clear, this is a new guarantee.  The current guarantee is R+WN gives 
you consistency.  This bug is asking that a quorum read of A means that A has 
been committed to a quorum of nodes.

How can we ensure the quorum read property that you want ?

If when reading at quorum, and no quorum can be found which agrees on a 
particular value, then the coordinator (?) will wait for acks of read repair 
writes (or perhaps just do normal writes) to be returned from a sufficient 
number of nodes to ensure that the value has been committed to a quorum of 
nodes.

Without this new guarantee it is hard for readers to function correctly.  The 
reader does not know that the quorum write failed, or is still in progress, so 
without reading at ALL, the R+WN guarantee does not help the reader.




  
 Quorum reads are not consistent
 ---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges

 As discussed in this thread,
 http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
 Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
 (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
 not Y and Z, then a read from X should not return N unless the read is 
 committed to at  least two nodes.  To ensure this, a read from X should wait 
 for an ack of the read repair write from either Y or Z before returning.
 Are there system tests for cassandra?  If so, there should be a test similar 
 to the original post in the email thread.  One thread should write 1,2,3... 
 at consistency level ONE.  Another thread should read at consistency level 
 QUORUM from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-17 Thread Sean Bridges (JIRA)
Quorum reads are not consistent
---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges


As discussed in this thread,

http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html

If we have a cluster of 3 nodes (X,Y,Z) and a replication factor of 3, Quorum 
reads should be consistent.  If a write of N is committed to X, but not Y and 
Z, then a read from X should not return N unless the read is committed to at  
least two nodes.  To ensure this, a read from X should wait for an ack of the 
read repair write from either Y or Z before returning.

Are there system tests for cassandra?  If so, there should be a test similar to 
the original post in the email thread.  One thread should write 1,2,3... at 
consistency level ONE.  Another thread should read at consistency level QUORUM 
from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-17 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-2494:


Description: 
As discussed in this thread,

http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html

Quorum reads should be consistent.  Assume we have a cluster of 3 nodes (X,Y,Z) 
and a replication factor of 3. If a write of N is committed to X, but not Y and 
Z, then a read from X should not return N unless the read is committed to at  
least two nodes.  To ensure this, a read from X should wait for an ack of the 
read repair write from either Y or Z before returning.

Are there system tests for cassandra?  If so, there should be a test similar to 
the original post in the email thread.  One thread should write 1,2,3... at 
consistency level ONE.  Another thread should read at consistency level QUORUM 
from a random host, and verify that each read is = the last read.

  was:
As discussed in this thread,

http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html

If we have a cluster of 3 nodes (X,Y,Z) and a replication factor of 3, Quorum 
reads should be consistent.  If a write of N is committed to X, but not Y and 
Z, then a read from X should not return N unless the read is committed to at  
least two nodes.  To ensure this, a read from X should wait for an ack of the 
read repair write from either Y or Z before returning.

Are there system tests for cassandra?  If so, there should be a test similar to 
the original post in the email thread.  One thread should write 1,2,3... at 
consistency level ONE.  Another thread should read at consistency level QUORUM 
from a random host, and verify that each read is = the last read.


 Quorum reads are not consistent
 ---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges

 As discussed in this thread,
 http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
 Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
 (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
 not Y and Z, then a read from X should not return N unless the read is 
 committed to at  least two nodes.  To ensure this, a read from X should wait 
 for an ack of the read repair write from either Y or Z before returning.
 Are there system tests for cassandra?  If so, there should be a test similar 
 to the original post in the email thread.  One thread should write 1,2,3... 
 at consistency level ONE.  Another thread should read at consistency level 
 QUORUM from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-2494) Quorum reads are not consistent

2011-04-17 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020898#comment-13020898
 ] 

Sean Bridges commented on CASSANDRA-2494:
-

Peter Shuller wrote,

However, it sounds like what is being asked for is not that they don't 
propagate in the event of a write failure, but just that reads don't see the 
writes until they are sufficiently propagated to guarantee that any future 
QUORUM read will also see the data.

Yes, that is the issue.  The comment in the bug about writing at ONE and 
reading at QUORUM is just a way of testing this new guarantee in a distributed 
test, if Cassandra has those.

 Quorum reads are not consistent
 ---

 Key: CASSANDRA-2494
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2494
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges

 As discussed in this thread,
 http://www.mail-archive.com/user@cassandra.apache.org/msg12421.html
 Quorum reads should be consistent.  Assume we have a cluster of 3 nodes 
 (X,Y,Z) and a replication factor of 3. If a write of N is committed to X, but 
 not Y and Z, then a read from X should not return N unless the read is 
 committed to at  least two nodes.  To ensure this, a read from X should wait 
 for an ack of the read repair write from either Y or Z before returning.
 Are there system tests for cassandra?  If so, there should be a test similar 
 to the original post in the email thread.  One thread should write 1,2,3... 
 at consistency level ONE.  Another thread should read at consistency level 
 QUORUM from a random host, and verify that each read is = the last read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Updated: (CASSANDRA-1187) make the number of compaction threads configurable

2010-07-31 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-1187:


Attachment: CASSANDRA-1187-2.patch

Is this what you were thinking of?  

The patch adds a new ConcurrentCompactedRow which can read columns from 
multiple SSTables in parallell.  I'm not sure how much parallelism this patch 
gives.  For the case where two SSTables have no rows in common, there is no 
benefit.

Trying to read from multiple rows in parallell seems like it would get messy.

 make the number of compaction threads configurable
 --

 Key: CASSANDRA-1187
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sean Bridges
 Attachments: CASSANDRA-1187-2.patch, CASSANDRA-1187.patch


 On our test machines, compaction is the limiting factor when we are writing 
 to Cassandra.  It's easy to write to Cassandra faster than the single 
 compaction thread can keep up, leading to a large number of sstables.
 In one extreme example, we inserted a TB of data into a single cassandra node 
 overnight, and ended up with 100,000 sstables, which took another two days to 
 finish compacting.
 If the number of compaction threads was configurable, we could tune cassandra 
 to support a higher write workload.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-912) First-class commandline interface

2010-07-10 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-912:
---

Attachment: CASSANDRA-912-2.patch.txt

rebased previous patch to trunk

 First-class commandline interface
 -

 Key: CASSANDRA-912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-912
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 0.6
Reporter: Eric Evans
 Fix For: 0.7

 Attachments: CASSANDRA-912-2.patch.txt, CASSANDRA-912.patch


 While a useful tool for education and simple tests, cassandra-cli is 
 ultimately limted by the fact that column names and values are binary, (and 
 eventually keys will be as well, see CASSANDRA-767). 
 The current approach when writing consists of encoding column names as UTF8, 
 and passing the value as a byte[] of the String parsed from the command. When 
 performing a read, the column names outputted are the result of the 
 toString() method of the comparator (the result of which is not always 
 meaningful), and values are again treated as raw strings. This is almost 
 certainly broken anywhere that the CF comparator is not UTF8Type and values 
 are anything but strings.
 One possible approach would be to follow HBase's lead and simply allow binary 
 values to be encoded as strings (see: 
 http://wiki.apache.org/hadoop/Hbase/Shell).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1187) make the number of compaction threads configurable

2010-06-13 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-1187:


Attachment: CASSANDRA-1187.patch

This patch allows setting the number of threads used in compaction.

A queue is created for each column family, and only one compaction thread is 
allowed to compact a column family at a time.

 make the number of compaction threads configurable
 --

 Key: CASSANDRA-1187
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1187
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.6.1
Reporter: Sean Bridges
 Attachments: CASSANDRA-1187.patch


 On our test machines, compaction is the limiting factor when we are writing 
 to Cassandra.  It's easy to write to Cassandra faster than the single 
 compaction thread can keep up, leading to a large number of sstables.
 In one extreme example, we inserted a TB of data into a single cassandra node 
 overnight, and ended up with 100,000 sstables, which took another two days to 
 finish compacting.
 If the number of compaction threads was configurable, we could tune cassandra 
 to support a higher write workload.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-912) First-class commandline interface

2010-05-24 Thread Sean Bridges (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Bridges updated CASSANDRA-912:
---

Attachment: CASSANDRA-912.patch

 First-class commandline interface
 -

 Key: CASSANDRA-912
 URL: https://issues.apache.org/jira/browse/CASSANDRA-912
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Affects Versions: 0.6
Reporter: Eric Evans
 Attachments: CASSANDRA-912.patch


 While a useful tool for education and simple tests, cassandra-cli is 
 ultimately limted by the fact that column names and values are binary, (and 
 eventually keys will be as well, see CASSANDRA-767). 
 The current approach when writing consists of encoding column names as UTF8, 
 and passing the value as a byte[] of the String parsed from the command. When 
 performing a read, the column names outputted are the result of the 
 toString() method of the comparator (the result of which is not always 
 meaningful), and values are again treated as raw strings. This is almost 
 certainly broken anywhere that the CF comparator is not UTF8Type and values 
 are anything but strings.
 One possible approach would be to follow HBase's lead and simply allow binary 
 values to be encoded as strings (see: 
 http://wiki.apache.org/hadoop/Hbase/Shell).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.