[jira] [Updated] (CASSANDRA-11569) Track message latency across DCs

2016-04-13 Thread Chris Lohfink (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-11569:
--
Attachment: CASSANDRA-11569.patch

> Track message latency across DCs
> 
>
> Key: CASSANDRA-11569
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11569
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSANDRA-11569.patch
>
>
> Since we have the timestamp a message is created and when arrives, we can get 
> an approximate time it took relatively easy and would remove necessity for 
> more complex hacks to determine latency between DCs.
> Although is not going to be very meaningful when ntp is not setup, it is 
> pretty common to have NTP setup and even with clock drift nothing is really 
> hurt except the metric becoming whacky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11550) Make the fanout size for LeveledCompactionStrategy to be configurable

2016-04-13 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240467#comment-15240467
 ] 

Dikang Gu commented on CASSANDRA-11550:
---

[~krummas], yes, we do have some write heavy use cases, and I'm testing 
different fanout sizes to reduce the write amplification of the system, which 
potentially could reduce both cpu and disk usage.

What's your concern to make the fanout size to be configurable?

Thanks.

> Make the fanout size for LeveledCompactionStrategy to be configurable
> -
>
> Key: CASSANDRA-11550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11550
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
> Attachments: 
> 0001-make-fanout-size-for-leveledcompactionstrategy-to-be.patch
>
>
> Currently, the fanout size for LeveledCompactionStrategy is hard coded in the 
> system (10). It would be useful to make the fanout size to be tunable, so 
> that we can change it according to different use cases.
> Further more, we can change the size dynamically.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11569) Track message latency across DCs

2016-04-13 Thread Chris Lohfink (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-11569:
--
Status: Patch Available  (was: Open)

> Track message latency across DCs
> 
>
> Key: CASSANDRA-11569
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11569
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Minor
> Attachments: CASSANDRA-11569.patch
>
>
> Since we have the timestamp a message is created and when arrives, we can get 
> an approximate time it took relatively easy and would remove necessity for 
> more complex hacks to determine latency between DCs.
> Although is not going to be very meaningful when ntp is not setup, it is 
> pretty common to have NTP setup and even with clock drift nothing is really 
> hurt except the metric becoming whacky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11572) SStableloader does not stream data if the Cassandra table was altered to drop some column

2016-04-13 Thread manuj singh (JIRA)
manuj singh created CASSANDRA-11572:
---

 Summary: SStableloader does not stream data if the Cassandra table 
was altered to drop some column
 Key: CASSANDRA-11572
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11572
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: manuj singh


Sstabble loader stops working whenever the cassandra table is altered to drop 
some column. 
the following error shows:
Error:

Could not retrieve endpoint ranges:
java.lang.IllegalArgumentException
java.lang.RuntimeException: Could not retrieve endpoint ranges:
at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
at 
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:275)
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
at 
org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
at 
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
at 
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
at 
org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
at 
org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
at 
org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
at 
org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
... 2 more

The only solution is then to drop the table and create it again. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11374) LEAK DETECTED during repair

2016-04-13 Thread Anubhav Kale (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anubhav Kale updated CASSANDRA-11374:
-
Attachment: Leak_Logs_2.zip
Leak_Logs_1.zip

Attached Leak_Logs*.zip that show this error on Cassandra 2.1.13 while 
bootstrapping. This is a consistent repro for us. Our node size is ~300 GB.

The process stays up after the leak message, but doesn't do much and the node 
is eventually removed from gossip (thus doesn't show up in gossipinfo / status 
on other nodes).

The only workaround seems to be letting the node boot with auto_bootstrap=false 
and then do a nodetool rebuild.

> LEAK DETECTED during repair
> ---
>
> Key: CASSANDRA-11374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11374
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jean-Francois Gosselin
>Assignee: Marcus Eriksson
> Attachments: Leak_Logs_1.zip, Leak_Logs_2.zip
>
>
> When running a range repair we are seeing the following LEAK DETECTED errors:
> {noformat}
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,261 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@5ee90b43) to class 
> org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@367168611:[[OffHeapBitSet]]
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@4ea9d4a7) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1875396681:Memory@[7f34b905fd10..7f34b9060b7a)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,262 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@27a6b614) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@838594402:Memory@[7f34bae11ce0..7f34bae11d84)
>  was not released before the reference was garbage collected
> ERROR [Reference-Reaper:1] 2016-03-17 06:58:52,263 Ref.java:179 - LEAK 
> DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@64e7b566) to class 
> org.apache.cassandra.io.util.SafeMemory$MemoryTidy@674656075:Memory@[7f342deab4e0..7f342deb7ce0)
>  was not released before the reference was garbage collected
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11571:
--
Attachment: 0001-Optimize-the-overlapping-lookup-by-calculating-all-t.patch

> Optimize the overlapping lookup, by calculating all the bounds in advance.
> --
>
> Key: CASSANDRA-11571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
> Attachments: 
> 0001-Optimize-the-overlapping-lookup-by-calculating-all-t.patch
>
>
> When L0 sstable backs up (because of repair or other reasons), I find that a 
> lot of CPU is using to construct the Bounds.
> {code}
> "CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
> nid=0x2303ab runnable [0x7f824d735000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
> at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
> - locked <0x7f8e11e67900> (a 
> org.apache.cassandra.db.compaction.LeveledManifest)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
> - locked <0x7f8e11b1d780> (a 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
> - locked <0x7f8e110931a0> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> From the code, we may construct the bounds multiply times, my patch optimizes 
> it by calculating it in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11571:
--
Reviewer: Marcus Eriksson
  Status: Patch Available  (was: Open)

> Optimize the overlapping lookup, by calculating all the bounds in advance.
> --
>
> Key: CASSANDRA-11571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
> Attachments: 
> 0001-Optimize-the-overlapping-lookup-by-calculating-all-t.patch
>
>
> When L0 sstable backs up (because of repair or other reasons), I find that a 
> lot of CPU is using to construct the Bounds.
> {code}
> "CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
> nid=0x2303ab runnable [0x7f824d735000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
> at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
> - locked <0x7f8e11e67900> (a 
> org.apache.cassandra.db.compaction.LeveledManifest)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
> - locked <0x7f8e11b1d780> (a 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
> - locked <0x7f8e110931a0> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> From the code, we may construct the bounds multiply times, my patch optimizes 
> it by calculating it in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11571:
--
Fix Version/s: 3.x

> Optimize the overlapping lookup, by calculating all the bounds in advance.
> --
>
> Key: CASSANDRA-11571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
> Fix For: 3.x
>
>
> When L0 sstable backs up (because of repair or other reasons), I find that a 
> lot of CPU is using to construct the Bounds.
> {code}
> "CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
> nid=0x2303ab runnable [0x7f824d735000]
>java.lang.Thread.State: RUNNABLE
> at 
> org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
> at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
> at 
> org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
> - locked <0x7f8e11e67900> (a 
> org.apache.cassandra.db.compaction.LeveledManifest)
> at 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
> - locked <0x7f8e11b1d780> (a 
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
> - locked <0x7f8e110931a0> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> From the code, we may construct the bounds multiply times, my patch optimizes 
> it by calculating it in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11571:
--
Description: 
When L0 sstable backs up (because of repair or other reasons), I find that a 
lot of CPU is using to construct the Bounds.

{code}
"CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
nid=0x2303ab runnable [0x7f824d735000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
- locked <0x7f8e11e67900> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
- locked <0x7f8e11b1d780> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
- locked <0x7f8e110931a0> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

>From the code, we may construct the bounds multiply times, my patch optimize 
>it by calculating it in advance.

  was:
When L0 sstable backs up (because of repair or other reasons), I find that a 
lot of CPU is using to construct the Bounds.

"CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
nid=0x2303ab runnable [0x7f824d735000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
- locked <0x7f8e11e67900> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
- locked <0x7f8e11b1d780> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
- locked <0x7f8e110931a0> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

>From the code, we may construct the bounds multiply times, my patch optimize 
>it by calculating it in advance.


> Optimize the overlapping lookup, by calculating all the bounds in advance.
> --
>
> Key: CASSANDRA-11571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> When L0 sstable backs up (because of repair or other reasons), I find that a 
> lot of CPU is using to construct the Bounds.
> {code}
> "CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 

[jira] [Updated] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dikang Gu updated CASSANDRA-11571:
--
Description: 
When L0 sstable backs up (because of repair or other reasons), I find that a 
lot of CPU is using to construct the Bounds.

{code}
"CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
nid=0x2303ab runnable [0x7f824d735000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
- locked <0x7f8e11e67900> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
- locked <0x7f8e11b1d780> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
- locked <0x7f8e110931a0> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

>From the code, we may construct the bounds multiply times, my patch optimizes 
>it by calculating it in advance.

  was:
When L0 sstable backs up (because of repair or other reasons), I find that a 
lot of CPU is using to construct the Bounds.

{code}
"CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
nid=0x2303ab runnable [0x7f824d735000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
- locked <0x7f8e11e67900> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
- locked <0x7f8e11b1d780> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
- locked <0x7f8e110931a0> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

>From the code, we may construct the bounds multiply times, my patch optimize 
>it by calculating it in advance.


> Optimize the overlapping lookup, by calculating all the bounds in advance.
> --
>
> Key: CASSANDRA-11571
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Compaction
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>
> When L0 sstable backs up (because of repair or other reasons), I find that a 
> lot of CPU is using to construct the Bounds.
> {code}
> "CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 

[jira] [Created] (CASSANDRA-11571) Optimize the overlapping lookup, by calculating all the bounds in advance.

2016-04-13 Thread Dikang Gu (JIRA)
Dikang Gu created CASSANDRA-11571:
-

 Summary: Optimize the overlapping lookup, by calculating all the 
bounds in advance.
 Key: CASSANDRA-11571
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11571
 Project: Cassandra
  Issue Type: Improvement
  Components: Compaction
Reporter: Dikang Gu
Assignee: Dikang Gu


When L0 sstable backs up (because of repair or other reasons), I find that a 
lot of CPU is using to construct the Bounds.

"CompactionExecutor:223" #1557 daemon prio=1 os_prio=4 tid=0x7f88f401d800 
nid=0x2303ab runnable [0x7f824d735000]
   java.lang.Thread.State: RUNNABLE
at 
org.apache.cassandra.dht.AbstractBounds.strictlyWrapsAround(AbstractBounds.java:86)
at org.apache.cassandra.dht.Bounds.(Bounds.java:44)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:533)
at 
org.apache.cassandra.db.compaction.LeveledManifest.overlapping(LeveledManifest.java:520)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCandidatesFor(LeveledManifest.java:595)
at 
org.apache.cassandra.db.compaction.LeveledManifest.getCompactionCandidates(LeveledManifest.java:349)
- locked <0x7f8e11e67900> (a 
org.apache.cassandra.db.compaction.LeveledManifest)
at 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getNextBackgroundTask(LeveledCompactionStrategy.java:97)
- locked <0x7f8e11b1d780> (a 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.getNextBackgroundTask(WrappingCompactionStrategy.java:78)
- locked <0x7f8e110931a0> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:250)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

>From the code, we may construct the bounds multiply times, my patch optimize 
>it by calculating it in advance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5977) Structure for cfstats output (JSON, YAML, or XML)

2016-04-13 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240150#comment-15240150
 ] 

Yuki Morishita commented on CASSANDRA-5977:
---

Thanks for update. Overall, we are in the right direction.

Couple of comments:

* I think you don't need to use {{EMPTY}} from commons-lang3. Just use {{""}} 
instead, and you can check it with {{String#isEmpty()}}.
* New {{StatsKeyspace}} and existing {{KeyspaceStats}} look almost the same. 
can you merge those two?

And this one is debatable, but the format of JSON(and yaml) I prefer:

{code:javascript}
{
  "system": {
  "tables": {
"local": {
  "...": 1,
}
  }
}
}
{code}

to what the patch output now:

{code:javascript}
{
  "keyspaces": [
{
  "name": "system",
  "tables": [
{
  "name": "local",
}
  ]
}
  ]
}
{code}

WDYT?
Is it doable to print out the former?

> Structure for cfstats output (JSON, YAML, or XML)
> -
>
> Key: CASSANDRA-5977
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5977
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Alyssa Kwan
>Assignee: Shogo Hoshii
>Priority: Minor
>  Labels: Tools
> Fix For: 3.x
>
> Attachments: CASSANDRA-5977-trunk.patch, sample_result.zip, 
> tablestats_sample_result.json, tablestats_sample_result.txt, 
> tablestats_sample_result.yaml, trunk-tablestats.patch, trunk-tablestats.patch
>
>
> nodetool cfstats should take a --format arg that structures the output in 
> JSON, YAML, or XML.  This would be useful for piping into another script that 
> can easily parse this and act on it.  It would also help those of us who use 
> things like MCollective gather aggregate stats across clusters/nodes.
> Thoughts?  I can submit a patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11340) Heavy read activity on system_auth tables can cause apparent livelock

2016-04-13 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240140#comment-15240140
 ] 

Jeff Jirsa commented on CASSANDRA-11340:


Apologies for not being as diligent with followups as you deserve. I suspect 
it's not the size of the cluster, but the number (and rate) of clients 
reconnecting that triggers this. We're sitting at about 2k connected (tcp/9042 
ESTABLISHED) clients per server.

I do appreciate you trying to reproduce. I'm going to try again, as well.




> Heavy read activity on system_auth tables can cause apparent livelock
> -
>
> Key: CASSANDRA-11340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11340
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Jeff Jirsa
>Assignee: Aleksey Yeschenko
> Attachments: mass_connect.py, prepare_mass_connect.py
>
>
> Reproduced in at least 2.1.9. 
> It appears possible for queries against system_auth tables to trigger 
> speculative retry, which causes auth to block on traffic going off node. In 
> some cases, it appears possible for threads to become deadlocked, causing 
> load on the nodes to increase sharply. This happens even in clusters with RF 
> of system_auth == N, as all requests being served locally puts the bar for 
> 99% SR pretty low. 
> Incomplete stack trace below, but we haven't yet figured out what exactly is 
> blocking:
> {code}
> Thread 82291: (state = BLOCKED)
>  - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information 
> may be imprecise)
>  - java.util.concurrent.locks.LockSupport.parkNanos(long) @bci=11, line=338 
> (Compiled frame)
>  - 
> org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUntil(long)
>  @bci=28, line=307 (Compiled frame)
>  - org.apache.cassandra.utils.concurrent.SimpleCondition.await(long, 
> java.util.concurrent.TimeUnit) @bci=76, line=63 (Compiled frame)
>  - org.apache.cassandra.service.ReadCallback.await(long, 
> java.util.concurrent.TimeUnit) @bci=25, line=92 (Compiled frame)
>  - 
> org.apache.cassandra.service.AbstractReadExecutor$SpeculatingReadExecutor.maybeTryAdditionalReplicas()
>  @bci=39, line=281 (Compiled frame)
>  - org.apache.cassandra.service.StorageProxy.fetchRows(java.util.List, 
> org.apache.cassandra.db.ConsistencyLevel) @bci=175, line=1338 (Compiled frame)
>  - org.apache.cassandra.service.StorageProxy.readRegular(java.util.List, 
> org.apache.cassandra.db.ConsistencyLevel) @bci=9, line=1274 (Compiled frame)
>  - org.apache.cassandra.service.StorageProxy.read(java.util.List, 
> org.apache.cassandra.db.ConsistencyLevel, 
> org.apache.cassandra.service.ClientState) @bci=57, line=1199 (Compiled frame)
>  - 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(org.apache.cassandra.service.pager.Pageable,
>  org.apache.cassandra.cql3.QueryOptions, int, long, 
> org.apache.cassandra.service.QueryState) @bci=35, line=272 (Compiled frame)
>  - 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(org.apache.cassandra.service.QueryState,
>  org.apache.cassandra.cql3.QueryOptions) @bci=105, line=224 (Compiled frame)
>  - org.apache.cassandra.auth.Auth.selectUser(java.lang.String) @bci=27, 
> line=265 (Compiled frame)
>  - org.apache.cassandra.auth.Auth.isExistingUser(java.lang.String) @bci=1, 
> line=86 (Compiled frame)
>  - 
> org.apache.cassandra.service.ClientState.login(org.apache.cassandra.auth.AuthenticatedUser)
>  @bci=11, line=206 (Compiled frame)
>  - 
> org.apache.cassandra.transport.messages.AuthResponse.execute(org.apache.cassandra.service.QueryState)
>  @bci=58, line=82 (Compiled frame)
>  - 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(io.netty.channel.ChannelHandlerContext,
>  org.apache.cassandra.transport.Message$Request) @bci=75, line=439 (Compiled 
> frame)
>  - 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(io.netty.channel.ChannelHandlerContext,
>  java.lang.Object) @bci=6, line=335 (Compiled frame)
>  - 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(io.netty.channel.ChannelHandlerContext,
>  java.lang.Object) @bci=17, line=105 (Compiled frame)
>  - 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(java.lang.Object)
>  @bci=9, line=333 (Compiled frame)
>  - 
> io.netty.channel.AbstractChannelHandlerContext.access$700(io.netty.channel.AbstractChannelHandlerContext,
>  java.lang.Object) @bci=2, line=32 (Compiled frame)
>  - io.netty.channel.AbstractChannelHandlerContext$8.run() @bci=8, line=324 
> (Compiled frame)
>  - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=511 
> (Compiled frame)
>  - 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run()
>  @bci=5, line=164 (Compiled frame)
>  - org.apache.cassandra.concurrent.SEPWorker.run() 

[jira] [Commented] (CASSANDRA-11566) read time out when do count(*)

2016-04-13 Thread nizar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240124#comment-15240124
 ] 

nizar commented on CASSANDRA-11566:
---

Hello Tyler, Thanks for answering my question .. so the how to handle this use 
case.. By using Cassandra's special COUNTER type track the count of things..
WHY count(*) timing out ... you see my nodetool cfstats above, total number for 
records  not very huge ?

> read time out when do count(*)
> --
>
> Key: CASSANDRA-11566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11566
> Project: Cassandra
>  Issue Type: Bug
> Environment: staging
>Reporter: nizar
> Fix For: 3.3
>
>
> Hello I using Cassandra Datastax 3.3, I keep getting read time out even if I 
> set the limit to 1, it would make sense if the limit is high number .. 
> However only limit 1 and still timing out sounds odd?
> [cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
> cqlsh:test> select count(*) from test.my_view where s_id=? and flag=false 
> limit 1;
> OperationTimedOut: errors={}, last_host=
> my key look like this :
> CREATE MATERIALIZED VIEW test.my_view AS
>   SELECT *
>   FROM table_name
>   WHERE id IS NOT NULL AND processed IS NOT NULL AND time IS  NOT NULL AND id 
> IS NOT NULL
>   PRIMARY KEY ( ( s_id, flag ), time, id )
>   WITH CLUSTERING ORDER BY ( time ASC );
>  I have 5 nodes with replica 3
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc': '3'}  AND durable_writes = true;
> Below was the result for nodetoolcfstats
> Keyspace: test
> Read Count: 128770
> Read Latency: 1.42208769123243 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Flushes: 0
> Table: tableName
> SSTable count: 3
> Space used (live): 280777032
> Space used (total): 280777032
> Space used by snapshots (total): 0
> Off heap memory used (total): 2850227
> SSTable Compression Ratio: 0.24706731995327527
> Number of keys (estimate): 1277211
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 0
> Local read count: 3
> Local read latency: 0.396 ms
> Local write count: 0
> Local write latency: NaN ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 1589848
> Bloom filter off heap memory used: 1589824
> Index summary off heap memory used: 1195691
> Compression metadata off heap memory used: 64712
> Compacted partition minimum bytes: 311
> Compacted partition maximum bytes: 535
> Compacted partition mean bytes: 458
> Average live cells per slice (last five minutes): 102.92671205446536
> Maximum live cells per slice (last five minutes): 103
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Table: my_view
> SSTable count: 4
> Space used (live): 126114270
> Space used (total): 126114270
> Space used by snapshots (total): 0
> Off heap memory used (total): 91588
> SSTable Compression Ratio: 0.1652453778228639
> Number of keys (estimate): 8
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 0
> Local read count: 128767
> Local read latency: 1.590 ms
> Local write count: 0
> Local write latency: NaN ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 96
> Bloom filter off heap memory used: 64
> Index summary off heap memory used: 140
> Compression metadata off heap memory used: 91384
> Compacted partition minimum bytes: 3974
> Compacted partition maximum bytes: 386857368
> Compacted partition mean bytes: 26034715
> Average live cells per slice (last five minutes): 102.99462595230145
> Maximum live cells per slice (last five minutes): 103
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Thank you.
> Nizar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240056#comment-15240056
 ] 

Lucas de Souza Santos edited comment on CASSANDRA-9666 at 4/13/16 10:02 PM:


I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

Doing a search for cassandra issues with DTCS we found TWCS as an alternative, 
and we saw several of the problems we had reported regarding DTCS. I couldn't 
even wait for an complete test in QA afraid of a crash in production, so I 
decided to apply TWCS in our biggest keyspace. The result was impressive, from 
more than 2.5 million sstables to around 30 for each node (after full 
compaction). No data loss, no change in load or memory improvement. Given these 
results, yesterday (03/12/2016) I decide to apply TWCS to all 14 keyspaces, and 
today the result, at least for me, is mind blowing.

Now I have around 500 sstables per node, sum of all keyspaces, from 10 million 
to 500! The load5 dropped from ~6 to ~0.5, cassandra released around 3GB RAM 
per node. Disc usage drooped from ~150 GB to ~120GB. Right after that, the 
number of requests got up from 120k to 190k requests per minute and we are 
seeing no change in load.



Our DTCS create table:
CREATE TABLE IF NOT EXISTS %s.ts_number (id text, date timeuuid, value double, 
PRIMARY KEY (id, date))
WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys':'ALL', 'rows_per_partition':'NONE'}
AND comment = ''
AND compaction= 
'class':'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy', 
'tombstone_compaction_interval': '7', 'min_threshold': '8', 'max_threshold': 
'64', 'timestamp_resolution':'MILLISECONDS', 'base_time_seconds':'3600', 
'max_sstable_age_days':'365'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = %d
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'
 AND 
compaction={'class':'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy',
 'tombstone_compaction_interval': '7', 'min_threshold': '8', 'max_threshold': 
'64', 'timestamp_resolution':'MILLISECONDS', 'base_time_seconds':'3600', 
'max_sstable_age_days':'365'}


TWCS:
CREATE TABLE .ts_number (
id text,
date timeuuid,
value double,
PRIMARY KEY (id, date)
) WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"100"}'
AND comment = ''
AND compaction = {'compaction_window_unit': 'DAYS', 
'compaction_window_size': '7', 'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 

[jira] [Comment Edited] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240056#comment-15240056
 ] 

Lucas de Souza Santos edited comment on CASSANDRA-9666 at 4/13/16 10:00 PM:


I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

Doing a search for cassandra issues with DTCS we found TWCS as an alternative, 
and we saw several of the problems we had reported regarding DTCS. I couldn't 
even wait for an complete test in QA afraid of a crash in production, so I 
decided to apply TWCS in our biggest keyspace. The result was impressive, from 
more than 2.5 million sstables to around 30 for each node (after full 
compaction). No data loss, no change in load or memory improvement. Given these 
results, yesterday (03/12/2016) I decide to apply TWCS to all 14 keyspaces, and 
today the result, at least for me, is mind blowing.

Now I have around 500 sstables per node, sum of all keyspaces, from 10 million 
to 500! The load5 dropped from ~6 to ~0.5, cassandra released around 3GB RAM 
per node. Disc usage drooped from ~150 GB to ~120GB. Right after that, the 
number of requests got up from 120k to 190k requests per minute and we are 
seeing no change in load.



Our DTCS create table:
CREATE TABLE IF NOT EXISTS %s.ts_number (id text, date timeuuid, value double, 
PRIMARY KEY (id, date))
WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys':'ALL', 'rows_per_partition':'NONE'}
AND comment = ''
AND compaction={ 'min_threshold': '8', 'max_threshold': '64', 
'compaction_window_unit': 'DAYS', 'compaction_window_size': '7', 'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = %d
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'
 AND 
compaction={'class':'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy',
 'tombstone_compaction_interval': '7', 'min_threshold': '8', 'max_threshold': 
'64', 'timestamp_resolution':'MILLISECONDS', 'base_time_seconds':'3600', 
'max_sstable_age_days':'365'}


TWCS:
CREATE TABLE .ts_number (
id text,
date timeuuid,
value double,
PRIMARY KEY (id, date)
) WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"100"}'
AND comment = ''
AND compaction = {'compaction_window_unit': 'DAYS', 
'compaction_window_size': '7', 'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND 

[jira] [Comment Edited] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240056#comment-15240056
 ] 

Lucas de Souza Santos edited comment on CASSANDRA-9666 at 4/13/16 9:54 PM:
---

I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

Doing a search for cassandra issues with DTCS we found TWCS as an alternative, 
and we saw several of the problems we had reported regarding DTCS. I couldn't 
even wait for an complete test in QA afraid of a crash in production, so I 
decided to apply TWCS in our biggest keyspace. The result was impressive, from 
more than 2.5 million sstables to around 30 for each node (after full 
compaction). No data loss, no change in load or memory improvement. Given these 
results, yesterday (03/12/2016) I decide to apply TWCS to all 14 keyspaces, and 
today the result, at least for me, is mind blowing.

Now I have around 500 sstables per node, sum of all keyspaces, from 10 million 
to 500! The load5 dropped from ~6 to ~0.5, cassandra released around 3GB RAM 
per node. Disc usage drooped from ~150 GB to ~120GB. Right after that, the 
number of requests got up from 120k to 190k requests per minute and we are 
seeing no change in load.



Our DTCS create table:
CREATE TABLE IF NOT EXISTS %s.ts_number (id text, date timeuuid, value double, 
PRIMARY KEY (id, date))
WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys':'ALL', 'rows_per_partition':'NONE'}
AND comment = ''
AND compaction={ 'min_threshold': '8', 'max_threshold': '64', 
'compaction_window_unit': 'DAYS', 'compaction_window_size': '7', 'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = %d
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE'


TWCS:
CREATE TABLE .ts_number (
id text,
date timeuuid,
value double,
PRIMARY KEY (id, date)
) WITH CLUSTERING ORDER BY (date ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"100"}'
AND comment = ''
AND compaction = {'compaction_window_unit': 'DAYS', 
'compaction_window_size': '7', 'class': 
'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'}
AND compression = {'crc_check_chance': '0.5', 'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 0
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';



was (Author: lucasdss):
I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year 

[jira] [Comment Edited] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240056#comment-15240056
 ] 

Lucas de Souza Santos edited comment on CASSANDRA-9666 at 4/13/16 9:28 PM:
---

I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

Doing a search for cassandra issues with DTCS we found TWCS as an alternative, 
and we saw several of the problems we had reported regarding DTCS. I couldn't 
even wait for an complete test in QA afraid of a crash in production, so I 
decided to apply TWCS in our biggest keyspace. The result was impressive, from 
more than 2.5 million sstables to around 30 for each node (after full 
compaction). No data loss, no change in load or memory improvement. Given these 
results, yesterday (03/12/2016) I decide to apply TWCS to all 14 keyspaces, and 
today the result, at least for me, is mind blowing.

Now I have around 500 sstables per node, sum of all keyspaces, from 10 million 
to 500! The load5 dropped from ~6 to ~0.5, cassandra released around 3GB RAM 
per node. Disc usage drooped from ~150 GB to ~120GB. Right after that, the 
number of requests got up from 120k to 190k requests per minute and we are 
seeing no change in load.


was (Author: lucasdss):
I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

> Provide an alternative to DTCS
> 

[jira] [Created] (CASSANDRA-11570) Concurrent execution of prepared statement returns invalid JSON as result

2016-04-13 Thread Alexander Ryabets (JIRA)
Alexander Ryabets created CASSANDRA-11570:
-

 Summary: Concurrent execution of prepared statement returns 
invalid JSON as result
 Key: CASSANDRA-11570
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11570
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 3.2, C++ or C# driver
Reporter: Alexander Ryabets
 Attachments: CassandraPreparedStatementsTest.zip, test_neptunao.cql

When I use prepared statement for async execution of multiple statements I get 
JSON with broken data. Keys got totally corrupted when values seems to be 
normal though.

First I encoutered this issue when I were performing stress testing of our 
project using custom script. We are using DataStax C++ driver and execute 
statements from different fibers.

Then I was trying to isolate problem and wrote simple C# program which starts 
multiple Tasks in a loop. Each task uses the once created prepared statement to 
read data from the base. As you can see results are totally mess.

I 've attached archive with console C# project (1 cs file) which just print 
resulting JSON to user. 
Here is the main part of C# code.

{noformat}
static void Main(string[] args)
{
  const int task_count = 300;

  using(var cluster = 
Cluster.Builder().AddContactPoints("127.0.0.1").Build())
  {
using(var session = cluster.Connect())
{
  var prepared = session.Prepare("select json * from 
test_neptunao.ubuntu");
  var tasks = new Task[task_count];
  for(int i = 0; i < task_count; i++)
  {
tasks[i] = Query(prepared, session);
  }
  Task.WaitAll(tasks);
}
  }
  Console.ReadKey();
}

private static Task Query(PreparedStatement prepared, ISession session)
{
  var stmt = prepared.Bind();
  stmt.SetConsistencyLevel(ConsistencyLevel.One);
  return session.ExecuteAsync(stmt).ContinueWith(tr =>
  {
foreach(var row in tr.Result)
{
  var value = row.GetValue(0);
  Console.WriteLine(value);
}
  });
}
{noformat}

I also attached cql script with test DB schema.

{noformat}
CREATE KEYSPACE IF NOT EXISTS test_neptunao
WITH replication = {
'class' : 'SimpleStrategy',
'replication_factor' : 3
};

use test_neptunao;

create table if not exists ubuntu (
id timeuuid PRIMARY KEY,
precise_pangolin text,
trusty_tahr text,
wily_werewolf text, 
vivid_vervet text,
saucy_salamander text,
lucid_lynx text
);
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15240056#comment-15240056
 ] 

Lucas de Souza Santos commented on CASSANDRA-9666:
--

I have been working for the biggest online news in Brazil and we are 
using/building a timeseries system with cassandra as persistence.
Since the end of last year we decided to use DTCS. This timeseries 
implementation is rolling to production to substitute legacy cacti.

Right now the cluster is composed by 10 Dell PE R6XX, some 610 and some 620. 
All with SAS discs, 8 cpus and 32GB of RAM, Linux Centos 6, kernel 2.
6.32.

Since Jan 20 2016 we are running cassandra21-2.1.12 over JRE 7. At that point 
we were just doing some tests, receiving ~140k points/minute. The cluster was 
fine and using STCS (the default) as compaction strategy.

At the end of February I changed to DTCS and we doubled the load, passing to 
around 200k points/minute. A week after, we saw the cpu load growing up, 
together with disc space and memory. First we thought it was us using more, so 
we built some dashboards to visualize the data.

About 3 weeks ago the cluster started to get some timeouts and we lost a node 
at least two times, a reboot was needed to get the node back.

Things I have done trying to fix/improve the cluster:

Upgraded jre7 to jdk8, configured GC G1, altered memtable_cleanup_threshold to 
0.10 (was using 0.20, getting this value high made the problem worst).

Changed all applications using cassandra to use consistency ONE because GC 
pause was putting nodes out of the cluster and we were receiving a lot of 
timeouts.
After those changes the cluster was better to use but we were not confident in 
growing the number of requests. Last week I noticed a problem when restarting 
any node, it took at lest 15 minutes, sometimes 30 minutes just to load/open 
sstables. I checked the data on disc and saw that cassandra created more than 
10 million sstables. I couldn't do a simple "ls" in any datadir (I have 14 
keyspaces).

> Provide an alternative to DTCS
> --
>
> Key: CASSANDRA-9666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9666
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
> Fix For: 2.1.x, 2.2.x
>
> Attachments: dashboard-DTCS_to_TWCS.png, dtcs-twcs-io.png, 
> dtcs-twcs-load.png
>
>
> DTCS is great for time series data, but it comes with caveats that make it 
> difficult to use in production (typical operator behaviors such as bootstrap, 
> removenode, and repair have MAJOR caveats as they relate to 
> max_sstable_age_days, and hints/read repair break the selection algorithm).
> I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices 
> the tiered nature of DTCS in order to address some of DTCS' operational 
> shortcomings. I believe it is necessary to propose an alternative rather than 
> simply adjusting DTCS, because it fundamentally removes the tiered nature in 
> order to remove the parameter max_sstable_age_days - the result is very very 
> different, even if it is heavily inspired by DTCS. 
> Specifically, rather than creating a number of windows of ever increasing 
> sizes, this strategy allows an operator to choose the window size, compact 
> with STCS within the first window of that size, and aggressive compact down 
> to a single sstable once that window is no longer current. The window size is 
> a combination of unit (minutes, hours, days) and size (1, etc), such that an 
> operator can expect all data using a block of that size to be compacted 
> together (that is, if your unit is hours, and size is 6, you will create 
> roughly 4 sstables per day, each one containing roughly 6 hours of data). 
> The result addresses a number of the problems with 
> DateTieredCompactionStrategy:
> - At the present time, DTCS’s first window is compacted using an unusual 
> selection criteria, which prefers files with earlier timestamps, but ignores 
> sizes. In TimeWindowCompactionStrategy, the first window data will be 
> compacted with the well tested, fast, reliable STCS. All STCS options can be 
> passed to TimeWindowCompactionStrategy to configure the first window’s 
> compaction behavior.
> - HintedHandoff may put old data in new sstables, but it will have little 
> impact other than slightly reduced efficiency (sstables will cover a wider 
> range, but the old timestamps will not impact sstable selection criteria 
> during compaction)
> - ReadRepair may put old data in new sstables, but it will have little impact 
> other than slightly reduced efficiency (sstables will cover a wider range, 
> but the old timestamps will not impact sstable selection criteria during 
> compaction)
> - Small, old sstables resulting from streams of any kind will be swiftly 

[jira] [Updated] (CASSANDRA-9666) Provide an alternative to DTCS

2016-04-13 Thread Lucas de Souza Santos (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lucas de Souza Santos updated CASSANDRA-9666:
-
Attachment: dashboard-DTCS_to_TWCS.png

> Provide an alternative to DTCS
> --
>
> Key: CASSANDRA-9666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9666
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jeff Jirsa
>Assignee: Jeff Jirsa
> Fix For: 2.1.x, 2.2.x
>
> Attachments: dashboard-DTCS_to_TWCS.png, dtcs-twcs-io.png, 
> dtcs-twcs-load.png
>
>
> DTCS is great for time series data, but it comes with caveats that make it 
> difficult to use in production (typical operator behaviors such as bootstrap, 
> removenode, and repair have MAJOR caveats as they relate to 
> max_sstable_age_days, and hints/read repair break the selection algorithm).
> I'm proposing an alternative, TimeWindowCompactionStrategy, that sacrifices 
> the tiered nature of DTCS in order to address some of DTCS' operational 
> shortcomings. I believe it is necessary to propose an alternative rather than 
> simply adjusting DTCS, because it fundamentally removes the tiered nature in 
> order to remove the parameter max_sstable_age_days - the result is very very 
> different, even if it is heavily inspired by DTCS. 
> Specifically, rather than creating a number of windows of ever increasing 
> sizes, this strategy allows an operator to choose the window size, compact 
> with STCS within the first window of that size, and aggressive compact down 
> to a single sstable once that window is no longer current. The window size is 
> a combination of unit (minutes, hours, days) and size (1, etc), such that an 
> operator can expect all data using a block of that size to be compacted 
> together (that is, if your unit is hours, and size is 6, you will create 
> roughly 4 sstables per day, each one containing roughly 6 hours of data). 
> The result addresses a number of the problems with 
> DateTieredCompactionStrategy:
> - At the present time, DTCS’s first window is compacted using an unusual 
> selection criteria, which prefers files with earlier timestamps, but ignores 
> sizes. In TimeWindowCompactionStrategy, the first window data will be 
> compacted with the well tested, fast, reliable STCS. All STCS options can be 
> passed to TimeWindowCompactionStrategy to configure the first window’s 
> compaction behavior.
> - HintedHandoff may put old data in new sstables, but it will have little 
> impact other than slightly reduced efficiency (sstables will cover a wider 
> range, but the old timestamps will not impact sstable selection criteria 
> during compaction)
> - ReadRepair may put old data in new sstables, but it will have little impact 
> other than slightly reduced efficiency (sstables will cover a wider range, 
> but the old timestamps will not impact sstable selection criteria during 
> compaction)
> - Small, old sstables resulting from streams of any kind will be swiftly and 
> aggressively compacted with the other sstables matching their similar 
> maxTimestamp, without causing sstables in neighboring windows to grow in size.
> - The configuration options are explicit and straightforward - the tuning 
> parameters leave little room for error. The window is set in common, easily 
> understandable terms such as “12 hours”, “1 Day”, “30 days”. The 
> minute/hour/day options are granular enough for users keeping data for hours, 
> and users keeping data for years. 
> - There is no explicitly configurable max sstable age, though sstables will 
> naturally stop compacting once new data is written in that window. 
> - Streaming operations can create sstables with old timestamps, and they'll 
> naturally be joined together with sstables in the same time bucket. This is 
> true for bootstrap/repair/sstableloader/removenode. 
> - It remains true that if old data and new data is written into the memtable 
> at the same time, the resulting sstables will be treated as if they were new 
> sstables, however, that no longer negatively impacts the compaction 
> strategy’s selection criteria for older windows. 
> Patch provided for : 
> - 2.1: https://github.com/jeffjirsa/cassandra/commits/twcs-2.1 
> - 2.2: https://github.com/jeffjirsa/cassandra/commits/twcs-2.2
> - trunk (post-8099):  https://github.com/jeffjirsa/cassandra/commits/twcs 
> Rebased, force-pushed July 18, with bug fixes for estimated pending 
> compactions and potential starvation if more than min_threshold tables 
> existed in current window but STCS did not consider them viable candidates
> Rebased, force-pushed Aug 20 to bring in relevant logic from CASSANDRA-9882



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11117) ColUpdateTimeDeltaHistogram histogram overflow

2016-04-13 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-7:

Reproduced In: 3.0.0
Fix Version/s: 3.x
   3.0.x

I am reproducing this in some of my tests in 3.0.0

> ColUpdateTimeDeltaHistogram histogram overflow
> --
>
> Key: CASSANDRA-7
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Chris Lohfink
>Priority: Minor
> Fix For: 3.0.x, 3.x
>
>
> {code}
> getting attribute Mean of 
> org.apache.cassandra.metrics:type=ColumnFamily,name=ColUpdateTimeDeltaHistogram
>  threw an exceptionjavax.management.RuntimeMBeanException: 
> java.lang.IllegalStateException: Unable to compute ceiling for max when 
> histogram overflowed
> {code}
> Although the fact that this histogram has 164 buckets already, I wonder if 
> there is something weird with the computation thats causing this to be so 
> large? It appears to be coming from updates to system.local
> {code}
> org.apache.cassandra.metrics:type=Table,keyspace=system,scope=local,name=ColUpdateTimeDeltaHistogram
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


svn commit: r1739007 - in /cassandra/site: publish/download/index.html publish/index.html src/settings.py

2016-04-13 Thread jake
Author: jake
Date: Wed Apr 13 20:32:02 2016
New Revision: 1739007

URL: http://svn.apache.org/viewvc?rev=1739007=rev
Log:
3.5 rel

Modified:
cassandra/site/publish/download/index.html
cassandra/site/publish/index.html
cassandra/site/src/settings.py

Modified: cassandra/site/publish/download/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1739007=1739006=1739007=diff
==
--- cassandra/site/publish/download/index.html (original)
+++ cassandra/site/publish/download/index.html Wed Apr 13 20:32:02 2016
@@ -49,34 +49,16 @@
 
 Cassandra is moving to a monthly release process called Tick-Tock.  
Even-numbered releases (e.g. 3.2) contain new features; 
odd-numbered releases (e.g. 3.3) contain bug fixes only.  If a critical 
bug is found, a patch will be released against the most recent bug fix release. 
 http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/;>Read 
more about tick-tock here.
 
-The latest tick-tock release is 3.4, released on
-2016-03-08.
+The latest tick-tock release is 3.5, released on
+2016-04-13.
 
 
 
   
-  http://www.apache.org/dyn/closer.lua/cassandra/3.4/apache-cassandra-3.4-bin.tar.gz;>apache-cassandra-3.4-bin.tar.gz
-  [http://www.apache.org/dist/cassandra/3.4/apache-cassandra-3.4-bin.tar.gz.asc;>PGP]
-  [http://www.apache.org/dist/cassandra/3.4/apache-cassandra-3.4-bin.tar.gz.md5;>MD5]
-  [http://www.apache.org/dist/cassandra/3.4/apache-cassandra-3.4-bin.tar.gz.sha1;>SHA1]
-  
-  
-   http://wiki.apache.org/cassandra/DebianPackaging;>Debian 
installation instructions
-  
-
-
-
-
-The previous tick-tock bugfix release is 3.3, released on
-2016-02-09.
-
-
-
-  
-  http://www.apache.org/dyn/closer.lua/cassandra/3.3/apache-cassandra-3.3-bin.tar.gz;>apache-cassandra-3.3-bin.tar.gz
-  [http://www.apache.org/dist/cassandra/3.3/apache-cassandra-3.3-bin.tar.gz.asc;>PGP]
-  [http://www.apache.org/dist/cassandra/3.3/apache-cassandra-3.3-bin.tar.gz.md5;>MD5]
-  [http://www.apache.org/dist/cassandra/3.3/apache-cassandra-3.3-bin.tar.gz.sha1;>SHA1]
+  http://www.apache.org/dyn/closer.lua/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz;>apache-cassandra-3.5-bin.tar.gz
+  [http://www.apache.org/dist/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc;>PGP]
+  [http://www.apache.org/dist/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.md5;>MD5]
+  [http://www.apache.org/dist/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.sha1;>SHA1]
   
   
http://wiki.apache.org/cassandra/DebianPackaging;>Debian 
installation instructions

Modified: cassandra/site/publish/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1739007=1739006=1739007=diff
==
--- cassandra/site/publish/index.html (original)
+++ cassandra/site/publish/index.html Wed Apr 13 20:32:02 2016
@@ -77,7 +77,7 @@
   
   
   
-  http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/;>Tick-Tock
 release 3.4 (http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-3.4;>Changes)
+  http://www.planetcassandra.org/blog/cassandra-2-2-3-0-and-beyond/;>Tick-Tock
 release 3.5 (http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-3.5;>Changes)
   
   
 

Modified: cassandra/site/src/settings.py
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1739007=1739006=1739007=diff
==
--- cassandra/site/src/settings.py (original)
+++ cassandra/site/src/settings.py Wed Apr 13 20:32:02 2016
@@ -92,9 +92,9 @@ SITE_POST_PROCESSORS = {
 }
 
 class CassandraDef(object):
-ticktock_both_exist = True
-ticktock_version = '3.4'
-ticktock_version_date = '2016-03-08'
+ticktock_both_exist = False
+ticktock_version = '3.5'
+ticktock_version_date = '2016-04-13'
 ticktock_odd_version = '3.3'
 ticktock_odd_version_date = '2016-02-09'
 stable_version = '3.0.5'




[jira] [Created] (CASSANDRA-11569) Track message latency across DCs

2016-04-13 Thread Chris Lohfink (JIRA)
Chris Lohfink created CASSANDRA-11569:
-

 Summary: Track message latency across DCs
 Key: CASSANDRA-11569
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11569
 Project: Cassandra
  Issue Type: Improvement
  Components: Observability
Reporter: Chris Lohfink
Assignee: Chris Lohfink
Priority: Minor


Since we have the timestamp a message is created and when arrives, we can get 
an approximate time it took relatively easy and would remove necessity for more 
complex hacks to determine latency between DCs.

Although is not going to be very meaningful when ntp is not setup, it is pretty 
common to have NTP setup and even with clock drift nothing is really hurt 
except the metric becoming whacky.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session

2016-04-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239940#comment-15239940
 ] 

Robert Stupp commented on CASSANDRA-11425:
--

CASSANDRA-11555 could help with the cached prepared statement restriction.
I'd really like the ability to have the CQL source available - the also allows 
us to produce better debug logs and also log more information with exceptions.

> Add prepared query parameter to trace for "Execute CQL3 prepared query" 
> session
> ---
>
> Key: CASSANDRA-11425
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11425
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>
> For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
> not show us any information about the prepared query which is executed on the 
> session. So we can't see what query is the session executing.
> I think this makes performance tuning difficult on Cassandra.
> So, In this ticket, I'd like to add the prepared query parameter on Execute 
> session trace like this.
> {noformat}
> cqlsh:system_traces> select * from sessions ;
>  session_id   | client| command | coordinator | 
> duration | parameters 
>   
> | request | started_at
> --+---+-+-+--+--+-+-
>  a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>666 |  \{'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'\} | Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
>  a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>109 |  
>{'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 
> 1'} |Preparing CQL3 query | 2016-03-24 13:37:59.998000+
>  a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>126 |  
>  {'query': 'INSERT INTO test.test2(id,value) VALUES 
> (?,?)'} |Preparing CQL3 query | 2016-03-24 13:37:59.996000+
>  a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 
> 'SELECT * FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
>  a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 |   
>857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 
> 'INSERT INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 
> 'SERIAL'} | Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
> {noformat}
> Now, "Execute CQL3 prepared query" session displays its query.
> I believe that this additional information would help operators a lot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11339) WHERE clause in SELECT DISTINCT can be ignored

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239938#comment-15239938
 ] 

Alex Petrov commented on CASSANDRA-11339:
-

I've ran the tests, there are several failures but all seem to be unrelated and 
do pass locally.

> WHERE clause in SELECT DISTINCT can be ignored
> --
>
> Key: CASSANDRA-11339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11339
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Philip Thompson
>Assignee: Alex Petrov
> Fix For: 2.2.x, 3.x
>
> Attachments: 
> 0001-Add-validation-for-distinct-queries-disallowing-quer.patch
>
>
> I've tested this out on 2.1-head. I'm not sure if it's the same behavior on 
> newer versions.
> For a given table t, with {{PRIMARY KEY (id, v)}} the following two queries 
> return the same result:
> {{SELECT DISTINCT id FROM t WHERE v > X ALLOW FILTERING}}
> {{SELECT DISTINCT id FROM t}}
> The WHERE clause in the former is silently ignored, and all id are returned, 
> regardless of the value of v in any row. 
> It seems like this has been a known issue for a while:
> http://stackoverflow.com/questions/26548788/select-distinct-cql-ignores-where-clause
> However, if we don't support filtering on anything but the partition key, we 
> should reject the query, rather than silently dropping the where clause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11425) Add prepared query parameter to trace for "Execute CQL3 prepared query" session

2016-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-11425:
-
Description: 
For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
not show us any information about the prepared query which is executed on the 
session. So we can't see what query is the session executing.
I think this makes performance tuning difficult on Cassandra.

So, In this ticket, I'd like to add the prepared query parameter on Execute 
session trace like this.

{noformat}
cqlsh:system_traces> select * from sessions ;

 session_id   | client| command | coordinator | 
duration | parameters   

| request | started_at
--+---+-+-+--+--+-+-
 a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 666 |  \{'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT 
* FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'\} | 
Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
 a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 109 |  
   {'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.998000+
 a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 126 |  
 {'query': 'INSERT INTO test.test2(id,value) VALUES (?,?)'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.996000+
 a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 764 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT 
* FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | 
Execute CQL3 prepared query | 2016-03-24 13:37:59.998000+
 a00176d0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 857 | {'consistency_level': 'QUORUM', 'page_size': '5000', 'query': 'INSERT 
INTO test.test2(id,value) VALUES (?,?)', 'serial_consistency_level': 'SERIAL'} 
| Execute CQL3 prepared query | 2016-03-24 13:37:59.997000+
{noformat}

Now, "Execute CQL3 prepared query" session displays its query.
I believe that this additional information would help operators a lot.

  was:
For now, the system_traces.sessions rows for "Execute CQL3 prepared query" do 
not show us any information about the prepared query which is executed on the 
session. So we can't see what query is the session executing.
I think this makes performance tuning difficult on Cassandra.

So, In this ticket, I'd like to add the prepared query parameter on Execute 
session trace like this.

{noformat}
cqlsh:system_traces> select * from sessions ;

 session_id   | client| command | coordinator | 
duration | parameters   

| request | started_at
--+---+-+-+--+--+-+-
 a001ec00-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 666 |  {'consistency_level': 'ONE', 'page_size': '5000', 'query': 'SELECT 
* FROM test.test2 WHERE id=? LIMIT 1', 'serial_consistency_level': 'SERIAL'} | 
Execute CQL3 prepared query | 2016-03-24 13:38:00.00+
 a0019de0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 109 |  
   {'query': 'SELECT * FROM test.test2 WHERE id=? LIMIT 1'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.998000+
 a0014fc0-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 126 |  
 {'query': 'INSERT INTO test.test2(id,value) VALUES (?,?)'} |   
 Preparing CQL3 query | 2016-03-24 13:37:59.996000+
 a0019de1-f1c5-11e5-b14a-6fe1292cf9f1 | 127.0.0.1 |   QUERY |   127.0.0.1 | 
 764 |  {'consistency_level': 'ONE', 'page_size': 

[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest

2016-04-13 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239926#comment-15239926
 ] 

Nick Bailey commented on CASSANDRA-7190:


Well the schema attached to the last sstable is only "right" if the an sstable 
has been flushed since the last schema change. Really the point is that this 
ticket and 9587 are geared towards very different audiences, so we should treat 
them differently.

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Priority: Minor
>  Labels: lhf
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239904#comment-15239904
 ] 

Aleksey Yeschenko commented on CASSANDRA-11567:
---

There is difference, but for this purpose, nope. Feel free to bump in 3.6.

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421. 
> Netty 4.0.34 -> 4.0.36.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10624) Support UDT in CQLSSTableWriter

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239896#comment-15239896
 ] 

Alex Petrov commented on CASSANDRA-10624:
-

Great, thank you!

> Support UDT in CQLSSTableWriter
> ---
>
> Key: CASSANDRA-10624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10624
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.6
>
> Attachments: 0001-Add-support-for-UDTs-to-CQLSStableWriter.patch, 
> 0001-Support-UDTs-in-CQLSStableWriterV2.patch
>
>
> As far as I can tell, there is not way to use a UDT with {{CQLSSTableWriter}} 
> since there is no way to declare it and thus {{CQLSSTableWriter.Builder}} 
> knows of no UDT when parsing the {{CREATE TABLE}} statement passed.
> In terms of API, I think the simplest would be to allow to pass types to the 
> builder in the same way we pass the table definition. So something like:
> {noformat}
> String type = "CREATE TYPE myKs.vertex (x int, y int, z int)";
> String schema = "CREATE TABLE myKs.myTable ("
>   + "  k int PRIMARY KEY,"
>   + "  s set"
>   + ")";
> String insert = ...;
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
>   .inDirectory("path/to/directory")
>   .withType(type)
>   .forTable(schema)
>   .using(insert).build();
> {noformat}
> I'll note that implementation wise, this might be a bit simpler after the 
> changes of CASSANDRA-10365 (as it makes it easy to passe specific types 
> during the preparation of the create statement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11566) read time out when do count(*)

2016-04-13 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239857#comment-15239857
 ] 

Tyler Hobbs commented on CASSANDRA-11566:
-

Using {{LIMIT 1}} with {{SELECT count\(*\)}} doesn't limit Cassandra to only 
counting one row -- it will still count all rows.  The {{LIMIT}} applies to the 
number of returned rows, not the number of processed rows.

> read time out when do count(*)
> --
>
> Key: CASSANDRA-11566
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11566
> Project: Cassandra
>  Issue Type: Bug
> Environment: staging
>Reporter: nizar
> Fix For: 3.3
>
>
> Hello I using Cassandra Datastax 3.3, I keep getting read time out even if I 
> set the limit to 1, it would make sense if the limit is high number .. 
> However only limit 1 and still timing out sounds odd?
> [cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
> cqlsh:test> select count(*) from test.my_view where s_id=? and flag=false 
> limit 1;
> OperationTimedOut: errors={}, last_host=
> my key look like this :
> CREATE MATERIALIZED VIEW test.my_view AS
>   SELECT *
>   FROM table_name
>   WHERE id IS NOT NULL AND processed IS NOT NULL AND time IS  NOT NULL AND id 
> IS NOT NULL
>   PRIMARY KEY ( ( s_id, flag ), time, id )
>   WITH CLUSTERING ORDER BY ( time ASC );
>  I have 5 nodes with replica 3
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc': '3'}  AND durable_writes = true;
> Below was the result for nodetoolcfstats
> Keyspace: test
> Read Count: 128770
> Read Latency: 1.42208769123243 ms.
> Write Count: 0
> Write Latency: NaN ms.
> Pending Flushes: 0
> Table: tableName
> SSTable count: 3
> Space used (live): 280777032
> Space used (total): 280777032
> Space used by snapshots (total): 0
> Off heap memory used (total): 2850227
> SSTable Compression Ratio: 0.24706731995327527
> Number of keys (estimate): 1277211
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 0
> Local read count: 3
> Local read latency: 0.396 ms
> Local write count: 0
> Local write latency: NaN ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 1589848
> Bloom filter off heap memory used: 1589824
> Index summary off heap memory used: 1195691
> Compression metadata off heap memory used: 64712
> Compacted partition minimum bytes: 311
> Compacted partition maximum bytes: 535
> Compacted partition mean bytes: 458
> Average live cells per slice (last five minutes): 102.92671205446536
> Maximum live cells per slice (last five minutes): 103
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Table: my_view
> SSTable count: 4
> Space used (live): 126114270
> Space used (total): 126114270
> Space used by snapshots (total): 0
> Off heap memory used (total): 91588
> SSTable Compression Ratio: 0.1652453778228639
> Number of keys (estimate): 8
> Memtable cell count: 0
> Memtable data size: 0
> Memtable off heap memory used: 0
> Memtable switch count: 0
> Local read count: 128767
> Local read latency: 1.590 ms
> Local write count: 0
> Local write latency: NaN ms
> Pending flushes: 0
> Bloom filter false positives: 0
> Bloom filter false ratio: 0.0
> Bloom filter space used: 96
> Bloom filter off heap memory used: 64
> Index summary off heap memory used: 140
> Compression metadata off heap memory used: 91384
> Compacted partition minimum bytes: 3974
> Compacted partition maximum bytes: 386857368
> Compacted partition mean bytes: 26034715
> Average live cells per slice (last five minutes): 102.99462595230145
> Maximum live cells per slice (last five minutes): 103
> Average tombstones per slice (last five minutes): 1.0
> Maximum tombstones per slice (last five minutes): 1
> Thank you.
> Nizar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11568) cassandra-stress won't accept a user profile from a path with a space in the name

2016-04-13 Thread Joshua McKenzie (JIRA)
Joshua McKenzie created CASSANDRA-11568:
---

 Summary: cassandra-stress won't accept a user profile from a path 
with a space in the name
 Key: CASSANDRA-11568
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11568
 Project: Cassandra
  Issue Type: Bug
Reporter: Joshua McKenzie
Priority: Minor


Haven't tested on linux but I assume it's a similar story there (though why 
anyone would use a path with a space on *nix is beyond me).

{noformat}
> cassandra-stress user profile="d:\src\test 
> space\cassandra\tools\cqlstress-counter-example.yaml" "ops=(write=1,query1=2)"
Invalid parameter space\cassandra\tools\cqlstress-counter-example.yaml
{noformat}

I can't find a permutation of the above quotation placement that'll get it to 
parse the path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11567) Update netty version

2016-04-13 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11567:
---
Issue Type: Improvement  (was: Bug)

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421. 
> Netty 4.0.34 -> 4.0.36.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11567) Update netty version

2016-04-13 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11567:
---
Description: 
Mainly for prereq to CASSANDRA-11421. 

Netty 4.0.34 -> 4.0.36.

  was:Mainly for prereq to CASSANDRA-11421


> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421. 
> Netty 4.0.34 -> 4.0.36.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239734#comment-15239734
 ] 

Jason Brown edited comment on CASSANDRA-11567 at 4/13/16 6:12 PM:
--

bq. This is tick tock - no difference between 3.6 and 4.0

Is that true? Some things (like changes to protocols) are only at major revs I 
thought.

Either way, I'm largely fine with the netty upgrade. Just wanted a to be sure 
we're cool with the dep update on a non-major upgrade


was (Author: jasobrown):
bq. This is tick tock - no difference between 3.6 and 4.0

Is that true? Some things (like changes to protocols) are only at major revs I 
thought

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Allow instantiation of UDTs and tuples in UDFs

2016-04-13 Thread snazy
Repository: cassandra
Updated Branches:
  refs/heads/trunk ea0a97263 -> 5288d434b


Allow instantiation of UDTs and tuples in UDFs

patch by Robert Stupp; reviewed by Tyler Hobbs for CASSANDRA-10818


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5288d434
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5288d434
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5288d434

Branch: refs/heads/trunk
Commit: 5288d434b3b559c7006fa001a2dc56f4f4b2e2c3
Parents: ea0a972
Author: Robert Stupp 
Authored: Wed Apr 13 20:12:29 2016 +0200
Committer: Robert Stupp 
Committed: Wed Apr 13 20:12:29 2016 +0200

--
 CHANGES.txt |   1 +
 doc/cql3/CQL.textile|  49 ++-
 .../cql3/functions/JavaBasedUDFunction.java |   9 +-
 .../cassandra/cql3/functions/JavaUDF.java   |   5 +-
 .../cql3/functions/ScriptBasedUDFunction.java   |  74 ++
 .../cql3/functions/UDFByteCodeVerifier.java |   2 +-
 .../cassandra/cql3/functions/UDFContext.java| 105 +
 .../cql3/functions/UDFContextImpl.java  | 146 +++
 .../cassandra/cql3/functions/UDFunction.java|   6 +
 .../cassandra/cql3/functions/UDHelper.java  |  10 +-
 .../cassandra/cql3/functions/JavaSourceUDF.txt  |   7 +-
 .../cql3/validation/entities/UFTest.java| 144 +-
 .../entities/udfverify/CallClone.java   |   5 +-
 .../entities/udfverify/CallComDatastax.java |   5 +-
 .../entities/udfverify/CallFinalize.java|   5 +-
 .../entities/udfverify/CallOrgApache.java   |   5 +-
 .../entities/udfverify/ClassWithField.java  |   5 +-
 .../udfverify/ClassWithInitializer.java |   5 +-
 .../udfverify/ClassWithInitializer2.java|   5 +-
 .../udfverify/ClassWithInitializer3.java|   5 +-
 .../udfverify/ClassWithStaticInitializer.java   |   5 +-
 .../entities/udfverify/GoodClass.java   |   5 +-
 .../entities/udfverify/UseOfSynchronized.java   |   5 +-
 .../udfverify/UseOfSynchronizedWithNotify.java  |   5 +-
 .../UseOfSynchronizedWithNotifyAll.java |   5 +-
 .../udfverify/UseOfSynchronizedWithWait.java|   5 +-
 .../udfverify/UseOfSynchronizedWithWaitL.java   |   5 +-
 .../udfverify/UseOfSynchronizedWithWaitLI.java  |   5 +-
 28 files changed, 587 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5288d434/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 8067962..3b5f1b7 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Allow instantiation of UDTs and tuples in UDFs (CASSANDRA-10818)
  * Support UDT in CQLSSTableWriter (CASSANDRA-10624)
  * Support for non-frozen user-defined types, updating
individual fields of user-defined types (CASSANDRA-7423)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/5288d434/doc/cql3/CQL.textile
--
diff --git a/doc/cql3/CQL.textile b/doc/cql3/CQL.textile
index b0173a6..9fd5f85 100644
--- a/doc/cql3/CQL.textile
+++ b/doc/cql3/CQL.textile
@@ -2059,6 +2059,49 @@ CREATE FUNCTION fct_using_udt ( udtarg 
frozen )
 
 User-defined functions can be used in "@SELECT@":#selectStmt, 
"@INSERT@":#insertStmt and "@UPDATE@":#updateStmt statements.
 
+The implicitly available @udfContext@ field (or binding for script UDFs) 
provides the neccessary functionality to create new UDT and tuple values.
+
+bc(sample). 
+CREATE TYPE custom_type (txt text, i int);
+CREATE FUNCTION fct_using_udt ( somearg int )
+  RETURNS NULL ON NULL INPUT
+  RETURNS custom_type
+  LANGUAGE java
+  AS $$
+UDTValue udt = udfContext.newReturnUDTValue();
+udt.setString("txt", "some string");
+udt.setInt("i", 42);
+return udt;
+  $$;
+
+The definition of the @UDFContext@ interface can be found in the Apache 
Cassandra source code for @org.apache.cassandra.cql3.functions.UDFContext@.
+
+bc(sample). 
+public interface UDFContext
+{
+UDTValue newArgUDTValue(String argName);
+UDTValue newArgUDTValue(int argNum);
+UDTValue newReturnUDTValue();
+UDTValue newUDTValue(String udtName);
+TupleValue newArgTupleValue(String argName);
+TupleValue newArgTupleValue(int argNum);
+TupleValue newReturnTupleValue();
+TupleValue newTupleValue(String cqlDefinition);
+}
+
+Java UDFs already have some imports for common interfaces and classes defined. 
These imports are:
+Please note, that these convenience imports are not available for script UDFs.
+
+bc(sample). 
+import java.nio.ByteBuffer;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import 

[jira] [Updated] (CASSANDRA-10818) Allow instantiation of UDTs and tuples in UDFs

2016-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-10818:
-
   Resolution: Fixed
Fix Version/s: (was: 3.x)
   3.6
   Status: Resolved  (was: Patch Available)

Thank you!
Committed as 5288d434b3b559c7006fa001a2dc56f4f4b2e2c3 to trunk.

> Allow instantiation of UDTs and tuples in UDFs
> --
>
> Key: CASSANDRA-10818
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10818
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.6
>
>
> Currently UDF implementations cannot create new UDT instances.
> There's no way to create a new UT instance without having the 
> {{com.datastax.driver.core.DataType}} to be able to call 
> {{com.datastax.driver.core.UserType.newValue()}}.
> From a quick look into the related code in {{JavaUDF}}, {{DataType}} and 
> {{UserType}} classes it looks fine to expose information about return and 
> argument types via {{JavaUDF}}.
> Have to find some solution for script UDFs - but feels doable, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239734#comment-15239734
 ] 

Jason Brown commented on CASSANDRA-11567:
-

bq. This is tick tock - no difference between 3.6 and 4.0

Is that true? Some things (like changes to protocols) are only at major revs I 
thought

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239717#comment-15239717
 ] 

Jason Brown edited comment on CASSANDRA-11567 at 4/13/16 6:09 PM:
--

I thought about it, and don't we typically update a dependency only at major 
versions? Are you targeting the 3.x series, or 4.0?

/cc wdyt, [~iamaleksey]?


was (Author: jasobrown):
I though about it, and don't we typically update a dependency only at major 
versions? Are you targeting the 3.x series, or 4.0?

/cc wdyt, [~iamaleksey]?

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10818) Allow instantiation of UDTs and tuples in UDFs

2016-04-13 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-10818:
-
Summary: Allow instantiation of UDTs and tuples in UDFs  (was: Evaluate 
exposure of DataType instances from JavaUDF class)

> Allow instantiation of UDTs and tuples in UDFs
> --
>
> Key: CASSANDRA-10818
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10818
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Currently UDF implementations cannot create new UDT instances.
> There's no way to create a new UT instance without having the 
> {{com.datastax.driver.core.DataType}} to be able to call 
> {{com.datastax.driver.core.UserType.newValue()}}.
> From a quick look into the related code in {{JavaUDF}}, {{DataType}} and 
> {{UserType}} classes it looks fine to expose information about return and 
> argument types via {{JavaUDF}}.
> Have to find some solution for script UDFs - but feels doable, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239724#comment-15239724
 ] 

T Jake Luciani commented on CASSANDRA-11567:


This is tick tock :) no difference between 3.6 and 4.0

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239717#comment-15239717
 ] 

Jason Brown edited comment on CASSANDRA-11567 at 4/13/16 6:06 PM:
--

I though about it, and don't we typically update a dependency only at major 
versions? Are you targeting the 3.x series, or 4.0?

/cc wdyt, [~iamaleksey]?


was (Author: jasobrown):
I though about it, and don't we typically update a dependency only at major 
versions? Are you targeting the 3.x series, or 4.0?

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239717#comment-15239717
 ] 

Jason Brown commented on CASSANDRA-11567:
-

I though about it, and don't we typically update a dependency only at major 
versions? Are you targeting the 3.x series, or 4.0?

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-04-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239712#comment-15239712
 ] 

Robert Stupp commented on CASSANDRA-10783:
--

Alright - CassCI is happy now

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239705#comment-15239705
 ] 

Jason Brown commented on CASSANDRA-11567:
-

Assuming the CI tests are good, +1

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11421) Eliminate allocations of byte array for UTF8 String serializations

2016-04-13 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239693#comment-15239693
 ] 

T Jake Luciani commented on CASSANDRA-11421:


Will update once CASSANDRA-11567 is in

> Eliminate allocations of byte array for UTF8 String serializations
> --
>
> Key: CASSANDRA-11421
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11421
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Nitsan Wakart
>Assignee: Nitsan Wakart
>
> When profiling a read workload (YCSB workload c) on Cassandra 3.2.1 I noticed 
> a large part of allocation profile was generated from String.getBytes() calls 
> on CBUtil::writeString
> I have fixed up the code to use a thread local cached ByteBuffer and 
> CharsetEncoder to eliminate the allocations. This results in improved 
> allocation profile, and a mild improvement in performance.
> The fix is available here:
> https://github.com/nitsanw/cassandra/tree/fix-write-string-allocation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11567) Update netty version

2016-04-13 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239692#comment-15239692
 ] 

T Jake Luciani commented on CASSANDRA-11567:


[branch| https://github.com/tjake/cassandra/tree/update-netty]
[testall| http://cassci.datastax.com/job/tjake-update-netty-testall]
[dtest| http://cassci.datastax.com/job/tjake-update-netty-dtest]

> Update netty version
> 
>
> Key: CASSANDRA-11567
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
> Project: Cassandra
>  Issue Type: Bug
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
>Priority: Minor
> Fix For: 3.6
>
>
> Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] cassandra git commit: Bump version to 3.0.6 in build.xml

2016-04-13 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/trunk cc90d0423 -> ea0a97263


Bump version to 3.0.6 in build.xml


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fd24b7c0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fd24b7c0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fd24b7c0

Branch: refs/heads/trunk
Commit: fd24b7c0d62e7c452fd4aa3ddeb8363221cd0c4e
Parents: 4238cdd
Author: Tyler Hobbs 
Authored: Wed Apr 13 12:51:58 2016 -0500
Committer: Tyler Hobbs 
Committed: Wed Apr 13 12:51:58 2016 -0500

--
 build.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fd24b7c0/build.xml
--
diff --git a/build.xml b/build.xml
index 27de95c..271481f 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 
 
 
-
+
 
 
 http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree"/>



[2/2] cassandra git commit: Merge branch 'cassandra-3.0' into trunk

2016-04-13 Thread tylerhobbs
Merge branch 'cassandra-3.0' into trunk

Conflicts:
build.xml


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/ea0a9726
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/ea0a9726
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/ea0a9726

Branch: refs/heads/trunk
Commit: ea0a972639510c54f35e1ec40023ef19f9c088f7
Parents: cc90d04 fd24b7c
Author: Tyler Hobbs 
Authored: Wed Apr 13 12:53:22 2016 -0500
Committer: Tyler Hobbs 
Committed: Wed Apr 13 12:53:22 2016 -0500

--

--




cassandra git commit: Bump version to 3.0.6 in build.xml

2016-04-13 Thread tylerhobbs
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-3.0 4238cdd99 -> fd24b7c0d


Bump version to 3.0.6 in build.xml


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fd24b7c0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fd24b7c0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fd24b7c0

Branch: refs/heads/cassandra-3.0
Commit: fd24b7c0d62e7c452fd4aa3ddeb8363221cd0c4e
Parents: 4238cdd
Author: Tyler Hobbs 
Authored: Wed Apr 13 12:51:58 2016 -0500
Committer: Tyler Hobbs 
Committed: Wed Apr 13 12:51:58 2016 -0500

--
 build.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/fd24b7c0/build.xml
--
diff --git a/build.xml b/build.xml
index 27de95c..271481f 100644
--- a/build.xml
+++ b/build.xml
@@ -25,7 +25,7 @@
 
 
 
-
+
 
 
 http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=tree"/>



[jira] [Created] (CASSANDRA-11567) Update netty version

2016-04-13 Thread T Jake Luciani (JIRA)
T Jake Luciani created CASSANDRA-11567:
--

 Summary: Update netty version
 Key: CASSANDRA-11567
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11567
 Project: Cassandra
  Issue Type: Bug
Reporter: T Jake Luciani
Assignee: T Jake Luciani
Priority: Minor
 Fix For: 3.6


Mainly for prereq to CASSANDRA-11421



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9625) GraphiteReporter not reporting

2016-04-13 Thread Ruoran Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239684#comment-15239684
 ] 

Ruoran Wang edited comment on CASSANDRA-9625 at 4/13/16 5:49 PM:
-

I tired this following dumb fix, I applied similar change to 
ColumnFamilyMetrics where 
cfs.getCompactionStrategy().getEstimatedRemainingTasks(); is called. 
I hard coded to return 21 when getEstimatedRemainingTasks is taking too long. 
The graph 
(https://issues.apache.org/jira/secure/attachment/12798541/Screen%20Shot%202016-04-13%20at%2010.40.58%20AM.png)
 shows when it's busy pendingCompaction shows 21, but now the graphite-reporter 
will continue to collect other metrics instead of blocked.

{noformat}
diff --git a/src/java/org/apache/cassandra/metrics/CompactionMetrics.java 
b/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
index f7a99e1..e2ac22b 100644
--- a/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
@@ -18,8 +18,13 @@
 package org.apache.cassandra.metrics;
 
 import java.util.*;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
 import java.util.concurrent.ThreadPoolExecutor;
 import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
 
 import com.yammer.metrics.Metrics;
 import com.yammer.metrics.core.Counter;
@@ -31,12 +36,17 @@ import org.apache.cassandra.db.ColumnFamilyStore;
 import org.apache.cassandra.db.Keyspace;
 import org.apache.cassandra.db.compaction.CompactionInfo;
 import org.apache.cassandra.db.compaction.CompactionManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * Metrics for compaction.
  */
 public class CompactionMetrics implements 
CompactionManager.CompactionExecutorStatsCollector
 {
+
+private static final Logger logger = 
LoggerFactory.getLogger(CompactionMetrics.class);
+
 public static final MetricNameFactory factory = new 
DefaultNameFactory("Compaction");
 
 // a synchronized identity set of running tasks to their compaction info
@@ -57,15 +67,36 @@ public class CompactionMetrics implements 
CompactionManager.CompactionExecutorSt
 {
 public Integer value()
 {
-int n = 0;
-// add estimate number of compactions need to be done
-for (String keyspaceName : Schema.instance.getKeyspaces())
-{
-for (ColumnFamilyStore cfs : 
Keyspace.open(keyspaceName).getColumnFamilyStores())
-n += 
cfs.getCompactionStrategy().getEstimatedRemainingTasks();
+// The collector thread is likely to be blocked by compactions
+// This is a quick fix to avoid losing metrics
+ExecutorService executor = Executors.newSingleThreadExecutor();
+
+final Future future = executor.submit(new Callable() {
+@Override
+public Integer call() throws Exception {
+int n = 0;
+// add estimate number of compactions need to be done
+for (String keyspaceName : 
Schema.instance.getKeyspaces())
+{
+for (ColumnFamilyStore cfs : 
Keyspace.open(keyspaceName).getColumnFamilyStores())
+n += 
cfs.getCompactionStrategy().getEstimatedRemainingTasks();
+}
+// add number of currently running compactions
+return n + compactions.size();
+}
+});
+
+try {
+return future.get(20, TimeUnit.SECONDS);
+} catch (TimeoutException e) {
+future.cancel(true);
+logger.error("Skipping PendingTasks because some cfs is 
busy");
+} catch (Exception othere) {
+logger.error("Skipping PendingTasks because an unexpected 
exception", othere);
 }
-// add number of currently running compactions
-return n + compactions.size();
+
+executor.shutdownNow();
+return 21;
 }
 });
 completedTasks = 
Metrics.newGauge(factory.createMetricName("CompletedTasks"), new Gauge()
{noformat}


was (Author: ruoranwang):
I tired this following dumb fix, I applied similar change to 
ColumnFamilyMetrics where 
cfs.getCompactionStrategy().getEstimatedRemainingTasks(); is called. 
I hard coded to return 21 when getEstimatedRemainingTasks is taking too long. 
The graph shows when it's busy pendingCompaction shows 21, but now the 
graphite-reporter will continue to collect other metrics instead 

[jira] [Updated] (CASSANDRA-9625) GraphiteReporter not reporting

2016-04-13 Thread Ruoran Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruoran Wang updated CASSANDRA-9625:
---
Attachment: Screen Shot 2016-04-13 at 10.40.58 AM.png

I tired this following dumb fix, I applied similar change to 
ColumnFamilyMetrics where 
cfs.getCompactionStrategy().getEstimatedRemainingTasks(); is called. 
I hard coded to return 21 when getEstimatedRemainingTasks is taking too long. 
The graph shows when it's busy pendingCompaction shows 21, but now the 
graphite-reporter will continue to collect other metrics instead of blocked.

{noformat}
diff --git a/src/java/org/apache/cassandra/metrics/CompactionMetrics.java 
b/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
index f7a99e1..e2ac22b 100644
--- a/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/CompactionMetrics.java
@@ -18,8 +18,13 @@
 package org.apache.cassandra.metrics;
 
 import java.util.*;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
 import java.util.concurrent.ThreadPoolExecutor;
 import java.util.concurrent.TimeUnit;
+import java.util.concurrent.TimeoutException;
 
 import com.yammer.metrics.Metrics;
 import com.yammer.metrics.core.Counter;
@@ -31,12 +36,17 @@ import org.apache.cassandra.db.ColumnFamilyStore;
 import org.apache.cassandra.db.Keyspace;
 import org.apache.cassandra.db.compaction.CompactionInfo;
 import org.apache.cassandra.db.compaction.CompactionManager;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 /**
  * Metrics for compaction.
  */
 public class CompactionMetrics implements 
CompactionManager.CompactionExecutorStatsCollector
 {
+
+private static final Logger logger = 
LoggerFactory.getLogger(CompactionMetrics.class);
+
 public static final MetricNameFactory factory = new 
DefaultNameFactory("Compaction");
 
 // a synchronized identity set of running tasks to their compaction info
@@ -57,15 +67,36 @@ public class CompactionMetrics implements 
CompactionManager.CompactionExecutorSt
 {
 public Integer value()
 {
-int n = 0;
-// add estimate number of compactions need to be done
-for (String keyspaceName : Schema.instance.getKeyspaces())
-{
-for (ColumnFamilyStore cfs : 
Keyspace.open(keyspaceName).getColumnFamilyStores())
-n += 
cfs.getCompactionStrategy().getEstimatedRemainingTasks();
+// The collector thread is likely to be blocked by compactions
+// This is a quick fix to avoid losing metrics
+ExecutorService executor = Executors.newSingleThreadExecutor();
+
+final Future future = executor.submit(new Callable() {
+@Override
+public Integer call() throws Exception {
+int n = 0;
+// add estimate number of compactions need to be done
+for (String keyspaceName : 
Schema.instance.getKeyspaces())
+{
+for (ColumnFamilyStore cfs : 
Keyspace.open(keyspaceName).getColumnFamilyStores())
+n += 
cfs.getCompactionStrategy().getEstimatedRemainingTasks();
+}
+// add number of currently running compactions
+return n + compactions.size();
+}
+});
+
+try {
+return future.get(20, TimeUnit.SECONDS);
+} catch (TimeoutException e) {
+future.cancel(true);
+logger.error("Skipping PendingTasks because some cfs is 
busy");
+} catch (Exception othere) {
+logger.error("Skipping PendingTasks because an unexpected 
exception", othere);
 }
-// add number of currently running compactions
-return n + compactions.size();
+
+executor.shutdownNow();
+return 21;
 }
 });
 completedTasks = 
Metrics.newGauge(factory.createMetricName("CompletedTasks"), new Gauge()
{noformat}

> GraphiteReporter not reporting
> --
>
> Key: CASSANDRA-9625
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9625
> Project: Cassandra
>  Issue Type: Bug
> Environment: Debian Jessie, 7u79-2.5.5-1~deb8u1, Cassandra 2.1.3
>Reporter: Eric Evans
>Assignee: T Jake Luciani
> Attachments: Screen Shot 2016-04-13 at 10.40.58 AM.png, metrics.yaml, 
> thread-dump.log
>
>
> When upgrading from 2.1.3 to 2.1.6, the Graphite 

[jira] [Commented] (CASSANDRA-10818) Evaluate exposure of DataType instances from JavaUDF class

2016-04-13 Thread Tyler Hobbs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239676#comment-15239676
 ] 

Tyler Hobbs commented on CASSANDRA-10818:
-

+1

> Evaluate exposure of DataType instances from JavaUDF class
> --
>
> Key: CASSANDRA-10818
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10818
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Robert Stupp
>Assignee: Robert Stupp
>Priority: Minor
> Fix For: 3.x
>
>
> Currently UDF implementations cannot create new UDT instances.
> There's no way to create a new UT instance without having the 
> {{com.datastax.driver.core.DataType}} to be able to call 
> {{com.datastax.driver.core.UserType.newValue()}}.
> From a quick look into the related code in {{JavaUDF}}, {{DataType}} and 
> {{UserType}} classes it looks fine to expose information about return and 
> argument types via {{JavaUDF}}.
> Have to find some solution for script UDFs - but feels doable, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (CASSANDRA-7190) Add schema to snapshot manifest

2016-04-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko reopened CASSANDRA-7190:
--

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Priority: Minor
>  Labels: lhf
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-7190) Add schema to snapshot manifest

2016-04-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-7190:
-
Labels: lhf  (was: )

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Priority: Minor
>  Labels: lhf
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7190) Add schema to snapshot manifest

2016-04-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239666#comment-15239666
 ] 

Aleksey Yeschenko commented on CASSANDRA-7190:
--

Wouldn't 'the latest schema at the time of the snapshot' just be the schema 
attached to the most recent sstable though, more or less?

As I see it, this ticket doesn't buy you much over CASSANDRA-9587. Though I 
don't really mind if you provide a patch and reopen.

bq. Since this is such a LHF, if we could somehow get it into 2.1 and 3.0

Would have to be 3.x only, I'm afraid. Sorry.

> Add schema to snapshot manifest
> ---
>
> Key: CASSANDRA-7190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Priority: Minor
>
> followup from CASSANDRA-6326



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10624) Support UDT in CQLSSTableWriter

2016-04-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-10624:
--
   Resolution: Fixed
Fix Version/s: (was: 3.x)
   3.6
   Status: Resolved  (was: Ready to Commit)

> Support UDT in CQLSSTableWriter
> ---
>
> Key: CASSANDRA-10624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10624
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.6
>
> Attachments: 0001-Add-support-for-UDTs-to-CQLSStableWriter.patch, 
> 0001-Support-UDTs-in-CQLSStableWriterV2.patch
>
>
> As far as I can tell, there is not way to use a UDT with {{CQLSSTableWriter}} 
> since there is no way to declare it and thus {{CQLSSTableWriter.Builder}} 
> knows of no UDT when parsing the {{CREATE TABLE}} statement passed.
> In terms of API, I think the simplest would be to allow to pass types to the 
> builder in the same way we pass the table definition. So something like:
> {noformat}
> String type = "CREATE TYPE myKs.vertex (x int, y int, z int)";
> String schema = "CREATE TABLE myKs.myTable ("
>   + "  k int PRIMARY KEY,"
>   + "  s set"
>   + ")";
> String insert = ...;
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
>   .inDirectory("path/to/directory")
>   .withType(type)
>   .forTable(schema)
>   .using(insert).build();
> {noformat}
> I'll note that implementation wise, this might be a bit simpler after the 
> changes of CASSANDRA-10365 (as it makes it easy to passe specific types 
> during the preparation of the create statement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10624) Support UDT in CQLSSTableWriter

2016-04-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239657#comment-15239657
 ] 

Aleksey Yeschenko commented on CASSANDRA-10624:
---

Committed as 
[cc90d0423cb64bcf61ad37126c32de85fbca22c6|https://github.com/apache/cassandra/commit/cc90d0423cb64bcf61ad37126c32de85fbca22c6]
 to trunk, thanks.

> Support UDT in CQLSSTableWriter
> ---
>
> Key: CASSANDRA-10624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10624
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.x
>
> Attachments: 0001-Add-support-for-UDTs-to-CQLSStableWriter.patch, 
> 0001-Support-UDTs-in-CQLSStableWriterV2.patch
>
>
> As far as I can tell, there is not way to use a UDT with {{CQLSSTableWriter}} 
> since there is no way to declare it and thus {{CQLSSTableWriter.Builder}} 
> knows of no UDT when parsing the {{CREATE TABLE}} statement passed.
> In terms of API, I think the simplest would be to allow to pass types to the 
> builder in the same way we pass the table definition. So something like:
> {noformat}
> String type = "CREATE TYPE myKs.vertex (x int, y int, z int)";
> String schema = "CREATE TABLE myKs.myTable ("
>   + "  k int PRIMARY KEY,"
>   + "  s set"
>   + ")";
> String insert = ...;
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
>   .inDirectory("path/to/directory")
>   .withType(type)
>   .forTable(schema)
>   .using(insert).build();
> {noformat}
> I'll note that implementation wise, this might be a bit simpler after the 
> changes of CASSANDRA-10365 (as it makes it easy to passe specific types 
> during the preparation of the create statement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: Support UDTs in CQLSStableWriter

2016-04-13 Thread aleksey
Repository: cassandra
Updated Branches:
  refs/heads/trunk 6d43fc981 -> cc90d0423


Support UDTs in CQLSStableWriter

Patch by Alex Petrov and Stefania Alborghetti;
reviewed by Stefania Alborghetti and Aleksey Yeschenko for CASSANDRA-10624.


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/cc90d042
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/cc90d042
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/cc90d042

Branch: refs/heads/trunk
Commit: cc90d0423cb64bcf61ad37126c32de85fbca22c6
Parents: 6d43fc9
Author: Alex Petrov 
Authored: Tue Apr 5 10:50:59 2016 +0200
Committer: Aleksey Yeschenko 
Committed: Wed Apr 13 18:28:39 2016 +0100

--
 CHANGES.txt |   1 +
 .../org/apache/cassandra/cql3/CQL3Type.java |   5 +-
 .../cassandra/cql3/functions/UDHelper.java  |   4 +-
 .../cql3/statements/CreateTypeStatement.java|  11 +-
 .../cassandra/io/sstable/CQLSSTableWriter.java  | 278 +++
 .../io/sstable/CQLSSTableWriterTest.java| 188 +++--
 6 files changed, 340 insertions(+), 147 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc90d042/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 1576c24..8067962 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.6
+ * Support UDT in CQLSSTableWriter (CASSANDRA-10624)
  * Support for non-frozen user-defined types, updating
individual fields of user-defined types (CASSANDRA-7423)
  * Make LZ4 compression level configurable (CASSANDRA-11051)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc90d042/src/java/org/apache/cassandra/cql3/CQL3Type.java
--
diff --git a/src/java/org/apache/cassandra/cql3/CQL3Type.java 
b/src/java/org/apache/cassandra/cql3/CQL3Type.java
index d5dfeed..cf7e18a 100644
--- a/src/java/org/apache/cassandra/cql3/CQL3Type.java
+++ b/src/java/org/apache/cassandra/cql3/CQL3Type.java
@@ -773,7 +773,10 @@ public interface CQL3Type
 @Override
 public String toString()
 {
-return name.toString();
+if (frozen)
+return "frozen<" + name.toString() + '>';
+else
+return name.toString();
 }
 }
 

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc90d042/src/java/org/apache/cassandra/cql3/functions/UDHelper.java
--
diff --git a/src/java/org/apache/cassandra/cql3/functions/UDHelper.java 
b/src/java/org/apache/cassandra/cql3/functions/UDHelper.java
index 4effdc3..d1c6157 100644
--- a/src/java/org/apache/cassandra/cql3/functions/UDHelper.java
+++ b/src/java/org/apache/cassandra/cql3/functions/UDHelper.java
@@ -35,7 +35,7 @@ import org.apache.cassandra.db.marshal.AbstractType;
 import org.apache.cassandra.transport.Server;
 
 /**
- * Helper class for User Defined Functions + Aggregates.
+ * Helper class for User Defined Functions, Types and Aggregates.
  */
 public final class UDHelper
 {
@@ -66,7 +66,7 @@ public final class UDHelper
 return codecs;
 }
 
-static TypeCodec codecFor(DataType dataType)
+public static TypeCodec codecFor(DataType dataType)
 {
 return codecRegistry.codecFor(dataType);
 }

http://git-wip-us.apache.org/repos/asf/cassandra/blob/cc90d042/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java
--
diff --git 
a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java 
b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java
index e134594..3268296 100644
--- a/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java
+++ b/src/java/org/apache/cassandra/cql3/statements/CreateTypeStatement.java
@@ -19,6 +19,7 @@ package org.apache.cassandra.cql3.statements;
 
 import java.nio.ByteBuffer;
 import java.util.*;
+import java.util.stream.Collectors;
 
 import org.apache.cassandra.auth.Permission;
 import org.apache.cassandra.config.*;
@@ -28,6 +29,7 @@ import org.apache.cassandra.db.marshal.UTF8Type;
 import org.apache.cassandra.db.marshal.UserType;
 import org.apache.cassandra.exceptions.*;
 import org.apache.cassandra.schema.KeyspaceMetadata;
+import org.apache.cassandra.schema.Types;
 import org.apache.cassandra.service.ClientState;
 import org.apache.cassandra.service.MigrationManager;
 import org.apache.cassandra.transport.Event;
@@ -97,13 +99,20 @@ public class CreateTypeStatement extends 
SchemaAlteringStatement
 }
 }
 
+public 

[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-04-13 Thread Ruoran Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239639#comment-15239639
 ] 

Ruoran Wang commented on CASSANDRA-9935:


[~pauloricardomg] I have some news. I mentioned earlier I found the two 
sstables returned from getsstables --hex-format always shows one is another's 
ancestor. So I looked at anticompaction, and I think it's the old sstable not 
being removed due to a race condition. CASSANDRA-10831 moved 
'markCompactedSSTablesReplaced' out of a try catch clause. 
{notformat}
cfs.getDataTracker().markCompactedSSTablesReplaced(successfullyAntiCompactedSSTables,
 anticompactedSSTables, OperationType.ANTICOMPACTION);
{notformat}
When I added try catch around this, I found an AssertError when the 
anticompaction process tries remove old sstables.
{notformat}
java.lang.AssertionError: Expecting new size of 95, got 96 while replacing XXX 
by XXX
{notformat}
That is thrown from org.apache.cassandra.db.DataTracker.View#replace

So I think this could be caused by unmarkCompacting called before 
markCompactedSSTablesReplaced. Yesterday I created another ticket for 2.1.13, I 
also attached my proposed patch there.
https://issues.apache.org/jira/browse/CASSANDRA-11548

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a 

[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException

2016-04-13 Thread Ruoran Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239639#comment-15239639
 ] 

Ruoran Wang edited comment on CASSANDRA-9935 at 4/13/16 5:12 PM:
-

[~pauloricardomg] I have some news. I mentioned earlier I found the two 
sstables returned from getsstables --hex-format always shows one is another's 
ancestor. So I looked at anticompaction, and I think it's the old sstable not 
being removed due to a race condition. CASSANDRA-10831 moved 
'markCompactedSSTablesReplaced' out of a try catch clause. 
{noformat}
cfs.getDataTracker().markCompactedSSTablesReplaced(successfullyAntiCompactedSSTables,
 anticompactedSSTables, OperationType.ANTICOMPACTION);
{noformat}
When I added try catch around this, I found an AssertError when the 
anticompaction process tries remove old sstables.
{notformat}
java.lang.AssertionError: Expecting new size of 95, got 96 while replacing XXX 
by XXX
{notformat}
That is thrown from org.apache.cassandra.db.DataTracker.View#replace

So I think this could be caused by unmarkCompacting called before 
markCompactedSSTablesReplaced. Yesterday I created another ticket for 2.1.13, I 
also attached my proposed patch there.
https://issues.apache.org/jira/browse/CASSANDRA-11548


was (Author: ruoranwang):
[~pauloricardomg] I have some news. I mentioned earlier I found the two 
sstables returned from getsstables --hex-format always shows one is another's 
ancestor. So I looked at anticompaction, and I think it's the old sstable not 
being removed due to a race condition. CASSANDRA-10831 moved 
'markCompactedSSTablesReplaced' out of a try catch clause. 
{notformat}
cfs.getDataTracker().markCompactedSSTablesReplaced(successfullyAntiCompactedSSTables,
 anticompactedSSTables, OperationType.ANTICOMPACTION);
{notformat}
When I added try catch around this, I found an AssertError when the 
anticompaction process tries remove old sstables.
{notformat}
java.lang.AssertionError: Expecting new size of 95, got 96 while replacing XXX 
by XXX
{notformat}
That is thrown from org.apache.cassandra.db.DataTracker.View#replace

So I think this could be caused by unmarkCompacting called before 
markCompactedSSTablesReplaced. Yesterday I created another ticket for 2.1.13, I 
also attached my proposed patch there.
https://issues.apache.org/jira/browse/CASSANDRA-11548

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 

[jira] [Comment Edited] (CASSANDRA-9935) Repair fails with RuntimeException

2016-04-13 Thread Ruoran Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239639#comment-15239639
 ] 

Ruoran Wang edited comment on CASSANDRA-9935 at 4/13/16 5:13 PM:
-

[~pauloricardomg] I have some news. I mentioned earlier I found the two 
sstables returned from getsstables --hex-format always shows one is another's 
ancestor. So I looked at anticompaction, and I think it's the old sstable not 
being removed due to a race condition. CASSANDRA-10831 moved 
'markCompactedSSTablesReplaced' out of a try catch clause. 
{noformat}
cfs.getDataTracker().markCompactedSSTablesReplaced(successfullyAntiCompactedSSTables,
 anticompactedSSTables, OperationType.ANTICOMPACTION);
{noformat}
When I added try catch around this, I found an AssertError when the 
anticompaction process tries remove old sstables.
{noformat}
java.lang.AssertionError: Expecting new size of 95, got 96 while replacing XXX 
by XXX
{noformat}
That is thrown from org.apache.cassandra.db.DataTracker.View#replace

So I think this could be caused by unmarkCompacting called before 
markCompactedSSTablesReplaced. Yesterday I created another ticket for 2.1.13, I 
also attached my proposed patch there.
https://issues.apache.org/jira/browse/CASSANDRA-11548


was (Author: ruoranwang):
[~pauloricardomg] I have some news. I mentioned earlier I found the two 
sstables returned from getsstables --hex-format always shows one is another's 
ancestor. So I looked at anticompaction, and I think it's the old sstable not 
being removed due to a race condition. CASSANDRA-10831 moved 
'markCompactedSSTablesReplaced' out of a try catch clause. 
{noformat}
cfs.getDataTracker().markCompactedSSTablesReplaced(successfullyAntiCompactedSSTables,
 anticompactedSSTables, OperationType.ANTICOMPACTION);
{noformat}
When I added try catch around this, I found an AssertError when the 
anticompaction process tries remove old sstables.
{notformat}
java.lang.AssertionError: Expecting new size of 95, got 96 while replacing XXX 
by XXX
{notformat}
That is thrown from org.apache.cassandra.db.DataTracker.View#replace

So I think this could be caused by unmarkCompacting called before 
markCompactedSSTablesReplaced. Yesterday I created another ticket for 2.1.13, I 
also attached my proposed patch there.
https://issues.apache.org/jira/browse/CASSANDRA-11548

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 

[jira] [Commented] (CASSANDRA-11559) Enhance node representation

2016-04-13 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239602#comment-15239602
 ] 

Jason Brown commented on CASSANDRA-11559:
-

On the whole, +1 on this idea in general.

bq. if we need to wait until 4.0 ...

I'm pretty sure we need to wait until 4.0, esp. if public interfaces will 
affected. Also, depending on the scope/size of the refactor, it might be safer 
for a 4.0 release.

bq. encapsulated in a {{VirtualNode}} interface

I'd prefer a different name as "virtual node", at least to me, implies a token 
in addition to the other data points (maybe I'm wrong here). But this is early 
bikeshedding ;)

One idea to throw out, although it might be premature at this stage, is to 
switch from {{InetAddress}} to {{InetSocketAddress}}. This will open the door 
to allowing different port bindings for each peer in the cluster. In no way 
would/should that effort, in it's entirety, be part of this ticket, but since 
we starting down the path of factoring away from passing the {{InetAddress}} 
around everywhere as the node identifier, this might be a nice place to put in 
the initial groundwork.




> Enhance node representation
> ---
>
> Key: CASSANDRA-11559
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11559
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Distributed Metadata
>Reporter: Paulo Motta
>Priority: Minor
>
> We currently represent nodes as {{InetAddress}} objects on {{TokenMetadata}}, 
> what causes difficulties when replacing a node with the same address (see 
> CASSANDRA-8523 and CASSANDRA-9244).
> Since CASSANDRA-4120 we index hosts by {{UUID}} in gossip, so I think it's 
> time to move that representation to {{TokenMetadata}}.
> I propose representing nodes as {{InetAddress, UUID}} pairs on 
> {{TokenMetadata}}, encapsulated in a {{VirtualNode}} interface, so it will 
> backward compatible with the current representation, while still allowing us 
> to enhance it in the future with additional metadata (and improved vnode 
> handling) if needed.
> This change will probably affect interfaces of internal classes like 
> {{TokenMetadata}} and {{AbstractReplicationStrategy}}, so I'd like to hear 
> from integrators and other developers if it's possible to change these 
> without major hassle or if we need to wait until 4.0.
> Besides updating {{TokenMetadata}} and {{AbstractReplicationStrategy}} (and 
> subclasses),  we will also need to replace all {{InetAddress}} uses with 
> {{VirtualNode.getEndpoint()}} calls on {{StorageService}} and related classes 
> and tests. We would probably already be able to replace some 
> {{TokenMetadata.getHostId(InetAddress endpoint)}} calls with 
> {{VirtualNode.getHostId()}}.
> While we will still be dealing with {{InetAddress}} on {{StorageService}} in 
> this initial stage, in the future I think we should pass {{VirtualNode}} 
> instances around and only translate from {{VirtualNode}} to {{InetAddress}} 
> in the network layer.
> Public interfaces like {{IEndpointSnitch}} will not be affected by this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-04-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239568#comment-15239568
 ] 

Robert Stupp commented on CASSANDRA-10783:
--

CI looks good. Two tests failed on cassci (timeout + a driver error) but pass 
locally. Triggered another testall run.

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9935) Repair fails with RuntimeException

2016-04-13 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239551#comment-15239551
 ] 

Paulo Motta commented on CASSANDRA-9935:


Hey [~ruoranwang], thanks for the report and helping troubleshoot the issue.

Do you have any update on this? While your patch might work, this might come at 
the expense of performance, because the default {{getScanner}} implementation 
create an {{IScanner}} instance for each sstable, while CASSANDRA-4142 improved 
this for LCS to have one scanner per level, making iteration faster.

I think that what might be happening is some race condition, where an sstable 
is added or removed from a level by a compaction during validation, but a 
{{LeveledScanner}} is created assuming there are no overlaps within each level, 
so we get the {{received out of order AssertionError}}.

I created a 
[patch|https://github.com/apache/cassandra/commit/a8c573547677f97b875583b8992155e7333659c3]
 that might solve this by verifying that the sstable level corresponds to the 
level in the current manifest, so we can guarantee non-overlapness. Otherwise 
it means that sstable was added or removed recently so we create an exclusive 
scanner for that sstable, so it will be merged correctly during validation.

Are you able to create a custom jar with that patch and check if that solves 
the issue? I'm attaching a .patch file to this ticket so you can apply in your 
custom branch.

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] 

[jira] [Updated] (CASSANDRA-9935) Repair fails with RuntimeException

2016-04-13 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-9935:
---
Attachment: 9935.patch

> Repair fails with RuntimeException
> --
>
> Key: CASSANDRA-9935
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9935
> Project: Cassandra
>  Issue Type: Bug
> Environment: C* 2.1.8, Debian Wheezy
>Reporter: mlowicki
>Assignee: Yuki Morishita
> Fix For: 2.1.x
>
> Attachments: 9935.patch, db1.sync.lati.osa.cassandra.log, 
> db5.sync.lati.osa.cassandra.log, system.log.10.210.3.117, 
> system.log.10.210.3.221, system.log.10.210.3.230
>
>
> We had problems with slow repair in 2.1.7 (CASSANDRA-9702) but after upgrade 
> to 2.1.8 it started to work faster but now it fails with:
> {code}
> ...
> [2015-07-29 20:44:03,956] Repair session 23a811b0-3632-11e5-a93e-4963524a8bde 
> for range (-5474076923322749342,-5468600594078911162] finished
> [2015-07-29 20:44:03,957] Repair session 336f8740-3632-11e5-a93e-4963524a8bde 
> for range (-8631877858109464676,-8624040066373718932] finished
> [2015-07-29 20:44:03,957] Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde 
> for range (-5372806541854279315,-5369354119480076785] finished
> [2015-07-29 20:44:03,957] Repair session 59f129f0-3632-11e5-a93e-4963524a8bde 
> for range (8166489034383821955,8168408930184216281] finished
> [2015-07-29 20:44:03,957] Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde 
> for range (6084602890817326921,6088328703025510057] finished
> [2015-07-29 20:44:03,957] Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde 
> for range (-781874602493000830,-781745173070807746] finished
> [2015-07-29 20:44:03,957] Repair command #4 finished
> error: nodetool failed, check server logs
> -- StackTrace --
> java.lang.RuntimeException: nodetool failed, check server logs
> at 
> org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:290)
> at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:202)
> {code}
> After running:
> {code}
> nodetool repair --partitioner-range --parallel --in-local-dc sync
> {code}
> Last records in logs regarding repair are:
> {code}
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 09ff9e40-3632-11e5-a93e-4963524a8bde for range 
> (-7695808664784761779,-7693529816291585568] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 17d8d860-3632-11e5-a93e-4963524a8bde for range 
> (806371695398849,8065203836608925992] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 23a811b0-3632-11e5-a93e-4963524a8bde for range 
> (-5474076923322749342,-5468600594078911162] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,956 StorageService.java:2952 - 
> Repair session 336f8740-3632-11e5-a93e-4963524a8bde for range 
> (-8631877858109464676,-8624040066373718932] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 4ccd8430-3632-11e5-a93e-4963524a8bde for range 
> (-5372806541854279315,-5369354119480076785] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 59f129f0-3632-11e5-a93e-4963524a8bde for range 
> (8166489034383821955,8168408930184216281] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 6ae7a9a0-3632-11e5-a93e-4963524a8bde for range 
> (6084602890817326921,6088328703025510057] finished
> INFO  [Thread-173887] 2015-07-29 20:44:03,957 StorageService.java:2952 - 
> Repair session 8938e4a0-3632-11e5-a93e-4963524a8bde for range 
> (-781874602493000830,-781745173070807746] finished
> {code}
> but a bit above I see (at least two times in attached log):
> {code}
> ERROR [Thread-173887] 2015-07-29 20:44:03,853 StorageService.java:2959 - 
> Repair session 1b07ea50-3608-11e5-a93e-4963524a8bde for range 
> (5765414319217852786,5781018794516851576] failed with error 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> org.apache.cassandra.exceptions.RepairException: [repair 
> #1b07ea50-3608-11e5-a93e-4963524a8bde on sync/entity_by_id2, 
> (5765414319217852786,5781018794516851576]] Validation failed in /10.195.15.162
> at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.get(FutureTask.java:188) 
> [na:1.7.0_80]
> at 
> org.apache.cassandra.service.StorageService$4.runMayThrow(StorageService.java:2950)
>  ~[apache-cassandra-2.1.8.jar:2.1.8]
>   

[jira] [Commented] (CASSANDRA-11556) PER PARTITION LIMIT does not work properly for multi-partition query with ORDER BY

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239519#comment-15239519
 ] 

Alex Petrov commented on CASSANDRA-11556:
-

Thank you for the review! 

To summarise:
  * I've added an {{InvalidRequestException}} on aggregate queries with {{PER 
PARTITION LIMIT}}
  * got rid of {{assertRowsIgnoringOrder}} and {{assertRowsCount}} (there are 
just two places where {{assertRowsCount}} is used. It was mostly for simplicity 
(as we don't know which partitions are returned) plus these tests are covered 
in dtest.
  * added test with {{ORDER}}, {{LIMIT}} and {{PER PARTITION LIMIT}}
  * moved the tests to {{SelectLimitTest}}

|[trunk|https://github.com/ifesdjeen/cassandra/tree/11556-trunk]|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11556-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11556-trunk-dtest/]|

Both dtests are failing on trunk, too, issue filed 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11560] and 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11127]. Unit tests are 
passing.


> PER PARTITION LIMIT does not work properly for multi-partition query with 
> ORDER BY
> --
>
> Key: CASSANDRA-11556
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11556
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Benjamin Lerer
>Assignee: Alex Petrov
> Fix For: 3.6
>
>
> Multi-partition queries with {{PER PARTITION LIMIT}} with {{ORDER BY}} do not 
> respect the {{PER PARTITION LIMIT}}.
> The problem can be reproduced with the following unit test:
> {code}
> @Test
> public void testPerPartitionLimitWithMultiPartitionQueryAndOrderBy() 
> throws Throwable
> {
> createTable("CREATE TABLE %s (a int, b int, c int, PRIMARY KEY (a, 
> b))");
> for (int i = 0; i < 5; i++)
> {
> for (int j = 0; j < 5; j++)
> {
> execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", i, j, j);
> }
> }
> assertRows(execute("SELECT * FROM %s WHERE a IN (2, 3) ORDER BY b 
> DESC PER PARTITION LIMIT ?", 2),
> row(2, 4, 4),
> row(3, 4, 4),
> row(2, 3, 3),
> row(3, 3, 3));
> }
> {code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10988) ClassCastException in SelectStatement

2016-04-13 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-10988:

Fix Version/s: 2.2.x

> ClassCastException in SelectStatement
> -
>
> Key: CASSANDRA-10988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10988
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Vassil Hristov
>Assignee: Alex Petrov
> Fix For: 2.2.x
>
>
> After we've upgraded our cluster to version 2.1.11, we started getting the 
> bellow exceptions for some of our queries. Issue seems to be very similar to 
> CASSANDRA-7284.
> Code to reproduce:
> {code:java}
> createTable("CREATE TABLE %s (" +
> "a text," +
> "b int," +
> "PRIMARY KEY (a, b)" +
> ") WITH COMPACT STORAGE" +
> "AND CLUSTERING ORDER BY (b DESC)");
> execute("insert into %s (a, b) values ('a', 2)");
> execute("SELECT * FROM %s WHERE a = 'a' AND b > 0");
> {code}
> {code:java}
> java.lang.ClassCastException: 
> org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast 
> to org.apache.cassandra.db.composites.CellName
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.makeCellName(AbstractCellNameType.java:254)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.makeExclusiveSliceBound(SelectStatement.java:1197)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.applySliceRestriction(SelectStatement.java:1205)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1283)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1250)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:299)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:276)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:67)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:138)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.1.11.jar:2.1.11]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10988) ClassCastException in SelectStatement

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239513#comment-15239513
 ] 

Alex Petrov commented on CASSANDRA-10988:
-

I've tested it with {{2.2}} branch and was able to reproduce it with unit tests 
(updated summary). {{3.x}} is unaffected.

> ClassCastException in SelectStatement
> -
>
> Key: CASSANDRA-10988
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10988
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Vassil Hristov
>Assignee: Alex Petrov
>
> After we've upgraded our cluster to version 2.1.11, we started getting the 
> bellow exceptions for some of our queries. Issue seems to be very similar to 
> CASSANDRA-7284.
> Code to reproduce:
> {code:java}
> createTable("CREATE TABLE %s (" +
> "a text," +
> "b int," +
> "PRIMARY KEY (a, b)" +
> ") WITH COMPACT STORAGE" +
> "AND CLUSTERING ORDER BY (b DESC)");
> execute("insert into %s (a, b) values ('a', 2)");
> execute("SELECT * FROM %s WHERE a = 'a' AND b > 0");
> {code}
> {code:java}
> java.lang.ClassCastException: 
> org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast 
> to org.apache.cassandra.db.composites.CellName
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.db.composites.AbstractCellNameType.makeCellName(AbstractCellNameType.java:254)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.makeExclusiveSliceBound(SelectStatement.java:1197)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.applySliceRestriction(SelectStatement.java:1205)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1283)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1250)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:299)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:276)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:67)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:138)
>  ~[apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_66]
> at 
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
>  [apache-cassandra-2.1.11.jar:2.1.11]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
> [apache-cassandra-2.1.11.jar:2.1.11]
> at 

[jira] [Updated] (CASSANDRA-10624) Support UDT in CQLSSTableWriter

2016-04-13 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-10624:

Status: Ready to Commit  (was: Patch Available)

> Support UDT in CQLSSTableWriter
> ---
>
> Key: CASSANDRA-10624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10624
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.x
>
> Attachments: 0001-Add-support-for-UDTs-to-CQLSStableWriter.patch, 
> 0001-Support-UDTs-in-CQLSStableWriterV2.patch
>
>
> As far as I can tell, there is not way to use a UDT with {{CQLSSTableWriter}} 
> since there is no way to declare it and thus {{CQLSSTableWriter.Builder}} 
> knows of no UDT when parsing the {{CREATE TABLE}} statement passed.
> In terms of API, I think the simplest would be to allow to pass types to the 
> builder in the same way we pass the table definition. So something like:
> {noformat}
> String type = "CREATE TYPE myKs.vertex (x int, y int, z int)";
> String schema = "CREATE TABLE myKs.myTable ("
>   + "  k int PRIMARY KEY,"
>   + "  s set"
>   + ")";
> String insert = ...;
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
>   .inDirectory("path/to/directory")
>   .withType(type)
>   .forTable(schema)
>   .using(insert).build();
> {noformat}
> I'll note that implementation wise, this might be a bit simpler after the 
> changes of CASSANDRA-10365 (as it makes it easy to passe specific types 
> during the preparation of the create statement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10624) Support UDT in CQLSSTableWriter

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239505#comment-15239505
 ] 

Alex Petrov commented on CASSANDRA-10624:
-

Both dtests are failing on trunk, too, issue filed 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11560] and 
[here|https://issues.apache.org/jira/browse/CASSANDRA-11127].
The utests are all passing locally...

> Support UDT in CQLSSTableWriter
> ---
>
> Key: CASSANDRA-10624
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10624
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Sylvain Lebresne
>Assignee: Alex Petrov
> Fix For: 3.x
>
> Attachments: 0001-Add-support-for-UDTs-to-CQLSStableWriter.patch, 
> 0001-Support-UDTs-in-CQLSStableWriterV2.patch
>
>
> As far as I can tell, there is not way to use a UDT with {{CQLSSTableWriter}} 
> since there is no way to declare it and thus {{CQLSSTableWriter.Builder}} 
> knows of no UDT when parsing the {{CREATE TABLE}} statement passed.
> In terms of API, I think the simplest would be to allow to pass types to the 
> builder in the same way we pass the table definition. So something like:
> {noformat}
> String type = "CREATE TYPE myKs.vertex (x int, y int, z int)";
> String schema = "CREATE TABLE myKs.myTable ("
>   + "  k int PRIMARY KEY,"
>   + "  s set"
>   + ")";
> String insert = ...;
> CQLSSTableWriter writer = CQLSSTableWriter.builder()
>   .inDirectory("path/to/directory")
>   .withType(type)
>   .forTable(schema)
>   .using(insert).build();
> {noformat}
> I'll note that implementation wise, this might be a bit simpler after the 
> changes of CASSANDRA-10365 (as it makes it easy to passe specific types 
> during the preparation of the create statement).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11566) read time out when do count(*)

2016-04-13 Thread nizar (JIRA)
nizar created CASSANDRA-11566:
-

 Summary: read time out when do count(*)
 Key: CASSANDRA-11566
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11566
 Project: Cassandra
  Issue Type: Bug
 Environment: staging
Reporter: nizar
 Fix For: 3.3


Hello I using Cassandra Datastax 3.3, I keep getting read time out even if I 
set the limit to 1, it would make sense if the limit is high number .. However 
only limit 1 and still timing out sounds odd?

[cqlsh 5.0.1 | Cassandra 3.3 | CQL spec 3.4.0 | Native protocol v4]
cqlsh:test> select count(*) from test.my_view where s_id=? and flag=false limit 
1;
OperationTimedOut: errors={}, last_host=
my key look like this :
CREATE MATERIALIZED VIEW test.my_view AS
  SELECT *
  FROM table_name
  WHERE id IS NOT NULL AND processed IS NOT NULL AND time IS  NOT NULL AND id 
IS NOT NULL
  PRIMARY KEY ( ( s_id, flag ), time, id )
  WITH CLUSTERING ORDER BY ( time ASC );

 I have 5 nodes with replica 3
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc': '3'}  AND durable_writes = true;

Below was the result for nodetoolcfstats

Keyspace: test
Read Count: 128770
Read Latency: 1.42208769123243 ms.
Write Count: 0
Write Latency: NaN ms.
Pending Flushes: 0
Table: tableName
SSTable count: 3
Space used (live): 280777032
Space used (total): 280777032
Space used by snapshots (total): 0
Off heap memory used (total): 2850227
SSTable Compression Ratio: 0.24706731995327527
Number of keys (estimate): 1277211
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 3
Local read latency: 0.396 ms
Local write count: 0
Local write latency: NaN ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 1589848
Bloom filter off heap memory used: 1589824
Index summary off heap memory used: 1195691
Compression metadata off heap memory used: 64712
Compacted partition minimum bytes: 311
Compacted partition maximum bytes: 535
Compacted partition mean bytes: 458
Average live cells per slice (last five minutes): 102.92671205446536
Maximum live cells per slice (last five minutes): 103
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1


Table: my_view
SSTable count: 4
Space used (live): 126114270
Space used (total): 126114270
Space used by snapshots (total): 0
Off heap memory used (total): 91588
SSTable Compression Ratio: 0.1652453778228639
Number of keys (estimate): 8
Memtable cell count: 0
Memtable data size: 0
Memtable off heap memory used: 0
Memtable switch count: 0
Local read count: 128767
Local read latency: 1.590 ms
Local write count: 0
Local write latency: NaN ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 96
Bloom filter off heap memory used: 64
Index summary off heap memory used: 140
Compression metadata off heap memory used: 91384
Compacted partition minimum bytes: 3974
Compacted partition maximum bytes: 386857368
Compacted partition mean bytes: 26034715
Average live cells per slice (last five minutes): 102.99462595230145
Maximum live cells per slice (last five minutes): 103
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1

Thank you.
Nizar




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11562) "Could not retrieve endpoint ranges" for sstableloader

2016-04-13 Thread Jens Rantil (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239488#comment-15239488
 ] 

Jens Rantil commented on CASSANDRA-11562:
-

Tested moving "manifest.json" out of the directory. Still getting the same 
error message.

> "Could not retrieve endpoint ranges" for sstableloader
> --
>
> Key: CASSANDRA-11562
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11562
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: $ uname -a
> Linux bigdb-100 3.2.0-99-virtual #139-Ubuntu SMP Mon Feb 1 23:52:21 UTC 2016 
> x86_64 x86_64 x86_64 GNU/Linux
> I am using Datastax Enterprise 4.7.5-1 which is based on 2.1.11.
>Reporter: Jens Rantil
>
> I am setting up a second datacenter and have a very slow and shaky VPN 
> connection to my old datacenter. To speed up import process I am trying to 
> seed the new datacenter with a backup (that has been transferred encrypted 
> out of bands from the VPN). When this is done I will issue a final 
> clusterwide repair.
> However...sstableloader crashes with the following:
> {noformat}
> sstableloader -v --nodes XXX --username MYUSERNAME --password MYPASSWORD 
> --ignore YYY,ZZZ ./backupdir/MYKEYSPACE/MYTABLE/
> Could not retrieve endpoint ranges:
> java.lang.IllegalArgumentException
> java.lang.RuntimeException: Could not retrieve endpoint ranges:
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
> at 
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.limit(Buffer.java:267)
> at 
> org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
> at 
> org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
> at 
> org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
> at 
> org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
> at 
> org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
> at 
> org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
> at 
> org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
> at 
> org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
> ... 2 more
> {noformat}
> (where YYY,ZZZ are nodes in the old DC)
> The files in ./backupdir/MYKEYSPACE/MYTABLE/ are an exact copy of a snapshot 
> from the older datacenter that has been taken with the exact same version of 
> Datastax Enterprise/Cassandra. The backup was taken 2-3 days ago.
> Question: ./backupdir/MYKEYSPACE/MYTABLE/ contains the non-"*.db" file  
> "manifest.json". Is that an issue?
> My workaround for my quest will probably be to copy the snapshot directories 
> out to the nodes of the new datacenter and do a DC-local repair+cleanup.
> Let me know if I can assist in debugging this further.
> References:
>  * This _might_ be a duplicate of 
> https://issues.apache.org/jira/browse/CASSANDRA-10629.
>  * http://stackoverflow.com/q/34757922/260805. 
> http://stackoverflow.com/a/35213418/260805 claims this could happen when 
> dropping a column, but don't think I've dropped any column for this column 
> ever.
>  * http://stackoverflow.com/q/28632555/260805
>  * http://stackoverflow.com/q/34487567/260805



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11565) dtest failure in cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_non_prepared_statements

2016-04-13 Thread Michael Shuler (JIRA)
Michael Shuler created CASSANDRA-11565:
--

 Summary: dtest failure in 
cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest.test_bulk_round_trip_non_prepared_statements
 Key: CASSANDRA-11565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11565
 Project: Cassandra
  Issue Type: Test
Reporter: Michael Shuler
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/cassandra-2.1_offheap_dtest/329/testReport/cqlsh_tests.cqlsh_copy_tests/CqlshCopyTest/test_bulk_round_trip_non_prepared_statements

Failed on CassCI build cassandra-2.1_offheap_dtest #329

{noformat}
Error Message

'int' object has no attribute 'on_read_timeout'
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-LNfFyy
dtest: DEBUG: Custom init_config not found. Setting defaults.
dtest: DEBUG: Done setting configuration options:
{   'initial_token': None,
'memtable_allocation_type': 'offheap_objects',
'num_tokens': '32',
'phi_convict_threshold': 5,
'range_request_timeout_in_ms': 1,
'read_request_timeout_in_ms': 1,
'request_timeout_in_ms': 1,
'truncate_request_timeout_in_ms': 1,
'write_request_timeout_in_ms': 1}
dtest: DEBUG: Running stress without any user profile
- >> end captured logging << -
Stacktrace

  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/automaton/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", line 
2325, in test_bulk_round_trip_non_prepared_statements
copy_from_options={'PREPAREDSTATEMENTS': False})
  File "/home/automaton/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", line 
2283, in _test_bulk_round_trip
num_records = create_records()
  File "/home/automaton/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", line 
2258, in create_records
ret = rows_to_list(self.session.execute(count_statement))[0][0]
  File "cassandra/cluster.py", line 1581, in cassandra.cluster.Session.execute 
(cassandra/cluster.c:27046)
return self.execute_async(query, parameters, trace, custom_payload, 
timeout).result()
  File "cassandra/cluster.py", line 3145, in 
cassandra.cluster.ResponseFuture.result (cassandra/cluster.c:59905)
raise self._final_exception
"'int' object has no attribute 'on_read_timeout'\n >> begin 
captured logging << \ndtest: DEBUG: cluster ccm directory: 
/mnt/tmp/dtest-LNfFyy\ndtest: DEBUG: Custom init_config not found. Setting 
defaults.\ndtest: DEBUG: Done setting configuration options:\n{   
'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n
'num_tokens': '32',\n'phi_convict_threshold': 5,\n
'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
1,\n'request_timeout_in_ms': 1,\n
'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
1}\ndtest: DEBUG: Running stress without any user 
profile\n- >> end captured logging << -"
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11564) dtest failure in secondary_indexes_test.TestSecondaryIndexes.test_query_indexes_with_vnodes

2016-04-13 Thread Michael Shuler (JIRA)
Michael Shuler created CASSANDRA-11564:
--

 Summary: dtest failure in 
secondary_indexes_test.TestSecondaryIndexes.test_query_indexes_with_vnodes
 Key: CASSANDRA-11564
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11564
 Project: Cassandra
  Issue Type: Test
Reporter: Michael Shuler
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/trunk_novnode_dtest/344/testReport/secondary_indexes_test/TestSecondaryIndexes/test_query_indexes_with_vnodes

Failed on CassCI build trunk_novnode_dtest #344

Test does not appear to configure single-token cluster correctly:
{noformat}
Error Message

Error starting node1.
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-4pEIhy
dtest: DEBUG: Custom init_config not found. Setting defaults.
dtest: DEBUG: Done setting configuration options:
{   'num_tokens': None,
'phi_convict_threshold': 5,
'range_request_timeout_in_ms': 1,
'read_request_timeout_in_ms': 1,
'request_timeout_in_ms': 1,
'truncate_request_timeout_in_ms': 1,
'write_request_timeout_in_ms': 1}
- >> end captured logging << -
Stacktrace

  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File "/home/automaton/cassandra-dtest/secondary_indexes_test.py", line 436, 
in test_query_indexes_with_vnodes
cluster.populate(2, use_vnodes=True).start()
  File "/home/automaton/ccm/ccmlib/cluster.py", line 360, in start
raise NodeError("Error starting {0}.".format(node.name), p)
"Error starting node1.\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
/mnt/tmp/dtest-4pEIhy\ndtest: DEBUG: Custom init_config not found. Setting 
defaults.\ndtest: DEBUG: Done setting configuration options:\n{   'num_tokens': 
None,\n'phi_convict_threshold': 5,\n'range_request_timeout_in_ms': 
1,\n'read_request_timeout_in_ms': 1,\n'request_timeout_in_ms': 
1,\n'truncate_request_timeout_in_ms': 1,\n
'write_request_timeout_in_ms': 1}\n- >> end captured 
logging << -"
Standard Output

[node1 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
[node2 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11563) dtest failure in repair_tests.incremental_repair_test.TestIncRepair.sstable_marking_test_not_intersecting_all_ranges

2016-04-13 Thread Michael Shuler (JIRA)
Michael Shuler created CASSANDRA-11563:
--

 Summary: dtest failure in 
repair_tests.incremental_repair_test.TestIncRepair.sstable_marking_test_not_intersecting_all_ranges
 Key: CASSANDRA-11563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11563
 Project: Cassandra
  Issue Type: Test
Reporter: Michael Shuler
Assignee: DS Test Eng


example failure:

http://cassci.datastax.com/job/trunk_novnode_dtest/344/testReport/repair_tests.incremental_repair_test/TestIncRepair/sstable_marking_test_not_intersecting_all_ranges

Failed on CassCI build trunk_novnode_dtest #344

Test does not appear to deal with single-token cluster testing correctly:
{noformat}
Error Message

Error starting node1.
 >> begin captured logging << 
dtest: DEBUG: cluster ccm directory: /mnt/tmp/dtest-I164Fa
dtest: DEBUG: Custom init_config not found. Setting defaults.
dtest: DEBUG: Done setting configuration options:
{   'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}
- >> end captured logging << -
Stacktrace

  File "/usr/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
  File 
"/home/automaton/cassandra-dtest/repair_tests/incremental_repair_test.py", line 
369, in sstable_marking_test_not_intersecting_all_ranges
cluster.populate(4, use_vnodes=True).start()
  File "/home/automaton/ccm/ccmlib/cluster.py", line 360, in start
raise NodeError("Error starting {0}.".format(node.name), p)
"Error starting node1.\n >> begin captured logging << 
\ndtest: DEBUG: cluster ccm directory: 
/mnt/tmp/dtest-I164Fa\ndtest: DEBUG: Custom init_config not found. Setting 
defaults.\ndtest: DEBUG: Done setting configuration options:\n{   'num_tokens': 
None, 'phi_convict_threshold': 5, 'start_rpc': 'true'}\n- 
>> end captured logging << -"
Standard Output

[node1 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
[node3 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
[node2 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
[node4 ERROR] Invalid yaml. Those properties [num_tokens] are not valid
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11562) "Could not retrieve endpoint ranges" for sstableloader

2016-04-13 Thread Jens Rantil (JIRA)
Jens Rantil created CASSANDRA-11562:
---

 Summary: "Could not retrieve endpoint ranges" for sstableloader
 Key: CASSANDRA-11562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11562
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: $ uname -a
Linux bigdb-100 3.2.0-99-virtual #139-Ubuntu SMP Mon Feb 1 23:52:21 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux

I am using Datastax Enterprise 4.7.5-1 which is based on 2.1.11.
Reporter: Jens Rantil


I am setting up a second datacenter and have a very slow and shaky VPN 
connection to my old datacenter. To speed up import process I am trying to seed 
the new datacenter with a backup (that has been transferred encrypted out of 
bands from the VPN). When this is done I will issue a final clusterwide repair.

However...sstableloader crashes with the following:

{noformat}
sstableloader -v --nodes XXX --username MYUSERNAME --password MYPASSWORD 
--ignore YYY,ZZZ ./backupdir/MYKEYSPACE/MYTABLE/
Could not retrieve endpoint ranges:
java.lang.IllegalArgumentException
java.lang.RuntimeException: Could not retrieve endpoint ranges:
at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:338)
at 
org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:156)
at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:106)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.limit(Buffer.java:267)
at 
org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:543)
at 
org.apache.cassandra.serializers.CollectionSerializer.readValue(CollectionSerializer.java:124)
at 
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:101)
at 
org.apache.cassandra.serializers.MapSerializer.deserializeForNativeProtocol(MapSerializer.java:30)
at 
org.apache.cassandra.serializers.CollectionSerializer.deserialize(CollectionSerializer.java:50)
at 
org.apache.cassandra.db.marshal.AbstractType.compose(AbstractType.java:68)
at 
org.apache.cassandra.cql3.UntypedResultSet$Row.getMap(UntypedResultSet.java:287)
at 
org.apache.cassandra.config.CFMetaData.fromSchemaNoTriggers(CFMetaData.java:1833)
at 
org.apache.cassandra.config.CFMetaData.fromThriftCqlRow(CFMetaData.java:1126)
at 
org.apache.cassandra.tools.BulkLoader$ExternalClient.init(BulkLoader.java:330)
... 2 more
{noformat}
(where YYY,ZZZ are nodes in the old DC)

The files in ./backupdir/MYKEYSPACE/MYTABLE/ are an exact copy of a snapshot 
from the older datacenter that has been taken with the exact same version of 
Datastax Enterprise/Cassandra. The backup was taken 2-3 days ago.

Question: ./backupdir/MYKEYSPACE/MYTABLE/ contains the non-"*.db" file  
"manifest.json". Is that an issue?

My workaround for my quest will probably be to copy the snapshot directories 
out to the nodes of the new datacenter and do a DC-local repair+cleanup.

Let me know if I can assist in debugging this further.

References:
 * This _might_ be a duplicate of 
https://issues.apache.org/jira/browse/CASSANDRA-10629.
 * http://stackoverflow.com/q/34757922/260805. 
http://stackoverflow.com/a/35213418/260805 claims this could happen when 
dropping a column, but don't think I've dropped any column for this column ever.
 * http://stackoverflow.com/q/28632555/260805
 * http://stackoverflow.com/q/34487567/260805



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11553) hadoop.cql3.CqlRecordWriter does not close cluster on reconnect

2016-04-13 Thread Jeremiah Jordan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremiah Jordan updated CASSANDRA-11553:

Fix Version/s: (was: 3.0.6)
   (was: 3.6)
   (was: 2.2.6)
   3.x
   3.0.x
   2.2.x

> hadoop.cql3.CqlRecordWriter does not close cluster on reconnect
> ---
>
> Key: CASSANDRA-11553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11553
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Artem Aliev
>Assignee: Artem Aliev
> Fix For: 2.2.x, 3.0.x, 3.x
>
> Attachments: CASSANDRA-11553-2.2.txt
>
>
> CASSANDRA-10058 add session and cluster close to all places in hadoop except 
> one place on reconnection.
> The writer uses one connection per new cluster, so I added cluster.close() 
> call to sesseionClose() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-10988) ClassCastException in SelectStatement

2016-04-13 Thread Alex Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-10988:

Description: 
After we've upgraded our cluster to version 2.1.11, we started getting the 
bellow exceptions for some of our queries. Issue seems to be very similar to 
CASSANDRA-7284.

Code to reproduce:

{code:java}
createTable("CREATE TABLE %s (" +
"a text," +
"b int," +
"PRIMARY KEY (a, b)" +
") WITH COMPACT STORAGE" +
"AND CLUSTERING ORDER BY (b DESC)");

execute("insert into %s (a, b) values ('a', 2)");
execute("SELECT * FROM %s WHERE a = 'a' AND b > 0");
{code}

{code:java}
java.lang.ClassCastException: 
org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast to 
org.apache.cassandra.db.composites.CellName
at 
org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.composites.AbstractCellNameType.makeCellName(AbstractCellNameType.java:254)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.makeExclusiveSliceBound(SelectStatement.java:1197)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.applySliceRestriction(SelectStatement.java:1205)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processColumnFamily(SelectStatement.java:1283)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:1250)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:299)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:276)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:224)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:67)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:493)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:138)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:439)
 [apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [apache-cassandra-2.1.11.jar:2.1.11]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_66]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-2.1.11.jar:2.1.11]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.11.jar:2.1.11]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
{code}

  was:
After we've upgraded our cluster to version 2.1.11, we started getting the 
bellow exceptions for some of our queries. Issue seems to be very similar to 
CASSANDRA-7284.

{code:java}
java.lang.ClassCastException: 
org.apache.cassandra.db.composites.Composites$EmptyComposite cannot be cast to 
org.apache.cassandra.db.composites.CellName
at 
org.apache.cassandra.db.composites.AbstractCellNameType.cellFromByteBuffer(AbstractCellNameType.java:188)
 ~[apache-cassandra-2.1.11.jar:2.1.11]
at 
org.apache.cassandra.db.composites.AbstractSimpleCellNameType.makeCellName(AbstractSimpleCellNameType.java:125)
 

[jira] [Updated] (CASSANDRA-11553) hadoop.cql3.CqlRecordWriter does not close cluster on reconnect

2016-04-13 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11553:
---
Fix Version/s: (was: 3.5)

> hadoop.cql3.CqlRecordWriter does not close cluster on reconnect
> ---
>
> Key: CASSANDRA-11553
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11553
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Artem Aliev
>Assignee: Artem Aliev
> Fix For: 2.2.6, 3.6, 3.0.6
>
> Attachments: CASSANDRA-11553-2.2.txt
>
>
> CASSANDRA-10058 add session and cluster close to all places in hadoop except 
> one place on reconnection.
> The writer uses one connection per new cluster, so I added cluster.close() 
> call to sesseionClose() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11541) correct the java documentation for SlabAllocator and NativeAllocator

2016-04-13 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-11541:
---
Fix Version/s: (was: 3.5)
   3.6

> correct the java documentation for SlabAllocator and NativeAllocator
> 
>
> Key: CASSANDRA-11541
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11541
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Trivial
>  Labels: document, lhf
> Fix For: 2.1.14, 3.6
>
> Attachments: CASSANDRA-11541.patch
>
>
> I heard a lot that Cassandra uses 2MB slab allocation strategy for memtable 
> to improve its memory efficiency. But in fact, it has been changed from 2MB 
> to 1 MB. 
> And in NativeAllocator, it's logarithmically from 8kb to 1mb.
> large number of tables may not cause too much trouble in term of memtable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


svn commit: r13235 - in /release/cassandra: 3.0.3/ 3.0.4/ 3.5/ debian/dists/35x/ debian/dists/35x/main/ debian/dists/35x/main/binary-amd64/ debian/dists/35x/main/binary-i386/ debian/dists/35x/main/sou

2016-04-13 Thread jake
Author: jake
Date: Wed Apr 13 14:57:25 2016
New Revision: 13235

Log:
3.5 rel

Added:
release/cassandra/3.5/
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz   (with props)
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.md5
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.sha1
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.md5
release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.sha1
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz   (with props)
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz.asc
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz.asc.md5
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz.asc.sha1
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz.md5
release/cassandra/3.5/apache-cassandra-3.5-src.tar.gz.sha1
release/cassandra/debian/dists/35x/
release/cassandra/debian/dists/35x/InRelease
release/cassandra/debian/dists/35x/Release
release/cassandra/debian/dists/35x/Release.gpg
release/cassandra/debian/dists/35x/main/
release/cassandra/debian/dists/35x/main/binary-amd64/
release/cassandra/debian/dists/35x/main/binary-amd64/Packages
release/cassandra/debian/dists/35x/main/binary-amd64/Packages.gz   (with 
props)
release/cassandra/debian/dists/35x/main/binary-amd64/Release
release/cassandra/debian/dists/35x/main/binary-i386/
release/cassandra/debian/dists/35x/main/binary-i386/Packages
release/cassandra/debian/dists/35x/main/binary-i386/Packages.gz   (with 
props)
release/cassandra/debian/dists/35x/main/binary-i386/Release
release/cassandra/debian/dists/35x/main/source/
release/cassandra/debian/dists/35x/main/source/Release
release/cassandra/debian/dists/35x/main/source/Sources.gz   (with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra-tools_3.5_all.deb  
 (with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.5.diff.gz   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.5.dsc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.5.orig.tar.gz   
(with props)
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.5.orig.tar.gz.asc
release/cassandra/debian/pool/main/c/cassandra/cassandra_3.5_all.deb   
(with props)
Removed:
release/cassandra/3.0.3/
release/cassandra/3.0.4/

Added: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz
==
Binary file - no diff available.

Propchange: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz
--
svn:mime-type = application/octet-stream

Added: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc
==
--- release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc (added)
+++ release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc Wed Apr 13 
14:57:25 2016
@@ -0,0 +1,17 @@
+-BEGIN PGP SIGNATURE-
+Version: GnuPG v1
+
+iQIcBAABAgAGBQJXCQ28AAoJEHSdbuwDU7EsM2kP/iaUTOGpdLoF2gGzqTWcW962
+Cki5N3K+a4LDKFHU0zr9RacKQ21g10GQ+rRWL7hI1KbD+TO2iuCcXLBfBoqocKjY
+OCclkrHOes83Hgk4oK7qYP1p/+ipwJsUBSDDzWNoiyNrTiiPlOhjlnK1d6osbGWG
+RMEV7bavlMK30HyZkffriB6BIhpm+/3iG0DK7VPwsd+/ojhu2CdkaZoSE7kYTMJr
+4XdHyTISoDObT3XJ6EYn/tikjPY3FmABW/a8DwYYUa5uRBFgUDmWGOXMKuuwNars
+SdVI7meJ1MfKelDQ2xAxsHWr3mZpRuEJS+VPlrHPc4n2VGOejm2XlyJ+3iE3LBFy
++0VDOH7mjYhhuY2oV4q/grQB+ywbxFJZuq8qrpcN47SlMLQhu5o9F9abum2Lohe2
+LWKMrkDbZMzSqeIBfCQfaUku+NjZzduci+Ro57LWiTUMDbcRmJYZ1lw4dEIwNpQH
+2IqO00fYAnow5xgRRU77Trz/r1DXso46lCyYrn1x2laHtJ6ub5VTxK+Kx363vNhM
++6SQhFR3lyrmTbsFzARSGZp4dMnuNMWAMjr/qwkRkOtFPt8Dxi2clNE4kbvV9xNV
+4wBCtMrfvPfivRSXGodVrdolBP8Mm3dNrlXie+mht30OSw0rM6mEAI44kkFJOzF3
+QcKfbEJdQIaxRS/lANTN
+=Fd7E
+-END PGP SIGNATURE-

Added: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.md5
==
--- release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.md5 (added)
+++ release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.md5 Wed Apr 13 
14:57:25 2016
@@ -0,0 +1 @@
+e0ab14577b95c973004070bf406483ac
\ No newline at end of file

Added: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.sha1
==
--- release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.sha1 (added)
+++ release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.asc.sha1 Wed Apr 13 
14:57:25 2016
@@ -0,0 +1 @@
+88b11cbf4a07f3705e0228602eaf4152cfa2e8fa
\ No newline at end of file

Added: release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.md5
==
--- release/cassandra/3.5/apache-cassandra-3.5-bin.tar.gz.md5 (added)
+++ 

[cassandra] Git Push Summary

2016-04-13 Thread jake
Repository: cassandra
Updated Tags:  refs/tags/3.5-tentative [deleted] 020dd2d10


[cassandra] Git Push Summary

2016-04-13 Thread jake
Repository: cassandra
Updated Tags:  refs/tags/cassandra-3.5 [created] 6a201b169


[jira] [Commented] (CASSANDRA-9766) Bootstrap outgoing streaming speeds are much slower than during repair

2016-04-13 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239330#comment-15239330
 ] 

T Jake Luciani commented on CASSANDRA-9766:
---

 I've taken a look at the streaming path and have been able to improve 
Streaming performance 25% with the following:

* CompressedStreamReader had only implemented reading a single byte at a time. 
I added the read(byte[], int, int) method.

* When writing to sstable, rather than calculate size of row, write the size 
then write the row (which causes 2x the cpu). Serialize the row to memory then 
write size of memory buffer and copy buffer to disk.

* Added object recycling of the largest garbage sources. Namely, 
BTreeSearchIterator, and DataOutputBuffer (for above fix). There are still a 
few more places recycling would help/ like the Object[] in BTree.Builder

* Changed all ThreadLocals to use FastThreadLocal from netty, and subsequently 
adding FastThreadLocalThreads for all internal threads.

There are still more things todo here, like we generate tons of garbage boxing 
types for StreamingHistogram.

I've added a long test to stream and compact a large sstable.

Branch is https://github.com/tjake/cassandra/tree/faster-streaming

3.5 test:
Finished Streaming in 25 seconds: 18.92 Mb/sec

branch test:
Finished Streaming in 19 seconds: 25.04 Mb/sec

> Bootstrap outgoing streaming speeds are much slower than during repair
> --
>
> Key: CASSANDRA-9766
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9766
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
> Environment: Cassandra 2.1.2. more details in the pdf attached 
>Reporter: Alexei K
>Assignee: T Jake Luciani
>  Labels: performance
> Fix For: 2.1.x
>
> Attachments: problem.pdf
>
>
> I have a cluster in Amazon cloud , its described in detail in the attachment. 
> What I've noticed is that we during bootstrap we never go above 12MB/sec 
> transmission speeds and also those speeds flat line almost like we're hitting 
> some sort of a limit ( this remains true for other tests that I've ran) 
> however during the repair we see much higher,variable sending rates. I've 
> provided network charts in the attachment as well . Is there an explanation 
> for this? Is something wrong with my configuration, or is it a possible bug?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-11561) Improve Streaming Performance

2016-04-13 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani resolved CASSANDRA-11561.

Resolution: Duplicate

> Improve Streaming Performance
> -
>
> Key: CASSANDRA-11561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11561
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 3.x
>
>
> Inspired by CASSANDRA-9766 I've taken a look at the streaming path and have 
> been able to improve Streaming performance 25% with the following:
> * CompressedStreamReader had only implemented reading a single byte at a 
> time. I added the read(byte[], int, int) method.
> * When writing to sstable, rather than calculate size of row, write the size 
> then write the row (which causes 2x the cpu). Serialize the row to memory 
> then write size of memory buffer and copy buffer to disk.
> * Added object recycling of the largest garbage sources.  Namely, 
> BTreeSearchIterator, and DataOutputBuffer (for above fix). There are still a 
> few more places recycling would help/ like the Object[] in BTree.Builder
> * Changed all ThreadLocals to use FastThreadLocal from netty, and 
> subsequently adding FastThreadLocalThreads for all internal threads.
> There are still more things todo here, like we generate tons of garbage 
> boxing types for StreamingHistogram.
> I've added a long test to stream and compact a large sstable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-11516) Make max number of streams configurable

2016-04-13 Thread Aleksey Yeschenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-11516:
--
Labels: lhf  (was: )

> Make max number of streams configurable
> ---
>
> Key: CASSANDRA-11516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11516
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Sebastian Estevez
>  Labels: lhf
>
> Today we default to num cores. In large boxes (many cores), this is 
> suboptimal as it can generate huge amounts of garbage that GC can't keep up 
> with.
> Usually we tackle issues like this with the streaming throughput levers but 
> in this case the problem is CPU consumption by StreamReceiverTasks 
> specifically in the IntervalTree build -- 
> https://github.com/apache/cassandra/blob/cassandra-2.1.12/src/java/org/apache/cassandra/utils/IntervalTree.java#L257
> We need a max number of parallel streams lever to hanlde this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11561) Improve Streaming Performance

2016-04-13 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239320#comment-15239320
 ] 

T Jake Luciani commented on CASSANDRA-11561:


Branch is https://github.com/tjake/cassandra/tree/faster-streaming

3.5 test:
Finished Streaming in 25 seconds: 18.92 Mb/sec

branch test:
Finished Streaming in 19 seconds: 25.04 Mb/sec

> Improve Streaming Performance
> -
>
> Key: CASSANDRA-11561
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11561
> Project: Cassandra
>  Issue Type: Bug
>  Components: Streaming and Messaging
>Reporter: T Jake Luciani
>Assignee: T Jake Luciani
> Fix For: 3.x
>
>
> Inspired by CASSANDRA-9766 I've taken a look at the streaming path and have 
> been able to improve Streaming performance 25% with the following:
> * CompressedStreamReader had only implemented reading a single byte at a 
> time. I added the read(byte[], int, int) method.
> * When writing to sstable, rather than calculate size of row, write the size 
> then write the row (which causes 2x the cpu). Serialize the row to memory 
> then write size of memory buffer and copy buffer to disk.
> * Added object recycling of the largest garbage sources.  Namely, 
> BTreeSearchIterator, and DataOutputBuffer (for above fix). There are still a 
> few more places recycling would help/ like the Object[] in BTree.Builder
> * Changed all ThreadLocals to use FastThreadLocal from netty, and 
> subsequently adding FastThreadLocalThreads for all internal threads.
> There are still more things todo here, like we generate tons of garbage 
> boxing types for StreamingHistogram.
> I've added a long test to stream and compact a large sstable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-11561) Improve Streaming Performance

2016-04-13 Thread T Jake Luciani (JIRA)
T Jake Luciani created CASSANDRA-11561:
--

 Summary: Improve Streaming Performance
 Key: CASSANDRA-11561
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11561
 Project: Cassandra
  Issue Type: Bug
  Components: Streaming and Messaging
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 3.x


Inspired by CASSANDRA-9766 I've taken a look at the streaming path and have 
been able to improve Streaming performance 25% with the following:

* CompressedStreamReader had only implemented reading a single byte at a time. 
I added the read(byte[], int, int) method.

* When writing to sstable, rather than calculate size of row, write the size 
then write the row (which causes 2x the cpu). Serialize the row to memory then 
write size of memory buffer and copy buffer to disk.

* Added object recycling of the largest garbage sources.  Namely, 
BTreeSearchIterator, and DataOutputBuffer (for above fix). There are still a 
few more places recycling would help/ like the Object[] in BTree.Builder

* Changed all ThreadLocals to use FastThreadLocal from netty, and subsequently 
adding FastThreadLocalThreads for all internal threads.

There are still more things todo here, like we generate tons of garbage boxing 
types for StreamingHistogram.

I've added a long test to stream and compact a large sstable.







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-5870) CQLSH not showing milliseconds in timestamps

2016-04-13 Thread Evan Prothro (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239299#comment-15239299
 ] 

Evan Prothro edited comment on CASSANDRA-5870 at 4/13/16 2:09 PM:
--

Thanks for the update, Tyler!


was (Author: eprothro):
Fixed in 10428.

> CQLSH not showing milliseconds in timestamps
> 
>
> Key: CASSANDRA-5870
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5870
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Linux
>Reporter: Ben Boule
>Priority: Minor
> Fix For: 1.2.9
>
>
> CQLSH does not include the milliseconds portion of the timestamp when 
> outputting query results.  For example on my system a time might be displayed 
> like this:
> "2013-08-09 10:55:58-0400" for a time stored in cassandra as: 1376060158267
> We've found this extremely annoying when dealing with time series data as it 
> will make records which occurred at different times appear to occur at the 
> same time.
> I'm submitting a patch, the existing formatting code already has handling of 
> some versions of python which do not support formatting time zones, I'm not 
> sure which versions of python can format seconds+milliseconds so I attempted 
> to supply something which will work with any time_format string and does not 
> depend on the system library.
> The above time with the patch will format like this:
> "2013-08-09 10:55:58.267-0400"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-5870) CQLSH not showing milliseconds in timestamps

2016-04-13 Thread Evan Prothro (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239299#comment-15239299
 ] 

Evan Prothro commented on CASSANDRA-5870:
-

Fixed in 10428.

> CQLSH not showing milliseconds in timestamps
> 
>
> Key: CASSANDRA-5870
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5870
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: Linux
>Reporter: Ben Boule
>Priority: Minor
> Fix For: 1.2.9
>
>
> CQLSH does not include the milliseconds portion of the timestamp when 
> outputting query results.  For example on my system a time might be displayed 
> like this:
> "2013-08-09 10:55:58-0400" for a time stored in cassandra as: 1376060158267
> We've found this extremely annoying when dealing with time series data as it 
> will make records which occurred at different times appear to occur at the 
> same time.
> I'm submitting a patch, the existing formatting code already has handling of 
> some versions of python which do not support formatting time zones, I'm not 
> sure which versions of python can format seconds+milliseconds so I attempted 
> to supply something which will work with any time_format string and does not 
> depend on the system library.
> The above time with the patch will format like this:
> "2013-08-09 10:55:58.267-0400"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10756) Timeout failures in NativeTransportService.testConcurrentDestroys unit test

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239212#comment-15239212
 ] 

Alex Petrov edited comment on CASSANDRA-10756 at 4/13/16 1:22 PM:
--

For sakes of experiment, I've changed the order (made a shutdown of 
{{workerGroup}} and {{eventExecutorGroup}} before calling {{destroy}}), it 
seems unrelated, as there are different executors that are being shut down. 

Same with {{emptyList}} replacement, since the call to {{Server::stop}} is 
synchronous, at least one thread would succeed.

So it seems that it's some other group that's being shut down is causing this 
error. Maybe roots are in the fact that the test times out...


was (Author: ifesdjeen):
For sakes of experiment, I've changed the order (made a shutdown of 
{{workerGroup}} and {{eventExecutorGroup}} before calling {{destroy}}), it 
seems unrelated, as there are different executors that are being shut down. 
Also, there's an assert for {{allTerminated.get())}}, which confirms that 
worker group and {{eventGroupExectuor}} are unrelated.

Same with {{emptyList}} replacement, since the call to {{Server::stop}} is 
synchronous, at least one thread would succeed.

So it seems that it's some other group that's being shut down is causing this 
error. Maybe roots are in the fact that the test times out...

> Timeout failures in NativeTransportService.testConcurrentDestroys unit test
> ---
>
> Key: CASSANDRA-10756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10756
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Alex Petrov
>
> History of test on trunk 
> [here|http://cassci.datastax.com/job/trunk_testall/lastCompletedBuild/testReport/org.apache.cassandra.service/NativeTransportServiceTest/testConcurrentDestroys/history/].
> I've seen these failures across 3.0/trunk for a while. I ran the test looping 
> locally for a while and the timeout is fairly easy to reproduce. The timeout 
> appears to be an indefinite hang and not a timing issue.
> When the timeout occurs, the following stack trace is at the end of the logs 
> for the unit test.
> {code}
> ERROR [ForkJoinPool.commonPool-worker-1] 2015-11-22 21:30:53,635 Failed to 
> submit a listener notification task. Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:745)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:322)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:728)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.execute(DefaultPromise.java:671) 
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.notifyLateListener(DefaultPromise.java:641)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:138) 
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroupFuture.(DefaultChannelGroupFuture.java:116)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:275)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:167)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> org.apache.cassandra.transport.Server$ConnectionTracker.closeAll(Server.java:277)
>  [main/:na]
>   at org.apache.cassandra.transport.Server.close(Server.java:180) 
> [main/:na]
>   at org.apache.cassandra.transport.Server.stop(Server.java:116) 
> [main/:na]
>   at java.util.Collections$SingletonSet.forEach(Collections.java:4767) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.service.NativeTransportService.stop(NativeTransportService.java:136)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.NativeTransportService.destroy(NativeTransportService.java:144)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.NativeTransportServiceTest.lambda$withService$102(NativeTransportServiceTest.java:201)
>  ~[classes/:na]
>   at 

[jira] [Commented] (CASSANDRA-10756) Timeout failures in NativeTransportService.testConcurrentDestroys unit test

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239212#comment-15239212
 ] 

Alex Petrov commented on CASSANDRA-10756:
-

For sakes of experiment, I've changed the order (made a shutdown of 
{{workerGroup}} and {{eventExecutorGroup}} before calling {{destroy}}), it 
seems unrelated, as there are different executors that are being shut down. 
Also, there's an assert for {{allTerminated.get())}}, which confirms that 
worker group and {{eventGroupExectuor}} are unrelated.

Same with {{emptyList}} replacement, since the call to {{Server::stop}} is 
synchronous, at least one thread would succeed.

So it seems that it's some other group that's being shut down is causing this 
error. Maybe roots are in the fact that the test times out...

> Timeout failures in NativeTransportService.testConcurrentDestroys unit test
> ---
>
> Key: CASSANDRA-10756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10756
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Joel Knighton
>Assignee: Alex Petrov
>
> History of test on trunk 
> [here|http://cassci.datastax.com/job/trunk_testall/lastCompletedBuild/testReport/org.apache.cassandra.service/NativeTransportServiceTest/testConcurrentDestroys/history/].
> I've seen these failures across 3.0/trunk for a while. I ran the test looping 
> locally for a while and the timeout is fairly easy to reproduce. The timeout 
> appears to be an indefinite hang and not a timing issue.
> When the timeout occurs, the following stack trace is at the end of the logs 
> for the unit test.
> {code}
> ERROR [ForkJoinPool.commonPool-worker-1] 2015-11-22 21:30:53,635 Failed to 
> submit a listener notification task. Event loop shut down?
> java.util.concurrent.RejectedExecutionException: event executor terminated
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.reject(SingleThreadEventExecutor.java:745)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.addTask(SingleThreadEventExecutor.java:322)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.SingleThreadEventExecutor.execute(SingleThreadEventExecutor.java:728)
>  ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.execute(DefaultPromise.java:671) 
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.notifyLateListener(DefaultPromise.java:641)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.util.concurrent.DefaultPromise.addListener(DefaultPromise.java:138) 
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:93)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.DefaultChannelPromise.addListener(DefaultChannelPromise.java:28)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroupFuture.(DefaultChannelGroupFuture.java:116)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:275)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> io.netty.channel.group.DefaultChannelGroup.close(DefaultChannelGroup.java:167)
>  [netty-all-4.0.23.Final.jar:4.0.23.Final]
>   at 
> org.apache.cassandra.transport.Server$ConnectionTracker.closeAll(Server.java:277)
>  [main/:na]
>   at org.apache.cassandra.transport.Server.close(Server.java:180) 
> [main/:na]
>   at org.apache.cassandra.transport.Server.stop(Server.java:116) 
> [main/:na]
>   at java.util.Collections$SingletonSet.forEach(Collections.java:4767) 
> ~[na:1.8.0_60]
>   at 
> org.apache.cassandra.service.NativeTransportService.stop(NativeTransportService.java:136)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.NativeTransportService.destroy(NativeTransportService.java:144)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.NativeTransportServiceTest.lambda$withService$102(NativeTransportServiceTest.java:201)
>  ~[classes/:na]
>   at java.util.stream.IntPipeline$3$1.accept(IntPipeline.java:233) 
> ~[na:1.8.0_60]
>   at 
> java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110)
>  ~[na:1.8.0_60]
>   at java.util.Spliterator$OfInt.forEachRemaining(Spliterator.java:693) 
> ~[na:1.8.0_60]
>   at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) 
> ~[na:1.8.0_60]
>   at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) 
> ~[na:1.8.0_60]
>   at java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:747) 
> ~[na:1.8.0_60]
>   at 

[jira] [Commented] (CASSANDRA-11339) WHERE clause in SELECT DISTINCT can be ignored

2016-04-13 Thread Alex Petrov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15239126#comment-15239126
 ] 

Alex Petrov commented on CASSANDRA-11339:
-

One more change in {{trunk}}, the static column restrictions should actually be 
supported. I've closed the dtest pull request and updated {{trunk}} accordingly.
{{2.2}} doesn't require this change, as restrictions on static columns (or 
non-primary key columns without 2i) are not supported. I've also updated 
imprecise method name in {{SelectStatement}}.

|| |2.2|trunk|
||code|[2.2|https://github.com/ifesdjeen/cassandra/tree/11339-2.2]|[trunk|https://github.com/ifesdjeen/cassandra/tree/11339-trunk]|
||utest|[2.2|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11339-2.2-testall/]|[trunk|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11339-trunk-testall/]|
||dtest|[2.2|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11339-2.2-dtest]|[trunk|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-11339-trunk-dtest]|

Unfortunately, CI seems to be down at the moment. I'll schedule new build as 
soon as it's back up.

> WHERE clause in SELECT DISTINCT can be ignored
> --
>
> Key: CASSANDRA-11339
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11339
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Philip Thompson
>Assignee: Alex Petrov
> Fix For: 2.2.x, 3.x
>
> Attachments: 
> 0001-Add-validation-for-distinct-queries-disallowing-quer.patch
>
>
> I've tested this out on 2.1-head. I'm not sure if it's the same behavior on 
> newer versions.
> For a given table t, with {{PRIMARY KEY (id, v)}} the following two queries 
> return the same result:
> {{SELECT DISTINCT id FROM t WHERE v > X ALLOW FILTERING}}
> {{SELECT DISTINCT id FROM t}}
> The WHERE clause in the former is silently ignored, and all id are returned, 
> regardless of the value of v in any row. 
> It seems like this has been a known issue for a while:
> http://stackoverflow.com/questions/26548788/select-distinct-cql-ignores-where-clause
> However, if we don't support filtering on anything but the partition key, we 
> should reject the query, rather than silently dropping the where clause



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-04-13 Thread vincent.poncet (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238992#comment-15238992
 ] 

vincent.poncet edited comment on CASSANDRA-9259 at 4/13/16 10:15 AM:
-

In datawarehouse / analytics usecases, you are doing mostly full scans, 
(hopefully with some predicate pushdown either projection and filtering), but 
it is reading a big numbers of rows, then doing aggregations.
I just want to say that that takes time. So by definition, an analytic query on 
an OLTP database is always "wrong", in the sense of at the same time of doing 
the query, data changed, deleted, updated, inserted. So operational analytics 
is always approximate.
The only way to have exact result of  analytic query on an OLTP database would 
be to have capabilities to query data during a snapshot to have coherent data 
and not being impacted by the data changes during the running of the analytic 
query. That's like MVCC in RDBMS. that's has performance cost and is relaxed 
for most of analytics  workloads.

So, my point is in operational analytics, CL=1 will be perfectly fine.


was (Author: vincent.pon...@gmail.com):
In datawarehouse / analytics usecases, you are doing mostly full scans, 
(hopefully with some predicate pushdown either projection and filtering), but 
it is reading a big numbers of rows, then doing aggregations.
I just want to say that that takes time. So by definition, an analytic query on 
an OLTP database is always "wrong", in the sense of at the same time of doing 
the query, data changed, deleted, updated, inserted. So operational analytics 
is always approximate.

So, my point is in operational analytics, CL=1 will be perfectly fine.

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, 

[jira] [Commented] (CASSANDRA-9259) Bulk Reading from Cassandra

2016-04-13 Thread vincent.poncet (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238992#comment-15238992
 ] 

vincent.poncet commented on CASSANDRA-9259:
---

In datawarehouse / analytics usecases, you are doing mostly full scans, 
(hopefully with some predicate pushdown either projection and filtering), but 
it is reading a big numbers of rows, then doing aggregations.
I just want to say that that takes time. So by definition, an analytic query on 
an OLTP database is always "wrong", in the sense of at the same time of doing 
the query, data changed, deleted, updated, inserted. So operational analytics 
is always approximate.

So, my point is in operational analytics, CL=1 will be perfectly fine.

> Bulk Reading from Cassandra
> ---
>
> Key: CASSANDRA-9259
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Compaction, CQL, Local Write-Read Paths, Streaming and 
> Messaging, Testing
>Reporter:  Brian Hess
>Assignee: Stefania
>Priority: Critical
> Fix For: 3.x
>
> Attachments: bulk-read-benchmark.1.html, 
> bulk-read-jfr-profiles.1.tar.gz, bulk-read-jfr-profiles.2.tar.gz
>
>
> This ticket is following on from the 2015 NGCC.  This ticket is designed to 
> be a place for discussing and designing an approach to bulk reading.
> The goal is to have a bulk reading path for Cassandra.  That is, a path 
> optimized to grab a large portion of the data for a table (potentially all of 
> it).  This is a core element in the Spark integration with Cassandra, and the 
> speed at which Cassandra can deliver bulk data to Spark is limiting the 
> performance of Spark-plus-Cassandra operations.  This is especially of 
> importance as Cassandra will (likely) leverage Spark for internal operations 
> (for example CASSANDRA-8234).
> The core CQL to consider is the following:
> SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
> Token(partitionKey) <= Y
> Here, we choose X and Y to be contained within one token range (perhaps 
> considering the primary range of a node without vnodes, for example).  This 
> query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
> operations via Spark (or other processing frameworks - ETL, etc).  There are 
> a few causes (e.g., inefficient paging).
> There are a few approaches that could be considered.  First, we consider a 
> new "Streaming Compaction" approach.  The key observation here is that a bulk 
> read from Cassandra is a lot like a major compaction, though instead of 
> outputting a new SSTable we would output CQL rows to a stream/socket/etc.  
> This would be similar to a CompactionTask, but would strip out some 
> unnecessary things in there (e.g., some of the indexing, etc). Predicates and 
> projections could also be encapsulated in this new "StreamingCompactionTask", 
> for example.
> Another approach would be an alternate storage format.  For example, we might 
> employ Parquet (just as an example) to store the same data as in the primary 
> Cassandra storage (aka SSTables).  This is akin to Global Indexes (an 
> alternate storage of the same data optimized for a particular query).  Then, 
> Cassandra can choose to leverage this alternate storage for particular CQL 
> queries (e.g., range scans).
> These are just 2 suggestions to get the conversation going.
> One thing to note is that it will be useful to have this storage segregated 
> by token range so that when you extract via these mechanisms you do not get 
> replications-factor numbers of copies of the data.  That will certainly be an 
> issue for some Spark operations (e.g., counting).  Thus, we will want 
> per-token-range storage (even for single disks), so this will likely leverage 
> CASSANDRA-6696 (though, we'll want to also consider the single disk case).
> It is also worth discussing what the success criteria is here.  It is 
> unlikely to be as fast as EDW or HDFS performance (though, that is still a 
> good goal), but being within some percentage of that performance should be 
> set as success.  For example, 2x as long as doing bulk operations on HDFS 
> with similar node count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-04-13 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238980#comment-15238980
 ] 

Robert Stupp commented on CASSANDRA-10783:
--

Sure, rebased + CI triggered.

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-04-13 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238966#comment-15238966
 ] 

Benjamin Lerer commented on CASSANDRA-10783:


I will try to review this week or next one.
[~snazy] Coud you rebase it?

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-11522) batch_size_fail_threshold_in_kb shouldn't only apply to batch

2016-04-13 Thread Giampaolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giampaolo reassigned CASSANDRA-11522:
-

Assignee: Giampaolo

> batch_size_fail_threshold_in_kb shouldn't only apply to batch
> -
>
> Key: CASSANDRA-11522
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11522
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sylvain Lebresne
>Assignee: Giampaolo
>Priority: Minor
>  Labels: lhf
>
> I can buy that C* is not good at dealing with large (in bytes) inserts and 
> that it makes sense to provide a user configurable protection against inserts 
> larger than a certain size, but it doesn't make sense to limit this to 
> batches. It's absolutely possible to insert a single very large row and 
> internally a batch with a single statement is exactly the same than a single 
> similar insert, so rejecting the former and not the later is confusing and 
> well, wrong.
> Note that I get that batches are more likely to get big and that's where the 
> protection is most often useful, but limiting the option to batch is still 
> less useful (it's a hole in the protection) and it's going to confuse users 
> in thinking that batches to a single partition are different from single 
> inserts.
> Of course that also mean that we should rename that option to 
> {{write_size_fail_threshold_in_kb}}. Which means we probably want to add this 
> new option and just deprecate {{batch_size_fail_threshold_in_kb}} for now 
> (with removal in 4.0).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >