date:20150717


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631410#comment-14631410
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

bq. From a user's perspective, I agree with Sylvain that the MV should respect 
the CL. I wouldn't expect to do a write at ALL, then do a read and get an old 
record back.

But the other side of that coin is is, we're effectively promoting all 
operations to at least QUORUM regardless of what the user asked for...

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9838) Unable to update an element in a static list


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631417#comment-14631417
 ] 

Philip Thompson commented on CASSANDRA-9838:


I'm getting {{InvalidRequest: code=2200 [Invalid query] message=Attempted to 
set an element on a list which is null}} instead, on the same operations.

[~thobbs], are these operations valid?

 Unable to update an element in a static list
 

 Key: CASSANDRA-9838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.5 on Linux
Reporter: Mahesh Datt
 Fix For: 2.1.x


 I created a table in cassandra  (my_table) which has a static list column 
 sizes_list. 
 I created a new row and initialized the list sizes_list as having one 
 element. 
 {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}}
 Now I m trying to update the element at index '0' with a statement like this
 {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values 
 (0x01, False, 0x00, 0x00);
 UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;
 {code}
 Now I see an error like this: 
 {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of 
 bound, list has size 0}}
 If I change my list to a non-static list, it works fine! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631420#comment-14631420
 ] 

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. But the other side of that coin is is, we're effectively promoting all 
operations to at least QUORUM regardless of what the user asked for...

We're not. In the description I made above, we need to wait on QUORUM response 
to remove from the batchlog, but we don't need to wait on QUORUM to respond to 
the user. Unless my reasoning is broken, we do respect the CL levels exactly as 
we should.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631422#comment-14631422
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

1. Paired replica?  What?

2. Under what conditions does replica BL save you from replaying coordinator BL?

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631426#comment-14631426
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

Pedantically you are correct.  Which is why I said effectively and not 
literally. :)

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9669) Commit Log Replay is Broken

[
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631236#comment-14631236
]

Benedict commented on CASSANDRA-9669:
-

Ick. So, thinking about it from a 2.0 perspective, this is even more of a
problem for counters. Since CL replay of a counter that is already persisted
causes a double-count.

Question is: do we care? If we do, we should probably stick with the solution I
already posted for 2.0. For 2.1+ I think a ledger is a better route.

Commit Log Replay is Broken
---

Key: CASSANDRA-9669
URL: https://issues.apache.org/jira/browse/CASSANDRA-9669
Project: Cassandra
Issue Type: Bug
Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Critical
Labels: correctness
Fix For: 3.x, 2.1.x, 2.2.x, 3.0.x

While {{postFlushExecutor}} ensures it never expires CL entries out-of-order,
on restart we simply take the maximum replay position of any sstable on disk,
and ignore anything prior.
It is quite possible for there to be two flushes triggered for a given table,
and for the second to finish first by virtue of containing a much smaller
quantity of live data (or perhaps the disk is just under less pressure). If
we crash before the first sstable has been written, then on restart the data
it would have represented will disappear, since we will not replay the CL
records.
This looks to be a bug present since time immemorial, and also seems pretty
serious.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9723) UDF / UDA execution time in trace

2015-07-17 Thread Christopher Batey (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631224#comment-14631224
 ] 

Christopher Batey commented on CASSANDRA-9723:
--

Ready for review: https://github.com/chbatey/cassandra-1/tree/udf-trace

 UDF / UDA execution time in trace
 -

 Key: CASSANDRA-9723
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9723
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Christopher Batey
Assignee: Christopher Batey
Priority: Minor
 Fix For: 2.2.x


 I'd like to see how long my UDF/As take in the trace. Checked in 2.2rc1 and 
 doesn't appear to be mentioned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jon Haddad (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631226#comment-14631226
 ] 

Jon Haddad commented on CASSANDRA-6477:
---

From a user's perspective, I agree with Sylvain that the MV should respect the 
CL.  I wouldn't expect to do a write at ALL, then do a read and get an old 
record back.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631267#comment-14631267
 ] 

 Brian Hess commented on CASSANDRA-6477:


+1  I think that is the promise of the MV.  

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size

2015-07-17 Thread Ryan McGuire (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631274#comment-14631274
 ] 

Ryan McGuire commented on CASSANDRA-8894:
-

Yep, I can add that to the gui. I'll probably just add a section for Extra 
cstar_perf settings and document what can be put in there, rather than calling 
out blockdevice readahead explicitly in the interface. (can put a help link 
next to it to make it easy.)

 Our default buffer size for (uncompressed) buffered reads should be smaller, 
 and based on the expected record size
 --

 Key: CASSANDRA-8894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: benedict-to-commit
 Fix For: 3.x

 Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml


 A large contributor to slower buffered reads than mmapped is likely that we 
 read a full 64Kb at once, when average record sizes may be as low as 140 
 bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
 each read will touch 32 of these, meaning we are unlikely to almost ever be 
 hitting the TLB, and will be incurring at least 30 unnecessary misses each 
 time (as well as the other costs of larger than necessary accesses). When 
 working with an SSD there is little to no benefit reading more than 4Kb at 
 once, and in either case reading more data than we need is wasteful. So, I 
 propose selecting a buffer size that is the next larger power of 2 than our 
 average record size (with a minimum of 4Kb), so that we expect to read in one 
 operation. I also propose that we create a pool of these buffers up-front, 
 and that we ensure they are all exactly aligned to a virtual page, so that 
 the source and target operations each touch exactly one virtual page per 4Kb 
 of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9838) Unable to update an element in a static list


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9838:
---
Description: 
I created a table in cassandra  (my_table) which has a static list column 
sizes_list. 

I created a new row and initialized the list sizes_list as having one element. 

{{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}}

Now I m trying to update the element at index '0' with a statement like this

{code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, 
False, 0x00, 0x00);

UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;
{code}
Now I see an error like this: 

{{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, 
list has size 0}}

If I change my list to a non-static list, it works fine! 

  was:
I created a table in cassandra  (my_table) which has a static list column 
sizes_list. 

I created a new row and initialized the list sizes_list as having one element. 

UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01

Now I m trying to update the element at index '0' with a statement like this

insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, 
False, 0x00, 0x00);

UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;

Now I see an error like this: 

InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, 
list has size 0

If I change my list to a non-static list, it works fine! 


 Unable to update an element in a static list
 

 Key: CASSANDRA-9838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.5 on Linux
Reporter: Mahesh Datt
 Fix For: 2.1.x


 I created a table in cassandra  (my_table) which has a static list column 
 sizes_list. 
 I created a new row and initialized the list sizes_list as having one 
 element. 
 {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}}
 Now I m trying to update the element at index '0' with a statement like this
 {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values 
 (0x01, False, 0x00, 0x00);
 UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;
 {code}
 Now I see an error like this: 
 {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of 
 bound, list has size 0}}
 If I change my list to a non-static list, it works fine! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9838) Unable to update an element in a static list


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9838:
---
Reproduced In: 2.1.5
Fix Version/s: 2.1.x

 Unable to update an element in a static list
 

 Key: CASSANDRA-9838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.5 on Linux
Reporter: Mahesh Datt
 Fix For: 2.1.x


 I created a table in cassandra  (my_table) which has a static list column 
 sizes_list. 
 I created a new row and initialized the list sizes_list as having one 
 element. 
 UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01
 Now I m trying to update the element at index '0' with a statement like this
 insert into my_table (my_id, is_deleted , col_id1, col_id2) values (0x01, 
 False, 0x00, 0x00);
 UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;
 Now I see an error like this: 
 InvalidRequest: code=2200 [Invalid query] message=List index 0 out of bound, 
 list has size 0
 If I change my list to a non-static list, it works fine! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-9519:

Fix Version/s: 2.0.17

 CASSANDRA-8448 Doesn't seem to be fixed
 ---

 Key: CASSANDRA-9519
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Sylvain Lebresne
 Fix For: 2.1.9, 2.0.17, 2.2.0

 Attachments: 9519.txt


 Still seeing the Comparison method violates its general contract! in 2.1.5
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
   at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45]
   at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45]
   at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45]
   at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45]
   at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45]
   at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45]
   at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45]
   at 
 org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/5] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04

Branch: refs/heads/cassandra-2.2
Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8
Parents: 1eda7cb 0ef1888
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:36:24 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:36:24 2015 +0200

--
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt
--
diff --cc CHANGES.txt
index 49cc850,f20fad8..c6774c2
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,32 -1,7 +1,32 @@@
 -2.0.17
 +2.1.9
 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
-  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
 + * Handle corrupt files on startup (CASSANDRA-9686)
 + * Fix clientutil jar and tests (CASSANDRA-9760)
 + * (cqlsh) Allow the SSL protocol version to be specified through the
 +   config file or environment variables (CASSANDRA-9544)
 +Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
 + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591)
 + * Fix growing pending background compaction (CASSANDRA-9662)
 +
 +
 +2.1.8
 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing
 +   COMPACT STORAGE tables with no clustering columns
 + * Warn when an extra-large partition is compacted (CASSANDRA-9643)
 + * Eliminate strong self-reference chains in sstable ref tidiers 
(CASSANDRA-9656)
 + * Ensure StreamSession uses canonical sstable reader instances 
(CASSANDRA-9700) 
 + * Ensure memtable book keeping is not corrupted in the event we shrink usage 
(CASSANDRA-9681)
 + * Update internal python driver for cqlsh (CASSANDRA-9064)
 + * Fix IndexOutOfBoundsException when inserting tuple with too many
 +   elements using the string literal notation (CASSANDRA-9559)
 + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090)
 + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540)
 + * Enable describe on indices (CASSANDRA-7814)
 + * ColumnFamilyStore.selectAndReference may block during compaction 
(CASSANDRA-9637)
 +Merged from 2.0:
   * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)
   * Add listen_address to system.local (CASSANDRA-9603)
   * Bug fixes to resultset metadata construction (CASSANDRA-9636)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631354#comment-14631354
]

T Jake Luciani commented on CASSANDRA-6477:
---

bq. I've actually never understood why we do a batchlog update on the base
table replicas (and so I think we should remove it, even though that's likely
not the most costly one). Why do we need it? If my reasoning above is correct,
the coordinator batchlog is enough to guarantee durability and eventual
consistency because we will replay the whole mutation until a QUORUM of replica
acknowledges success.

Yes, if we error out if the base is unable to replicate to the view then the
second BL is redundant. However there are a few reasons why we did what we did.

1. Your availability is cut in half when you use a MV with these guarantees. I
have a 5 node cluster RF=3 and I want to write at CL.ONE. If I have an MV I
can no longer handle two down nodes. Since the paired replica for the one base
node might be down.
2. The cost of replaying the coordinator BL is much higher than replaying the
base to replica BL since the latter is 1:1.

I do agree there is a disconnect in terms of consistency level when using the
MV but the batchlog feature was written to handle this. We could support both
approaches in terms of a new flag? Or are we willing to take a hit on
availability?

Materialized Views (was: Global Indexes)

Key: CASSANDRA-6477
URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
Project: Cassandra
Issue Type: New Feature
Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
Labels: cql
Fix For: 3.0 beta 1

Attachments: test-view-data.sh, users.yaml

Local indexes are suitable for low-cardinality data, where spreading the
index across the cluster is a Good Thing. However, for high-cardinality
data, local indexes require querying most nodes in the cluster even if only a
handful of rows is returned.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/5] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.2 f74419cd2 - f60e4ad42


Fix comparison contract violation in the dynamic snitch sorting

patch by slebresne; reviewed by benedict for CASSANDRA-9519

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627

Branch: refs/heads/cassandra-2.2
Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64
Parents: 1d54fc3
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 9 13:28:38 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:07 2015 +0200

--
 CHANGES.txt |  1 +
 .../locator/DynamicEndpointSnitch.java  | 34 --
 .../locator/DynamicEndpointSnitchTest.java  | 69 +++-
 3 files changed, 95 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a755cb9..f20fad8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.17
+ * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Don't include auth credentials in debug log (CASSANDRA-9682)
  * Can't transition from write survey to normal mode (CASSANDRA-9740)
  * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 3469847..f226989 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 private static final double ALPHA = 0.75; // set to 0.75 to make EDS more 
biased to towards the newer values
 private static final int WINDOW_SIZE = 100;
 
-private int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
-private int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
-private double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
+private final int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
+private final int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
+private final double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
 
 // the score for a merged set of endpoints must be this much worse than 
the score for separate endpoints to
 // warrant not merging two ranges into a single range
@@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 private void sortByProximityWithScore(final InetAddress address, 
ListInetAddress addresses)
 {
-super.sortByProximity(address, addresses);
+// Scores can change concurrently from a call to this method. But 
Collections.sort() expects
+// its comparator to be stable, that is 2 endpoint should compare 
the same way for the duration
+// of the sort() call. As we copy the scores map on write, it is thus 
enough to alias the current
+// version of it during this call.
+final HashMapInetAddress, Double scores = this.scores;
+Collections.sort(addresses, new ComparatorInetAddress()
+{
+public int compare(InetAddress a1, InetAddress a2)
+{
+return compareEndpoints(address, a1, a2, scores);
+}
+});
 }
 
 private void sortByProximityWithBadness(final InetAddress address, 
ListInetAddress addresses)
@@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 return;
 
 subsnitch.sortByProximity(address, addresses);
+HashMapInetAddress, Double scores = this.scores; // Make sure the 
score don't change in the middle of the loop below
+   // (which wouldn't 
really matter here but its cleaner that way).
 ArrayListDouble subsnitchOrderedScores = new 
ArrayList(addresses.size());
 for (InetAddress inet : addresses)
 {
@@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 }
 }
 
-public int

[5/5] cassandra git commit: Don't wrap byte arrays in SequentialWriter

Don't wrap byte arrays in SequentialWriter

patch by slebresne; reviewed by snazy  benedict for CASSANDRA-9797


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f60e4ad4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f60e4ad4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f60e4ad4

Branch: refs/heads/cassandra-2.2
Commit: f60e4ad4298725dac57c36da8427d992be19eb8a
Parents: 22c97bc
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:39:32 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:39:32 2015 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/io/util/SequentialWriter.java | 22 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 9a262dc..47d1db5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.0-rc3
+ * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
  * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
  * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 Merged from 2.1:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/src/java/org/apache/cassandra/io/util/SequentialWriter.java
--
diff --git a/src/java/org/apache/cassandra/io/util/SequentialWriter.java 
b/src/java/org/apache/cassandra/io/util/SequentialWriter.java
index f3268a2..915133f 100644
--- a/src/java/org/apache/cassandra/io/util/SequentialWriter.java
+++ b/src/java/org/apache/cassandra/io/util/SequentialWriter.java
@@ -185,12 +185,30 @@ public class SequentialWriter extends OutputStream 
implements WritableByteChanne
 
 public void write(byte[] buffer) throws IOException
 {
-write(ByteBuffer.wrap(buffer, 0, buffer.length));
+write(buffer, 0, buffer.length);
 }
 
 public void write(byte[] data, int offset, int length) throws IOException
 {
-write(ByteBuffer.wrap(data, offset, length));
+if (buffer == null)
+throw new ClosedChannelException();
+
+int position = offset;
+int remaining = length;
+while (remaining  0)
+{
+if (!buffer.hasRemaining())
+reBuffer();
+
+int toCopy = Math.min(remaining, buffer.remaining());
+buffer.put(data, position, toCopy);
+
+remaining -= toCopy;
+position += toCopy;
+
+isDirty = true;
+syncNeeded = true;
+}
 }
 
 public int write(ByteBuffer src) throws IOException

[2/5] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)

Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886

Branch: refs/heads/cassandra-2.2
Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42
Parents: a9b9e62
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 16 15:14:54 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:24 2015 +0200

--
 .../locator/DynamicEndpointSnitchLongTest.java  | 104 +++
 .../locator/DynamicEndpointSnitchTest.java  |  64 
 2 files changed, 104 insertions(+), 64 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
--
diff --git 
a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java 
b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
new file mode 100644
index 000..1c628fa
--- /dev/null
+++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
@@ -0,0 +1,104 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* License); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*/
+
+package org.apache.cassandra.locator;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.util.*;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.service.StorageService;
+import org.junit.Test;
+
+import org.apache.cassandra.utils.FBUtilities;
+
+import static org.junit.Assert.assertEquals;
+
+public class DynamicEndpointSnitchLongTest
+{
+@Test
+public void testConcurrency() throws InterruptedException, IOException, 
ConfigurationException
+{
+// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519
+double badness = DatabaseDescriptor.getDynamicBadnessThreshold();
+DatabaseDescriptor.setDynamicBadnessThreshold(0.0);
+
+try
+{
+final int ITERATIONS = 1;
+
+// do this because SS needs to be initialized before DES can work 
properly.
+StorageService.instance.initClient(0);
+SimpleSnitch ss = new SimpleSnitch();
+DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, 
String.valueOf(ss.hashCode()));
+InetAddress self = FBUtilities.getBroadcastAddress();
+
+ListInetAddress hosts = new ArrayList();
+// We want a big list of hosts so  sorting takes time, making it 
much more likely to reproduce the
+// problem we're looking for.
+for (int i = 0; i  100; i++)
+for (int j = 0; j  256; j++)
+hosts.add(InetAddress.getByAddress(new byte[]{127, 0, 
(byte)i, (byte)j}));
+
+ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts);
+updater.start();
+
+ListInetAddress result = null;
+for (int i = 0; i  ITERATIONS; i++)
+result = dsnitch.getSortedListByProximity(self, hosts);
+
+updater.stopped = true;
+updater.join();
+}
+finally
+{
+DatabaseDescriptor.setDynamicBadnessThreshold(badness);
+}
+}
+
+public static class ScoreUpdater extends Thread
+{
+private static final int SCORE_RANGE = 100;
+
+public volatile boolean stopped;
+
+private final DynamicEndpointSnitch dsnitch;
+private final ListInetAddress hosts;
+private final Random random = new Random();
+
+public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress 
hosts)
+{
+this.dsnitch = dsnitch;
+this.hosts = hosts;
+}
+
+public void run()
+{
+while (!stopped)
+{
+

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631353#comment-14631353
 ] 

Benedict commented on CASSANDRA-6477:
-

bq. I've actually never understood why we do a batchlog update on the base 
table replicas (and so I think we should remove it, even though that's likely 
not the most costly one). Why do we need it?

The thing is, the coordinator-level batchlog write is quite expensive. It seems 
we've paired each node with one MV node, but here's an idea: why not also pair 
it with RF-2 (or 1, and only support RF=3 for now) partners, to whom it 
requires the first write to be propagated, without which it does not 
acknowledge? This could be done with a specialised batchlog write, that goes to 
the local node _and_ the paired MV node. That way, most importantly, we do not 
have to wait synchronously for the batchlog records to be written: if they're 
lost, then the corruption caused by their loss is also lost.



 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[4/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22c97bc5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22c97bc5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22c97bc5

Branch: refs/heads/trunk
Commit: 22c97bc5ef5017663a40d25bd5b7283c09e25dd5
Parents: f74419c 2d462c0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:36:49 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:36:49 2015 +0200

--
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22c97bc5/CHANGES.txt
--
diff --cc CHANGES.txt
index e6c093d,c6774c2..9a262dc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,14 +1,17 @@@
 -2.1.9
 +2.2.0-rc3
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
   * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
-  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)
   * (cqlsh) Allow the SSL protocol version to be specified through the
 config file or environment variables (CASSANDRA-9544)
  Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
 - * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591)
 + * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591)
   * Fix growing pending background compaction (CASSANDRA-9662)

[1/6] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting

Repository: cassandra
Updated Branches:
  refs/heads/trunk 412e8743d - 05a5fb4f8


Fix comparison contract violation in the dynamic snitch sorting

patch by slebresne; reviewed by benedict for CASSANDRA-9519

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627

Branch: refs/heads/trunk
Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64
Parents: 1d54fc3
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 9 13:28:38 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:07 2015 +0200

--
 CHANGES.txt |  1 +
 .../locator/DynamicEndpointSnitch.java  | 34 --
 .../locator/DynamicEndpointSnitchTest.java  | 69 +++-
 3 files changed, 95 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a755cb9..f20fad8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.17
+ * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Don't include auth credentials in debug log (CASSANDRA-9682)
  * Can't transition from write survey to normal mode (CASSANDRA-9740)
  * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 3469847..f226989 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 private static final double ALPHA = 0.75; // set to 0.75 to make EDS more 
biased to towards the newer values
 private static final int WINDOW_SIZE = 100;
 
-private int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
-private int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
-private double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
+private final int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
+private final int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
+private final double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
 
 // the score for a merged set of endpoints must be this much worse than 
the score for separate endpoints to
 // warrant not merging two ranges into a single range
@@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 private void sortByProximityWithScore(final InetAddress address, 
ListInetAddress addresses)
 {
-super.sortByProximity(address, addresses);
+// Scores can change concurrently from a call to this method. But 
Collections.sort() expects
+// its comparator to be stable, that is 2 endpoint should compare 
the same way for the duration
+// of the sort() call. As we copy the scores map on write, it is thus 
enough to alias the current
+// version of it during this call.
+final HashMapInetAddress, Double scores = this.scores;
+Collections.sort(addresses, new ComparatorInetAddress()
+{
+public int compare(InetAddress a1, InetAddress a2)
+{
+return compareEndpoints(address, a1, a2, scores);
+}
+});
 }
 
 private void sortByProximityWithBadness(final InetAddress address, 
ListInetAddress addresses)
@@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 return;
 
 subsnitch.sortByProximity(address, addresses);
+HashMapInetAddress, Double scores = this.scores; // Make sure the 
score don't change in the middle of the loop below
+   // (which wouldn't 
really matter here but its cleaner that way).
 ArrayListDouble subsnitchOrderedScores = new 
ArrayList(addresses.size());
 for (InetAddress inet : addresses)
 {
@@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 }
 }
 
-public int compareEndpoints(InetAddress target,

[2/6] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)

Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886

Branch: refs/heads/trunk
Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42
Parents: a9b9e62
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 16 15:14:54 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:24 2015 +0200

--
 .../locator/DynamicEndpointSnitchLongTest.java  | 104 +++
 .../locator/DynamicEndpointSnitchTest.java  |  64 
 2 files changed, 104 insertions(+), 64 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
--
diff --git 
a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java 
b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
new file mode 100644
index 000..1c628fa
--- /dev/null
+++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
@@ -0,0 +1,104 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* License); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*/
+
+package org.apache.cassandra.locator;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.util.*;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.service.StorageService;
+import org.junit.Test;
+
+import org.apache.cassandra.utils.FBUtilities;
+
+import static org.junit.Assert.assertEquals;
+
+public class DynamicEndpointSnitchLongTest
+{
+@Test
+public void testConcurrency() throws InterruptedException, IOException, 
ConfigurationException
+{
+// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519
+double badness = DatabaseDescriptor.getDynamicBadnessThreshold();
+DatabaseDescriptor.setDynamicBadnessThreshold(0.0);
+
+try
+{
+final int ITERATIONS = 1;
+
+// do this because SS needs to be initialized before DES can work 
properly.
+StorageService.instance.initClient(0);
+SimpleSnitch ss = new SimpleSnitch();
+DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, 
String.valueOf(ss.hashCode()));
+InetAddress self = FBUtilities.getBroadcastAddress();
+
+ListInetAddress hosts = new ArrayList();
+// We want a big list of hosts so  sorting takes time, making it 
much more likely to reproduce the
+// problem we're looking for.
+for (int i = 0; i  100; i++)
+for (int j = 0; j  256; j++)
+hosts.add(InetAddress.getByAddress(new byte[]{127, 0, 
(byte)i, (byte)j}));
+
+ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts);
+updater.start();
+
+ListInetAddress result = null;
+for (int i = 0; i  ITERATIONS; i++)
+result = dsnitch.getSortedListByProximity(self, hosts);
+
+updater.stopped = true;
+updater.join();
+}
+finally
+{
+DatabaseDescriptor.setDynamicBadnessThreshold(badness);
+}
+}
+
+public static class ScoreUpdater extends Thread
+{
+private static final int SCORE_RANGE = 100;
+
+public volatile boolean stopped;
+
+private final DynamicEndpointSnitch dsnitch;
+private final ListInetAddress hosts;
+private final Random random = new Random();
+
+public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress 
hosts)
+{
+this.dsnitch = dsnitch;
+this.hosts = hosts;
+}
+
+public void run()
+{
+while (!stopped)
+{
+

[6/6] cassandra git commit: Merge branch 'cassandra-2.2' into trunk

Merge branch 'cassandra-2.2' into trunk

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/05a5fb4f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/05a5fb4f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/05a5fb4f

Branch: refs/heads/trunk
Commit: 05a5fb4f8b7bf76be0d95196a3231c5be61ee978
Parents: 412e874 f60e4ad
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:40:40 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:40:40 2015 +0200

--
 CHANGES.txt |  4 +++-
 .../cassandra/io/util/SequentialWriter.java | 22 ++--
 2 files changed, 23 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/05a5fb4f/CHANGES.txt
--
diff --cc CHANGES.txt
index db306ea,47d1db5..b2abd10
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,34 -1,15 +1,36 @@@
 +3.0
 + * Metrics should use up to date nomenclature (CASSANDRA-9448)
 + * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384)
 + * Cleanup crc and adler code for java 8 (CASSANDRA-9650)
 + * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808, 
9825)
 + * Update Guava to 18.0 (CASSANDRA-9653)
 + * Bloom filter false positive ratio is not honoured (CASSANDRA-8413)
 + * New option for cassandra-stress to leave a ratio of columns null 
(CASSANDRA-9522)
 + * Change hinted_handoff_enabled yaml setting, JMX (CASSANDRA-9035)
 + * Add algorithmic token allocation (CASSANDRA-7032)
 + * Add nodetool command to replay batchlog (CASSANDRA-9547)
 + * Make file buffer cache independent of paths being read (CASSANDRA-8897)
 + * Remove deprecated legacy Hadoop code (CASSANDRA-9353)
 + * Decommissioned nodes will not rejoin the cluster (CASSANDRA-8801)
 + * Change gossip stabilization to use endpoit size (CASSANDRA-9401)
 + * Change default garbage collector to G1 (CASSANDRA-7486)
 + * Populate TokenMetadata early during startup (CASSANDRA-9317)
 + * undeprecate cache recentHitRate (CASSANDRA-6591)
 + * Add support for selectively varint encoding fields (CASSANDRA-9499)
 +
 +
  2.2.0-rc3
+  * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
+  * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
   * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
  Merged from 2.1:
   * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)
   * (cqlsh) Allow the SSL protocol version to be specified through the
 -   config file or environment variables (CASSANDRA-9544)
 +config file or environment variables (CASSANDRA-9544)
  Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
   * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/05a5fb4f/src/java/org/apache/cassandra/io/util/SequentialWriter.java
--

[3/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04

Branch: refs/heads/trunk
Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8
Parents: 1eda7cb 0ef1888
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:36:24 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:36:24 2015 +0200

--
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt
--
diff --cc CHANGES.txt
index 49cc850,f20fad8..c6774c2
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,32 -1,7 +1,32 @@@
 -2.0.17
 +2.1.9
 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
-  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
 + * Handle corrupt files on startup (CASSANDRA-9686)
 + * Fix clientutil jar and tests (CASSANDRA-9760)
 + * (cqlsh) Allow the SSL protocol version to be specified through the
 +   config file or environment variables (CASSANDRA-9544)
 +Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
 + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591)
 + * Fix growing pending background compaction (CASSANDRA-9662)
 +
 +
 +2.1.8
 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing
 +   COMPACT STORAGE tables with no clustering columns
 + * Warn when an extra-large partition is compacted (CASSANDRA-9643)
 + * Eliminate strong self-reference chains in sstable ref tidiers 
(CASSANDRA-9656)
 + * Ensure StreamSession uses canonical sstable reader instances 
(CASSANDRA-9700) 
 + * Ensure memtable book keeping is not corrupted in the event we shrink usage 
(CASSANDRA-9681)
 + * Update internal python driver for cqlsh (CASSANDRA-9064)
 + * Fix IndexOutOfBoundsException when inserting tuple with too many
 +   elements using the string literal notation (CASSANDRA-9559)
 + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090)
 + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540)
 + * Enable describe on indices (CASSANDRA-7814)
 + * ColumnFamilyStore.selectAndReference may block during compaction 
(CASSANDRA-9637)
 +Merged from 2.0:
   * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)
   * Add listen_address to system.local (CASSANDRA-9603)
   * Bug fixes to resultset metadata construction (CASSANDRA-9636)

[4/5] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/22c97bc5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/22c97bc5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/22c97bc5

Branch: refs/heads/cassandra-2.2
Commit: 22c97bc5ef5017663a40d25bd5b7283c09e25dd5
Parents: f74419c 2d462c0
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:36:49 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:36:49 2015 +0200

--
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/22c97bc5/CHANGES.txt
--
diff --cc CHANGES.txt
index e6c093d,c6774c2..9a262dc
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,17 -1,14 +1,17 @@@
 -2.1.9
 +2.2.0-rc3
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
   * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
-  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)
   * (cqlsh) Allow the SSL protocol version to be specified through the
 config file or environment variables (CASSANDRA-9544)
  Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
 - * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591)
 + * Scrub (recover) sstables even when -Index.db is missing (CASSANDRA-9591)
   * Fix growing pending background compaction (CASSANDRA-9662)

[jira] [Updated] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9798:
---
Reproduced In: 2.1.6
Fix Version/s: 2.1.x

 Cassandra seems to have deadlocks during flush operations
 -

 Key: CASSANDRA-9798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4x HP Gen9 dl 360 servers
 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz)
 6x900GB 10kRPM disk for data
 1x900GB 10kRPM disk for commitlog
 64GB ram
 ETH: 10Gb/s
 Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64
 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too)
Reporter: Łukasz Mrożkiewicz
 Fix For: 2.1.x

 Attachments: cassandra.log, cassandra.yaml, gc.log.0.current


 Hi,
 We noticed some problem with dropped mutationstages. Usually on one random 
 node there is a situation that:
 MutationStage active is full, pending is increasing  completed is 
 stalled.
 MemtableFlushWriter active 6, pending: 25 completed: stalled 
 MemtablePostFlush active is 1, pending 29 completed: stalled
 after a some time (30s-10min) pending mutations are dropped and everything is 
 working.
 When it happened:
 1. Cpu idle is ~95%
 2. no gc long pauses or more activity.
 3. memory usage 3.5GB form 8GB
 4. only writes is processed by cassandra
 5. when LOAD  400GB/node problems appeared 
 6. cassandra 2.1.6
 There is gap in logs:
 INFO  08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 
 delivered)
 INFO  08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap
 INFO  08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:58 CompactionManager 239
 INFO  08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 
 162359 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 
 ops, 0%/0% of on/off-heap limit)
 INFO  08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 
 137378 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 
 0 (0%) off-heap
 INFO  08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 
 180792 ops, 4%/0% of on/off-heap
  limit)
 INFO  08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 
 273472 ops, 6%/0% of on/off-hea
 p limit)
 INFO  08:48:03 2176 MUTATION messages dropped in last 5000ms
 use case:
 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B
 CMS and G1GC tested - no difference



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631360#comment-14631360
 ] 

Benedict commented on CASSANDRA-6477:
-

bq.  We could support both approaches in terms of a new flag? 

I think _permitting_ faster operation with only eventual consistency 
guarantees is a good thing, since most users doing their own denormalisation 
probably get no better than that. A flag on construction?

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[5/6] cassandra git commit: Don't wrap byte arrays in SequentialWriter

Don't wrap byte arrays in SequentialWriter

patch by slebresne; reviewed by snazy  benedict for CASSANDRA-9797


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f60e4ad4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f60e4ad4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f60e4ad4

Branch: refs/heads/trunk
Commit: f60e4ad4298725dac57c36da8427d992be19eb8a
Parents: 22c97bc
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:39:32 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:39:32 2015 +0200

--
 CHANGES.txt |  1 +
 .../cassandra/io/util/SequentialWriter.java | 22 ++--
 2 files changed, 21 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 9a262dc..47d1db5 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.2.0-rc3
+ * Don't wrap byte arrays in SequentialWriter (CASSANDRA-9797)
  * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
  * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 Merged from 2.1:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f60e4ad4/src/java/org/apache/cassandra/io/util/SequentialWriter.java
--
diff --git a/src/java/org/apache/cassandra/io/util/SequentialWriter.java 
b/src/java/org/apache/cassandra/io/util/SequentialWriter.java
index f3268a2..915133f 100644
--- a/src/java/org/apache/cassandra/io/util/SequentialWriter.java
+++ b/src/java/org/apache/cassandra/io/util/SequentialWriter.java
@@ -185,12 +185,30 @@ public class SequentialWriter extends OutputStream 
implements WritableByteChanne
 
 public void write(byte[] buffer) throws IOException
 {
-write(ByteBuffer.wrap(buffer, 0, buffer.length));
+write(buffer, 0, buffer.length);
 }
 
 public void write(byte[] data, int offset, int length) throws IOException
 {
-write(ByteBuffer.wrap(data, offset, length));
+if (buffer == null)
+throw new ClosedChannelException();
+
+int position = offset;
+int remaining = length;
+while (remaining  0)
+{
+if (!buffer.hasRemaining())
+reBuffer();
+
+int toCopy = Math.min(remaining, buffer.remaining());
+buffer.put(data, position, toCopy);
+
+remaining -= toCopy;
+position += toCopy;
+
+isDirty = true;
+syncNeeded = true;
+}
 }
 
 public int write(ByteBuffer src) throws IOException

[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows

2015-07-17 Thread Joshua McKenzie (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631366#comment-14631366
 ] 

Joshua McKenzie commented on CASSANDRA-9795:


+1 on C* changes, ccm changes look to already be merged (not showing diff on 
github), and my only concern: on the dtests, could we manually delete the 
NamedTemporaryFile when we're done with it rather than leaving them littered 
around? 

 Fix cqlsh dtests on windows
 ---

 Key: CASSANDRA-9795
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795
 Project: Cassandra
  Issue Type: Sub-task
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 2.2.x


 There are a number of portability problems with python on win32 as I've 
 learned over the past few days.  
   * Our use of multiprocess is broken in cqlsh for windows.  
 https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming
 The code was passing self to the sub-process which on windows must be 
 pickleable (it's not).  So I refactored to be a class which is initialized in 
 the parent.
 Also, when the windows process starts it needs to load our cqlsh as a module. 
 So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh 
   * Our use of strftime is broken on windows
 The default timezone information %z in strftime isn't valid on windows.  I 
 added code to the date format parser in C* to support windows timezone labels.
   * We have a number of file access issues in dtest
   * csv import/export is broken on windows and requires all file be opened 
 with mode 'wb' or 'rb'
  
 http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297
   * CCM's use of popen required the univeral_newline=True flag to work on 
 windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9839) Move crc_check_chance out of compressions options

2015-07-17 Thread Aleksey Yeschenko (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-9839:
-
Labels: client-impacting docs-impacting  (was: )

 Move crc_check_chance out of compressions options
 -

 Key: CASSANDRA-9839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9839
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Priority: Minor
  Labels: client-impacting, docs-impacting
 Fix For: 3.0.0 rc1


 Follow up to CASSANDRA-8384. The option doesn't belong to compression params 
 - it doesn't affect compression, itself, and isn't passed to compressors upon 
 initialization.
 While it's true that it is (currently) only being honored when reading 
 compressed sstables, it still doesn't belong to compression params (and is 
 causing CASSANDRA-7978 -like issues).
 [~tjake] suggested we should make it an option of its own, and I think we 
 should.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[2/2] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)

Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886

Branch: refs/heads/cassandra-2.0
Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42
Parents: a9b9e62
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 16 15:14:54 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:24 2015 +0200

--
 .../locator/DynamicEndpointSnitchLongTest.java  | 104 +++
 .../locator/DynamicEndpointSnitchTest.java  |  64 
 2 files changed, 104 insertions(+), 64 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
--
diff --git 
a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java 
b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
new file mode 100644
index 000..1c628fa
--- /dev/null
+++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
@@ -0,0 +1,104 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* License); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*/
+
+package org.apache.cassandra.locator;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.util.*;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.service.StorageService;
+import org.junit.Test;
+
+import org.apache.cassandra.utils.FBUtilities;
+
+import static org.junit.Assert.assertEquals;
+
+public class DynamicEndpointSnitchLongTest
+{
+@Test
+public void testConcurrency() throws InterruptedException, IOException, 
ConfigurationException
+{
+// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519
+double badness = DatabaseDescriptor.getDynamicBadnessThreshold();
+DatabaseDescriptor.setDynamicBadnessThreshold(0.0);
+
+try
+{
+final int ITERATIONS = 1;
+
+// do this because SS needs to be initialized before DES can work 
properly.
+StorageService.instance.initClient(0);
+SimpleSnitch ss = new SimpleSnitch();
+DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, 
String.valueOf(ss.hashCode()));
+InetAddress self = FBUtilities.getBroadcastAddress();
+
+ListInetAddress hosts = new ArrayList();
+// We want a big list of hosts so  sorting takes time, making it 
much more likely to reproduce the
+// problem we're looking for.
+for (int i = 0; i  100; i++)
+for (int j = 0; j  256; j++)
+hosts.add(InetAddress.getByAddress(new byte[]{127, 0, 
(byte)i, (byte)j}));
+
+ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts);
+updater.start();
+
+ListInetAddress result = null;
+for (int i = 0; i  ITERATIONS; i++)
+result = dsnitch.getSortedListByProximity(self, hosts);
+
+updater.stopped = true;
+updater.join();
+}
+finally
+{
+DatabaseDescriptor.setDynamicBadnessThreshold(badness);
+}
+}
+
+public static class ScoreUpdater extends Thread
+{
+private static final int SCORE_RANGE = 100;
+
+public volatile boolean stopped;
+
+private final DynamicEndpointSnitch dsnitch;
+private final ListInetAddress hosts;
+private final Random random = new Random();
+
+public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress 
hosts)
+{
+this.dsnitch = dsnitch;
+this.hosts = hosts;
+}
+
+public void run()
+{
+while (!stopped)
+{
+

[jira] [Created] (CASSANDRA-9839) Move crc_check_chance out of compressions options

2015-07-17 Thread Aleksey Yeschenko (JIRA)

Aleksey Yeschenko created CASSANDRA-9839:


 Summary: Move crc_check_chance out of compressions options
 Key: CASSANDRA-9839
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9839
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Priority: Minor
 Fix For: 3.0.0 rc1


Follow up to CASSANDRA-8384. The option doesn't belong to compression params - 
it doesn't affect compression, itself, and isn't passed to compressors upon 
initialization.

While it's true that it is (currently) only being honored when reading 
compressed sstables, it still doesn't belong to compression params (and is 
causing CASSANDRA-7978 -like issues).

[~tjake] suggested we should make it an option of its own, and I think we 
should.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9831) Hanging dtests

2015-07-17 Thread Michael Shuler (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Shuler updated CASSANDRA-9831:
--
Description: 
This is the current list of dtests over the last week that have completely hung 
a test server, ending in an aborted job and incomplete test results in jenkins.

I'll be excluding these tests in the cassandra-dtest repo with {{@require}} 
annotations (or fully excluding paging_test in trunk, I think)

*2.0*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)

*2.1*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)

*2.2*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)
test_ttl_deletions (paging_test.TestPagingWithDeletions)
test_network_topology_strategy (consistency_test.TestAvailability)
test_network_topology_strategy_counters (consistency_test.TestAccuracy)

*trunk*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)
putget_2dc_rf2_test (multidc_putget_test.TestMultiDCPutGet)
test_network_topology_strategy_users (consistency_test.TestAccuracy)
test_network_topology_strategy (consistency_test.TestAvailability)
test_single_row_deletions (paging_test.TestPagingWithDeletions)
test_with_more_results_than_page_size (paging_test.TestPagingSize)
test_query_isolation (paging_test.TestPagingQueryIsolation)
test_node_unavailabe_during_paging (paging_test.TestPagingDatasetChanges)
test_with_no_results (paging_test.TestPagingSize)
test_data_change_impacting_later_page (paging_test.TestPagingDatasetChanges)
test_multiple_partition_deletions (paging_test.TestPagingWithDeletions)
sstableloader_compression_snappy_to_none_test 
(sstable_generation_loading_test.TestSSTableGenerationAndLoading)

(edit: new hung test nodes 07/16-17)
*2.2*
dc_repair_test (repair_test.TestRepair)
wide_row_test (putget_test.TestPutGet)
putget_snappy_test (putget_test.TestPutGet)

*trunk*
force_repair_async_1_test (deprecated_repair_test.TestDeprecatedRepairAPI)
test_nested_user_types (user_types_test.TestUserTypes)
sstableloader_compression_snappy_to_deflate_test 
(sstable_generation_loading_test.TestSSTableGenerationAndLoading)
test_paging_across_multi_wide_rows (paging_test.TestPagingData)
resumable_replace_test (replace_address_test.TestReplaceAddress)

  was:
This is the current list of dtests over the last week that have completely hung 
a test server, ending in an aborted job and incomplete test results in jenkins.

I'll be excluding these tests in the cassandra-dtest repo with {{@require}} 
annotations (or fully excluding paging_test in trunk, I think)

*2.0*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)

*2.1*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)

*2.2*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)
test_ttl_deletions (paging_test.TestPagingWithDeletions)
test_network_topology_strategy (consistency_test.TestAvailability)
test_network_topology_strategy_counters (consistency_test.TestAccuracy)

*trunk*
metadata_reset_while_compact_test (metadata_tests.TestMetadata)
putget_2dc_rf2_test (multidc_putget_test.TestMultiDCPutGet)
test_network_topology_strategy_users (consistency_test.TestAccuracy)
test_network_topology_strategy (consistency_test.TestAvailability)
test_single_row_deletions (paging_test.TestPagingWithDeletions)
test_with_more_results_than_page_size (paging_test.TestPagingSize)
test_query_isolation (paging_test.TestPagingQueryIsolation)
test_node_unavailabe_during_paging (paging_test.TestPagingDatasetChanges)
test_with_no_results (paging_test.TestPagingSize)
test_data_change_impacting_later_page (paging_test.TestPagingDatasetChanges)
test_multiple_partition_deletions (paging_test.TestPagingWithDeletions)
sstableloader_compression_snappy_to_none_test 
(sstable_generation_loading_test.TestSSTableGenerationAndLoading)

(edit: new hung test nodes 07/16)
*2.2*
dc_repair_test (repair_test.TestRepair)

*trunk*
force_repair_async_1_test (deprecated_repair_test.TestDeprecatedRepairAPI)
test_nested_user_types (user_types_test.TestUserTypes)
sstableloader_compression_snappy_to_deflate_test 
(sstable_generation_loading_test.TestSSTableGenerationAndLoading)
test_paging_across_multi_wide_rows (paging_test.TestPagingData)


 Hanging dtests
 --

 Key: CASSANDRA-9831
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9831
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Michael Shuler
Assignee: Michael Shuler
  Labels: test-failure

 This is the current list of dtests over the last week that have completely 
 hung a test server, ending in an aborted job and incomplete test results in 
 jenkins.
 I'll be excluding these tests in the cassandra-dtest repo with {{@require}} 
 annotations (or fully excluding paging_test in trunk, I think)
 *2.0*
 metadata_reset_while_compact_test

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631295#comment-14631295
 ] 

Benedict commented on CASSANDRA-8894:
-

Sounds perfect, thanks.

 Our default buffer size for (uncompressed) buffered reads should be smaller, 
 and based on the expected record size
 --

 Key: CASSANDRA-8894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: benedict-to-commit
 Fix For: 3.x

 Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml


 A large contributor to slower buffered reads than mmapped is likely that we 
 read a full 64Kb at once, when average record sizes may be as low as 140 
 bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
 each read will touch 32 of these, meaning we are unlikely to almost ever be 
 hitting the TLB, and will be incurring at least 30 unnecessary misses each 
 time (as well as the other costs of larger than necessary accesses). When 
 working with an SSD there is little to no benefit reading more than 4Kb at 
 once, and in either case reading more data than we need is wasteful. So, I 
 propose selecting a buffer size that is the next larger power of 2 than our 
 average record size (with a minimum of 4Kb), so that we expect to read in one 
 operation. I also propose that we create a pool of these buffers up-front, 
 and that we ensure they are all exactly aligned to a virtual page, so that 
 the source and target operations each touch exactly one virtual page per 4Kb 
 of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[1/2] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 1d54fc339 - 0ef188869


Fix comparison contract violation in the dynamic snitch sorting

patch by slebresne; reviewed by benedict for CASSANDRA-9519

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627

Branch: refs/heads/cassandra-2.0
Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64
Parents: 1d54fc3
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 9 13:28:38 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:07 2015 +0200

--
 CHANGES.txt |  1 +
 .../locator/DynamicEndpointSnitch.java  | 34 --
 .../locator/DynamicEndpointSnitchTest.java  | 69 +++-
 3 files changed, 95 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a755cb9..f20fad8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.17
+ * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Don't include auth credentials in debug log (CASSANDRA-9682)
  * Can't transition from write survey to normal mode (CASSANDRA-9740)
  * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 3469847..f226989 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 private static final double ALPHA = 0.75; // set to 0.75 to make EDS more 
biased to towards the newer values
 private static final int WINDOW_SIZE = 100;
 
-private int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
-private int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
-private double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
+private final int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
+private final int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
+private final double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
 
 // the score for a merged set of endpoints must be this much worse than 
the score for separate endpoints to
 // warrant not merging two ranges into a single range
@@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 private void sortByProximityWithScore(final InetAddress address, 
ListInetAddress addresses)
 {
-super.sortByProximity(address, addresses);
+// Scores can change concurrently from a call to this method. But 
Collections.sort() expects
+// its comparator to be stable, that is 2 endpoint should compare 
the same way for the duration
+// of the sort() call. As we copy the scores map on write, it is thus 
enough to alias the current
+// version of it during this call.
+final HashMapInetAddress, Double scores = this.scores;
+Collections.sort(addresses, new ComparatorInetAddress()
+{
+public int compare(InetAddress a1, InetAddress a2)
+{
+return compareEndpoints(address, a1, a2, scores);
+}
+});
 }
 
 private void sortByProximityWithBadness(final InetAddress address, 
ListInetAddress addresses)
@@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 return;
 
 subsnitch.sortByProximity(address, addresses);
+HashMapInetAddress, Double scores = this.scores; // Make sure the 
score don't change in the middle of the loop below
+   // (which wouldn't 
really matter here but its cleaner that way).
 ArrayListDouble subsnitchOrderedScores = new 
ArrayList(addresses.size());
 for (InetAddress inet : addresses)
 {
@@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 }
 }
 
-public int

[1/3] cassandra git commit: Fix comparison contract violation in the dynamic snitch sorting

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 1eda7cb55 - 2d462c049


Fix comparison contract violation in the dynamic snitch sorting

patch by slebresne; reviewed by benedict for CASSANDRA-9519

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a9b9e627
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a9b9e627
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a9b9e627

Branch: refs/heads/cassandra-2.1
Commit: a9b9e627b0256a7b55dbfefa6960e1e5b8379e64
Parents: 1d54fc3
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 9 13:28:38 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:07 2015 +0200

--
 CHANGES.txt |  1 +
 .../locator/DynamicEndpointSnitch.java  | 34 --
 .../locator/DynamicEndpointSnitchTest.java  | 69 +++-
 3 files changed, 95 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index a755cb9..f20fad8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.0.17
+ * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Don't include auth credentials in debug log (CASSANDRA-9682)
  * Can't transition from write survey to normal mode (CASSANDRA-9740)
  * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/a9b9e627/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
--
diff --git a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java 
b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
index 3469847..f226989 100644
--- a/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
+++ b/src/java/org/apache/cassandra/locator/DynamicEndpointSnitch.java
@@ -42,9 +42,9 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 private static final double ALPHA = 0.75; // set to 0.75 to make EDS more 
biased to towards the newer values
 private static final int WINDOW_SIZE = 100;
 
-private int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
-private int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
-private double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
+private final int UPDATE_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicUpdateInterval();
+private final int RESET_INTERVAL_IN_MS = 
DatabaseDescriptor.getDynamicResetInterval();
+private final double BADNESS_THRESHOLD = 
DatabaseDescriptor.getDynamicBadnessThreshold();
 
 // the score for a merged set of endpoints must be this much worse than 
the score for separate endpoints to
 // warrant not merging two ranges into a single range
@@ -154,7 +154,18 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 
 private void sortByProximityWithScore(final InetAddress address, 
ListInetAddress addresses)
 {
-super.sortByProximity(address, addresses);
+// Scores can change concurrently from a call to this method. But 
Collections.sort() expects
+// its comparator to be stable, that is 2 endpoint should compare 
the same way for the duration
+// of the sort() call. As we copy the scores map on write, it is thus 
enough to alias the current
+// version of it during this call.
+final HashMapInetAddress, Double scores = this.scores;
+Collections.sort(addresses, new ComparatorInetAddress()
+{
+public int compare(InetAddress a1, InetAddress a2)
+{
+return compareEndpoints(address, a1, a2, scores);
+}
+});
 }
 
 private void sortByProximityWithBadness(final InetAddress address, 
ListInetAddress addresses)
@@ -163,6 +174,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 return;
 
 subsnitch.sortByProximity(address, addresses);
+HashMapInetAddress, Double scores = this.scores; // Make sure the 
score don't change in the middle of the loop below
+   // (which wouldn't 
really matter here but its cleaner that way).
 ArrayListDouble subsnitchOrderedScores = new 
ArrayList(addresses.size());
 for (InetAddress inet : addresses)
 {
@@ -189,7 +202,8 @@ public class DynamicEndpointSnitch extends 
AbstractEndpointSnitch implements ILa
 }
 }
 
-public int

[3/3] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
CHANGES.txt


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2d462c04
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2d462c04
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2d462c04

Branch: refs/heads/cassandra-2.1
Commit: 2d462c04973a15e84ca550ce3913d08d7c5ee8c8
Parents: 1eda7cb 0ef1888
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Fri Jul 17 15:36:24 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:36:24 2015 +0200

--
 CHANGES.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2d462c04/CHANGES.txt
--
diff --cc CHANGES.txt
index 49cc850,f20fad8..c6774c2
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,32 -1,7 +1,32 @@@
 -2.0.17
 +2.1.9
 + * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
-  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
 + * Handle corrupt files on startup (CASSANDRA-9686)
 + * Fix clientutil jar and tests (CASSANDRA-9760)
 + * (cqlsh) Allow the SSL protocol version to be specified through the
 +   config file or environment variables (CASSANDRA-9544)
 +Merged from 2.0:
+  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Don't include auth credentials in debug log (CASSANDRA-9682)
   * Can't transition from write survey to normal mode (CASSANDRA-9740)
 + * Scrub (recover) sstables even when -Index.db is missing, (CASSANDRA-9591)
 + * Fix growing pending background compaction (CASSANDRA-9662)
 +
 +
 +2.1.8
 + * (cqlsh) Fix bad check for CQL compatibility when DESCRIBE'ing
 +   COMPACT STORAGE tables with no clustering columns
 + * Warn when an extra-large partition is compacted (CASSANDRA-9643)
 + * Eliminate strong self-reference chains in sstable ref tidiers 
(CASSANDRA-9656)
 + * Ensure StreamSession uses canonical sstable reader instances 
(CASSANDRA-9700) 
 + * Ensure memtable book keeping is not corrupted in the event we shrink usage 
(CASSANDRA-9681)
 + * Update internal python driver for cqlsh (CASSANDRA-9064)
 + * Fix IndexOutOfBoundsException when inserting tuple with too many
 +   elements using the string literal notation (CASSANDRA-9559)
 + * Allow JMX over SSL directly from nodetool (CASSANDRA-9090)
 + * Fix incorrect result for IN queries where column not found (CASSANDRA-9540)
 + * Enable describe on indices (CASSANDRA-7814)
 + * ColumnFamilyStore.selectAndReference may block during compaction 
(CASSANDRA-9637)
 +Merged from 2.0:
   * Avoid NPE in AuthSuccess#decode (CASSANDRA-9727)
   * Add listen_address to system.local (CASSANDRA-9603)
   * Bug fixes to resultset metadata construction (CASSANDRA-9636)

[2/3] cassandra git commit: Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)

Move CASSANDRA-9519 test in long tests (and reduce the size of the list used)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ef18886
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ef18886
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ef18886

Branch: refs/heads/cassandra-2.1
Commit: 0ef188869049ec6233d115f7a46c25f492e8fa42
Parents: a9b9e62
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 16 15:14:54 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 15:35:24 2015 +0200

--
 .../locator/DynamicEndpointSnitchLongTest.java  | 104 +++
 .../locator/DynamicEndpointSnitchTest.java  |  64 
 2 files changed, 104 insertions(+), 64 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/0ef18886/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
--
diff --git 
a/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java 
b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
new file mode 100644
index 000..1c628fa
--- /dev/null
+++ b/test/long/org/apache/cassandra/locator/DynamicEndpointSnitchLongTest.java
@@ -0,0 +1,104 @@
+/*
+* Licensed to the Apache Software Foundation (ASF) under one
+* or more contributor license agreements.  See the NOTICE file
+* distributed with this work for additional information
+* regarding copyright ownership.  The ASF licenses this file
+* to you under the Apache License, Version 2.0 (the
+* License); you may not use this file except in compliance
+* with the License.  You may obtain a copy of the License at
+*
+*http://www.apache.org/licenses/LICENSE-2.0
+*
+* Unless required by applicable law or agreed to in writing,
+* software distributed under the License is distributed on an
+* AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+* KIND, either express or implied.  See the License for the
+* specific language governing permissions and limitations
+* under the License.
+*/
+
+package org.apache.cassandra.locator;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.util.*;
+
+import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.exceptions.ConfigurationException;
+import org.apache.cassandra.service.StorageService;
+import org.junit.Test;
+
+import org.apache.cassandra.utils.FBUtilities;
+
+import static org.junit.Assert.assertEquals;
+
+public class DynamicEndpointSnitchLongTest
+{
+@Test
+public void testConcurrency() throws InterruptedException, IOException, 
ConfigurationException
+{
+// The goal of this test is to check for CASSANDRA-8448/CASSANDRA-9519
+double badness = DatabaseDescriptor.getDynamicBadnessThreshold();
+DatabaseDescriptor.setDynamicBadnessThreshold(0.0);
+
+try
+{
+final int ITERATIONS = 1;
+
+// do this because SS needs to be initialized before DES can work 
properly.
+StorageService.instance.initClient(0);
+SimpleSnitch ss = new SimpleSnitch();
+DynamicEndpointSnitch dsnitch = new DynamicEndpointSnitch(ss, 
String.valueOf(ss.hashCode()));
+InetAddress self = FBUtilities.getBroadcastAddress();
+
+ListInetAddress hosts = new ArrayList();
+// We want a big list of hosts so  sorting takes time, making it 
much more likely to reproduce the
+// problem we're looking for.
+for (int i = 0; i  100; i++)
+for (int j = 0; j  256; j++)
+hosts.add(InetAddress.getByAddress(new byte[]{127, 0, 
(byte)i, (byte)j}));
+
+ScoreUpdater updater = new ScoreUpdater(dsnitch, hosts);
+updater.start();
+
+ListInetAddress result = null;
+for (int i = 0; i  ITERATIONS; i++)
+result = dsnitch.getSortedListByProximity(self, hosts);
+
+updater.stopped = true;
+updater.join();
+}
+finally
+{
+DatabaseDescriptor.setDynamicBadnessThreshold(badness);
+}
+}
+
+public static class ScoreUpdater extends Thread
+{
+private static final int SCORE_RANGE = 100;
+
+public volatile boolean stopped;
+
+private final DynamicEndpointSnitch dsnitch;
+private final ListInetAddress hosts;
+private final Random random = new Random();
+
+public ScoreUpdater(DynamicEndpointSnitch dsnitch, ListInetAddress 
hosts)
+{
+this.dsnitch = dsnitch;
+this.hosts = hosts;
+}
+
+public void run()
+{
+while (!stopped)
+{
+

[jira] [Commented] (CASSANDRA-9838) Unable to update an element in a static list

2015-07-17 Thread Tyler Hobbs (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631429#comment-14631429
 ] 

Tyler Hobbs commented on CASSANDRA-9838:


I think we recently changed the error message around that, which would explain 
why you are seeing a slightly different error.  Regardless, this should be 
working (especially since the same thing works for non-static columns), so it's 
definitely a bug.

 Unable to update an element in a static list
 

 Key: CASSANDRA-9838
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9838
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Cassandra 2.1.5 on Linux
Reporter: Mahesh Datt
 Fix For: 2.1.x


 I created a table in cassandra  (my_table) which has a static list column 
 sizes_list. 
 I created a new row and initialized the list sizes_list as having one 
 element. 
 {{UPDATE my_table SET sizes_list = sizes_list + [0] WHERE view_id = 0x01}}
 Now I m trying to update the element at index '0' with a statement like this
 {code}insert into my_table (my_id, is_deleted , col_id1, col_id2) values 
 (0x01, False, 0x00, 0x00);
 UPDATE my_table SET sizes_list[0] = 100 WHERE my_id = 0x01 ;
 {code}
 Now I see an error like this: 
 {{InvalidRequest: code=2200 [Invalid query] message=List index 0 out of 
 bound, list has size 0}}
 If I change my list to a non-static list, it works fine! 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9798:
---
Description: 
Hi,
We noticed some problem with dropped mutationstages. Usually on one random node 
there is a situation that:
MutationStage active is full, pending is increasing  completed is stalled.
MemtableFlushWriter active 6, pending: 25 completed: stalled 
MemtablePostFlush active is 1, pending 29 completed: stalled

after a some time (30s-10min) pending mutations are dropped and everything is 
working.
When it happened:
1. Cpu idle is ~95%
2. no gc long pauses or more activity.
3. memory usage 3.5GB form 8GB
4. only writes is processed by cassandra
5. when LOAD  400GB/node problems appeared 
6. cassandra 2.1.6

There is gap in logs:
{code}
INFO  08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 
delivered)
INFO  08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap
INFO  08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap
INFO  08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:58 CompactionManager 239
INFO  08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 
162359 ops, 4%/0% of on/off-heap l
imit)
INFO  08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 
ops, 0%/0% of on/off-heap limit)
INFO  08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 
137378 ops, 4%/0% of on/off-heap l
imit)
INFO  08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 
(0%) off-heap
INFO  08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 
180792 ops, 4%/0% of on/off-heap
 limit)
INFO  08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 
273472 ops, 6%/0% of on/off-hea
p limit)

INFO  08:48:03 2176 MUTATION messages dropped in last 5000ms
{code}

use case:
100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B
CMS and G1GC tested - no difference


  was:
Hi,
We noticed some problem with dropped mutationstages. Usually on one random node 
there is a situation that:
MutationStage active is full, pending is increasing  completed is stalled.
MemtableFlushWriter active 6, pending: 25 completed: stalled 
MemtablePostFlush active is 1, pending 29 completed: stalled

after a some time (30s-10min) pending mutations are dropped and everything is 
working.
When it happened:
1. Cpu idle is ~95%
2. no gc long pauses or more activity.
3. memory usage 3.5GB form 8GB
4. only writes is processed by cassandra
5. when LOAD  400GB/node problems appeared 
6. cassandra 2.1.6

There is gap in logs:
INFO  08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 
delivered)
INFO  08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap
INFO  08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) off-heap
INFO  08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) off-heap
INFO  08:47:58 CompactionManager 239
INFO  08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 
162359 ops, 4%/0% of on/off-heap l
imit)
INFO  08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 
ops, 0%/0% of on/off-heap limit)
INFO  08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 
137378 ops, 4%/0% of on/off-heap l
imit)
INFO  08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 0 
(0%) off-heap
INFO  08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 
180792 ops, 4%/0% of on/off-heap
 limit)
INFO  08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 
273472 ops, 6%/0% of on/off-hea
p limit)

INFO  08:48:03 2176 MUTATION messages dropped in last 5000ms


use case:
100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B
CMS and G1GC tested - no difference



 Cassandra seems to have deadlocks during flush operations
 -

 Key: CASSANDRA-9798

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631435#comment-14631435
 ] 

T Jake Luciani commented on CASSANDRA-6477:
---

1.  This goes way back to benedicts main concept 
https://issues.apache.org/jira/browse/CASSANDRA-6477?focusedCommentId=14039757page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14039757

We have each replica on the base table send the mv update to a single mv 
replica. So replicas are paired 1:1

2. Since the coordinator is a BL against a QUORUM of all base replicas which 
will always send to MV replicas we have a lot more work todo than a only 
sending a failed base to view update.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631436#comment-14631436
 ] 

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. Pedantically you are correct. Which is why I said effectively and not 
literally.

Well, I mean, CL is always more about when does we answer the client than 
what amount of work we do internally. Every write is always written to every 
replica for instance, the CL is just a matter of how long we wait before 
answering the client. I'm arguing this is very exactly the case here too. 
Anyway, your the other side of that coin made it sounds like we were doing 
unusual regarding the CL, something that may not be desirable. I don't 
understand what that is if that's the case.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631440#comment-14631440
 ] 

Sylvain Lebresne commented on CASSANDRA-6477:
-

I don't really think the cost of replaying the coordinator BL matters that 
much. We'll only if less than a quorum of node don't answer a particular query, 
which should be pretty rare unless you have bigger problems with your cluster. 
And given the local BL has a cost on every write, even if small, I don't think 
that from a performance perspective a local BL is a win.

That said, I hadn't seen we'd decided to go with pairing of base replica to MV 
replica. Doing so does justify a local BL (another option has always been to 
fan out to every MV replica, and since this ticket desperately miss a good 
description of what exact algorithm is actually implemented, I wasn't sure 
which option we went with).

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631442#comment-14631442
 ] 

Philip Thompson commented on CASSANDRA-9798:


Have you encountered the error described in CASSANDRA-7275 in your logs at all?

 Cassandra seems to have deadlocks during flush operations
 -

 Key: CASSANDRA-9798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4x HP Gen9 dl 360 servers
 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz)
 6x900GB 10kRPM disk for data
 1x900GB 10kRPM disk for commitlog
 64GB ram
 ETH: 10Gb/s
 Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64
 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too)
Reporter: Łukasz Mrożkiewicz
 Fix For: 2.1.x

 Attachments: cassandra.log, cassandra.yaml, gc.log.0.current


 Hi,
 We noticed some problem with dropped mutationstages. Usually on one random 
 node there is a situation that:
 MutationStage active is full, pending is increasing  completed is 
 stalled.
 MemtableFlushWriter active 6, pending: 25 completed: stalled 
 MemtablePostFlush active is 1, pending 29 completed: stalled
 after a some time (30s-10min) pending mutations are dropped and everything is 
 working.
 When it happened:
 1. Cpu idle is ~95%
 2. no gc long pauses or more activity.
 3. memory usage 3.5GB form 8GB
 4. only writes is processed by cassandra
 5. when LOAD  400GB/node problems appeared 
 6. cassandra 2.1.6
 There is gap in logs:
 {code}
 INFO  08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 
 delivered)
 INFO  08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap
 INFO  08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:58 CompactionManager 239
 INFO  08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 
 162359 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 
 ops, 0%/0% of on/off-heap limit)
 INFO  08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 
 137378 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 
 0 (0%) off-heap
 INFO  08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 
 180792 ops, 4%/0% of on/off-heap
  limit)
 INFO  08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 
 273472 ops, 6%/0% of on/off-hea
 p limit)
 INFO  08:48:03 2176 MUTATION messages dropped in last 5000ms
 {code}
 use case:
 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B
 CMS and G1GC tested - no difference



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9798) Cassandra seems to have deadlocks during flush operations


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631444#comment-14631444
 ] 

Benedict commented on CASSANDRA-9798:
-

[http://www.infoq.com/news/2015/05/redhat-futex]

There is a serious bug in some latest kernels, that is more common to encounter 
under certain CPUs. The result is lost thread signals, and hence stalled 
threads. GC may result in these wakeups being received, which could explain why 
the resolution corresponds with the mutation dropped message (which occurs when 
under GC load)

 Cassandra seems to have deadlocks during flush operations
 -

 Key: CASSANDRA-9798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4x HP Gen9 dl 360 servers
 2x8 cpu each (Intel(R) Xeon E5-2667 v3 @ 3.20GHz)
 6x900GB 10kRPM disk for data
 1x900GB 10kRPM disk for commitlog
 64GB ram
 ETH: 10Gb/s
 Red Hat Enterprise Linux Server release 6.6 (Santiago) 2.6.32-504.el6.x86_64
 java build 1.8.0_45-b14 (openjdk) (tested on oracle java 8 too)
Reporter: Łukasz Mrożkiewicz
 Fix For: 2.1.x

 Attachments: cassandra.log, cassandra.yaml, gc.log.0.current


 Hi,
 We noticed some problem with dropped mutationstages. Usually on one random 
 node there is a situation that:
 MutationStage active is full, pending is increasing  completed is 
 stalled.
 MemtableFlushWriter active 6, pending: 25 completed: stalled 
 MemtablePostFlush active is 1, pending 29 completed: stalled
 after a some time (30s-10min) pending mutations are dropped and everything is 
 working.
 When it happened:
 1. Cpu idle is ~95%
 2. no gc long pauses or more activity.
 3. memory usage 3.5GB form 8GB
 4. only writes is processed by cassandra
 5. when LOAD  400GB/node problems appeared 
 6. cassandra 2.1.6
 There is gap in logs:
 {code}
 INFO  08:47:01 Timed out replaying hints to /192.168.100.83; aborting (0 
 delivered)
 INFO  08:47:01 Enqueuing flush of hints: 7870567 (0%) on-heap, 0 (0%) off-heap
 INFO  08:47:30 Enqueuing flush of table1: 95301807 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 60462632 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table2: 76973746 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:31 Enqueuing flush of table1: 84290135 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table3: 56926652 (3%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:32 Enqueuing flush of table1: 85124218 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:33 Enqueuing flush of table2: 95663415 (4%) on-heap, 0 (0%) 
 off-heap
 INFO  08:47:58 CompactionManager 239
 INFO  08:47:58 Writing Memtable-table2@1767938721(13843064 serialized bytes, 
 162359 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Writing Memtable-hints@1433125911(478703 serialized bytes, 424 
 ops, 0%/0% of on/off-heap limit)
 INFO  08:47:58 Writing Memtable-table2@1318583275(11783615 serialized bytes, 
 137378 ops, 4%/0% of on/off-heap l
 imit)
 INFO  08:47:58 Enqueuing flush of compactions_in_progress: 969 (0%) on-heap, 
 0 (0%) off-heap
 INFO  08:47:58 Writing Memtable-table1@541175113(17221327 serialized bytes, 
 180792 ops, 4%/0% of on/off-heap
  limit)
 INFO  08:47:58 Writing Memtable-table1@1361154669(27138519 serialized bytes, 
 273472 ops, 6%/0% of on/off-hea
 p limit)
 INFO  08:48:03 2176 MUTATION messages dropped in last 5000ms
 {code}
 use case:
 100% write - 100Mb/s, couples of CF ~10column each. max cell size 100B
 CMS and G1GC tested - no difference



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631446#comment-14631446
 ] 

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. why not also pair it with RF-2 (or 1, and only support RF=3 for now) 
partners, to whom it requires the first write to be propagated, without which 
it does not acknowledge? This could be done with a specialised batchlog write, 
that goes to the local node and the paired MV node.

I _think_ I get a vague idea of what you mean but I'm not fully sure (and I'm 
not fully sure it's practical).

So lets first make sure I understand. Is the suggestion that to guarantee that 
if base-table replica applies an update, then RF/2 other ones also do it, we'd 
send the update to all base table replicas normally (without coordinator 
batchlog), but each replica would 1) write the update to a local-only batchlog 
and 2) forward the update to RF/2 other base table replicas?


 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631459#comment-14631459
 ] 

T Jake Luciani commented on CASSANDRA-6477:
---

bq. We'll only if less than a quorum of node don't answer a particular query, 
which should be pretty rare unless you have bigger problems with your cluster. 

bq. That said, I hadn't seen we'd decided to go with pairing of base replica to 
MV replica.

If we replicate to every MV replica from every base replica the write 
amplification gets much worse causing more timeouts. So it makes sense to have 
replication paired.

I do think waiting for the MV updates to be synchronous will cause a lot more 
timeouts and write latency (on top of what we have now). But if it's optional 
then people can choose.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631466#comment-14631466
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

No, you're right.  Synchronous MV updates is a terrible idea, which is more 
obvious when considering the case of more than one MV.  In the extreme case you 
could touch every node in the cluster.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631472#comment-14631472
]

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. So it makes sense to have replication paired.

Sure, didn't implied otherwise, I just wasn't aware we were doing it.

bq. I do think waiting for the MV updates to be synchronous will cause a lot
more timeouts and write latency (on top of what we have now).But if it's
optional then people can choose.

Frankly, I'm pretty negative on adding such option. I think there is some basic
guarantees that shouldn't be optional, and the CL ones are amongst those.
Making it optional will have people shoot themselves in the foo all the time.
At the very least, I would aks that we don't include such option on this ticket
(there is enough stuff to deal with) and open a separate ticket to discuss it
(one on which we'd actually benchmark thinks before assuming there will be
timeouts).

Materialized Views (was: Global Indexes)

Attachments: test-view-data.sh, users.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631471#comment-14631471
 ] 

Jonathan Ellis commented on CASSANDRA-6477:
---

If there are multiple MVs being updated, do they get merged into a single set 
of batchlogs?  (I.e. Just one on coordinator, one on each base replica, instead 
of one per MV.)

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9837) Bad logging interpolation string in Memtable:


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630846#comment-14630846
 ] 

Robert Stupp commented on CASSANDRA-9837:
-

[~mckibben] thanks for the patch! Will review it soon.

 Bad logging interpolation string in Memtable: 
 --

 Key: CASSANDRA-9837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael McKibben
Priority: Trivial
 Attachments: trunk-9837.txt


 Notice the following non-interpolated log entry showing up in our logs 
 Completed flushing %s. Looking at the source it appears to be a mix between 
 logback style {} substitution vs String.format %s  style.
 Attaching a trivial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[3/6] cassandra git commit: Fix broken logging for empty flushes in Memtable

Fix broken logging for empty flushes in Memtable

patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5

Branch: refs/heads/trunk
Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69
Parents: 4fcd7d4
Author: Michael McKibben mikemckib...@gmail.com
Authored: Fri Jul 17 08:36:00 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:36:12 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e950f3b..49cc850 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.9
+ * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Handle corrupt files on startup (CASSANDRA-9686)
  * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java
--
diff --git a/src/java/org/apache/cassandra/db/Memtable.java 
b/src/java/org/apache/cassandra/db/Memtable.java
index 9f6cf9b..375195f 100644
--- a/src/java/org/apache/cassandra/db/Memtable.java
+++ b/src/java/org/apache/cassandra/db/Memtable.java
@@ -390,7 +390,7 @@ public class Memtable
 }
 else
 {
-logger.info(Completed flushing %s; nothing needed to be 
retained.  Commitlog position was {},
+logger.info(Completed flushing {}; nothing needed to be 
retained.  Commitlog position was {},
 writer.getFilename(), context);
 writer.abort();
 ssTable = null;

[6/6] cassandra git commit: Merge branch 'cassandra-2.2' into trunk

Merge branch 'cassandra-2.2' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f5f3ae1d
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f5f3ae1d
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f5f3ae1d

Branch: refs/heads/trunk
Commit: f5f3ae1da45d633c5eb03b3fe760b4e866dca9d7
Parents: 689582c f74419c
Author: Robert Stupp sn...@snazy.de
Authored: Fri Jul 17 08:37:49 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:37:49 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5f3ae1d/CHANGES.txt
--
diff --cc CHANGES.txt
index 4e2c22e,e6c093d..76d6e92
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,27 -1,8 +1,28 @@@
 +3.0
 + * Metrics should use up to date nomenclature (CASSANDRA-9448)
 + * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384)
 + * Cleanup crc and adler code for java 8 (CASSANDRA-9650)
 + * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808)
 + * Update Guava to 18.0 (CASSANDRA-9653)
 + * Bloom filter false positive ratio is not honoured (CASSANDRA-8413)
 + * New option for cassandra-stress to leave a ratio of columns null 
(CASSANDRA-9522)
 + * Change hinted_handoff_enabled yaml setting, JMX (CASSANDRA-9035)
 + * Add algorithmic token allocation (CASSANDRA-7032)
 + * Add nodetool command to replay batchlog (CASSANDRA-9547)
 + * Make file buffer cache independent of paths being read (CASSANDRA-8897)
 + * Remove deprecated legacy Hadoop code (CASSANDRA-9353)
 + * Decommissioned nodes will not rejoin the cluster (CASSANDRA-8801)
 + * Change gossip stabilization to use endpoit size (CASSANDRA-9401)
 + * Change default garbage collector to G1 (CASSANDRA-7486)
 + * Populate TokenMetadata early during startup (CASSANDRA-9317)
 + * undeprecate cache recentHitRate (CASSANDRA-6591)
 + * Add support for selectively varint encoding fields (CASSANDRA-9499)
 +
 +
  2.2.0-rc3
 - * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
   * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
  Merged from 2.1:
+  * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
   * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f5f3ae1d/src/java/org/apache/cassandra/db/Memtable.java
--

[5/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f74419cd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f74419cd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f74419cd

Branch: refs/heads/cassandra-2.2
Commit: f74419cd2b13c3c8fe01d09df16f7edae583fe35
Parents: 2b99b5d 1eda7cb
Author: Robert Stupp sn...@snazy.de
Authored: Fri Jul 17 08:37:37 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:37:37 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/CHANGES.txt
--
diff --cc CHANGES.txt
index b4ea4b4,49cc850..e6c093d
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,7 -1,5 +1,8 @@@
 -2.1.9
 +2.2.0-rc3
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
+  * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
   * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/src/java/org/apache/cassandra/db/Memtable.java
--

[1/6] cassandra git commit: Fix broken logging for empty flushes in Memtable

Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 4fcd7d4d3 - 1eda7cb55
  refs/heads/cassandra-2.2 2b99b5d35 - f74419cd2
  refs/heads/trunk 689582c04 - f5f3ae1da


Fix broken logging for empty flushes in Memtable

patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5

Branch: refs/heads/cassandra-2.1
Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69
Parents: 4fcd7d4
Author: Michael McKibben mikemckib...@gmail.com
Authored: Fri Jul 17 08:36:00 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:36:12 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e950f3b..49cc850 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.9
+ * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Handle corrupt files on startup (CASSANDRA-9686)
  * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java
--
diff --git a/src/java/org/apache/cassandra/db/Memtable.java 
b/src/java/org/apache/cassandra/db/Memtable.java
index 9f6cf9b..375195f 100644
--- a/src/java/org/apache/cassandra/db/Memtable.java
+++ b/src/java/org/apache/cassandra/db/Memtable.java
@@ -390,7 +390,7 @@ public class Memtable
 }
 else
 {
-logger.info(Completed flushing %s; nothing needed to be 
retained.  Commitlog position was {},
+logger.info(Completed flushing {}; nothing needed to be 
retained.  Commitlog position was {},
 writer.getFilename(), context);
 writer.abort();
 ssTable = null;

[4/6] cassandra git commit: Merge branch 'cassandra-2.1' into cassandra-2.2

Merge branch 'cassandra-2.1' into cassandra-2.2


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f74419cd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f74419cd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f74419cd

Branch: refs/heads/trunk
Commit: f74419cd2b13c3c8fe01d09df16f7edae583fe35
Parents: 2b99b5d 1eda7cb
Author: Robert Stupp sn...@snazy.de
Authored: Fri Jul 17 08:37:37 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:37:37 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/CHANGES.txt
--
diff --cc CHANGES.txt
index b4ea4b4,49cc850..e6c093d
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,7 -1,5 +1,8 @@@
 -2.1.9
 +2.2.0-rc3
 + * sum() and avg() functions missing for smallint and tinyint types 
(CASSANDRA-9671)
 + * Revert CASSANDRA-9542 (allow native functions in UDA) (CASSANDRA-9771)
 +Merged from 2.1:
+  * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
   * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
   * Handle corrupt files on startup (CASSANDRA-9686)
   * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f74419cd/src/java/org/apache/cassandra/db/Memtable.java
--

[2/6] cassandra git commit: Fix broken logging for empty flushes in Memtable

Fix broken logging for empty flushes in Memtable

patch by Michael McKibben; reviewed by Robert Stupp for CASSANDRA-9837


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1eda7cb5
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1eda7cb5
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1eda7cb5

Branch: refs/heads/cassandra-2.2
Commit: 1eda7cb55c5876046cbc3f4ace3c7812ca032f69
Parents: 4fcd7d4
Author: Michael McKibben mikemckib...@gmail.com
Authored: Fri Jul 17 08:36:00 2015 +0200
Committer: Robert Stupp sn...@snazy.de
Committed: Fri Jul 17 08:36:12 2015 +0200

--
 CHANGES.txt| 1 +
 src/java/org/apache/cassandra/db/Memtable.java | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index e950f3b..49cc850 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.9
+ * Fix broken logging for empty flushes in Memtable (CASSANDRA-9837)
  * Complete CASSANDRA-8448 fix (CASSANDRA-9519)
  * Handle corrupt files on startup (CASSANDRA-9686)
  * Fix clientutil jar and tests (CASSANDRA-9760)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/1eda7cb5/src/java/org/apache/cassandra/db/Memtable.java
--
diff --git a/src/java/org/apache/cassandra/db/Memtable.java 
b/src/java/org/apache/cassandra/db/Memtable.java
index 9f6cf9b..375195f 100644
--- a/src/java/org/apache/cassandra/db/Memtable.java
+++ b/src/java/org/apache/cassandra/db/Memtable.java
@@ -390,7 +390,7 @@ public class Memtable
 }
 else
 {
-logger.info(Completed flushing %s; nothing needed to be 
retained.  Commitlog position was {},
+logger.info(Completed flushing {}; nothing needed to be 
retained.  Commitlog position was {},
 writer.getFilename(), context);
 writer.abort();
 ssTable = null;

[jira] [Commented] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631017#comment-14631017
 ] 

Sylvain Lebresne commented on CASSANDRA-9797:
-

bq. I mean reintroduce the write(byte[]...) implementation as we had it then, 
since it likely introduces fewer risks to restore behaviour as it was than to 
rewrite it.

That would make sense, though looking at that more closely, there has been 
enough change to {{SequentialWriter}} than just copy-pasting the old version 
doesn't at all (that old version uses {{bufferCursor()}} that doesn't exist 
anymore, it sets {{current}} even though that's not a method not a field, and 
sets {{validBufferBytes}} that also doesn't exist anymore). So we would need to 
revert a little bit more than just that one method, or modify it to fit the new 
stuffs, but both of which looks a lot more risky that the very simple version 
attached. That said, I'm also not the most familiar with the different 
evolution of {{SequentialWriter}} so if someone feels more confident with one 
of those two previous option, happy to let you have a shot at it.

 Don't wrap byte arrays in SequentialWriter
 --

 Key: CASSANDRA-9797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: performance
 Fix For: 3.x

 Attachments: 9797.txt


 While profiling a simple stress write run ({{cassandra-stress write n=200 
 -rate threads=50}} to be precise) with Mission Control, I noticed that a non 
 trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in 
 {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we 
 wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} 
 method. One could have hoped this wrapping would be stack allocated, but if 
 Mission Control isn't lying (and I was told it's fairly honest on that 
 front), it's not. And we do use that {{write(byte[])}} method quite a bit, 
 especially with the new vint encodings since they use a {{byte[]}} thread 
 local buffer and call that method.
 Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} 
 method, so attaching a patch to do that. A very quick local benchmark seems 
 to show a little bit less allocation and a slight edge for the branch with 
 this patch (on top of CASSANDRA-9705 I must add), but that local bench was 
 far from scientific so happy if someone that knows how to use our perf 
 service want to give that patch a shot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9837) Fix broken logging for empty flushes in Memtable


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-9837:

Summary: Fix broken logging for empty flushes in Memtable  (was: Bad 
logging interpolation string in Memtable: )

 Fix broken logging for empty flushes in Memtable
 --

 Key: CASSANDRA-9837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael McKibben
Priority: Trivial
 Attachments: trunk-9837.txt


 Notice the following non-interpolated log entry showing up in our logs 
 Completed flushing %s. Looking at the source it appears to be a mix between 
 logback style {} substitution vs String.format %s  style.
 Attaching a trivial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630910#comment-14630910
]

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. Why do we need this at all? Since replicas are in charge of updating MV
then normal hints should perform the same function as batchlog except without
the performance hint in the normal case.

Allow me to sum up how we deal with consistency guarantees, why we do it this
way and why I don't think hints work. I'm sorry if that response is a bit
verbose but as this is the most important thing of this ticket imo, I think it
bears repeating and making sure we're all on the same page.

The main guarantee we have to provide here is that MV are eventually consistent
with their base table. In other words, whatever failure scenarios we run into,
we should never have an inconsistency that never gets resolved. The canonical
example of why this is not a given is we have a column {{c = 2}} in the base
table that is also in a MV PK, and we have 2 concurrent updates A (sets {{c =
3}}) and B (sets {{c = 4}}). Without any kind of protection, we could end up
with the MV permanently having 2 entries, one of A and one for B, which is
incorrect (which should eventually converge to the update that has the biggest
timestamp since that's what the base table will keep). To the best of my
knowledge, there is 2 fundamental components to avoiding such permanent
inconsistency in the currently written patch/approach:
# On each replica, we synchronize/serialize the read-before-write done on the
base table. This guarantees that we won't have A and B racing on a single
base-table replica. Or, in other words, *if* the same replica sees both update
(where sees means do the read-before-write-and-update-MV-accordingly
dance), then it will properly update the MV. And since each base-table replica
updates all MV-table replica, it's enough that a single base-table replica sees
both update to guarantee eventually consistent of the MV. But we do need to
guarantee _at least_ one such base-table replica sees both updates and that's
the 2nd component.
# To provided that latter guarantee, we first put each base-table update that
include MV updates in the batchlog on the coordinator, and we only remove it
from the batchlog once a _QUORUM_ of replica have aknowledged the write (this
is importantly not dependent of the CL, eventual consistency must be guaranteed
whatever CL you use). That guarantees us that until a QUORUM of replica have
seen the update, we'll keep replaying it, which in turns guarantees us that for
any 2 updates, at least one replica will have sees them both.

Now, the latter guarantee cannot be provided by hints because we can't
guarantee hints delivery in face of failures. Typically, if I write hints on a
node and that node dies in a fire before that hint it delivered, it will never
be delivered. We need a distributed hint mechanism if you will, and that's what
the batch log gives us.

Materialized Views (was: Global Indexes)

Attachments: test-view-data.sh, users.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630928#comment-14630928
 ] 

Stefania commented on CASSANDRA-8894:
-

[~benedict] I went ahead and implemented the latest suggested optimization in 
this commit 
[here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad].
 I've also attached tentative stress yaml profiles, which I intend to run like 
this:

{code}
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(insert=1,\) n=10 -rate threads=50
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(singleblob=1,\) n=10 -rate threads=50
{code}

Can you confirm the profiles are what you intended, basically a partition id 
and a blob column with the size distributed as you previously indicated. I'm 
not sure if there is anything else I should do to ensure reads mostly hit disk 
- other than spreading the partition id across a bit interval? 

I created these additional branches:
- trunk-pre-8099
- 8894-pre-8099
- 8894-pre-8099-first-optim
- 8894-first-optim

The names are self describing except for first-optim which means before 
implementing the latest optimization. A tag would have been enough but cstar 
perf does not support it.

Unfortunately cstar perf has been giving me more problems other than tags, cc 
[~enigmacurry]:

* The old trunk branches pre 8099 fail due to the schema tables changes 
(http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : 
InvalidQueryException: Keyspace system_schema does not exist. However I think 
if we fake version 2.2 in build.xml we should be OK.
* The new branches either fail because of a nodetool failure 
(http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or 
the graphs are wrong 
(http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f).

Here is the nodetool failure:

{code}
[10.200.241.104] Executing task 'ensure_running'
[10.200.241.104] run: JAVA_HOME=~/fab/jvms/jdk1.8.0_45 
~/fab/cassandra/bin/nodetool ring
[10.200.241.104] out: error: null
[10.200.241.104] out: -- StackTrace --
[10.200.241.104] out: java.util.NoSuchElementException
[10.200.241.104] out:   at 
com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:506)
[10.200.241.104] out:   at 
com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:494)
[10.200.241.104] out:   at 
com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
[10.200.241.104] out:   at java.util.Collections.max(Collections.java:708)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.nodetool.Ring.execute(Ring.java:63)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:240)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.NodeTool.main(NodeTool.java:154)
[10.200.241.104] out: 
[10.200.241.104] out: 
{code}

I'll resume the performance tests once cstar perf is stable again.


 Our default buffer size for (uncompressed) buffered reads should be smaller, 
 and based on the expected record size
 --

 Key: CASSANDRA-8894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: benedict-to-commit
 Fix For: 3.x

 Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml


 A large contributor to slower buffered reads than mmapped is likely that we 
 read a full 64Kb at once, when average record sizes may be as low as 140 
 bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
 each read will touch 32 of these, meaning we are unlikely to almost ever be 
 hitting the TLB, and will be incurring at least 30 unnecessary misses each 
 time (as well as the other costs of larger than necessary accesses). When 
 working with an SSD there is little to no benefit reading more than 4Kb at 
 once, and in either case reading more data than we need is wasteful. So, I 
 propose selecting a buffer size that is the next larger power of 2 than our 
 average record size (with a minimum of 4Kb), so that we expect to read in one 
 operation. I also propose that we create a pool of these buffers up-front, 
 and that we ensure they are all exactly aligned to a virtual page, so that 
 the source and target operations each touch exactly one virtual page per 4Kb 
 of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6492) Have server pick query page size by default


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630977#comment-14630977
 ] 

Sylvain Lebresne commented on CASSANDRA-6492:
-

bq.  I'm just worried about not being able to meet user expectations when we 
first expose a page size in bytes.

I understand, and it's a valid concern. But I don't know, I'm just not a fan of 
hard-coded magic constants. Even if we hide that bytes target from view, we 
might still be really off on our stats and fail it, which can still have user 
visible consequence, and so I'm not sure this ultimately help users 
comprehension of what is going on.

The other aspect is that if we do that (just have a default mode), users for 
which the default doesn't work are still stuck with providing the page size in 
number of rows, which still requires them to guess-estimate their average row 
size, which is annoying to do when we can probably do a pretty good job of 
guess-estimating server-side automatically.

But I totally agree we should be very clear initially that this is a very soft 
target. And maybe we can experiment a bit to get a better sense of how bad 
that estimate will be in practice. That is, we can try different schemas and 
workloads (even try actively to game the estimate), and if it proves very 
easy to get an estimate that is very off, then I can agree that exposing the 
size is probably not a good idea (though if that's the case, it will also be 
worth asking ourselves if even a default is going to help more than it hurts). 
If it's quite hard however (to get an estimate that is very off reality), then 
we'll still warn users that it's not precise, but that's probably good enough 
in practice.

 Have server pick query page size by default
 ---

 Key: CASSANDRA-6492
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6492
 Project: Cassandra
  Issue Type: New Feature
  Components: API
Reporter: Jonathan Ellis
Assignee: Benjamin Lerer
Priority: Minor
  Labels: client-impacting

 We're almost always going to do a better job picking a page size based on 
 sstable stats, than users will guesstimating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

cassandra git commit: Fix handling of thrift non-string comparators

Repository: cassandra
Updated Branches:
  refs/heads/trunk f5f3ae1da - 412e8743d


Fix handling of thrift non-string comparators

patch by slebresne; reviewed by iamaleksey for CASSANDRA-9825


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/412e8743
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/412e8743
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/412e8743

Branch: refs/heads/trunk
Commit: 412e8743d7e933e5b3008242f74007f7ddd435cb
Parents: f5f3ae1
Author: Sylvain Lebresne sylv...@datastax.com
Authored: Thu Jul 16 15:38:02 2015 +0200
Committer: Sylvain Lebresne sylv...@datastax.com
Committed: Fri Jul 17 10:39:10 2015 +0200

--
 CHANGES.txt  | 2 +-
 src/java/org/apache/cassandra/config/CFMetaData.java | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/412e8743/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 76d6e92..db306ea 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -2,7 +2,7 @@
  * Metrics should use up to date nomenclature (CASSANDRA-9448)
  * Change CREATE/ALTER TABLE syntax for compression (CASSANDRA-8384)
  * Cleanup crc and adler code for java 8 (CASSANDRA-9650)
- * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808)
+ * Storage engine refactor (CASSANDRA-8099, 9743, 9746, 9759, 9781, 9808, 9825)
  * Update Guava to 18.0 (CASSANDRA-9653)
  * Bloom filter false positive ratio is not honoured (CASSANDRA-8413)
  * New option for cassandra-stress to leave a ratio of columns null 
(CASSANDRA-9522)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/412e8743/src/java/org/apache/cassandra/config/CFMetaData.java
--
diff --git a/src/java/org/apache/cassandra/config/CFMetaData.java 
b/src/java/org/apache/cassandra/config/CFMetaData.java
index 84639dc..ee1ed25 100644
--- a/src/java/org/apache/cassandra/config/CFMetaData.java
+++ b/src/java/org/apache/cassandra/config/CFMetaData.java
@@ -1117,7 +1117,7 @@ public final class CFMetaData
interval (%d)., 
maxIndexInterval, minIndexInterval));
 }
 
-// The comparator to validate the definition name.
+// The comparator to validate the definition name with thrift.
 public AbstractType? thriftColumnNameType()
 {
 if (isSuper())
@@ -1127,7 +1127,8 @@ public final class CFMetaData
 return ((MapType)def.type).nameComparator();
 }
 
-return UTF8Type.instance;
+assert isStaticCompactTable();
+return clusteringColumns.get(0).type;
 }
 
 public CFMetaData addAllColumnDefinitions(CollectionColumnDefinition 
defs)

[jira] [Updated] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-9797:

Fix Version/s: 2.2.x

 Don't wrap byte arrays in SequentialWriter
 --

 Key: CASSANDRA-9797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: performance
 Fix For: 3.x, 2.2.x

 Attachments: 9797.txt


 While profiling a simple stress write run ({{cassandra-stress write n=200 
 -rate threads=50}} to be precise) with Mission Control, I noticed that a non 
 trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in 
 {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we 
 wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} 
 method. One could have hoped this wrapping would be stack allocated, but if 
 Mission Control isn't lying (and I was told it's fairly honest on that 
 front), it's not. And we do use that {{write(byte[])}} method quite a bit, 
 especially with the new vint encodings since they use a {{byte[]}} thread 
 local buffer and call that method.
 Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} 
 method, so attaching a patch to do that. A very quick local benchmark seems 
 to show a little bit less allocation and a slight edge for the branch with 
 this patch (on top of CASSANDRA-9705 I must add), but that local bench was 
 far from scientific so happy if someone that knows how to use our perf 
 service want to give that patch a shot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631066#comment-14631066
 ] 

Stefania commented on CASSANDRA-7066:
-

bq. Also, I'd like to propose we hide TransactionLogs a little, by making its 
class constructor package-private, and ensuring it only ever exists as part of 
a LifecycleTransaction.

[~benedict], we can make it package private but I think there is one legitimate 
case where we need the transaction logs without a lifecycle transaction and 
that is when we only have a writer, like for {{SSTableTxnWriter}}. Do you think 
we should extend the lifecycle transaction to handle the case of no existing 
readers, no cfs but only one new writer? It seems kind of heavy to me and I 
would prefer to just move SSTableTxnWriter to the lifecycle package, perhaps 
with a better name? Also, the transaction logs must be created before the 
writer, because it must register the new file in its constructor before 
creating it.

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.x

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-9837) Bad logging interpolation string in Memtable:


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-9837:

Reviewer: Robert Stupp

 Bad logging interpolation string in Memtable: 
 --

 Key: CASSANDRA-9837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael McKibben
Priority: Trivial
 Attachments: trunk-9837.txt


 Notice the following non-interpolated log entry showing up in our logs 
 Completed flushing %s. Looking at the source it appears to be a mix between 
 logback style {} substitution vs String.format %s  style.
 Attaching a trivial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630926#comment-14630926
]

Sylvain Lebresne commented on CASSANDRA-6477:
-

bq. Made the base - view mutations async. Once we write to the local batchlog,
we don't care if the actual mutations are sent, it's best effort. So we can
fire and forget these and update the base memtable.

That's correct from an eventual consistency point of view, but I'm pretty sure
this breaks the CL guarantees for the user. What we want is that if I write at
{{CL.QUORUM}} the base table, and then read my MV at {{CL.QUORUM}}, then I'm
guaranteed to see my previous update. But that requires that each replica does
synchronous updates to the MV, and with the user CL. Writing a local batchlog
is not enough in particular since it doesn't give any kind of guarantee of the
visibility of the update. See my next comment though on that local batchlog.

bq. Made the Base - View batchlog update local only

I've actually never understood why we do a batchlog update on the base table
replicas (and so I think we should remove it, even though that's likely not the
most costly one). Why do we need it? If my reasoning above is correct, the
coordinator batchlog is enough to guarantee durability and eventual consistency
because we will replay the whole mutation until a QUORUM of replica
acknowledges success.

Materialized Views (was: Global Indexes)

Attachments: test-view-data.sh, users.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9785) dtests-offheap: leak detected


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630866#comment-14630866
 ] 

Robert Stupp commented on CASSANDRA-9785:
-

[~benedict] out of curiosity: do you know whether this _LEAK DETECTED_ is 
already handled in another ticket?

 dtests-offheap: leak detected
 -

 Key: CASSANDRA-9785
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9785
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp

 Following dtests fail with LEAK DETECTED with {{OFFHEAP_MEMTABLES=yes}}:
 * repair_test.py:TestRepair.dc_repair_test
 * repair_test.TestRepair.simple_sequential_repair_test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14630928#comment-14630928
 ] 

Stefania edited comment on CASSANDRA-8894 at 7/17/15 7:55 AM:
--

[~benedict] I went ahead and implemented the latest suggested optimization in 
this commit 
[here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad].
 I've also attached tentative stress yaml profiles, which I intend to run like 
this:

{code}
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(insert=1,\) n=10 -rate threads=50
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(singleblob=1,\) n=10 -rate threads=50
{code}

Can you confirm the profiles are what you intended, basically a partition id 
and a blob column with the size distributed as you previously indicated. I'm 
not sure if there is anything else I should do to ensure reads mostly hit disk 
- other than spreading the partition id across a big interval? 

I created these additional branches:
- trunk-pre-8099
- 8894-pre-8099
- 8894-pre-8099-first-optim
- 8894-first-optim

The names are self describing except for first-optim which means before 
implementing the latest optimization. A tag would have been enough but cstar 
perf does not support it.

Unfortunately cstar perf has been giving me more problems other than tags, cc 
[~enigmacurry]:

* The old trunk branches pre 8099 fail due to the schema tables changes 
(http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : 
InvalidQueryException: Keyspace system_schema does not exist. However I think 
if we fake version 2.2 in build.xml we should be OK.
* The new branches either fail because of a nodetool failure 
(http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or 
the graphs are wrong 
(http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f).

Here is the nodetool failure:

{code}
[10.200.241.104] Executing task 'ensure_running'
[10.200.241.104] run: JAVA_HOME=~/fab/jvms/jdk1.8.0_45 
~/fab/cassandra/bin/nodetool ring
[10.200.241.104] out: error: null
[10.200.241.104] out: -- StackTrace --
[10.200.241.104] out: java.util.NoSuchElementException
[10.200.241.104] out:   at 
com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:506)
[10.200.241.104] out:   at 
com.google.common.collect.LinkedHashMultimap$1.next(LinkedHashMultimap.java:494)
[10.200.241.104] out:   at 
com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
[10.200.241.104] out:   at java.util.Collections.max(Collections.java:708)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.nodetool.Ring.execute(Ring.java:63)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:240)
[10.200.241.104] out:   at 
org.apache.cassandra.tools.NodeTool.main(NodeTool.java:154)
[10.200.241.104] out: 
[10.200.241.104] out: 
{code}

I'll resume the performance tests once cstar perf is stable again.



was (Author: stefania):
[~benedict] I went ahead and implemented the latest suggested optimization in 
this commit 
[here|https://github.com/stef1927/cassandra/commit/ad6712cdc12380ef0529a13ed6e9bd1c5cecebad].
 I've also attached tentative stress yaml profiles, which I intend to run like 
this:

{code}
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(insert=1,\) n=10 -rate threads=50
user profile=https://dl.dropboxusercontent.com/u/15683245/8894_tiny.yaml  
ops\(singleblob=1,\) n=10 -rate threads=50
{code}

Can you confirm the profiles are what you intended, basically a partition id 
and a blob column with the size distributed as you previously indicated. I'm 
not sure if there is anything else I should do to ensure reads mostly hit disk 
- other than spreading the partition id across a bit interval? 

I created these additional branches:
- trunk-pre-8099
- 8894-pre-8099
- 8894-pre-8099-first-optim
- 8894-first-optim

The names are self describing except for first-optim which means before 
implementing the latest optimization. A tag would have been enough but cstar 
perf does not support it.

Unfortunately cstar perf has been giving me more problems other than tags, cc 
[~enigmacurry]:

* The old trunk branches pre 8099 fail due to the schema tables changes 
(http://cstar.datastax.com/tests/id/e134ee7e-2c46-11e5-a180-42010af0688f) : 
InvalidQueryException: Keyspace system_schema does not exist. However I think 
if we fake version 2.2 in build.xml we should be OK.
* The new branches either fail because of a nodetool failure 
(http://cstar.datastax.com/tests/id/86abc144-2c55-11e5-87b9-42010af0688f) or 
the graphs are wrong 
(http://cstar.datastax.com/tests/id/11fe9c5a-2c45-11e5-9760-42010af0688f).

Here is the nodetool failure:

{code}
[10.200.241.104] Executing task

[jira] [Updated] (CASSANDRA-9837) Bad logging interpolation string in Memtable:


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-9837:

Assignee: (was: Robert Stupp)

 Bad logging interpolation string in Memtable: 
 --

 Key: CASSANDRA-9837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael McKibben
Priority: Trivial
 Attachments: trunk-9837.txt


 Notice the following non-interpolated log entry showing up in our logs 
 Completed flushing %s. Looking at the source it appears to be a mix between 
 logback style {} substitution vs String.format %s  style.
 Attaching a trivial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (CASSANDRA-9837) Bad logging interpolation string in Memtable:


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp reassigned CASSANDRA-9837:
---

Assignee: Robert Stupp

 Bad logging interpolation string in Memtable: 
 --

 Key: CASSANDRA-9837
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9837
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael McKibben
Assignee: Robert Stupp
Priority: Trivial
 Attachments: trunk-9837.txt


 Notice the following non-interpolated log entry showing up in our logs 
 Completed flushing %s. Looking at the source it appears to be a mix between 
 logback style {} substitution vs String.format %s  style.
 Attaching a trivial patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9670) Cannot run CQL scripts on Windows AND having error Ubuntu Linux


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631485#comment-14631485
 ] 

Philip Thompson commented on CASSANDRA-9670:


Sorry, [~bholya], i was away from the office for a few weeks. I'm looking at 
these scripts now, and while the problem is definitely in the importing of one 
of these characters, the sp_setup.cql file you attached does not define the 
schema for the cities table, which is the one with the failing imports from the 
attached .csv files. Could you please give me that schema so I can repro the 
issue?

 Cannot run CQL scripts on Windows AND having error Ubuntu Linux
 ---

 Key: CASSANDRA-9670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9670
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: DataStax Community Edition 
 on Windows 7, 64 Bit and Ubuntu 
Reporter: Sanjay Patel
Assignee: Philip Thompson
  Labels: cqlsh
 Fix For: 2.1.x

 Attachments: cities.cql, germany_cities.cql, germany_cities.cql, 
 india_cities.csv, india_states.csv, sp_setup.cql


 After installation of 2.1.6 and 2.1.7 it is not possible to execute cql 
 scripts, which were earlier executed on windows + Linux environment 
 successfully.
 I have tried to install Python 2 latest version and try to execute, but 
 having same error.
 Attaching cities.cql for reference.
 ---
 {code}
 cqlsh source 'shoppoint_setup.cql' ;
 shoppoint_setup.cql:16:InvalidRequest: code=2200 [Invalid query] 
 message=Keyspace 'shopping' does not exist
 shoppoint_setup.cql:647:'ascii' codec can't decode byte 0xc3 in position 57: 
 ordinal not in range(128)
 cities.cql:9:'ascii' codec can't decode byte 0xc3 in position 51: ordinal not 
 in range(128)
 cities.cql:14:
 Error starting import process:
 cities.cql:14:Can't pickle type 'thread.lock': it's not found as thread.lock
 cities.cql:14:can only join a started process
 cities.cql:16:
 Error starting import process:
 cities.cql:16:Can't pickle type 'thread.lock': it's not found as thread.lock
 cities.cql:16:can only join a started process
 Traceback (most recent call last):
   File string, line 1, in module
   File I:\programm\python2710\lib\multiprocessing\forking.py, line 380, in 
 main
 prepare(preparation_data)
   File I:\programm\python2710\lib\multiprocessing\forking.py, line 489, in 
 prepare
 Traceback (most recent call last):
   File string, line 1, in module
 file, path_name, etc = imp.find_module(main_name, dirs)
 ImportError: No module named cqlsh
   File I:\programm\python2710\lib\multiprocessing\forking.py, line 380, in 
 main
 prepare(preparation_data)
   File I:\programm\python2710\lib\multiprocessing\forking.py, line 489, in 
 prepare
 file, path_name, etc = imp.find_module(main_name, dirs)
 ImportError: No module named cqlsh
 shoppoint_setup.cql:663:'ascii' codec can't decode byte 0xc3 in position 18: 
 ordinal not in range(128)
 ipcache.cql:28:ServerError: ErrorMessage code= [Server error] 
 message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
 java.lang.RuntimeException: java.io.FileNotFoundException: 
 I:\var\lib\cassandra\data\syste
 m\schema_columns-296e9c049bec3085827dc17d3df2122a\system-schema_columns-ka-300-Data.db
  (The process cannot access the file because it is being used by another 
 process)
 ccavn_bulkupdate.cql:75:ServerError: ErrorMessage code= [Server error] 
 message=java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
 java.lang.RuntimeException: java.io.FileNotFoundException: 
 I:\var\lib\cassandra\d
 ata\system\schema_columns-296e9c049bec3085827dc17d3df2122a\system-schema_columns-tmplink-ka-339-Data.db
  (The process cannot access the file because it is being used by another 
 process)
 shoppoint_setup.cql:680:'ascii' codec can't decode byte 0xe2 in position 14: 
 ordinal not in range(128){code}
 -
 In one of Ubuntu development environment we have similar errors.
 -
 {code}
 shoppoint_setup.cql:647:'ascii' codec can't decode byte 0xc3 in position 57: 
 ordinal not in range(128)
 cities.cql:9:'ascii' codec can't decode byte 0xc3 in position 51: ordinal not 
 in range(128)
 (corresponding line) COPY cities (city,country_code,state,isactive) FROM 
 'testdata/india_cities.csv' ;
 [19:53:18] j.basu: shoppoint_setup.cql:663:'ascii' codec can't decode byte 
 0xc3 in position 18: ordinal not in range(128)
 {code}
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631532#comment-14631532
 ] 

T Jake Luciani commented on CASSANDRA-9795:
---

Yeah but the fix for windows here was to set (delete=False) since it was 
keeping other processes from opening the file

 Fix cqlsh dtests on windows
 ---

 Key: CASSANDRA-9795
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795
 Project: Cassandra
  Issue Type: Sub-task
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 2.2.x


 There are a number of portability problems with python on win32 as I've 
 learned over the past few days.  
   * Our use of multiprocess is broken in cqlsh for windows.  
 https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming
 The code was passing self to the sub-process which on windows must be 
 pickleable (it's not).  So I refactored to be a class which is initialized in 
 the parent.
 Also, when the windows process starts it needs to load our cqlsh as a module. 
 So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh 
   * Our use of strftime is broken on windows
 The default timezone information %z in strftime isn't valid on windows.  I 
 added code to the date format parser in C* to support windows timezone labels.
   * We have a number of file access issues in dtest
   * csv import/export is broken on windows and requires all file be opened 
 with mode 'wb' or 'rb'
  
 http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297
   * CCM's use of popen required the univeral_newline=True flag to work on 
 windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Jack Krupansky (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631496#comment-14631496
]

Jack Krupansky commented on CASSANDRA-6477:
---

bq. multiple MVs being updated

It would be good to get a handle on what the scalability of MVs per base table
is in terms of recommended best practice. Hundreds? Thousands? A few dozen?
Maybe just a handful, like 5 or 10 or a dozen?

I hate it when a feature like this gets implemented without scalability in mind
and then some poor/idiot user comes along and tries a use case which is way out
of line with the implemented architecture but we provide no guidance as to what
the practical limits really are (e.g., number of tables - thousands vs.
hundreds.)

It seems to me that the primary use case is for query tables, where an app
might typically have a handful of queries and probably not more than a small
number of dozens in even extreme cases.

In any case, it would be great to be clear about the design limit for number of
MVs per base table - and to make sure some testing gets done to assure that the
number is practical.

And by design limit I don't mean a hard limit where more will cause an explicit
error, but where performance is considered acceptable.

Are the MV updates occurring in parallel with each other, or are they serial?
How many MVs could a base table have before the MV updates effectively become
serialized?

Materialized Views (was: Global Indexes)

Attachments: test-view-data.sh, users.yaml

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Tupshin Harper (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631500#comment-14631500
 ] 

Tupshin Harper commented on CASSANDRA-6477:
---

Just a reminder (since it was a loong time ago in this ticket), that we were 
going to target immediate consistency once we could leverage RAMP, and not 
before. 

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Carl Yeksigian (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631506#comment-14631506
 ] 

Carl Yeksigian commented on CASSANDRA-6477:
---

[~tupshin] Because we are no longer implementing this as a non-denormalized 
global index, we don't have multiple partitions to read, so RAMP unfortunately 
won't solve problems in a materialized view.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9683) Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra 2.1.7

2015-07-17 Thread Ariel Weisberg (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-9683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631513#comment-14631513
]

Ariel Weisberg commented on CASSANDRA-9683:
---

OK, so that might explain it. The metric you are looking at for writes might
not include replication. These are screenshots from OpsCenter? I will look into
how those metrics are collected.

Durable writes don't impact replication much since it still has to occur. All
durable writes does disable writing to the commit log at each node.

I will try again with multiple nodes and keep an eye on the ops center metrics.

Get mucher higher load and latencies after upgrading from 2.1.6 to cassandra
2.1.7
--

Attachments: cassandra-env.sh, cassandra.yaml, cfstats.txt,
os_load.png, pending_compactions.png, read_latency.png, schema.txt,
system.log, write_latency.png

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631507#comment-14631507
 ] 

T Jake Luciani commented on CASSANDRA-6477:
---

bq. Frankly, I'm pretty negative on adding such option. 

But then why do we even offer the batchlog at all?  Hand rolled materialized 
views use them.  And if you feel we should guarantee a consistency level then 
you would never use a batchlog. Since any timeout would mean you didn't achieve 
your consistency level and you must retry.  

If you are talking about just the UE then we could check the MV replica UP/Down 
status in the coordinator as well as the base.

 Materialized Views (was: Global Indexes)
 

 Key: CASSANDRA-6477
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
 Project: Cassandra
  Issue Type: New Feature
  Components: API, Core
Reporter: Jonathan Ellis
Assignee: Carl Yeksigian
  Labels: cql
 Fix For: 3.0 beta 1

 Attachments: test-view-data.sh, users.yaml


 Local indexes are suitable for low-cardinality data, where spreading the 
 index across the cluster is a Good Thing.  However, for high-cardinality 
 data, local indexes require querying most nodes in the cluster even if only a 
 handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9795) Fix cqlsh dtests on windows


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631527#comment-14631527
 ] 

Philip Thompson commented on CASSANDRA-9795:


[~JoshuaMcKenzie], [~tjake], NamedTemporaryFiles are automatically deleted at 
the end of tests already, so there is no need to explicitly remove them.

See https://docs.python.org/2/library/tempfile.html#tempfile.NamedTemporaryFile

 Fix cqlsh dtests on windows
 ---

 Key: CASSANDRA-9795
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9795
 Project: Cassandra
  Issue Type: Sub-task
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 2.2.x


 There are a number of portability problems with python on win32 as I've 
 learned over the past few days.  
   * Our use of multiprocess is broken in cqlsh for windows.  
 https://docs.python.org/2/library/multiprocessing.html#multiprocessing-programming
 The code was passing self to the sub-process which on windows must be 
 pickleable (it's not).  So I refactored to be a class which is initialized in 
 the parent.
 Also, when the windows process starts it needs to load our cqlsh as a module. 
 So I moved cqlsh - cqlsh.py and added a tiny wrapper for bin/cqlsh 
   * Our use of strftime is broken on windows
 The default timezone information %z in strftime isn't valid on windows.  I 
 added code to the date format parser in C* to support windows timezone labels.
   * We have a number of file access issues in dtest
   * csv import/export is broken on windows and requires all file be opened 
 with mode 'wb' or 'rb'
  
 http://stackoverflow.com/questions/1170214/pythons-csv-writer-produces-wrong-line-terminator/1170297#1170297
   * CCM's use of popen required the univeral_newline=True flag to work on 
 windows



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631083#comment-14631083
 ] 

Sylvain Lebresne commented on CASSANDRA-9519:
-

I'll admit I just went with the 'fix version' of that issue. If CASSANDRA-8448 
wasn't committed to 2.0 as the fix version imply, we would need to fix that too 
since it has part of the fix of the actual problem, but I don't know what was 
the rational for not committing it to 2.0 in the first place (if it was deemed 
not worth the risk, no reason for that to have changed).

 CASSANDRA-8448 Doesn't seem to be fixed
 ---

 Key: CASSANDRA-9519
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Sylvain Lebresne
 Fix For: 2.1.9, 2.2.0

 Attachments: 9519.txt


 Still seeing the Comparison method violates its general contract! in 2.1.5
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
   at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45]
   at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45]
   at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45]
   at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45]
   at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45]
   at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45]
   at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45]
   at 
 org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631112#comment-14631112
 ] 

Sylvain Lebresne commented on CASSANDRA-9472:
-

I'll note that unless I'm forgetting something, re-introducing this is not 
terribly complex (especially post-CASSANDRA-9705). CASSANDRA-8099 hasn't 
touched the off-heap memtable machinery much, so I think all we need is to 
implement the {{NativeAllocator.rowBuilder}} method (referencing CASSANDRA-9705 
patch here). Which in turns mainly mean writting a {{NativeRow}} implementation 
that is the counterpart of our previous {{NativeCell}} implementations (and we 
can likely salvage some of the code of said {{NativeCell}}). This thus require 
to come up with a reasonable serialization format for offheap rows, but that's 
hardly rocket science.

Note that I'm just talking here of doing what this ticket is actually about, 
which is to re-introduce off-heap memtables in a way that is as close as 
possible from what we had pre-CASSANDRA-8099. And I don't think we should wait 
on Java 9 or anything to do that. Of course we will improve on all this in the 
future, but let's please leave that to some future ticket. 

 Reintroduce off heap memtables
 --

 Key: CASSANDRA-9472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.x


 CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631119#comment-14631119
 ] 

Stefania commented on CASSANDRA-7066:
-

Then I need to add a new offline method to pass the metadata to the transaction 
logs and that's all?

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.x

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631147#comment-14631147
 ] 

Sylvain Lebresne commented on CASSANDRA-9472:
-

bq. however if we're shooting for a beta release ASAP, it will not likely be 
done in time.

Totally agreed. I did not meant to imply that we should attempt it for 3.0. As, 
as you said, it hasn't been taken out of experimental status, we've agreed 
that re-introducing that in 3.1/3.2 is good enough and I think we should stick 
to that. Though I also agree that since it is still experimental _and_ the 
fix here is totally isolated, then if beta1 goes surprisingly well and someone 
finds time to get this ready for the RC, then I wouldn't personally oppose a 
late inclusion.

 Reintroduce off heap memtables
 --

 Key: CASSANDRA-9472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.x


 CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631203#comment-14631203
 ] 

Benedict commented on CASSANDRA-8894:
-

I may aim to integrate this work prior to our running full performance tests, 
as I would like to see this safely hit pre-3.0, and we know it is effective 
already for in-memory workloads. The question now is more the tuning 
parameters, and how we might yet tweak them, and that's something that can be 
done much closer to release if we need to.

Some quick feedback:
* crossingProbability is always zero, I think? need to use {{ / 4096d}}
* disk_optimization_record_size_percentile: 
disk_optimization_estimate_percentile?
* disk_optimization_crossing_chance: disk_optimization_page_cross_chance?

No super strong feelings about the names, though. Just suggestions; not 100% 
certain they're even better from my POV, nor that it's important.

Otherwise this all LGTM, and I'm keen to commit. 

When performance testing, we should figure out how (via cstar) we can tweak 
read ahead settings on the machine. [~enigmacurry]: is there any way we could 
have that as a GUI option? Because this new code should make read ahead a bad 
idea for SSD clusters, and disabling it may see this will likely see standard 
mode become a superior option to mmap, since we can predict exactly how much we 
should read better than the OS can.

 Our default buffer size for (uncompressed) buffered reads should be smaller, 
 and based on the expected record size
 --

 Key: CASSANDRA-8894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: benedict-to-commit
 Fix For: 3.x

 Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml


 A large contributor to slower buffered reads than mmapped is likely that we 
 read a full 64Kb at once, when average record sizes may be as low as 140 
 bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
 each read will touch 32 of these, meaning we are unlikely to almost ever be 
 hitting the TLB, and will be incurring at least 30 unnecessary misses each 
 time (as well as the other costs of larger than necessary accesses). When 
 working with an SSD there is little to no benefit reading more than 4Kb at 
 once, and in either case reading more data than we need is wasteful. So, I 
 propose selecting a buffer size that is the next larger power of 2 than our 
 average record size (with a minimum of 4Kb), so that we expect to read in one 
 operation. I also propose that we create a pool of these buffers up-front, 
 and that we ensure they are all exactly aligned to a virtual page, so that 
 the source and target operations each touch exactly one virtual page per 4Kb 
 of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631123#comment-14631123
 ] 

Sylvain Lebresne commented on CASSANDRA-9835:
-

What makes you think that? We do still do this, it's now in 
{{SinglePartitionNamesCommand}} (in the {{queryMemtableAndDiskInternal}} method 
to be precise).

In fact, if anything, CASSANDRA-8099 makes this a lot more useful since 
pre-CASSANDRA-8099, {{collectTimeOrderedData}} is _never_ used for non-compact 
tables (we just never create names queries for non-compact tables for reasons 
explained at length in CASSANDRA-7085 and related issues).

 Restore collectTimeOrderedData behaviour post-8099
 --

 Key: CASSANDRA-9835
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.0.x


 AFAICT, we no longer prune the sstables we iterate once we know we've 
 satisfied a query. This is not only still possible, but possible in more 
 scenarios (since we can do it for any single CQL-row lookup).
 Affected workloads may have noticeably degraded behaviour, and this will 
 impact CASSANDRA-6477.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631129#comment-14631129
 ] 

Stefania commented on CASSANDRA-7066:
-

Sounds good. :)

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.x

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict resolved CASSANDRA-9835.
-
Resolution: Invalid

 Restore collectTimeOrderedData behaviour post-8099
 --

 Key: CASSANDRA-9835
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.0.x


 AFAICT, we no longer prune the sstables we iterate once we know we've 
 satisfied a query. This is not only still possible, but possible in more 
 scenarios (since we can do it for any single CQL-row lookup).
 Affected workloads may have noticeably degraded behaviour, and this will 
 impact CASSANDRA-6477.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9835) Restore collectTimeOrderedData behaviour post-8099


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631143#comment-14631143
 ] 

Benedict commented on CASSANDRA-9835:
-

Hmm. I guess a between-keyboard-and-chair error. I only found 
SinglePartitionReadCommand when finding implementations of queryMemtableAndDisk.

Looking at it, it could perhaps do with a little TLC still.

 Restore collectTimeOrderedData behaviour post-8099
 --

 Key: CASSANDRA-9835
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9835
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.0.x


 AFAICT, we no longer prune the sstables we iterate once we know we've 
 satisfied a query. This is not only still possible, but possible in more 
 scenarios (since we can do it for any single CQL-row lookup).
 Affected workloads may have noticeably degraded behaviour, and this will 
 impact CASSANDRA-6477.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-6237) Allow range deletions in CQL

2015-07-17 Thread Benjamin Lerer (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-6237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631193#comment-14631193
 ] 

Benjamin Lerer commented on CASSANDRA-6237:
---

{quote}It's not clear to me why some user function tests changed, e.g. here and 
here. Are these in scope for this ticket?{quote}

Statements are processed in 2 phases. They are prepared and then executed. We 
try to preform the validation as much as possible in the preparation phase. It 
was not always the case for Insert/Update/Delete statements. I fixed that in my 
patch.

Some of the unit tests were using invalid statements but as they were only 
testing the preparation phase no errors were thrown.
After my patch it was no longer the case. So I had to make sure that the 
statements were valid.
 

 Allow range deletions in CQL
 

 Key: CASSANDRA-6237
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6237
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Benjamin Lerer
Priority: Minor
  Labels: cql, docs
 Fix For: 3.0.0 rc1

 Attachments: CASSANDRA-6237.txt


 We uses RangeTombstones internally in a number of places, but we could expose 
 more directly too. Typically, given a table like:
 {noformat}
 CREATE TABLE events (
 id text,
 created_at timestamp,
 content text,
 PRIMARY KEY (id, created_at)
 )
 {noformat}
 we could allow queries like:
 {noformat}
 DELETE FROM events WHERE id='someEvent' AND created_at  'Jan 3, 2013';
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8894) Our default buffer size for (uncompressed) buffered reads should be smaller, and based on the expected record size


[ 
https://issues.apache.org/jira/browse/CASSANDRA-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631108#comment-14631108
 ] 

Benedict commented on CASSANDRA-8894:
-

A few comments on the stress testing:

* The blob_id population doesn't need to be constrained (it defaults to 
something like 1..100B)
* To perform the inserts, we want to ensure we construct a dataset large enough 
to spill to disk, i.e. we want to probably insert at least 100M items (perhaps 
200M+) if they're only ~50 bytes each.
* We probably want to run with slightly more threads, say 300

The graphs don't appear to actually be broken that were produced: the stress 
run was simply extremely brief, since it only operated over 100K items :)

At risk of sounding like a broken record to everyone, it can help to use K, M, 
B syntax for your numbers in the profile/command line.

 Our default buffer size for (uncompressed) buffered reads should be smaller, 
 and based on the expected record size
 --

 Key: CASSANDRA-8894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8894
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
  Labels: benedict-to-commit
 Fix For: 3.x

 Attachments: 8894_25pct.yaml, 8894_5pct.yaml, 8894_tiny.yaml


 A large contributor to slower buffered reads than mmapped is likely that we 
 read a full 64Kb at once, when average record sizes may be as low as 140 
 bytes on our stress tests. The TLB has only 128 entries on a modern core, and 
 each read will touch 32 of these, meaning we are unlikely to almost ever be 
 hitting the TLB, and will be incurring at least 30 unnecessary misses each 
 time (as well as the other costs of larger than necessary accesses). When 
 working with an SSD there is little to no benefit reading more than 4Kb at 
 once, and in either case reading more data than we need is wasteful. So, I 
 propose selecting a buffer size that is the next larger power of 2 than our 
 average record size (with a minimum of 4Kb), so that we expect to read in one 
 operation. I also propose that we create a pool of these buffers up-front, 
 and that we ensure they are all exactly aligned to a virtual page, so that 
 the source and target operations each touch exactly one virtual page per 4Kb 
 of expected record size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables

2015-07-17 Thread Aleksey Yeschenko (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631117#comment-14631117
 ] 

Aleksey Yeschenko commented on CASSANDRA-9472:
--

[~slebresne] I was referring to 'we should make memtables completely off-heap' 
comment.

 Reintroduce off heap memtables
 --

 Key: CASSANDRA-9472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.x


 CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9472) Reintroduce off heap memtables


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631121#comment-14631121
 ] 

Benedict commented on CASSANDRA-9472:
-

bq. I don't think we should wait on Java 9 or anything to do that. Of course we 
will improve on all this in the future, but let's please leave that to some 
future ticket.

Yes, I think we're all on the same page there.

bq. re-introducing this is not terribly complex

Also agreed, however if we're shooting for a beta release ASAP, it will not 
likely be done in time. We could perhaps sneak it in before RC or at .0 if 
we're willing to do that, of course. Perhaps since it was never taken out of 
experimental status that would be acceptable. But there is still a lot of 
other follow on work from 8099 to get through.

 Reintroduce off heap memtables
 --

 Key: CASSANDRA-9472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
 Fix For: 3.x


 CASSANDRA-8099 removes off heap memtables. We should reintroduce them ASAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631124#comment-14631124
 ] 

Benedict commented on CASSANDRA-7066:
-

Should be. In fact we'll need a metadata accepting method for the approach I've 
taken in CASSANDRA-9669. Which is to use a {{LifecycleTransaction}} for 
memtable flush, which means it needs to be constructed empty (but is an online 
operation).

So there is synergy :)

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.x

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed

2015-07-17 Thread Jeremiah Jordan (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631184#comment-14631184
 ] 

Jeremiah Jordan commented on CASSANDRA-9519:


8448 went in to 2.0.13 per the fix version?

 CASSANDRA-8448 Doesn't seem to be fixed
 ---

 Key: CASSANDRA-9519
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9519
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jeremiah Jordan
Assignee: Sylvain Lebresne
 Fix For: 2.1.9, 2.2.0

 Attachments: 9519.txt


 Still seeing the Comparison method violates its general contract! in 2.1.5
 {code}
 java.lang.IllegalArgumentException: Comparison method violates its general 
 contract!
   at java.util.TimSort.mergeHi(TimSort.java:895) ~[na:1.8.0_45]
   at java.util.TimSort.mergeAt(TimSort.java:512) ~[na:1.8.0_45]
   at java.util.TimSort.mergeCollapse(TimSort.java:437) ~[na:1.8.0_45]
   at java.util.TimSort.sort(TimSort.java:241) ~[na:1.8.0_45]
   at java.util.Arrays.sort(Arrays.java:1512) ~[na:1.8.0_45]
   at java.util.ArrayList.sort(ArrayList.java:1454) ~[na:1.8.0_45]
   at java.util.Collections.sort(Collections.java:175) ~[na:1.8.0_45]
   at 
 org.apache.cassandra.locator.AbstractEndpointSnitch.sortByProximity(AbstractEndpointSnitch.java:49)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithScore(DynamicEndpointSnitch.java:158)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximityWithBadness(DynamicEndpointSnitch.java:187)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.locator.DynamicEndpointSnitch.sortByProximity(DynamicEndpointSnitch.java:152)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getLiveSortedEndpoints(StorageProxy.java:1530)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1688)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:256)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:209)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:63)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:238)
  ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:260) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
   at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:272) 
 ~[cassandra-all-2.1.5.469.jar:2.1.5.469]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9797) Don't wrap byte arrays in SequentialWriter


[ 
https://issues.apache.org/jira/browse/CASSANDRA-9797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631095#comment-14631095
 ] 

Benedict commented on CASSANDRA-9797:
-

Nah. If CI is happy, patch LGTM

 Don't wrap byte arrays in SequentialWriter
 --

 Key: CASSANDRA-9797
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9797
 Project: Cassandra
  Issue Type: Improvement
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
  Labels: performance
 Fix For: 3.x, 2.2.x

 Attachments: 9797.txt


 While profiling a simple stress write run ({{cassandra-stress write n=200 
 -rate threads=50}} to be precise) with Mission Control, I noticed that a non 
 trivial amount of heap pressure was due to the {{ByteBuffer.wrap()}} call in 
 {{SequentialWriter.write(byte[])}}. Basically, when writing a byte array, we 
 wrap it in a ByteBuffer to reuse the {{SequentialWriter.write(ByteBuffer)}} 
 method. One could have hoped this wrapping would be stack allocated, but if 
 Mission Control isn't lying (and I was told it's fairly honest on that 
 front), it's not. And we do use that {{write(byte[])}} method quite a bit, 
 especially with the new vint encodings since they use a {{byte[]}} thread 
 local buffer and call that method.
 Anyway, it sounds very simple to me to have a more direct {{write(byte[])}} 
 method, so attaching a patch to do that. A very quick local benchmark seems 
 to show a little bit less allocation and a slight edge for the branch with 
 this patch (on top of CASSANDRA-9705 I must add), but that local bench was 
 far from scientific so happy if someone that knows how to use our perf 
 service want to give that patch a shot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7066) Simplify (and unify) cleanup of compaction leftovers


[ 
https://issues.apache.org/jira/browse/CASSANDRA-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631109#comment-14631109
 ] 

Benedict commented on CASSANDRA-7066:
-

LifecycleTransaction already supports this, by constructing it via the 
{{offline}} method call.

 Simplify (and unify) cleanup of compaction leftovers
 

 Key: CASSANDRA-7066
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7066
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Stefania
Priority: Minor
  Labels: benedict-to-commit, compaction
 Fix For: 3.x

 Attachments: 7066.txt


 Currently we manage a list of in-progress compactions in a system table, 
 which we use to cleanup incomplete compactions when we're done. The problem 
 with this is that 1) it's a bit clunky (and leaves us in positions where we 
 can unnecessarily cleanup completed files, or conversely not cleanup files 
 that have been superceded); and 2) it's only used for a regular compaction - 
 no other compaction types are guarded in the same way, so can result in 
 duplication if we fail before deleting the replacements.
 I'd like to see each sstable store in its metadata its direct ancestors, and 
 on startup we simply delete any sstables that occur in the union of all 
 ancestor sets. This way as soon as we finish writing we're capable of 
 cleaning up any leftovers, so we never get duplication. It's also much easier 
 to reason about.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9669) Commit Log Replay is Broken

[
https://issues.apache.org/jira/browse/CASSANDRA-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14631186#comment-14631186
]

Benedict commented on CASSANDRA-9669:
-

So, I am liking this approach less and less. It may be the least effort, but it
has too many sharp edges, in critical portions of the system. It's also
literally a custom endeavour for 2.0, 2.1, 2.2 _and_ 3.0.

I think I will introduce a new commit log expiration ledger, and just write to
it whenever we perform a {{discardCompletedSegments()}} call. This is then
replayed prior to CL replay, to build the state of what records we consider
replayable. Initially, I will limit this to a simple statement of latest
replayposition we can be certain to have replayed to since this is a uniform
behaviour for 2.0+. 2.1+ easily supports ranges, which can be implemented when
we deliver CASSANDRA-8496.

Commit Log Replay is Broken
---

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9519) CASSANDRA-8448 Doesn't seem to be fixed