[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289485#comment-14289485
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 4:35 PM:


Attached patch for the fix.


was (Author: abhish_gl):
Patch for the fix

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Gupta updated CASSANDRA-8638:
--
Attachment: 0001-bug-CASSANDRA-8638.patch

Patch for the fix

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289482#comment-14289482
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 4:36 PM:


[~s_delima] I have created a patch with a fix to check BOM characters and 
replace them if they are present. Please review it and apply if it looks good.

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.


was (Author: abhish_gl):
I have created a patch with a fix to check BOM characters and replace them if 
they are present.

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8638:
---
Reviewer: Philip Thompson  (was: Sotirios Delimanolis)

I don't think we need to respect the usage of a BOM as a possible word joiner, 
as that was deprecated in Unicode 3.2, the current version of Unicode is 7.0.

I do think this patch should include the BOM for UTF-16 LE in addition to 
UTF-16 BE.

Running multiple string.replace() calls feels inefficient. [~thobbs], what do 
you think?

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Gupta updated CASSANDRA-8638:
--
Comment: was deleted

(was: Attached patch for the fix.)

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289482#comment-14289482
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 4:38 PM:


[~s_delima] I have created a patch with a fix to check for BOM characters, if 
they are present it replaces them with empty string. Please review it and apply 
if it looks good. 

Attached patch file: 0001-bug-CASSANDRA-8638.patch

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.


was (Author: abhish_gl):
[~s_delima] I have created a patch with a fix to check for BOM characters, if 
they are present it replaces them with empty string. Please review it and apply 
if it looks good.

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8670) Large columns + NIO memory pooling causes excessive direct memory usage

2015-01-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8670:
--
Description: 
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger then the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.

The obvious fix to address this issue would be to read in smaller chunks when 
dealing with large values. I think small should still be relatively large (4 
megabytes) so that code that is reading from a disk can amortize the cost of a 
seek. It can be hard to tell what the underlying thing being read from is going 
to be in some of the contexts where we might choose to implement switching to 
reading chunks.

  was:
If you provide a large byte array to NIO and ask it to populate the byte array 
from a socket it will allocate a thread local byte buffer that is the size of 
the requested read no matter how large it is. Old IO wraps new IO for sockets 
(but not files) so old IO is effected as well.

Even If you are using Buffered{Input | Output}Stream you can end up passing a 
large byte array to NIO. The byte array read method will pass the array to NIO 
directly if it is larger then the internal buffer.  

Passing large cells between nodes as part of intra-cluster messaging can cause 
the NIO pooled buffers to quickly reach a high watermark and stay there. This 
ends up costing 2x the largest cell size because there is a buffer for input 
and output since they are different threads. This is further multiplied by the 
number of nodes in the cluster - 1 since each has a dedicated thread pair with 
separate thread locals.

Anecdotally it appears that the cost is doubled beyond that although it isn't 
clear why. Possibly the control connections or possibly there is some way in 
which multiple 

Need a workload in CI that tests the advertised limits of cells on a cluster. 
It would be reasonable to ratchet down the max direct memory for the test to 
trigger failures if a memory pooling issue is introduced. I don't think we need 
to test concurrently pulling in a lot of them, but it should at least work 
serially.


 Large columns + NIO memory pooling causes excessive direct memory usage
 ---

 Key: CASSANDRA-8670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8670
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg

 If you provide a large byte array to NIO and ask it to populate the byte 
 array from a socket it will allocate a thread local byte buffer that is the 
 size of the requested read no matter how large it is. Old IO wraps new IO for 
 sockets (but not files) so old IO is effected as well.
 Even If you are using Buffered{Input | Output}Stream you can end up passing a 
 large byte array to NIO. The byte array read method will pass the array to 
 NIO directly if it is larger then the internal buffer.  
 Passing large cells between nodes as part of intra-cluster messaging can 
 cause the NIO pooled buffers to quickly reach a high watermark and stay 
 there. This ends up costing 2x the largest cell size because there is a 
 buffer for input and output since they are different threads. This is further 
 multiplied by the number of nodes in the cluster - 1 since each has a 
 dedicated thread pair with separate thread locals.
 Anecdotally it appears that the cost is doubled beyond that although it isn't 
 clear why. Possibly the control connections or possibly there is some way in 

[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8675:
---
Tester: Philip Thompson

 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cqlsh
 Fix For: 2.1.3

 Attachments: copytest.csv


 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |This has a quote  character
   3 | This has a fake tab \t character (entered slash-t text)
 (3 rows)
 cqlsh:test copy copytest to '/tmp/copytest.csv';
 3 rows exported in 0.034 seconds.
 cqlsh:test copy copytest from '/tmp/copytest.csv';
 3 rows imported in 0.005 seconds.
 cqlsh:test select * from copytest;
  id | t
 +---
   1 |  This has a newlinencharacter
   2 |  This has a quote  character
   3 | This has a fake tab \t character (typed backslash, t)
 (3 rows)
 {code}
 I tried replacing \n in the CSV file with \\n, which just expands to \n in 
 the table; and with an actual newline character, which fails with error since 
 it prematurely terminates the record.
 It seems backslashes are only used to take the following character as a 
 literal
 Until this is fixed, what would be the best way to refactor an old table with 
 a new, incompatible structure maintaining its content and name, since we 
 can't rename tables?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8675:
---
Reproduced In: 2.1.2
Fix Version/s: 2.1.3
   Labels: cqlsh  (was: cql)

 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cqlsh
 Fix For: 2.1.3

 Attachments: copytest.csv


 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |This has a quote  character
   3 | This has a fake tab \t character (entered slash-t text)
 (3 rows)
 cqlsh:test copy copytest to '/tmp/copytest.csv';
 3 rows exported in 0.034 seconds.
 cqlsh:test copy copytest from '/tmp/copytest.csv';
 3 rows imported in 0.005 seconds.
 cqlsh:test select * from copytest;
  id | t
 +---
   1 |  This has a newlinencharacter
   2 |  This has a quote  character
   3 | This has a fake tab \t character (typed backslash, t)
 (3 rows)
 {code}
 I tried replacing \n in the CSV file with \\n, which just expands to \n in 
 the table; and with an actual newline character, which fails with error since 
 it prematurely terminates the record.
 It seems backslashes are only used to take the following character as a 
 literal
 Until this is fixed, what would be the best way to refactor an old table with 
 a new, incompatible structure maintaining its content and name, since we 
 can't rename tables?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-8622) All of pig-test is failing in trunk

2015-01-23 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-8622.
-
Resolution: Fixed

Confirmed that pig-test runs normally now, as does EmbeddedCassandraServerTest, 
so resolving this since there's no point in trying to fix actual pig errors 
before CASSANDRA-8358

 All of pig-test is failing in trunk
 ---

 Key: CASSANDRA-8622
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8622
 Project: Cassandra
  Issue Type: Test
  Components: Hadoop
Reporter: Philip Thompson
Assignee: Brandon Williams
 Fix For: 3.0


 See http://cassci.datastax.com/job/trunk_pigtest/330/testReport/
 Every test in the ant target {{ant pig-test}} has been failing on trunk for a 
 while now.
 {code}
 java.lang.ExceptionInInitializerError
   at org.apache.log4j.Logger.getLogger(Logger.java:40)
   at org.hyperic.sigar.SigarLog.getLogger(SigarLog.java:48)
   at org.hyperic.sigar.SigarLog.getLogger(SigarLog.java:44)
   at org.hyperic.sigar.SigarLog.debug(SigarLog.java:60)
   at org.hyperic.sigar.Sigar.clinit(Sigar.java:108)
   at org.apache.cassandra.utils.SigarLibrary.init(SigarLibrary.java:45)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:206)
   at 
 org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:408)
   at 
 org.apache.cassandra.service.EmbeddedCassandraService.start(EmbeddedCassandraService.java:52)
   at 
 org.apache.cassandra.pig.PigTestBase.startCassandra(PigTestBase.java:96)
   at 
 org.apache.cassandra.pig.CqlRecordReaderTest.setup(CqlRecordReaderTest.java:63)
   at 
 org.apache.log4j.Log4jLoggerFactory.clinit(Log4jLoggerFactory.java:50)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289482#comment-14289482
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 4:37 PM:


[~s_delima] I have created a patch with a fix to check for BOM characters, if 
they are present it replaces them with empty string. Please review it and apply 
if it looks good.

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.


was (Author: abhish_gl):
[~s_delima] I have created a patch with a fix to check BOM characters and 
replace them if they are present. Please review it and apply if it looks good.

One more enhancement for this could be to check the file if it has BOM 
characters then only you replace these characters.

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7306) Support edge dcs with more flexible gossip

2015-01-23 Thread Michael Nelson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289895#comment-14289895
 ] 

Michael Nelson commented on CASSANDRA-7306:
---

Any movement on this?

 Support edge dcs with more flexible gossip
 

 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
  Labels: ponies

 As Cassandra clusters get bigger and bigger, and their topology becomes more 
 complex, there is more and more need for a notion of hub and spoke 
 datacenters.
 One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
 is the assumption that all dcs need to talk to each other (and be connected 
 all the time).
 This ticket is a vague placeholder with the goals of achieving:
 1) better behavioral support for occasionally disconnected datacenters
 2) explicit support for custom dc to dc routing. A simple approach would be 
 an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8677) rcp_interface and listen_interface generate NPE on startup when specified interface doesn't exist

2015-01-23 Thread Ariel Weisberg (JIRA)
Ariel Weisberg created CASSANDRA-8677:
-

 Summary: rcp_interface and listen_interface generate NPE on 
startup when specified interface doesn't exist
 Key: CASSANDRA-8677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8677
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg


This is just a buggy UI bit.

Initially the error I got was this which is redundant and not well formatted.
{noformat}
ERROR 20:12:55 Exception encountered during startup
java.lang.ExceptionInInitializerError: null
Fatal configuration error; unable to start. See log for stacktrace.
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
 ~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
[main/:na]
java.lang.ExceptionInInitializerError: null
Fatal configuration error; unable to start. See log for stacktrace.
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571)
Exception encountered during startup: null
Fatal configuration error; unable to start. See log for stacktrace.
ERROR 20:12:55 Exception encountered during startup
java.lang.ExceptionInInitializerError: null
Fatal configuration error; unable to start. See log for stacktrace.
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
 ~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
[main/:na]
{noformat}

This has no description of the error that occurred. After logging the exception.

{noformat}
java.lang.NullPointerException: null
at 
org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:347)
 ~[main/:na]
at 
org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:102)
 ~[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479) 
[main/:na]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
[main/:na]
{noformat}

Exceptions thrown in the DatabaseDescriptor should log in a useful way.

This particular error should generate a message without a stack trace since it 
is easily recognized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (CASSANDRA-8651) Add support for running on Apache Mesos

2015-01-23 Thread Albert P Tobey (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Albert P Tobey reassigned CASSANDRA-8651:
-

Assignee: Albert P Tobey

 Add support for running on Apache Mesos
 ---

 Key: CASSANDRA-8651
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8651
 Project: Cassandra
  Issue Type: Task
Reporter: Ben Whitehead
Assignee: Albert P Tobey
Priority: Minor
 Fix For: 3.0


 As a user of Apache Mesos I would like to be able to run Cassandra on my 
 Mesos cluster. This would entail integration of Cassandra on Mesos through 
 the creation of a production level Mesos framework. This would enable me to 
 avoid static partitioning and inefficiencies and run Cassandra as part of my 
 data center infrastructure.
 http://mesos.apache.org/documentation/latest/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289940#comment-14289940
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 8:47 PM:


As an alternate solution, here is what I propose to do as opposed to using a 
replace:
1. check the first 4 bytes and see if contains a BOM
2. if it contains a BOM, i will lookup the bom marker and further get the 
encoding and bom size(from a static dictionary). default it will return ascii.
3. open the file in the encoding returned in step-2
4. move file pointer ahead by bom size(based on bom size returned in step-2)
5. everything else will remain same

comments ?


was (Author: abhish_gl):
As an alternate solution, here is what I propose to do as opposed to using a 
replace:
1. check the first 4 bytes and see if contains a BOM
2. if it contains a BOM, i will lookup the bom marker and further get the 
encoding and bom size(from a static dictionary). default it will return ascii.
3. open the file in the encoding returned in step-2
4. move file pointer ahead by bom size(based on bom size returned in step-2)
5. everything else will remain same

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


cassandra git commit: fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/trunk 230d884fa - feda54f04


fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

Patch by Benjamin Lerer, reviewed by jbellis for CASSANDRA-8514


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/feda54f0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/feda54f0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/feda54f0

Branch: refs/heads/trunk
Commit: feda54f04911373d6b6148dadbd843894767548c
Parents: 230d884
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jan 23 16:33:59 2015 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jan 23 16:33:59 2015 -0600

--
 CHANGES.txt |  1 +
 .../org/apache/cassandra/tools/NodeTool.java| 74 
 2 files changed, 47 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/feda54f0/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index cdcb5cc..5bfb29c 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0
+ * Fix ArrayIndexOutOfBoundsException in nodetool cfhistograms (CASSANDRA-8514)
  * Serializing Row cache alternative, fully off heap (CASSANDRA-7438)
  * Duplicate rows returned when in clause has repeated values (CASSANDRA-6707)
  * Make CassandraException unchecked, extend RuntimeException (CASSANDRA-8560)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/feda54f0/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index 24772d7..b67dff9 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -31,7 +31,9 @@ import javax.management.openmbean.*;
 
 import com.google.common.base.Joiner;
 import com.google.common.base.Throwables;
+
 import com.google.common.collect.*;
+
 import com.yammer.metrics.reporting.JmxReporter;
 
 import io.airlift.command.*;
@@ -44,7 +46,9 @@ import 
org.apache.cassandra.db.compaction.CompactionManagerMBean;
 import org.apache.cassandra.db.compaction.OperationType;
 import org.apache.cassandra.io.util.FileUtils;
 import org.apache.cassandra.locator.EndpointSnitchInfoMBean;
+
 import org.apache.cassandra.metrics.ColumnFamilyMetrics.Sampler;
+
 import org.apache.cassandra.net.MessagingServiceMBean;
 import org.apache.cassandra.repair.messages.RepairOption;
 import org.apache.cassandra.repair.RepairParallelism;
@@ -64,6 +68,7 @@ import static com.google.common.collect.Lists.newArrayList;
 import static java.lang.Integer.parseInt;
 import static java.lang.String.format;
 import static org.apache.commons.lang3.ArrayUtils.EMPTY_STRING_ARRAY;
+import static org.apache.commons.lang3.ArrayUtils.isEmpty;
 import static org.apache.commons.lang3.StringUtils.*;
 
 public class NodeTool
@@ -1023,46 +1028,59 @@ public class NodeTool
 long[] estimatedRowSize = (long[]) 
probe.getColumnFamilyMetric(keyspace, cfname, EstimatedRowSizeHistogram);
 long[] estimatedColumnCount = (long[]) 
probe.getColumnFamilyMetric(keyspace, cfname, EstimatedColumnCountHistogram);
 
-long[] rowSizeBucketOffsets = new 
EstimatedHistogram(estimatedRowSize.length).getBucketOffsets();
-long[] columnCountBucketOffsets = new 
EstimatedHistogram(estimatedColumnCount.length).getBucketOffsets();
-EstimatedHistogram rowSizeHist = new 
EstimatedHistogram(rowSizeBucketOffsets, estimatedRowSize);
-EstimatedHistogram columnCountHist = new 
EstimatedHistogram(columnCountBucketOffsets, estimatedColumnCount);
-
 // build arrays to store percentile values
 double[] estimatedRowSizePercentiles = new double[7];
 double[] estimatedColumnCountPercentiles = new double[7];
 double[] offsetPercentiles = new double[]{0.5, 0.75, 0.95, 0.98, 
0.99};
 
-if (rowSizeHist.isOverflowed())
+if (isEmpty(estimatedRowSize) || isEmpty(estimatedColumnCount))
 {
-System.err.println(String.format(Row sizes are larger than 
%s, unable to calculate percentiles, 
rowSizeBucketOffsets[rowSizeBucketOffsets.length - 1]));
-for (int i = 0; i  offsetPercentiles.length; i++)
-estimatedRowSizePercentiles[i] = Double.NaN;
-}
-else
-{
-for (int i = 0; i  offsetPercentiles.length; i++)
-estimatedRowSizePercentiles[i] = 
rowSizeHist.percentile(offsetPercentiles[i]);
-}
+

[jira] [Updated] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-8638:
---
Reviewer: Tyler Hobbs  (was: Philip Thompson)

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289998#comment-14289998
 ] 

Philip Thompson commented on CASSANDRA-8638:


That seems okay, but I'm not sure if Ascii or UTF-8 is a better default. I will 
change reviewer to Tyler and he will give better feedback.

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289998#comment-14289998
 ] 

Philip Thompson edited comment on CASSANDRA-8638 at 1/23/15 9:21 PM:
-

That seems okay, but I'm not sure if Ascii or UTF-8 is a better default. I will 
change reviewer to Tyler and he will give better feedback.

He is out until next Thursday though, so there will be some delay on committing 
the patch once you make your suggested changes.


was (Author: philipthompson):
That seems okay, but I'm not sure if Ascii or UTF-8 is a better default. I will 
change reviewer to Tyler and he will give better feedback.

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/3] cassandra git commit: fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread brandonwilliams
fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

Patch by Benjamin Lerer, reviewed by jbellis for CASSANDRA-8514


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c468c8b4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c468c8b4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c468c8b4

Branch: refs/heads/trunk
Commit: c468c8b4369c76612a1fc821e8e77fe1da8b8011
Parents: 3a5f79e
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jan 23 15:59:30 2015 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jan 23 15:59:30 2015 -0600

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/tools/NodeProbe.java | 12 +++-
 src/java/org/apache/cassandra/tools/NodeTool.java  | 14 --
 3 files changed, 24 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 474bfbe..7673a3b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Fix ArrayIndexOutOfBoundsException in nodetool cfhistograms (CASSANDRA-8514)
  * Switch from yammer metrics for nodetool cf/proxy histograms (CASSANDRA-8662)
  * Make sure we don't add tmplink files to the compaction
strategy (CASSANDRA-8580)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index 155236f..f124589 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -54,8 +54,11 @@ import org.apache.cassandra.utils.concurrent.SimpleCondition;
 import com.google.common.base.Function;
 import com.google.common.collect.*;
 import com.google.common.util.concurrent.Uninterruptibles;
+
 import com.yammer.metrics.reporting.JmxReporter;
 
+import static org.apache.commons.lang3.ArrayUtils.isEmpty;
+
 /**
  * JMX client operations for Cassandra.
  */
@@ -1152,10 +1155,17 @@ public class NodeProbe implements AutoCloseable
 
 public double[] metricPercentilesAsArray(long[] counts)
 {
+double[] result = new double[7];
+
+if (isEmpty(counts))
+{
+Arrays.fill(result, Double.NaN);
+return result;
+}
+
 double[] offsetPercentiles = new double[] { 0.5, 0.75, 0.95, 0.98, 
0.99 };
 long[] offsets = new 
EstimatedHistogram(counts.length).getBucketOffsets();
 EstimatedHistogram metric = new EstimatedHistogram(offsets, counts);
-double[] result = new double[7];
 
 if (metric.isOverflowed())
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index 8de4fff..c2146c6 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -32,6 +32,7 @@ import javax.management.openmbean.*;
 import com.google.common.base.Joiner;
 import com.google.common.base.Throwables;
 import com.google.common.collect.*;
+
 import com.yammer.metrics.reporting.JmxReporter;
 
 import io.airlift.command.*;
@@ -62,6 +63,7 @@ import static com.google.common.collect.Lists.newArrayList;
 import static java.lang.Integer.parseInt;
 import static java.lang.String.format;
 import static org.apache.commons.lang3.ArrayUtils.EMPTY_STRING_ARRAY;
+import static org.apache.commons.lang3.ArrayUtils.isEmpty;
 import static org.apache.commons.lang3.StringUtils.*;
 
 public class NodeTool
@@ -1014,12 +1016,20 @@ public class NodeTool
 
 ColumnFamilyStoreMBean store = probe.getCfsProxy(keyspace, cfname);
 
+long[] estimatedRowSizeHistogram = 
store.getEstimatedRowSizeHistogram();
+long[] estimatedColumnCountHistogram = 
store.getEstimatedColumnCountHistogram();
+
+if (isEmpty(estimatedRowSizeHistogram) || 
isEmpty(estimatedColumnCountHistogram))
+{
+System.err.println(No SSTables exists, unable to calculate 
'Partition Size' and 'Cell Count' percentiles);
+}
+
 // calculate percentile of row size and column count
 String[] percentiles = new String[]{50%, 75%, 95%, 98%, 
99%, Min, Max};
 double[] readLatency = 
probe.metricPercentilesAsArray(store.getRecentReadLatencyHistogramMicros());

[1/3] cassandra git commit: fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread brandonwilliams
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 3a5f79eb5 - c468c8b43
  refs/heads/trunk 27ad2db02 - 230d884fa


fix ArrayIndexOutOfBoundsException in nodetool cfhistograms

Patch by Benjamin Lerer, reviewed by jbellis for CASSANDRA-8514


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/c468c8b4
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/c468c8b4
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/c468c8b4

Branch: refs/heads/cassandra-2.1
Commit: c468c8b4369c76612a1fc821e8e77fe1da8b8011
Parents: 3a5f79e
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jan 23 15:59:30 2015 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jan 23 15:59:30 2015 -0600

--
 CHANGES.txt|  1 +
 src/java/org/apache/cassandra/tools/NodeProbe.java | 12 +++-
 src/java/org/apache/cassandra/tools/NodeTool.java  | 14 --
 3 files changed, 24 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 474bfbe..7673a3b 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 2.1.3
+ * Fix ArrayIndexOutOfBoundsException in nodetool cfhistograms (CASSANDRA-8514)
  * Switch from yammer metrics for nodetool cf/proxy histograms (CASSANDRA-8662)
  * Make sure we don't add tmplink files to the compaction
strategy (CASSANDRA-8580)

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/src/java/org/apache/cassandra/tools/NodeProbe.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeProbe.java 
b/src/java/org/apache/cassandra/tools/NodeProbe.java
index 155236f..f124589 100644
--- a/src/java/org/apache/cassandra/tools/NodeProbe.java
+++ b/src/java/org/apache/cassandra/tools/NodeProbe.java
@@ -54,8 +54,11 @@ import org.apache.cassandra.utils.concurrent.SimpleCondition;
 import com.google.common.base.Function;
 import com.google.common.collect.*;
 import com.google.common.util.concurrent.Uninterruptibles;
+
 import com.yammer.metrics.reporting.JmxReporter;
 
+import static org.apache.commons.lang3.ArrayUtils.isEmpty;
+
 /**
  * JMX client operations for Cassandra.
  */
@@ -1152,10 +1155,17 @@ public class NodeProbe implements AutoCloseable
 
 public double[] metricPercentilesAsArray(long[] counts)
 {
+double[] result = new double[7];
+
+if (isEmpty(counts))
+{
+Arrays.fill(result, Double.NaN);
+return result;
+}
+
 double[] offsetPercentiles = new double[] { 0.5, 0.75, 0.95, 0.98, 
0.99 };
 long[] offsets = new 
EstimatedHistogram(counts.length).getBucketOffsets();
 EstimatedHistogram metric = new EstimatedHistogram(offsets, counts);
-double[] result = new double[7];
 
 if (metric.isOverflowed())
 {

http://git-wip-us.apache.org/repos/asf/cassandra/blob/c468c8b4/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index 8de4fff..c2146c6 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -32,6 +32,7 @@ import javax.management.openmbean.*;
 import com.google.common.base.Joiner;
 import com.google.common.base.Throwables;
 import com.google.common.collect.*;
+
 import com.yammer.metrics.reporting.JmxReporter;
 
 import io.airlift.command.*;
@@ -62,6 +63,7 @@ import static com.google.common.collect.Lists.newArrayList;
 import static java.lang.Integer.parseInt;
 import static java.lang.String.format;
 import static org.apache.commons.lang3.ArrayUtils.EMPTY_STRING_ARRAY;
+import static org.apache.commons.lang3.ArrayUtils.isEmpty;
 import static org.apache.commons.lang3.StringUtils.*;
 
 public class NodeTool
@@ -1014,12 +1016,20 @@ public class NodeTool
 
 ColumnFamilyStoreMBean store = probe.getCfsProxy(keyspace, cfname);
 
+long[] estimatedRowSizeHistogram = 
store.getEstimatedRowSizeHistogram();
+long[] estimatedColumnCountHistogram = 
store.getEstimatedColumnCountHistogram();
+
+if (isEmpty(estimatedRowSizeHistogram) || 
isEmpty(estimatedColumnCountHistogram))
+{
+System.err.println(No SSTables exists, unable to calculate 
'Partition Size' and 'Cell Count' percentiles);
+}
+
 // calculate percentile of row size and column count
 String[] percentiles = new String[]{50%, 75%, 

[jira] [Updated] (CASSANDRA-8514) ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-8514:
--
Attachment: CASSANDRA-8514-trunk.txt

Patch for trunk

 ArrayIndexOutOfBoundsException in nodetool cfhistograms
 ---

 Key: CASSANDRA-8514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8514
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: OSX
Reporter: Philip Thompson
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: CASSANDRA-8514-V2.txt, CASSANDRA-8514-V3.txt, 
 CASSANDRA-8514-trunk.txt, cassandra-2.1-8514-1.txt


 When running nodetool cfhistograms on 2.1-HEAD, I am seeing the following 
 exception:
 {code}
 04:02 PM:~/cstar/cassandra[cassandra-2.1*]$ bin/nodetool cfhistograms 
 keyspace1 standard1
 objc[58738]: Class JavaLaunchHelper is implemented in both 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/bin/java and 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/lib/libinstrument.dylib.
  One of the two will be used. Which one is undefined.
 error: 0
 -- StackTrace --
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.cassandra.utils.EstimatedHistogram.newOffsets(EstimatedHistogram.java:75)
   at 
 org.apache.cassandra.utils.EstimatedHistogram.init(EstimatedHistogram.java:60)
   at 
 org.apache.cassandra.tools.NodeTool$CfHistograms.execute(NodeTool.java:946)
   at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:250)
   at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:164){code}
 I can reproduce this with these simple steps:
 Start a new C* 2.1-HEAD node
 Run {{cassandra-stress write n=1}}
 Run {{nodetool cfhistograms keyspace1 standard1}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8677) rpc_interface and listen_interface generate NPE on startup when specified interface doesn't exist

2015-01-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8677:
--
Summary: rpc_interface and listen_interface generate NPE on startup when 
specified interface doesn't exist  (was: rcp_interface and listen_interface 
generate NPE on startup when specified interface doesn't exist)

 rpc_interface and listen_interface generate NPE on startup when specified 
 interface doesn't exist
 -

 Key: CASSANDRA-8677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8677
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg

 This is just a buggy UI bit.
 Initially the error I got was this which is redundant and not well formatted.
 {noformat}
 ERROR 20:12:55 Exception encountered during startup
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571)
 Exception encountered during startup: null
 Fatal configuration error; unable to start. See log for stacktrace.
 ERROR 20:12:55 Exception encountered during startup
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 {noformat}
 This has no description of the error that occurred. After logging the 
 exception.
 {noformat}
 java.lang.NullPointerException: null
   at 
 org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:347)
  ~[main/:na]
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:102)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 {noformat}
 Exceptions thrown in the DatabaseDescriptor should log in a useful way.
 This particular error should generate a message without a stack trace since 
 it is easily recognized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8514) ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread Benjamin Lerer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289977#comment-14289977
 ] 

Benjamin Lerer commented on CASSANDRA-8514:
---

[~jbellis] I need an other review when you have time


 ArrayIndexOutOfBoundsException in nodetool cfhistograms
 ---

 Key: CASSANDRA-8514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8514
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: OSX
Reporter: Philip Thompson
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: CASSANDRA-8514-V2.txt, CASSANDRA-8514-V3.txt, 
 cassandra-2.1-8514-1.txt


 When running nodetool cfhistograms on 2.1-HEAD, I am seeing the following 
 exception:
 {code}
 04:02 PM:~/cstar/cassandra[cassandra-2.1*]$ bin/nodetool cfhistograms 
 keyspace1 standard1
 objc[58738]: Class JavaLaunchHelper is implemented in both 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/bin/java and 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/lib/libinstrument.dylib.
  One of the two will be used. Which one is undefined.
 error: 0
 -- StackTrace --
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.cassandra.utils.EstimatedHistogram.newOffsets(EstimatedHistogram.java:75)
   at 
 org.apache.cassandra.utils.EstimatedHistogram.init(EstimatedHistogram.java:60)
   at 
 org.apache.cassandra.tools.NodeTool$CfHistograms.execute(NodeTool.java:946)
   at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:250)
   at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:164){code}
 I can reproduce this with these simple steps:
 Start a new C* 2.1-HEAD node
 Run {{cassandra-stress write n=1}}
 Run {{nodetool cfhistograms keyspace1 standard1}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8677) rpc_interface and listen_interface generate NPE on startup when specified interface doesn't exist

2015-01-23 Thread Ariel Weisberg (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ariel Weisberg updated CASSANDRA-8677:
--
Attachment: 8677.patch

ConfigurationException now has a boolean field indicating whether the stack 
trace should be displayed and I opted in a few places where suppressing stack 
traces looks harmless. I am very conservative about suppressing stack traces 
because not getting a stack trace when you need one in the field can be a 
disaster and ConfigurationException is widely used.

I also fixed an issue where a format error in the YAML would cause the parser 
to fail, but the error from the parser is overridden by an NPE in 
JVMStabilityInspector due to the stability inspector trying to access config 
the parser needed to generate.

CassandraDaemon now unwraps the exceptions that propagate out of 
DatabaseDescriptor's static initializer. It also avoids printing the stack 
trace an extra time to both the log and stderr/stdout although it still prints 
to stderr and stdout once. 

I took a look at 2.1 and 2.0 and it looks like things are working ok. I tested 
2.1 and the JVMStabilityInspector circular dependency doesn't occur.

 rpc_interface and listen_interface generate NPE on startup when specified 
 interface doesn't exist
 -

 Key: CASSANDRA-8677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8677
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ariel Weisberg
Assignee: Ariel Weisberg
 Attachments: 8677.patch


 This is just a buggy UI bit.
 Initially the error I got was this which is redundant and not well formatted.
 {noformat}
 ERROR 20:12:55 Exception encountered during startup
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571)
 Exception encountered during startup: null
 Fatal configuration error; unable to start. See log for stacktrace.
 ERROR 20:12:55 Exception encountered during startup
 java.lang.ExceptionInInitializerError: null
 Fatal configuration error; unable to start. See log for stacktrace.
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:108)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 {noformat}
 This has no description of the error that occurred. After logging the 
 exception.
 {noformat}
 java.lang.NullPointerException: null
   at 
 org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:347)
  ~[main/:na]
   at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:102)
  ~[main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:122) 
 [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:479)
  [main/:na]
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:571) 
 [main/:na]
 {noformat}
 Exceptions thrown in the DatabaseDescriptor should log in a useful way.
 This particular error should generate a message without a stack trace since 
 it is easily recognized.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8514) ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread Benjamin Lerer (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-8514:
--
Attachment: CASSANDRA-8514-V3.txt

Modified version of the patch to adapte to the changes made by CASSANDRA-8662

 ArrayIndexOutOfBoundsException in nodetool cfhistograms
 ---

 Key: CASSANDRA-8514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8514
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: OSX
Reporter: Philip Thompson
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: CASSANDRA-8514-V2.txt, CASSANDRA-8514-V3.txt, 
 cassandra-2.1-8514-1.txt


 When running nodetool cfhistograms on 2.1-HEAD, I am seeing the following 
 exception:
 {code}
 04:02 PM:~/cstar/cassandra[cassandra-2.1*]$ bin/nodetool cfhistograms 
 keyspace1 standard1
 objc[58738]: Class JavaLaunchHelper is implemented in both 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/bin/java and 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/lib/libinstrument.dylib.
  One of the two will be used. Which one is undefined.
 error: 0
 -- StackTrace --
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.cassandra.utils.EstimatedHistogram.newOffsets(EstimatedHistogram.java:75)
   at 
 org.apache.cassandra.utils.EstimatedHistogram.init(EstimatedHistogram.java:60)
   at 
 org.apache.cassandra.tools.NodeTool$CfHistograms.execute(NodeTool.java:946)
   at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:250)
   at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:164){code}
 I can reproduce this with these simple steps:
 Start a new C* 2.1-HEAD node
 Run {{cassandra-stress write n=1}}
 Run {{nodetool cfhistograms keyspace1 standard1}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7306) Support edge dcs with more flexible gossip

2015-01-23 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289956#comment-14289956
 ] 

Jeremiah Jordan commented on CASSANDRA-7306:


Have you tried to do this with things as they are now?  If you setup keyspaces 
such that your spoke dc's don't need to replicate with other spokes.  Then 
does it matter that they think all the nodes in another spoke are down?  
Besides a bunch of log messages whining about nodes being down, I guess I don't 
see how this theoretically wouldn't work now?

 Support edge dcs with more flexible gossip
 

 Key: CASSANDRA-7306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7306
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tupshin Harper
  Labels: ponies

 As Cassandra clusters get bigger and bigger, and their topology becomes more 
 complex, there is more and more need for a notion of hub and spoke 
 datacenters.
 One of the big obstacles to supporting hundreds (or thousands) of remote dcs, 
 is the assumption that all dcs need to talk to each other (and be connected 
 all the time).
 This ticket is a vague placeholder with the goals of achieving:
 1) better behavioral support for occasionally disconnected datacenters
 2) explicit support for custom dc to dc routing. A simple approach would be 
 an optional per-dc annotation of which other DCs that DC could gossip with.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289940#comment-14289940
 ] 

Abhishek Gupta edited comment on CASSANDRA-8638 at 1/23/15 8:46 PM:


As an alternate solution, here is what I propose to do as opposed to using a 
replace:
1. check the first 4 bytes and see if contains a BOM
2. if it contains a BOM, i will lookup the bom marker and further get the 
encoding and bom size(from a static dictionary). default it will return ascii.
3. open the file in the encoding returned in step-2
4. move file pointer ahead by bom size(based on bom size returned in step-2)
5. everything else will remain same


was (Author: abhish_gl):
As an alternate solution, here is what I propose to do as opposed to using a 
replace:
1. check the first 4 bytes and see if contains a BOM
2. if it contains a BOM, i will lookup the bom marker and further get the 
encoding and bom size(from a static dictionary)
3. open the file in the encoding returned in step-2
4. move file pointer ahead by bom size(based on bom size returned in step-2)
5. everything else will remain same

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[3/3] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-23 Thread brandonwilliams
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/230d884f
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/230d884f
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/230d884f

Branch: refs/heads/trunk
Commit: 230d884fa3a9505ec21b95ed84e7176aa1db4f9f
Parents: 27ad2db c468c8b
Author: Brandon Williams brandonwilli...@apache.org
Authored: Fri Jan 23 16:09:54 2015 -0600
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Fri Jan 23 16:09:54 2015 -0600

--

--




[jira] [Reopened] (CASSANDRA-8616) sstable2json may result in commit log segments be written

2015-01-23 Thread Russ Hatch (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Russ Hatch reopened CASSANDRA-8616:
---

Seems to be working fine in 2.0 latest, but latest 2.1 (c468c8b436) is still 
writing commitlog files when sstable2json is called.

Similar to before, this happens whether the db file argument is valid or not, 
and each time 1 new commitlog file is written.

 sstable2json may result in commit log segments be written
 -

 Key: CASSANDRA-8616
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8616
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
 Fix For: 2.1.3, 2.0.13

 Attachments: 8161-2.0.txt


 There was a report of sstable2json causing commitlog segments to be written 
 out when run.  I haven't attempted to reproduce this yet, so that's all I 
 know for now.  Since sstable2json loads the conf and schema, I'm thinking 
 that it may inadvertently be triggering the commitlog code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8638) CQLSH -f option should ignore BOM in files

2015-01-23 Thread Abhishek Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289940#comment-14289940
 ] 

Abhishek Gupta commented on CASSANDRA-8638:
---

As an alternate solution, here is what I propose to do as opposed to using a 
replace:
1. check the first 4 bytes and see if contains a BOM
2. if it contains a BOM, i will lookup the bom marker and further get the 
encoding and bom size(from a static dictionary)
3. open the file in the encoding returned in step-2
4. move file pointer ahead by bom size(based on bom size returned in step-2)
5. everything else will remain same

 CQLSH -f option should ignore BOM in files
 --

 Key: CASSANDRA-8638
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8638
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
 Environment: Red Hat linux
Reporter: Sotirios Delimanolis
Priority: Trivial
  Labels: cqlsh, lhf
 Fix For: 2.1.3

 Attachments: 0001-bug-CASSANDRA-8638.patch


 I fell in byte order mark trap trying to execute a CQL script through CQLSH. 
 The file contained the simple (plus BOM)
 {noformat}
 CREATE KEYSPACE IF NOT EXISTS xobni WITH replication = {'class': 
 'SimpleStrategy', 'replication_factor': '3'}  AND durable_writes = true; 
 -- and another CREATE TABLE bucket_flags query
 {noformat}
 I executed the script
 {noformat}
 [~]$ cqlsh --file /home/selimanolis/Schema/patches/setup.cql 
 /home/selimanolis/Schema/patches/setup.cql:2:Invalid syntax at char 1
 /home/selimanolis/Schema/patches/setup.cql:2:  CREATE KEYSPACE IF NOT EXISTS 
 test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 
 '3'}  AND durable_writes = true; 
 /home/selimanolis/Schema/patches/setup.cql:2:  ^
 /home/selimanolis/Schema/patches/setup.cql:22:ConfigurationException: 
 ErrorMessage code=2300 [Query invalid because of configuration issue] 
 message=Cannot add column family 'bucket_flags' to non existing keyspace 
 'test'.
 {noformat}
 I realized much later that the file had a BOM which was seemingly screwing 
 with how CQLSH parsed the file.
 It would be nice to have CQLSH ignore the BOM when processing files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8514) ArrayIndexOutOfBoundsException in nodetool cfhistograms

2015-01-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290014#comment-14290014
 ] 

Jonathan Ellis commented on CASSANDRA-8514:
---

+1

 ArrayIndexOutOfBoundsException in nodetool cfhistograms
 ---

 Key: CASSANDRA-8514
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8514
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: OSX
Reporter: Philip Thompson
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: CASSANDRA-8514-V2.txt, CASSANDRA-8514-V3.txt, 
 cassandra-2.1-8514-1.txt


 When running nodetool cfhistograms on 2.1-HEAD, I am seeing the following 
 exception:
 {code}
 04:02 PM:~/cstar/cassandra[cassandra-2.1*]$ bin/nodetool cfhistograms 
 keyspace1 standard1
 objc[58738]: Class JavaLaunchHelper is implemented in both 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/bin/java and 
 /Library/Java/JavaVirtualMachines/jdk1.7.0_67.jdk/Contents/Home/jre/lib/libinstrument.dylib.
  One of the two will be used. Which one is undefined.
 error: 0
 -- StackTrace --
 java.lang.ArrayIndexOutOfBoundsException: 0
   at 
 org.apache.cassandra.utils.EstimatedHistogram.newOffsets(EstimatedHistogram.java:75)
   at 
 org.apache.cassandra.utils.EstimatedHistogram.init(EstimatedHistogram.java:60)
   at 
 org.apache.cassandra.tools.NodeTool$CfHistograms.execute(NodeTool.java:946)
   at 
 org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:250)
   at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:164){code}
 I can reproduce this with these simple steps:
 Start a new C* 2.1-HEAD node
 Run {{cassandra-stress write n=1}}
 Run {{nodetool cfhistograms keyspace1 standard1}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8348) allow takeColumnFamilySnapshot to take a list of ColumnFamilies

2015-01-23 Thread Sachin Janani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sachin Janani updated CASSANDRA-8348:
-
Attachment: Patch-8348.patch

Patch for Cassandra-8348

 allow takeColumnFamilySnapshot to take a list of ColumnFamilies
 ---

 Key: CASSANDRA-8348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8348
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Halliday
Priority: Minor
 Fix For: 3.0, 2.1.3

 Attachments: Patch-8348.patch


 Within StorageServiceMBean.java the function takeSnapshot allows for a list 
 of keyspaces to snapshot.  However, the function takeColumnFamilySnapshot 
 only allows for a single ColumnFamily to snapshot.  This should allow for 
 multiple ColumnFamilies within the same Keyspace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6809) Compressed Commit Log

2015-01-23 Thread Branimir Lambov (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289213#comment-14289213
 ] 

Branimir Lambov commented on CASSANDRA-6809:


Thank you, I did not realise you are interested in parallelism between segments 
only. Of course, what you suggest is the right solution if we are limited to 
that; I approached the problem with the assumption that we need shorter 
sections (of the same segment) that are to progress in parallel. I can see that 
this should work well enough with large sync periods, including the 10s default.

I am happy to continue with either approach, or without multithreaded 
compression altogether. I am now going back to addressing the individual issues 
Ariel raised.


 Compressed Commit Log
 -

 Key: CASSANDRA-6809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Branimir Lambov
Priority: Minor
  Labels: performance
 Fix For: 3.0

 Attachments: ComitLogStress.java, logtest.txt


 It seems an unnecessary oversight that we don't compress the commit log. 
 Doing so should improve throughput, but some care will need to be taken to 
 ensure we use as much of a segment as possible. I propose decoupling the 
 writing of the records from the segments. Basically write into a (queue of) 
 DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
 MB written to the CL (where X is ordinarily CLS size), and then pack as many 
 of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8674) Improve CL write latency under saturation

2015-01-23 Thread Benedict (JIRA)
Benedict created CASSANDRA-8674:
---

 Summary: Improve CL write latency under saturation
 Key: CASSANDRA-8674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8674
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
 Fix For: 3.0


At the moment we must flush the entire backlog of segments before delayed 
writes can continue, but we could update our progress as we flush individual 
segments. This may permit us to resume progress ahead of total completion of 
our backlog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8616) sstable2json may result in commit log segments be written

2015-01-23 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290380#comment-14290380
 ] 

Yuki Morishita commented on CASSANDRA-8616:
---

You are right.
In 2.1 and trunk, accessing schema creates new Memtable instance for 
schema_keyspace, and [Memtable touches CommitLog 
singleton|https://github.com/apache/cassandra/blob/cassandra-2.1.2/src/java/org/apache/cassandra/db/Memtable.java#L66]
 which [creates one commit log file when it is 
initialized|https://github.com/apache/cassandra/blob/cassandra-2.1.2/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L70].

Looks like more work is needed to be done...

 sstable2json may result in commit log segments be written
 -

 Key: CASSANDRA-8616
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8616
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Reporter: Tyler Hobbs
Assignee: Yuki Morishita
 Fix For: 2.1.3, 2.0.13

 Attachments: 8161-2.0.txt


 There was a report of sstable2json causing commitlog segments to be written 
 out when run.  I haven't attempted to reproduce this yet, so that's all I 
 know for now.  Since sstable2json loads the conf and schema, I'm thinking 
 that it may inadvertently be triggering the commitlog code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] cassandra git commit: java8 bug? disambiquate (ArrayUtils|StringUtils).isEmpty by not using static imports error: no suitable method found for isEmpty(String)

2015-01-23 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/trunk feda54f04 - 7b533d067


java8 bug? disambiquate (ArrayUtils|StringUtils).isEmpty by not using static 
imports
error: no suitable method found for isEmpty(String)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1bb0c149
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1bb0c149
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1bb0c149

Branch: refs/heads/trunk
Commit: 1bb0c149eb9657be1dc4c488156ced617b622ceb
Parents: c468c8b
Author: Dave Brosius dbros...@mebigfatguy.com
Authored: Fri Jan 23 20:59:41 2015 -0500
Committer: Dave Brosius dbros...@mebigfatguy.com
Committed: Fri Jan 23 20:59:41 2015 -0500

--
 src/java/org/apache/cassandra/tools/NodeTool.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bb0c149/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index c2146c6..674b346 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -55,6 +55,7 @@ import org.apache.cassandra.streaming.StreamState;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.JVMStabilityInspector;
 
+import org.apache.commons.lang3.ArrayUtils;
 import static com.google.common.base.Preconditions.checkArgument;
 import static com.google.common.base.Preconditions.checkState;
 import static com.google.common.base.Throwables.getStackTraceAsString;
@@ -63,7 +64,6 @@ import static com.google.common.collect.Lists.newArrayList;
 import static java.lang.Integer.parseInt;
 import static java.lang.String.format;
 import static org.apache.commons.lang3.ArrayUtils.EMPTY_STRING_ARRAY;
-import static org.apache.commons.lang3.ArrayUtils.isEmpty;
 import static org.apache.commons.lang3.StringUtils.*;
 
 public class NodeTool
@@ -1019,7 +1019,7 @@ public class NodeTool
 long[] estimatedRowSizeHistogram = 
store.getEstimatedRowSizeHistogram();
 long[] estimatedColumnCountHistogram = 
store.getEstimatedColumnCountHistogram();
 
-if (isEmpty(estimatedRowSizeHistogram) || 
isEmpty(estimatedColumnCountHistogram))
+if (ArrayUtils.isEmpty(estimatedRowSizeHistogram) || 
ArrayUtils.isEmpty(estimatedColumnCountHistogram))
 {
 System.err.println(No SSTables exists, unable to calculate 
'Partition Size' and 'Cell Count' percentiles);
 }



[2/2] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-23 Thread dbrosius
Merge branch 'cassandra-2.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7b533d06
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7b533d06
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7b533d06

Branch: refs/heads/trunk
Commit: 7b533d0677fcbd93b45ae442e60ab9a302f57d3d
Parents: feda54f 1bb0c14
Author: Dave Brosius dbros...@mebigfatguy.com
Authored: Fri Jan 23 21:04:59 2015 -0500
Committer: Dave Brosius dbros...@mebigfatguy.com
Committed: Fri Jan 23 21:04:59 2015 -0500

--
 src/java/org/apache/cassandra/tools/NodeTool.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/7b533d06/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --cc src/java/org/apache/cassandra/tools/NodeTool.java
index b67dff9,674b346..18feac7
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@@ -1024,68 -1014,23 +1024,68 @@@ public class NodeToo
  String keyspace = args.get(0);
  String cfname = args.get(1);
  
 -ColumnFamilyStoreMBean store = probe.getCfsProxy(keyspace, 
cfname);
 +// calculate percentile of row size and column count
 +long[] estimatedRowSize = (long[]) 
probe.getColumnFamilyMetric(keyspace, cfname, EstimatedRowSizeHistogram);
 +long[] estimatedColumnCount = (long[]) 
probe.getColumnFamilyMetric(keyspace, cfname, EstimatedColumnCountHistogram);
  
 -long[] estimatedRowSizeHistogram = 
store.getEstimatedRowSizeHistogram();
 -long[] estimatedColumnCountHistogram = 
store.getEstimatedColumnCountHistogram();
 +// build arrays to store percentile values
 +double[] estimatedRowSizePercentiles = new double[7];
 +double[] estimatedColumnCountPercentiles = new double[7];
 +double[] offsetPercentiles = new double[]{0.5, 0.75, 0.95, 0.98, 
0.99};
  
- if (isEmpty(estimatedRowSize) || isEmpty(estimatedColumnCount))
 -if (ArrayUtils.isEmpty(estimatedRowSizeHistogram) || 
ArrayUtils.isEmpty(estimatedColumnCountHistogram))
++if (ArrayUtils.isEmpty(estimatedRowSize) || 
ArrayUtils.isEmpty(estimatedColumnCount))
  {
  System.err.println(No SSTables exists, unable to calculate 
'Partition Size' and 'Cell Count' percentiles);
 +
 +for (int i = 0; i  7; i++)
 +{
 +estimatedRowSizePercentiles[i] = Double.NaN;
 +estimatedColumnCountPercentiles[i] = Double.NaN;
 +}
 +}
 +else
 +{
 +long[] rowSizeBucketOffsets = new 
EstimatedHistogram(estimatedRowSize.length).getBucketOffsets();
 +long[] columnCountBucketOffsets = new 
EstimatedHistogram(estimatedColumnCount.length).getBucketOffsets();
 +EstimatedHistogram rowSizeHist = new 
EstimatedHistogram(rowSizeBucketOffsets, estimatedRowSize);
 +EstimatedHistogram columnCountHist = new 
EstimatedHistogram(columnCountBucketOffsets, estimatedColumnCount);
 +
 +if (rowSizeHist.isOverflowed())
 +{
 +System.err.println(String.format(Row sizes are larger 
than %s, unable to calculate percentiles, 
rowSizeBucketOffsets[rowSizeBucketOffsets.length - 1]));
 +for (int i = 0; i  offsetPercentiles.length; i++)
 +estimatedRowSizePercentiles[i] = Double.NaN;
 +}
 +else
 +{
 +for (int i = 0; i  offsetPercentiles.length; i++)
 +estimatedRowSizePercentiles[i] = 
rowSizeHist.percentile(offsetPercentiles[i]);
 +}
 +
 +if (columnCountHist.isOverflowed())
 +{
 +System.err.println(String.format(Column counts are 
larger than %s, unable to calculate percentiles, 
columnCountBucketOffsets[columnCountBucketOffsets.length - 1]));
 +for (int i = 0; i  
estimatedColumnCountPercentiles.length; i++)
 +estimatedColumnCountPercentiles[i] = Double.NaN;
 +}
 +else
 +{
 +for (int i = 0; i  offsetPercentiles.length; i++)
 +estimatedColumnCountPercentiles[i] = 
columnCountHist.percentile(offsetPercentiles[i]);
 +}
 +
 +// min value
 +estimatedRowSizePercentiles[5] = rowSizeHist.min();
 +estimatedColumnCountPercentiles[5] = 

cassandra git commit: java8 bug? disambiquate (ArrayUtils|StringUtils).isEmpty by not using static imports error: no suitable method found for isEmpty(String)

2015-01-23 Thread dbrosius
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.1 c468c8b43 - 1bb0c149e


java8 bug? disambiquate (ArrayUtils|StringUtils).isEmpty by not using static 
imports
error: no suitable method found for isEmpty(String)


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1bb0c149
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1bb0c149
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1bb0c149

Branch: refs/heads/cassandra-2.1
Commit: 1bb0c149eb9657be1dc4c488156ced617b622ceb
Parents: c468c8b
Author: Dave Brosius dbros...@mebigfatguy.com
Authored: Fri Jan 23 20:59:41 2015 -0500
Committer: Dave Brosius dbros...@mebigfatguy.com
Committed: Fri Jan 23 20:59:41 2015 -0500

--
 src/java/org/apache/cassandra/tools/NodeTool.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/1bb0c149/src/java/org/apache/cassandra/tools/NodeTool.java
--
diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java 
b/src/java/org/apache/cassandra/tools/NodeTool.java
index c2146c6..674b346 100644
--- a/src/java/org/apache/cassandra/tools/NodeTool.java
+++ b/src/java/org/apache/cassandra/tools/NodeTool.java
@@ -55,6 +55,7 @@ import org.apache.cassandra.streaming.StreamState;
 import org.apache.cassandra.utils.FBUtilities;
 import org.apache.cassandra.utils.JVMStabilityInspector;
 
+import org.apache.commons.lang3.ArrayUtils;
 import static com.google.common.base.Preconditions.checkArgument;
 import static com.google.common.base.Preconditions.checkState;
 import static com.google.common.base.Throwables.getStackTraceAsString;
@@ -63,7 +64,6 @@ import static com.google.common.collect.Lists.newArrayList;
 import static java.lang.Integer.parseInt;
 import static java.lang.String.format;
 import static org.apache.commons.lang3.ArrayUtils.EMPTY_STRING_ARRAY;
-import static org.apache.commons.lang3.ArrayUtils.isEmpty;
 import static org.apache.commons.lang3.StringUtils.*;
 
 public class NodeTool
@@ -1019,7 +1019,7 @@ public class NodeTool
 long[] estimatedRowSizeHistogram = 
store.getEstimatedRowSizeHistogram();
 long[] estimatedColumnCountHistogram = 
store.getEstimatedColumnCountHistogram();
 
-if (isEmpty(estimatedRowSizeHistogram) || 
isEmpty(estimatedColumnCountHistogram))
+if (ArrayUtils.isEmpty(estimatedRowSizeHistogram) || 
ArrayUtils.isEmpty(estimatedColumnCountHistogram))
 {
 System.err.println(No SSTables exists, unable to calculate 
'Partition Size' and 'Cell Count' percentiles);
 }



[jira] [Created] (CASSANDRA-8672) Ambiguous WriteTimeoutException while completing pending CAS commits

2015-01-23 Thread Stefan Podkowinski (JIRA)
Stefan Podkowinski created CASSANDRA-8672:
-

 Summary: Ambiguous WriteTimeoutException while completing pending 
CAS commits
 Key: CASSANDRA-8672
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8672
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Stefan Podkowinski
Priority: Minor


Any CAS update has a chance to trigger a pending/stalled commit of any 
previously agreed on CAS update. After completing the pending commit, the CAS 
operation will resume to execute the actual update and also possibly create a 
new commit. See StorageProxy.cas()

Theres two possbile execution paths that might end up throwing a 
WriteTimeoutException:
cas() - beginAndRepairPaxos() - commitPaxos()
cas() - commitPaxos()

Unfortunatelly clients catching a WriteTimeoutException won't be able to tell 
at which stage the commit failed. My guess would be that most developers are 
not aware that the beginAndRepairPaxos() could also trigger a write and assume 
that write timeouts would refer to a timeout while writting the actual CAS 
update. Its therefor not safe to assume that successive CAS or SERIAL read 
operations will cause a (write-)timeouted CAS operation to get eventually 
applied. Although some [best-practices 
advise|http://www.datastax.com/dev/blog/cassandra-error-handling-done-right] 
claims otherwise.

At this point the safest bet is possibly to retry the complete business 
transaction in case of an WriteTimeoutException. However, as theres a chance 
that the timeout occurred while writing the actual CAS operation, another write 
could potentially complete it and our CAS condition will get a different result 
upon retry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8673) Row cache follow-ups

2015-01-23 Thread Robert Stupp (JIRA)
Robert Stupp created CASSANDRA-8673:
---

 Summary: Row cache follow-ups
 Key: CASSANDRA-8673
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8673
 Project: Cassandra
  Issue Type: New Feature
Reporter: Robert Stupp
 Fix For: 3.0, 3.1


We (Benedict, Ariel and me) had some offline discussion about the next steps to 
further improve the row cache committed for CASSANDRA-7438 and identified the 
following points.
This ticket is basically a note not to forget these topics. The individual 
points should be handled in separate (sub) tickets.

# Permit access to off-heap data without deserialization. This should be the 
biggest win to improve reads - effectively no more deserialization of the whole 
cached value from off-heap. [OHC issue #2|https://github.com/snazy/ohc/issues/2]
# Per-table-knob that decides whether changes are updated in the row cache on 
writes or not. Could be a win if you have a workload with frequent reads 
against a few hot partitions but write to many other partitions. Otherwise 
the row cache would fill up with useless data and effectively reduce cache hit 
ratio.
# Update {{cassandra.sh}} to preload jemalloc using {{LD_PRELOAD}} / 
{{DYLD_INSERT_LIBRARIES}} and use {{Unsafe}} for memory allocation. This 
removes JNA from the call stack. Additionally we should do this change in 
existing C* code for the same reason. (Note: JNA adds some overhead and has a 
synchronized block in each call going to be fixed in a future version - but 
it's not for free.) Feels like a LHF.
# Investigate whether key cache and counter cache can also use OHC. We could 
iterate towards a single cache implementation and maybe remove some code and 
decrease the potential number of configurations that can be run.
# Investigate whether _RowCacheSentinel_ can be replaced with something better 
/ more native. RowCacheSentinel's reason seems to be to avoid races with 
other update operations that would invalidate the row before it is inserted 
into the cache. It's a workaround for it not being write-through.
# Implement efficient off-heap memory allocator. (see below)

Not big wins:
* Allow serialization of hot keys during auto save. Since saving of cached keys 
is a task that only runs infrequently (if at all), the win would not be great. 
It feels like LHF, but the win is low iMO.
* Use other replacement strategy. We had some discussion about using something 
else instead of LRU (timestamp, 2Q, LIRS, LRU+random). But either the overhead 
to manage these strategies overwhelm the benefit or the win would be to low.

LHFs (should be fixed in the next days)
* don't use row cache in unit tests (currently enabled in 
test/conf/cassandra.yaml)
* don't print whole class path when jemalloc is not available (prints 40k 
class path on cassci for each unit text, since jemalloc is not available there 
- related to previous point)

bq. As to incorporating memory management, I think we can actually do this very 
simply by merging it with our eviction strategy. If we allocate S arenas of 1/S 
(where S is the number of Segments), and partition each arena into pages of 
size K, we can make our eviction strategy operate over whole pages, instead of 
individual items. This probably won't have any significant impact on eviction, 
especially with small-ish pages. The only slight complexity is dealing with 
allocations spanning multiple pages, but that shouldn't be too tricky. The nice 
thing about this approach is that, like our other decisions, it is very easily 
made obviously correct. It also gives us great locality for operations, with a 
high likelihood of cache presence for each allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8671) Give compaction strategy more control over where sstables are created, including for flushing and streaming.

2015-01-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14288972#comment-14288972
 ] 

Marcus Eriksson commented on CASSANDRA-8671:


We get half way there in #7272 (the compaction writer interface)

 Give compaction strategy more control over where sstables are created, 
 including for flushing and streaming.
 

 Key: CASSANDRA-8671
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8671
 Project: Cassandra
  Issue Type: Improvement
Reporter: Blake Eggleston
 Fix For: 3.0


 This would enable routing different partitions to different disks based on 
 some user defined parameters.
 My initial take on how to do this would be to make an interface from 
 SSTableWriter, and have a table's compaction strategy do all SSTableWriter 
 instantiation. Compaction strategies could then implement their own 
 SSTableWriter implementations (which basically wrap one or more normal 
 sstablewriters) for compaction, flushing, and streaming. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6809) Compressed Commit Log

2015-01-23 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289242#comment-14289242
 ] 

Benedict commented on CASSANDRA-6809:
-

bq. Thank you, I did not realise you are interested in parallelism between 
segments only.

Well, I considered that a natural extension, i.e. a follow up ticket. One I 
still consider reasonably straight forward to add: a mutator thread can 
partition the commit range once it's processed ~1Mb, and simply append the 
Callable to a shared queue. The sync thread can then drain this when it decides 
to initiate a sync.

bq. I can see that this should work well enough with large sync periods, 
including the 10s default.

I'm reasonably confident this will work as well or better for all sync periods. 
In particular it better guarantees honouring the sync periods, and is less 
likely to encourage random write behaviour. Of course, the main benefit is its 
simplicity.

 Compressed Commit Log
 -

 Key: CASSANDRA-6809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
 Project: Cassandra
  Issue Type: Improvement
Reporter: Benedict
Assignee: Branimir Lambov
Priority: Minor
  Labels: performance
 Fix For: 3.0

 Attachments: ComitLogStress.java, logtest.txt


 It seems an unnecessary oversight that we don't compress the commit log. 
 Doing so should improve throughput, but some care will need to be taken to 
 ensure we use as much of a segment as possible. I propose decoupling the 
 writing of the records from the segments. Basically write into a (queue of) 
 DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X 
 MB written to the CL (where X is ordinarily CLS size), and then pack as many 
 of the compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7560) 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession

2015-01-23 Thread Nick Bailey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289291#comment-14289291
 ] 

Nick Bailey commented on CASSANDRA-7560:


Is this a bug only for snapshot repair?

 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession
 --

 Key: CASSANDRA-7560
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7560
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vladimir Avram
Assignee: Yuki Morishita
 Fix For: 2.0.10

 Attachments: 0001-backport-CASSANDRA-6747.patch, 
 0001-partial-backport-3569.patch, cassandra_daemon.log, 
 cassandra_daemon_rep1.log, cassandra_daemon_rep2.log, nodetool_command.log


 Running {{nodetool repair -pr}} will sometimes hang on one of the resulting 
 AntiEntropySessions.
 The system logs will show the repair command starting
 {noformat}
  INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) 
 Starting repair command #1, repairing 256 ranges for keyspace x
 {noformat}
 You can then see a few AntiEntropySessions completing with:
 {noformat}
 INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 
 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed 
 successfully
 {noformat}
 Finally we reach an AntiEntropySession at some point that hangs just before 
 requesting the merkle trees for the next column family in line for repair. So 
 we first see the previous CF being finished and the whole repair sessions 
 hangs here with no visible progress or errors on this or any of the related 
 nodes.
 {noformat}
 INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 
 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully 
 synced
 {noformat}
 Notes:
 * Single DC 6 node cluster with an average load of 86 GB per node.
 * This appears to be random; it does not always happen on the same CF or on 
 the same session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lex Lythius updated CASSANDRA-8675:
---
Description: 
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{code:sql}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{code}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name, since we can't 
rename tables?


  was:
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{code:sql}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{code}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name, since we can't 
rename tables?



 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cql
 Attachments: copytest.csv


 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |  

[1/6] cassandra git commit: Do not write commitlog from standalone tools

2015-01-23 Thread yukim
Repository: cassandra
Updated Branches:
  refs/heads/cassandra-2.0 cc5fb19e5 - 2bf63f61e
  refs/heads/cassandra-2.1 136abcc7a - 3a5f79eb5
  refs/heads/trunk d4b23b059 - 27ad2db02


Do not write commitlog from standalone tools

patch by yukim; reviewed by Tyler Hobbs for CASSANDRA-8616


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2bf63f61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2bf63f61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2bf63f61

Branch: refs/heads/cassandra-2.0
Commit: 2bf63f61e33587a8ba94fce3a1433e8cd866d1f0
Parents: cc5fb19
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 08:58:47 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 08:58:47 2015 -0600

--
 .../cassandra/config/DatabaseDescriptor.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 6 files changed, 21 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 2bfdb16..286014e 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -516,9 +516,22 @@ public class DatabaseDescriptor
 return conf.dynamic_snitch ? new DynamicEndpointSnitch(snitch) : 
snitch;
 }
 
-/** load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances. */
+/**
+ * load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances.
+ * Schema version may be updated as the result.
+ */
 public static void loadSchemas()
 {
+loadSchemas(true);
+}
+
+/**
+ * Load schema definitions.
+ *
+ * @param updateVersion true if schema version needs to be updated
+ */
+public static void loadSchemas(boolean updateVersion)
+{
 ColumnFamilyStore schemaCFS = 
SystemKeyspace.schemaCFS(SystemKeyspace.SCHEMA_KEYSPACES_CF);
 
 // if keyspace with definitions is empty try loading the old way
@@ -536,7 +549,8 @@ public class DatabaseDescriptor
 Schema.instance.load(DefsTables.loadFromKeyspace());
 }
 
-Schema.instance.updateVersion();
+if (updateVersion)
+Schema.instance.updateVersion();
 }
 
 private static boolean hasExistingNoSystemTables()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index f8b85c3..0b96924 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -448,7 +448,7 @@ public class SSTableExport
 String[] excludes = cmd.getOptionValues(EXCLUDEKEY_OPTION);
 String ssTableFileName = new File(cmd.getArgs()[0]).getAbsolutePath();
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 Descriptor descriptor = Descriptor.fromFilename(ssTableFileName);
 
 // Start by validating keyspace name

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableImport.java 
b/src/java/org/apache/cassandra/tools/SSTableImport.java
index 11bfc81..3135fe6 100644
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@ -545,7 +545,7 @@ public class SSTableImport
 oldSCFormat = true;
 }
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 if (Schema.instance.getNonSystemKeyspaces().size()  1)
 {
 String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
--
diff --git 

[4/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-01-23 Thread yukim
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
src/java/org/apache/cassandra/tools/SSTableImport.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3a5f79eb
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3a5f79eb
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3a5f79eb

Branch: refs/heads/trunk
Commit: 3a5f79eb5856c714a94a05c17409221fd3ace1b0
Parents: 136abcc 2bf63f6
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 09:15:03 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 09:15:03 2015 -0600

--
 .../cassandra/config/DatabaseDescriptor.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 6 files changed, 21 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/SSTableExport.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --cc src/java/org/apache/cassandra/tools/SSTableImport.java
index bdbebc1,3135fe6..87d52be
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@@ -501,7 -540,12 +501,7 @@@ public class SSTableImpor
  isSorted = true;
  }
  
- DatabaseDescriptor.loadSchemas();
 -if (cmd.hasOption(OLD_SC_FORMAT_OPTION))
 -{
 -oldSCFormat = true;
 -}
 -
+ DatabaseDescriptor.loadSchemas(false);
  if (Schema.instance.getNonSystemKeyspaces().size()  1)
  {
  String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneSplitter.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
--



[3/6] cassandra git commit: Do not write commitlog from standalone tools

2015-01-23 Thread yukim
Do not write commitlog from standalone tools

patch by yukim; reviewed by Tyler Hobbs for CASSANDRA-8616


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2bf63f61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2bf63f61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2bf63f61

Branch: refs/heads/trunk
Commit: 2bf63f61e33587a8ba94fce3a1433e8cd866d1f0
Parents: cc5fb19
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 08:58:47 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 08:58:47 2015 -0600

--
 .../cassandra/config/DatabaseDescriptor.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 6 files changed, 21 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 2bfdb16..286014e 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -516,9 +516,22 @@ public class DatabaseDescriptor
 return conf.dynamic_snitch ? new DynamicEndpointSnitch(snitch) : 
snitch;
 }
 
-/** load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances. */
+/**
+ * load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances.
+ * Schema version may be updated as the result.
+ */
 public static void loadSchemas()
 {
+loadSchemas(true);
+}
+
+/**
+ * Load schema definitions.
+ *
+ * @param updateVersion true if schema version needs to be updated
+ */
+public static void loadSchemas(boolean updateVersion)
+{
 ColumnFamilyStore schemaCFS = 
SystemKeyspace.schemaCFS(SystemKeyspace.SCHEMA_KEYSPACES_CF);
 
 // if keyspace with definitions is empty try loading the old way
@@ -536,7 +549,8 @@ public class DatabaseDescriptor
 Schema.instance.load(DefsTables.loadFromKeyspace());
 }
 
-Schema.instance.updateVersion();
+if (updateVersion)
+Schema.instance.updateVersion();
 }
 
 private static boolean hasExistingNoSystemTables()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index f8b85c3..0b96924 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -448,7 +448,7 @@ public class SSTableExport
 String[] excludes = cmd.getOptionValues(EXCLUDEKEY_OPTION);
 String ssTableFileName = new File(cmd.getArgs()[0]).getAbsolutePath();
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 Descriptor descriptor = Descriptor.fromFilename(ssTableFileName);
 
 // Start by validating keyspace name

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableImport.java 
b/src/java/org/apache/cassandra/tools/SSTableImport.java
index 11bfc81..3135fe6 100644
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@ -545,7 +545,7 @@ public class SSTableImport
 oldSCFormat = true;
 }
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 if (Schema.instance.getNonSystemKeyspaces().size()  1)
 {
 String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
--
diff --git a/src/java/org/apache/cassandra/tools/StandaloneScrubber.java 
b/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
index 315e4e1..81dfdc3 100644
--- a/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
+++ 

[5/6] cassandra git commit: Merge branch 'cassandra-2.0' into cassandra-2.1

2015-01-23 Thread yukim
Merge branch 'cassandra-2.0' into cassandra-2.1

Conflicts:
src/java/org/apache/cassandra/tools/SSTableImport.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3a5f79eb
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3a5f79eb
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3a5f79eb

Branch: refs/heads/cassandra-2.1
Commit: 3a5f79eb5856c714a94a05c17409221fd3ace1b0
Parents: 136abcc 2bf63f6
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 09:15:03 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 09:15:03 2015 -0600

--
 .../cassandra/config/DatabaseDescriptor.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 6 files changed, 21 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/SSTableExport.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --cc src/java/org/apache/cassandra/tools/SSTableImport.java
index bdbebc1,3135fe6..87d52be
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@@ -501,7 -540,12 +501,7 @@@ public class SSTableImpor
  isSorted = true;
  }
  
- DatabaseDescriptor.loadSchemas();
 -if (cmd.hasOption(OLD_SC_FORMAT_OPTION))
 -{
 -oldSCFormat = true;
 -}
 -
+ DatabaseDescriptor.loadSchemas(false);
  if (Schema.instance.getNonSystemKeyspaces().size()  1)
  {
  String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneSplitter.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/3a5f79eb/src/java/org/apache/cassandra/tools/StandaloneUpgrader.java
--



[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lex Lythius updated CASSANDRA-8675:
---
Description: 
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{code:sql}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{code}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?


  was:
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)


I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?



 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cql

 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |This has a 

[jira] [Comment Edited] (CASSANDRA-7276) Include keyspace and table names in logs where possible

2015-01-23 Thread Philip Thompson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14287979#comment-14287979
 ] 

Philip Thompson edited comment on CASSANDRA-7276 at 1/23/15 3:50 PM:
-

[~nitzanv], sorry this ticket has taken so long to get reviewed. Just a small 
nit, please place all *{* and *}* characters on new lines as indicated by 
http://wiki.apache.org/cassandra/CodeStyle.

Also, please include all of your changes in one patch file.


was (Author: philipthompson):
[~nitzanv], sorry this ticket has taken so long to get reviewed. Just a small 
nit, please place all *{* and *}* characters on new lines as indicated by 
http://wiki.apache.org/cassandra/CodeStyle .

 Include keyspace and table names in logs where possible
 ---

 Key: CASSANDRA-7276
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7276
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Tyler Hobbs
Assignee: Nitzan Volman
Priority: Minor
  Labels: bootcamp, lhf
 Fix For: 2.1.3

 Attachments: 2.1-CASSANDRA-7276-v1.txt, 
 cassandra-2.1-7276-compaction.txt, cassandra-2.1-7276.txt


 Most error messages and stacktraces give you no clue as to what keyspace or 
 table was causing the problem.  For example:
 {noformat}
 ERROR [MutationStage:61648] 2014-05-20 12:05:45,145 CassandraDaemon.java 
 (line 198) Exception in thread Thread[MutationStage:61648,5,main]
 java.lang.IllegalArgumentException
 at java.nio.Buffer.limit(Unknown Source)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.getBytes(AbstractCompositeType.java:63)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.getWithShortLength(AbstractCompositeType.java:72)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:98)
 at 
 org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35)
 at 
 edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538)
 at 
 edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108)
 at 
 edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059)
 at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023)
 at 
 edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985)
 at 
 org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:328)
 at 
 org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:200)
 at org.apache.cassandra.db.Memtable.resolve(Memtable.java:226)
 at org.apache.cassandra.db.Memtable.put(Memtable.java:173)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:893)
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:368)
 at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:333)
 at org.apache.cassandra.db.RowMutation.apply(RowMutation.java:206)
 at 
 org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:56)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:60)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
 at java.lang.Thread.run(Unknown Source)
 {noformat}
 We should try to include info on the keyspace and column family in the error 
 messages or logs whenever possible.  This includes reads, writes, 
 compactions, flushes, repairs, and probably more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8676) commitlog_periodic_queue_size should not exist in 2.1+

2015-01-23 Thread Benedict (JIRA)
Benedict created CASSANDRA-8676:
---

 Summary: commitlog_periodic_queue_size should not exist in 2.1+
 Key: CASSANDRA-8676
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8676
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
Priority: Trivial
 Fix For: 2.1.3


This property was erroneously left in the yaml for 2.1. I will remove it from 
the yaml, and mark it deprecated in Config.java. If it could also be removed 
from the documentation, that would be great.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8366) Repair grows data on nodes, causes load to become unbalanced

2015-01-23 Thread Alan Boudreault (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Boudreault updated CASSANDRA-8366:
---
Attachment: run2_no_compact_before_repair.log
run1_with_compact_before_repair.log
run3_no_compact_before_repair.log
testv2.sh

[~krummas] I'm attaching a new version of the test script. (testv2.sh). This 
one has some improvements and gives more details after each operations (it 
shows sstable size, wait properly that all compaction tasks finish, display  
streaming status, it flushes nodes, it cleans nodes etc.).

I've run  3 times the script to see the differences. 

* run1 is the only real successful result. The reason is that I compact all 
nodes right after the cassandra-stress operation. Apparently, this removed the 
need to repair, so everything is fine and at the end of the script all nodes 
are at the proper size (1.43G).

* run2 doesn't compact after the stress. The repair is then ran and we only see 
the Did not get a positive answer until the end of the node2 repair. So we 
can see that the keyspace r1 has been successfully repaired for node1 and 
node2. The repair for node3 failed but it seems that the 2 other repairs have 
taken care to repair things so everything is OK at the end of the script. (node 
size ~1.43G)

* run3 doesn't compact after the stress. This time, the repair fails at the 
beginning (node1 repair call). This makes the node2 and node2 repairs fails 
too. After flushing + cleaning + compacting, all nodes have an extra 1G of 
data, which I don't know what they are. There is no streaming, all compaction 
is done and looks like I cannot get rid of them. This is not in the log, but I 
restarted my cluster again, then retried to full repair sequentially all nodes 
then re-cleaning, re-compacting and nothing changed. I let the cluster ran all 
night long to be sure. I have not deleted this cluster so if you need more 
information, I just have to restart it.

Do you see anything wrong in my tests? Ping me on IRC if you want to discuss 
more about this ticket. 




 Repair grows data on nodes, causes load to become unbalanced
 

 Key: CASSANDRA-8366
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8366
 Project: Cassandra
  Issue Type: Bug
 Environment: 4 node cluster
 2.1.2 Cassandra
 Inserts and reads are done with CQL driver
Reporter: Jan Karlsson
Assignee: Alan Boudreault
 Attachments: results-1750_inc_repair.txt, 
 results-500_1_inc_repairs.txt, results-500_2_inc_repairs.txt, 
 results-500_full_repair_then_inc_repairs.txt, 
 results-500_inc_repairs_not_parallel.txt, 
 run1_with_compact_before_repair.log, run2_no_compact_before_repair.log, 
 run3_no_compact_before_repair.log, test.sh, testv2.sh


 There seems to be something weird going on when repairing data.
 I have a program that runs 2 hours which inserts 250 random numbers and reads 
 250 times per second. It creates 2 keyspaces with SimpleStrategy and RF of 3. 
 I use size-tiered compaction for my cluster. 
 After those 2 hours I run a repair and the load of all nodes goes up. If I 
 run incremental repair the load goes up alot more. I saw the load shoot up 8 
 times the original size multiple times with incremental repair. (from 2G to 
 16G)
 with node 9 8 7 and 6 the repro procedure looked like this:
 (Note that running full repair first is not a requirement to reproduce.)
 {noformat}
 After 2 hours of 250 reads + 250 writes per second:
 UN  9  583.39 MB  256 ?   28220962-26ae-4eeb-8027-99f96e377406  rack1
 UN  8  584.01 MB  256 ?   f2de6ea1-de88-4056-8fde-42f9c476a090  rack1
 UN  7  583.72 MB  256 ?   2b6b5d66-13c8-43d8-855c-290c0f3c3a0b  rack1
 UN  6  583.84 MB  256 ?   b8bd67f1-a816-46ff-b4a4-136ad5af6d4b  rack1
 Repair -pr -par on all nodes sequentially
 UN  9  746.29 MB  256 ?   28220962-26ae-4eeb-8027-99f96e377406  rack1
 UN  8  751.02 MB  256 ?   f2de6ea1-de88-4056-8fde-42f9c476a090  rack1
 UN  7  748.89 MB  256 ?   2b6b5d66-13c8-43d8-855c-290c0f3c3a0b  rack1
 UN  6  758.34 MB  256 ?   b8bd67f1-a816-46ff-b4a4-136ad5af6d4b  rack1
 repair -inc -par on all nodes sequentially
 UN  9  2.41 GB256 ?   28220962-26ae-4eeb-8027-99f96e377406  rack1
 UN  8  2.53 GB256 ?   f2de6ea1-de88-4056-8fde-42f9c476a090  rack1
 UN  7  2.6 GB 256 ?   2b6b5d66-13c8-43d8-855c-290c0f3c3a0b  rack1
 UN  6  2.17 GB256 ?   b8bd67f1-a816-46ff-b4a4-136ad5af6d4b  rack1
 after rolling restart
 UN  9  1.47 GB256 ?   28220962-26ae-4eeb-8027-99f96e377406  rack1
 UN  8  1.5 GB 256 ?   f2de6ea1-de88-4056-8fde-42f9c476a090  rack1
 UN  7  2.46 GB256 ?   

[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lex Lythius updated CASSANDRA-8675:
---
Description: 
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{code:sql}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{code}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name, since we can't 
rename tables?


  was:
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{code:sql}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{code}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?



 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cql
 Attachments: copytest.csv


 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   

[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lex Lythius updated CASSANDRA-8675:
---
Attachment: copytest.csv

 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cql
 Attachments: copytest.csv


 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 {code:sql}
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |This has a quote  character
   3 | This has a fake tab \t character (entered slash-t text)
 (3 rows)
 cqlsh:test copy copytest to '/tmp/copytest.csv';
 3 rows exported in 0.034 seconds.
 cqlsh:test 
 cqlsh:test copy copytest to '/tmp/copytest.csv';
 3 rows exported in 0.034 seconds.
 cqlsh:test copy copytest from '/tmp/copytest.csv';
 3 rows imported in 0.005 seconds.
 cqlsh:test select * from copytest;
  id | t
 +---
   1 |  This has a newlinencharacter
   2 |  This has a quote  character
   3 | This has a fake tab \t character (typed backslash, t)
 (3 rows)
 {code}
 I tried replacing \n in the CSV file with \\n, which just expands to \n in 
 the table; and with an actual newline character, which fails with error since 
 it prematurely terminates the record.
 It seems backslashes are only used to take the following character as a 
 literal
 Until this is fixed, what would be the best way to refactor an old table with 
 a new, incompatible structure maintaining its content and name?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7560) 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession

2015-01-23 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289326#comment-14289326
 ] 

Yuki Morishita commented on CASSANDRA-7560:
---

For 2.0, yes, I believe.

 'nodetool repair -pr' leads to indefinitely hanging AntiEntropySession
 --

 Key: CASSANDRA-7560
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7560
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Vladimir Avram
Assignee: Yuki Morishita
 Fix For: 2.0.10

 Attachments: 0001-backport-CASSANDRA-6747.patch, 
 0001-partial-backport-3569.patch, cassandra_daemon.log, 
 cassandra_daemon_rep1.log, cassandra_daemon_rep2.log, nodetool_command.log


 Running {{nodetool repair -pr}} will sometimes hang on one of the resulting 
 AntiEntropySessions.
 The system logs will show the repair command starting
 {noformat}
  INFO [Thread-3079] 2014-07-15 02:22:56,514 StorageService.java (line 2569) 
 Starting repair command #1, repairing 256 ranges for keyspace x
 {noformat}
 You can then see a few AntiEntropySessions completing with:
 {noformat}
 INFO [AntiEntropySessions:2] 2014-07-15 02:28:12,766 RepairSession.java (line 
 282) [repair #eefb3c30-0bc6-11e4-83f7-a378978d0c49] session completed 
 successfully
 {noformat}
 Finally we reach an AntiEntropySession at some point that hangs just before 
 requesting the merkle trees for the next column family in line for repair. So 
 we first see the previous CF being finished and the whole repair sessions 
 hangs here with no visible progress or errors on this or any of the related 
 nodes.
 {noformat}
 INFO [AntiEntropyStage:1] 2014-07-15 02:38:20,325 RepairSession.java (line 
 221) [repair #8f85c1b0-0bc8-11e4-83f7-a378978d0c49] previous_cf is fully 
 synced
 {noformat}
 Notes:
 * Single DC 6 node cluster with an average load of 86 GB per node.
 * This appears to be random; it does not always happen on the same CF or on 
 the same session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8668) We don't enforce offheap memory constraints; regression introduced by 7882

2015-01-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289339#comment-14289339
 ] 

Jonathan Ellis commented on CASSANDRA-8668:
---

+1

 We don't enforce offheap memory constraints; regression introduced by 7882
 --

 Key: CASSANDRA-8668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8668
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 2.1.3

 Attachments: 8668.txt


 Very simple mistake, not sure how it was introduced (looks like accidental 
 delete, or possibly a half-rolled back change). Introducing a unit test to 
 ensure basic functionality here is covered to catch such mistakes in future.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8561) Tombstone log warning does not log partition key

2015-01-23 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289357#comment-14289357
 ] 

Jonathan Ellis commented on CASSANDRA-8561:
---

It's not, because if you query for N rows of M columns each you can have a 
maximum of N * M cells shadowed.  (And no extra memory is used.)  Tombstones 
are evil because there's no upper limit to how many you'd have to scan, but 
they all have to be sent to the coordinator for read repair, consuming memory 
on the read thread.

 Tombstone log warning does not log partition key
 

 Key: CASSANDRA-8561
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8561
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Datastax DSE 4.5
Reporter: Jens Rantil
  Labels: logging
 Fix For: 2.1.3, 2.0.13


 AFAIK, the tombstone warning in system.log does not contain the primary key. 
 See: https://gist.github.com/JensRantil/44204676f4dbea79ea3a
 Including it would help a lot in diagnosing why the (CQL) row has so many 
 tombstones.
 Let me know if I have misunderstood something.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)
Lex Lythius created CASSANDRA-8675:
--

 Summary: COPY TO/FROM broken for newline characters
 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
protocol v3]
Ubuntu 14.04 64-bit
Reporter: Lex Lythius


Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{{code}}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{{/code}}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8348) allow takeColumnFamilySnapshot to take a list of ColumnFamilies

2015-01-23 Thread Sachin Janani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289198#comment-14289198
 ] 

Sachin Janani commented on CASSANDRA-8348:
--

Going ahead I have implemented the approach 2 and uploaded the patch.So now the 
nodetool have a separate option in snapshot command (--kc-list) where we can 
provide list of  column family from different keyspace in the form ks.cf.But 
currently this does not support regular expression in column family name like 
ks.c*,ks.* etc.
Also [~hoangelos] [~nickmbailey] I have added a separate method 
takeMultipleColumnFamilySnapshot in StorageServiceBean.java for this 
option.Please let me know if there are any changes need to be done.

 allow takeColumnFamilySnapshot to take a list of ColumnFamilies
 ---

 Key: CASSANDRA-8348
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8348
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Halliday
Priority: Minor
 Fix For: 3.0, 2.1.3


 Within StorageServiceMBean.java the function takeSnapshot allows for a list 
 of keyspaces to snapshot.  However, the function takeColumnFamilySnapshot 
 only allows for a single ColumnFamily to snapshot.  This should allow for 
 multiple ColumnFamilies within the same Keyspace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/6] cassandra git commit: Do not write commitlog from standalone tools

2015-01-23 Thread yukim
Do not write commitlog from standalone tools

patch by yukim; reviewed by Tyler Hobbs for CASSANDRA-8616


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/2bf63f61
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/2bf63f61
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/2bf63f61

Branch: refs/heads/cassandra-2.1
Commit: 2bf63f61e33587a8ba94fce3a1433e8cd866d1f0
Parents: cc5fb19
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 08:58:47 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 08:58:47 2015 -0600

--
 .../cassandra/config/DatabaseDescriptor.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 6 files changed, 21 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
--
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 2bfdb16..286014e 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -516,9 +516,22 @@ public class DatabaseDescriptor
 return conf.dynamic_snitch ? new DynamicEndpointSnitch(snitch) : 
snitch;
 }
 
-/** load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances. */
+/**
+ * load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances.
+ * Schema version may be updated as the result.
+ */
 public static void loadSchemas()
 {
+loadSchemas(true);
+}
+
+/**
+ * Load schema definitions.
+ *
+ * @param updateVersion true if schema version needs to be updated
+ */
+public static void loadSchemas(boolean updateVersion)
+{
 ColumnFamilyStore schemaCFS = 
SystemKeyspace.schemaCFS(SystemKeyspace.SCHEMA_KEYSPACES_CF);
 
 // if keyspace with definitions is empty try loading the old way
@@ -536,7 +549,8 @@ public class DatabaseDescriptor
 Schema.instance.load(DefsTables.loadFromKeyspace());
 }
 
-Schema.instance.updateVersion();
+if (updateVersion)
+Schema.instance.updateVersion();
 }
 
 private static boolean hasExistingNoSystemTables()

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableExport.java 
b/src/java/org/apache/cassandra/tools/SSTableExport.java
index f8b85c3..0b96924 100644
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@ -448,7 +448,7 @@ public class SSTableExport
 String[] excludes = cmd.getOptionValues(EXCLUDEKEY_OPTION);
 String ssTableFileName = new File(cmd.getArgs()[0]).getAbsolutePath();
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 Descriptor descriptor = Descriptor.fromFilename(ssTableFileName);
 
 // Start by validating keyspace name

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --git a/src/java/org/apache/cassandra/tools/SSTableImport.java 
b/src/java/org/apache/cassandra/tools/SSTableImport.java
index 11bfc81..3135fe6 100644
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@ -545,7 +545,7 @@ public class SSTableImport
 oldSCFormat = true;
 }
 
-DatabaseDescriptor.loadSchemas();
+DatabaseDescriptor.loadSchemas(false);
 if (Schema.instance.getNonSystemKeyspaces().size()  1)
 {
 String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/2bf63f61/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
--
diff --git a/src/java/org/apache/cassandra/tools/StandaloneScrubber.java 
b/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
index 315e4e1..81dfdc3 100644
--- a/src/java/org/apache/cassandra/tools/StandaloneScrubber.java
+++ 

[6/6] cassandra git commit: Merge branch 'cassandra-2.1' into trunk

2015-01-23 Thread yukim
Merge branch 'cassandra-2.1' into trunk

Conflicts:
src/java/org/apache/cassandra/config/DatabaseDescriptor.java
src/java/org/apache/cassandra/tools/SSTableExport.java
src/java/org/apache/cassandra/tools/SSTableImport.java
src/java/org/apache/cassandra/tools/StandaloneScrubber.java
src/java/org/apache/cassandra/tools/StandaloneSplitter.java
src/java/org/apache/cassandra/tools/StandaloneUpgrader.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/27ad2db0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/27ad2db0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/27ad2db0

Branch: refs/heads/trunk
Commit: 27ad2db02f5ae0b3c59e817a7eb82163a4695f95
Parents: d4b23b0 3a5f79e
Author: Yuki Morishita yu...@apache.org
Authored: Fri Jan 23 09:27:53 2015 -0600
Committer: Yuki Morishita yu...@apache.org
Committed: Fri Jan 23 09:27:53 2015 -0600

--
 src/java/org/apache/cassandra/config/Schema.java  | 18 --
 .../org/apache/cassandra/tools/SSTableExport.java |  2 +-
 .../org/apache/cassandra/tools/SSTableImport.java |  2 +-
 .../cassandra/tools/SSTableLevelResetter.java |  2 +-
 .../cassandra/tools/StandaloneScrubber.java   |  2 +-
 .../cassandra/tools/StandaloneSplitter.java   |  2 +-
 .../cassandra/tools/StandaloneUpgrader.java   |  2 +-
 7 files changed, 22 insertions(+), 8 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/27ad2db0/src/java/org/apache/cassandra/config/Schema.java
--
diff --cc src/java/org/apache/cassandra/config/Schema.java
index 694c05c,8e9802f..af1b502
--- a/src/java/org/apache/cassandra/config/Schema.java
+++ b/src/java/org/apache/cassandra/config/Schema.java
@@@ -82,20 -80,10 +82,34 @@@ public class Schem
  }
  
  /**
 - * Initialize empty schema object
 + * Initialize empty schema object and load the hardcoded system tables
   */
  public Schema()
 -{}
 +{
 +load(SystemKeyspace.definition());
 +}
 +
- /** load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances. */
++/**
++ * load keyspace (keyspace) definitions, but do not initialize the 
keyspace instances.
++ * Schema version may be updated as the result.
++ */
 +public Schema loadFromDisk()
 +{
++return loadFromDisk(true);
++}
++
++/**
++ * Load schema definitions from disk.
++ *
++ * @param updateVersion true if schema version needs to be updated
++ */
++public Schema loadFromDisk(boolean updateVersion)
++{
 +load(LegacySchemaTables.readSchemaFromSystemTables());
- updateVersion();
++if (updateVersion)
++updateVersion();
 +return this;
 +}
  
  /**
   * Load up non-system keyspaces

http://git-wip-us.apache.org/repos/asf/cassandra/blob/27ad2db0/src/java/org/apache/cassandra/tools/SSTableExport.java
--
diff --cc src/java/org/apache/cassandra/tools/SSTableExport.java
index 76bfa3b,a90f405..b62f516
--- a/src/java/org/apache/cassandra/tools/SSTableExport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableExport.java
@@@ -418,7 -419,7 +418,7 @@@ public class SSTableExpor
  String[] excludes = cmd.getOptionValues(EXCLUDEKEY_OPTION);
  String ssTableFileName = new File(cmd.getArgs()[0]).getAbsolutePath();
  
- Schema.instance.loadFromDisk();
 -DatabaseDescriptor.loadSchemas(false);
++Schema.instance.loadFromDisk(false);
  Descriptor descriptor = Descriptor.fromFilename(ssTableFileName);
  
  // Start by validating keyspace name

http://git-wip-us.apache.org/repos/asf/cassandra/blob/27ad2db0/src/java/org/apache/cassandra/tools/SSTableImport.java
--
diff --cc src/java/org/apache/cassandra/tools/SSTableImport.java
index ee6bf59,87d52be..84613e9
--- a/src/java/org/apache/cassandra/tools/SSTableImport.java
+++ b/src/java/org/apache/cassandra/tools/SSTableImport.java
@@@ -501,7 -501,7 +501,7 @@@ public class SSTableImpor
  isSorted = true;
  }
  
- Schema.instance.loadFromDisk();
 -DatabaseDescriptor.loadSchemas(false);
++Schema.instance.loadFromDisk(false);
  if (Schema.instance.getNonSystemKeyspaces().size()  1)
  {
  String msg = no non-system keyspaces are defined;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/27ad2db0/src/java/org/apache/cassandra/tools/SSTableLevelResetter.java
--
diff --cc 

[jira] [Commented] (CASSANDRA-7705) Safer Resource Management

2015-01-23 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289393#comment-14289393
 ] 

Marcus Eriksson commented on CASSANDRA-7705:


This LGTM for 2.1 inclusion - we have had so many issues related to the ref 
counting lately we really need this

Comments;

* There is an AbstractRefCounted class in the patch which is not used, guess it 
was replaced by doing the implementation in the RefCounted interface instead
* In SSTableLoader the Ref.sharedRef() is passed to the constructor in 
SSTableStreamingSections which feels wrong, shouldn't we acquire a new Ref and 
have SSTSS release that once it is done with the sstable?
* Manager.extant - why a Map here? Could we use a Set?
* In RefState we directly access the CLQ in RefCountedState, could we 
encapsulate this and give the methods name better names? Adding 'this' to a 
'refs' collection does not tell me much when everything is called ref* :)
* In general, I think it would be nicer not putting all the classes inside the 
RefCounted interface (atleast the public ones), breaking them out into their 
own would make it a bit easier to follow (but that might just be my personal 
preference)
* A few comments on the methods in RefCounted.* - especially Refs as it is the 
most publicly visible

pushed a branch with a few nits fixed here: 
https://github.com/krummas/cassandra/commits/bes/7705-2.1

 Safer Resource Management
 -

 Key: CASSANDRA-7705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7705
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Benedict
Assignee: Benedict
 Fix For: 3.0


 We've had a spate of bugs recently with bad reference counting. these can 
 have potentially dire consequences, generally either randomly deleting data 
 or giving us infinite loops. 
 Since in 2.1 we only reference count resources that are relatively expensive 
 and infrequently managed (or in places where this safety is probably not as 
 necessary, e.g. SerializingCache), we could without any negative consequences 
 (and only slight code complexity) introduce a safer resource management 
 scheme for these more expensive/infrequent actions.
 Basically, I propose when we want to acquire a resource we allocate an object 
 that manages the reference. This can only be released once; if it is released 
 twice, we fail immediately at the second release, reporting where the bug is 
 (rather than letting it continue fine until the next correct release corrupts 
 the count). The reference counter remains the same, but we obtain guarantees 
 that the reference count itself is never badly maintained, although code 
 using it could mistakenly release its own handle early (typically this is 
 only an issue when cleaning up after a failure, in which case under the new 
 scheme this would be an innocuous error)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-8675) COPY TO/FROM broken for newline characters

2015-01-23 Thread Lex Lythius (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lex Lythius updated CASSANDRA-8675:
---
Description: 
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)


I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?


  was:
Exporting/importing does not preserve contents when texts containing newline 
(and possibly other) characters are involved:

{{code}}
cqlsh:test create table if not exists copytest (id int primary key, t text);
cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
... character');
cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
character');
cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
character (typed backslash, t)');
cqlsh:test select * from copytest;

 id | t
+-
  1 |   This has a newline\ncharacter
  2 |This has a quote  character
  3 | This has a fake tab \t character (entered slash-t text)

(3 rows)

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test 

cqlsh:test copy copytest to '/tmp/copytest.csv';

3 rows exported in 0.034 seconds.
cqlsh:test copy copytest from '/tmp/copytest.csv';

3 rows imported in 0.005 seconds.
cqlsh:test select * from copytest;

 id | t
+---
  1 |  This has a newlinencharacter
  2 |  This has a quote  character
  3 | This has a fake tab \t character (typed backslash, t)

(3 rows)
{{/code}}

I tried replacing \n in the CSV file with \\n, which just expands to \n in the 
table; and with an actual newline character, which fails with error since it 
prematurely terminates the record.

It seems backslashes are only used to take the following character as a literal

Until this is fixed, what would be the best way to refactor an old table with a 
new, incompatible structure maintaining its content and name?



 COPY TO/FROM broken for newline characters
 --

 Key: CASSANDRA-8675
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native 
 protocol v3]
 Ubuntu 14.04 64-bit
Reporter: Lex Lythius
  Labels: cql

 Exporting/importing does not preserve contents when texts containing newline 
 (and possibly other) characters are involved:
 cqlsh:test create table if not exists copytest (id int primary key, t text);
 cqlsh:test insert into copytest (id, t) values (1, 'This has a newline
 ... character');
 cqlsh:test insert into copytest (id, t) values (2, 'This has a quote  
 character');
 cqlsh:test insert into copytest (id, t) values (3, 'This has a fake tab \t 
 character (typed backslash, t)');
 cqlsh:test select * from copytest;
  id | t
 +-
   1 |   This has a newline\ncharacter
   2 |This has a quote  

[jira] [Comment Edited] (CASSANDRA-8366) Repair grows data on nodes, causes load to become unbalanced

2015-01-23 Thread Alan Boudreault (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14289415#comment-14289415
 ] 

Alan Boudreault edited comment on CASSANDRA-8366 at 1/23/15 3:51 PM:
-

[~krummas] I'm attaching a new version of the test script. (testv2.sh). This 
one has some improvements and gives more details after each operations (it 
shows sstable size, wait properly that all compaction tasks finish, display  
streaming status, it flushes nodes, it cleans nodes etc.).

I've run  3 times the script to see the differences. 

* run1 is the only real successful result. The reason is that I compact all 
nodes right after the cassandra-stress operation. Apparently, this removed the 
need to repair, so everything is fine and at the end of the script all nodes 
are at the proper size (1.43G).

* run2 doesn't compact after the stress. The repair is then ran and we only see 
the Did not get a positive answer until the end of the node2 repair. So we 
can see that the keyspace r1 has been successfully repaired for node1 and 
node2. The repair for node3 failed but it seems that the 2 other repairs have 
taken care to repair things so everything is OK at the end of the script. (node 
size ~1.43G). However, this run grows node size significantly: 1.4G - 9G 
(after the repair). 

* run3 doesn't compact after the stress. This time, the repair fails at the 
beginning (node1 repair call). This makes the node2 and node2 repairs fails 
too. After flushing + cleaning + compacting, all nodes have an extra 1G of 
data, which I don't know what they are. There is no streaming, all compaction 
is done and looks like I cannot get rid of them. This is not in the log, but I 
restarted my cluster again, then retried to full repair sequentially all nodes 
then re-cleaning, re-compacting and nothing changed. I let the cluster ran all 
night long to be sure. I have not deleted this cluster so if you need more 
information, I just have to restart it.

Do you see anything wrong in my tests? Ping me on IRC if you want to discuss 
more about this ticket. 





was (Author: aboudreault):
[~krummas] I'm attaching a new version of the test script. (testv2.sh). This 
one has some improvements and gives more details after each operations (it 
shows sstable size, wait properly that all compaction tasks finish, display  
streaming status, it flushes nodes, it cleans nodes etc.).

I've run  3 times the script to see the differences. 

* run1 is the only real successful result. The reason is that I compact all 
nodes right after the cassandra-stress operation. Apparently, this removed the 
need to repair, so everything is fine and at the end of the script all nodes 
are at the proper size (1.43G).

* run2 doesn't compact after the stress. The repair is then ran and we only see 
the Did not get a positive answer until the end of the node2 repair. So we 
can see that the keyspace r1 has been successfully repaired for node1 and 
node2. The repair for node3 failed but it seems that the 2 other repairs have 
taken care to repair things so everything is OK at the end of the script. (node 
size ~1.43G)

* run3 doesn't compact after the stress. This time, the repair fails at the 
beginning (node1 repair call). This makes the node2 and node2 repairs fails 
too. After flushing + cleaning + compacting, all nodes have an extra 1G of 
data, which I don't know what they are. There is no streaming, all compaction 
is done and looks like I cannot get rid of them. This is not in the log, but I 
restarted my cluster again, then retried to full repair sequentially all nodes 
then re-cleaning, re-compacting and nothing changed. I let the cluster ran all 
night long to be sure. I have not deleted this cluster so if you need more 
information, I just have to restart it.

Do you see anything wrong in my tests? Ping me on IRC if you want to discuss 
more about this ticket. 




 Repair grows data on nodes, causes load to become unbalanced
 

 Key: CASSANDRA-8366
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8366
 Project: Cassandra
  Issue Type: Bug
 Environment: 4 node cluster
 2.1.2 Cassandra
 Inserts and reads are done with CQL driver
Reporter: Jan Karlsson
Assignee: Alan Boudreault
 Attachments: results-1750_inc_repair.txt, 
 results-500_1_inc_repairs.txt, results-500_2_inc_repairs.txt, 
 results-500_full_repair_then_inc_repairs.txt, 
 results-500_inc_repairs_not_parallel.txt, 
 run1_with_compact_before_repair.log, run2_no_compact_before_repair.log, 
 run3_no_compact_before_repair.log, test.sh, testv2.sh


 There seems to be something weird going on when repairing data.
 I have a program that runs 2 hours which inserts 250 random numbers and reads 
 250 times