date:20120821

[
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438475#comment-13438475
]

Jonathan Ellis commented on CASSANDRA-1123:
---

Dug into this pretty hard today. Thanks for the huge effort, David!

The good news is I think we can simplify this a great deal. The bad news is
that it's a lot of code churn. Sorry about that! We probably could have used
a little more back-and-forth before fleshing it out so completely.

What I saw as I tried to add trace points was that the TraceEvent/Builder api
was fairly cumbersome. What I really wanted to do was instrument all the debug
logging that we've accumulated, most of which has the battle scars of being
added to solve a particular tricky situation.

So I started banging on the API and realized that I was ending up with
something that resembled a logging appender. So why not hook into our existing
logging api? Like Aaron's original idea, but without any copy and pasting; we
can just implement a log4j LoggingAppender and life is good.

This does mean we don't have payload maps anywhere but in the session
initialization, which gets special-cased, but I'm fine with that.

I've pushed a sketch of this approach to
https://github.com/jbellis/cassandra/tree/1123-4. I can't devote another day
to finish it, so I'm going to have to throw it back to David, but I've
considerately left the parts that need work in a non-compiling state so it's
clear where to start. :)

(The main drawback is we have to be careful about where we log on the write
path to avoid an infinite loop, but this was a problem with the old approach as
well. At least infinite loops will be fairly obvious and easy to fix.)

Some other points:
- I'm violently opposed to serializing thrift objects into the trace. This
just pushes the job of making them human-readable out to each consumer. Let's
solve this once at the server level instead. One approach is given (but not
completed) in my tree. Note that I'm okay with being *barely* human-readable;
the main use case we're concerned with is tracing queries interactively, in
which case we already know the parameters and the logging is a formality. I do
want it to be *possible* to reconstruct a problematic query detected by
probabilistic sampling, but it doesn't have to be easy. (That said, turning
hex back into cli or cqlsh is at least possible without writing a Thrift
deserializer first, so it has that much of an advantage over the initial patch.)
- I'm fine with giving up some structure in exchange for ease of use. A single
activity column, with some metadata, is enough.
- I'm also fine with giving up any or all of the pretty printer, summary by
request type, and tests, which I've made no effort to port. Would strongly
prefer getting a bare bones implementation finished, then adding more
functionality later. (I do note that the pretty printer probably makes more
sense to grow vertically rather than horizontally.)
- enabling tracing-by-probability should be in jmx rather than thrift. unsure
if max-sessions-to-trace is useful.
- I think you will like the improvements around DTPE and the Stage.
- Wiring up appenders at different levels (we'd want the file appender at INFO,
the tracing one at DEBUG) is kind of a bitch. Apologies in advance. Possibly
useful:
http://stackoverflow.com/questions/2154539/log4j-log-level-per-appender-for-a-single-logger,

http://stackoverflow.com/questions/751431/how-can-i-direct-log4j-output-so-that-different-log-levels-go-to-different-appen

Think that about covers it, but it's late and I could have missed something.

Allow tracing query details
---

Key: CASSANDRA-1123
URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
Project: Cassandra
Issue Type: New Feature
Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
Fix For: 1.2.0

Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch

In the spirit of CASSANDRA-511, it would be useful to tracing on queries to
see where latency is coming from: how long did row cache lookup take? key
search in the index? merging the data from the sstables? etc.
The main difference vs setting debug logging is that debug logging is too big
of a hammer; by turning on the flood of logging for everyone, you actually
distort the information you're looking for. This would be something you
could set per-query (or more likely per connection).
We don't need to be as sophisticated as the techniques discussed in the
following papers but they are interesting reading:
http://research.google.com/pubs/pub36356.html
http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This

[jira] [Created] (CASSANDRA-4561) update column family fails

2012-08-21 Thread Zenek Kraweznik (JIRA)

Zenek Kraweznik created CASSANDRA-4561:
--

 Summary: update column family fails
 Key: CASSANDRA-4561
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4561
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.3, 1.1.2, 1.1.1, 1.1.0, 1.1.4
Reporter: Zenek Kraweznik
Priority: Blocker


[default@test] show schema;
create column family Messages
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'AsciiType'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 2
  and max_compaction_threshold = 4
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
  and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 
'org.apache.cassandra.io.compress.DeflateCompressor'};


[default@test] update column family Messages with min_compaction_threshold = 4 
and  max_compaction_threshold = 32;
a5b7544e-1ef5-3bfd-8770-c09594e37ec2
Waiting for schema agreement...
... schemas agree across the cluster

[default@test] show schema;
create column family Messages
  with column_type = 'Standard'
  and comparator = 'AsciiType'
  and default_validation_class = 'BytesType'
  and key_validation_class = 'AsciiType'
  and read_repair_chance = 0.1
  and dclocal_read_repair_chance = 0.0
  and gc_grace = 864000
  and min_compaction_threshold = 2
  and max_compaction_threshold = 4
  and replicate_on_write = true
  and compaction_strategy = 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
  and caching = 'KEYS_ONLY'
  and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
  and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' : 
'org.apache.cassandra.io.compress.DeflateCompressor'};

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4561) update column family fails

2012-08-21 Thread Zenek Kraweznik (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438558#comment-13438558
 ] 

Zenek Kraweznik commented on CASSANDRA-4561:


in logfile I see only this:
 INFO [MigrationStage:1] 2012-08-21 11:27:55,560 ColumnFamilyStore.java (line 
659) Enqueuing flush of Memtable-schema_columnfamilies@970905946(1266/1582 
serialized/live bytes, 20 ops)
 INFO [FlushWriter:5] 2012-08-21 11:27:55,561 Memtable.java (line 264) Writing 
Memtable-schema_columnfamilies@970905946(1266/1582 serialized/live bytes, 20 
ops)
 INFO [FlushWriter:5] 2012-08-21 11:27:55,587 Memtable.java (line 305) 
Completed flushing 
/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-he-196-Data.db
 (1336 bytes) for commitlog position ReplayPosition(segmentId=4914817711083622, 
position=333055)

 update column family fails
 --

 Key: CASSANDRA-4561
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4561
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0, 1.1.1, 1.1.2, 1.1.3, 1.1.4
Reporter: Zenek Kraweznik
Priority: Blocker

 [default@test] show schema;
 create column family Messages
   with column_type = 'Standard'
   and comparator = 'AsciiType'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'AsciiType'
   and read_repair_chance = 0.1
   and dclocal_read_repair_chance = 0.0
   and gc_grace = 864000
   and min_compaction_threshold = 2
   and max_compaction_threshold = 4
   and replicate_on_write = true
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
   and caching = 'KEYS_ONLY'
   and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
   and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' 
 : 'org.apache.cassandra.io.compress.DeflateCompressor'};
 [default@test] update column family Messages with min_compaction_threshold = 
 4 and  max_compaction_threshold = 32;
 a5b7544e-1ef5-3bfd-8770-c09594e37ec2
 Waiting for schema agreement...
 ... schemas agree across the cluster
 [default@test] show schema;
 create column family Messages
   with column_type = 'Standard'
   and comparator = 'AsciiType'
   and default_validation_class = 'BytesType'
   and key_validation_class = 'AsciiType'
   and read_repair_chance = 0.1
   and dclocal_read_repair_chance = 0.0
   and gc_grace = 864000
   and min_compaction_threshold = 2
   and max_compaction_threshold = 4
   and replicate_on_write = true
   and compaction_strategy = 
 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
   and caching = 'KEYS_ONLY'
   and compaction_strategy_options = {'sstable_size_in_mb' : '1024'}
   and compression_options = {'chunk_length_kb' : '64', 'sstable_compression' 
 : 'org.apache.cassandra.io.compress.DeflateCompressor'};

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4527) Issue with CQL and ALTER TABLE DROP

2012-08-21 Thread Marcos Dione (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438580#comment-13438580
 ] 

Marcos Dione commented on CASSANDRA-4527:
-

this is a dupe of CASSANDRA-4526

 Issue with CQL and ALTER TABLE DROP
 ---

 Key: CASSANDRA-4527
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4527
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: Ubuntu 12.04
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Victor Penela
  Labels: cql

 Creating a CF in cqlsh -3
 CREATE COLUMNFAMILY ads_config (c_id uuid, ct_id uuid, pt_id uuid, c_type 
 int, creat blob, start timestamp, end timestamp, total int, pending int, 
 PRIMARY KEY ( campaign_id, creat_id, placement_id));
 INSERT INTO works fine. SELECT * works fine.
 ALTER TABLE ads_config add cost int;
 SELECT * works fine, new field is null.
 ALTER TABLE ads_config drop cost;
 Gives a:
 TSocket read 0 bytes
 After that the connection seems to die, giving me the same error a couple of 
 times and then a broken pipe:
 Traceback (most recent call last):
   File /usr/bin/cqlsh, line 1008, in perform_statement
 self.cursor.execute(statement, decoder=decoder)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 117, in execute
 response = self.handle_cql_execution_errors(doquery, prepared_q, compress)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 132, in handle_cql_execution_errors
 return executor(*args, **kwargs)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1583, in execute_cql_query
 self.send_execute_cql_query(query, compression)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1593, in send_execute_cql_query
 self._oprot.trans.flush()
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py,
  line 293, in flush
 self.__trans.write(buf)
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py,
  line 117, in write
 plus = self.handle.send(buff)
 error: [Errno 32] Broken pipe
 Closing and reopening cql works fine. The column key exists and can't be 
 dropped.
 Error log:
  INFO 12:08:40,632 Enqueuing flush of 
 Memtable-schema_columnfamilies@445620464(1428/1785 serialized/live bytes, 20 
 ops)
  INFO 12:08:40,633 Writing Memtable-schema_columnfamilies@445620464(1428/1785 
 serialized/live bytes, 20 ops)
  INFO 12:08:40,696 Completed flushing 
 /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db
  (1487 bytes) for commitlog position 
 ReplayPosition(segmentId=180414134565599, position=11928)
  INFO 12:08:40,697 Enqueuing flush of 
 Memtable-schema_columns@1158801519(222/277 serialized/live bytes, 4 ops)
  INFO 12:08:40,697 Writing Memtable-schema_columns@1158801519(222/277 
 serialized/live bytes, 4 ops)
  INFO 12:08:40,704 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-1-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-3-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-2-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db')]
  INFO 12:08:40,731 Completed flushing 
 /var/lib/cassandra/data/system/schema_columns/system-schema_columns-hd-4-Data.db
  (273 bytes) for commitlog position ReplayPosition(segmentId=180414134565599, 
 position=11928)
  INFO 12:08:40,761 Compacted to 
 [/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-5-Data.db,].
   5,892 to 2,855 (~48% of original) bytes for 1 keys at 0.050421MB/s.  Time: 
 54ms.
 ERROR 12:08:40,780 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:77)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:97)
   at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:35)
   at 
 org.apache.cassandra.cql3.UntypedResultSet$Row.getString(UntypedResultSet.java:87)
   at

[jira] [Commented] (CASSANDRA-4527) Issue with CQL and ALTER TABLE DROP

2012-08-21 Thread Victor Penela (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438591#comment-13438591
 ] 

Victor Penela commented on CASSANDRA-4527:
--

Sorry, there was an issue with Jira and it seems that it did post the issue 
even though it gave me an error :/

Thanks!

 Issue with CQL and ALTER TABLE DROP
 ---

 Key: CASSANDRA-4527
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4527
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: Ubuntu 12.04
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Victor Penela
  Labels: cql

 Creating a CF in cqlsh -3
 CREATE COLUMNFAMILY ads_config (c_id uuid, ct_id uuid, pt_id uuid, c_type 
 int, creat blob, start timestamp, end timestamp, total int, pending int, 
 PRIMARY KEY ( campaign_id, creat_id, placement_id));
 INSERT INTO works fine. SELECT * works fine.
 ALTER TABLE ads_config add cost int;
 SELECT * works fine, new field is null.
 ALTER TABLE ads_config drop cost;
 Gives a:
 TSocket read 0 bytes
 After that the connection seems to die, giving me the same error a couple of 
 times and then a broken pipe:
 Traceback (most recent call last):
   File /usr/bin/cqlsh, line 1008, in perform_statement
 self.cursor.execute(statement, decoder=decoder)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 117, in execute
 response = self.handle_cql_execution_errors(doquery, prepared_q, compress)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 132, in handle_cql_execution_errors
 return executor(*args, **kwargs)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1583, in execute_cql_query
 self.send_execute_cql_query(query, compression)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1593, in send_execute_cql_query
 self._oprot.trans.flush()
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py,
  line 293, in flush
 self.__trans.write(buf)
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py,
  line 117, in write
 plus = self.handle.send(buff)
 error: [Errno 32] Broken pipe
 Closing and reopening cql works fine. The column key exists and can't be 
 dropped.
 Error log:
  INFO 12:08:40,632 Enqueuing flush of 
 Memtable-schema_columnfamilies@445620464(1428/1785 serialized/live bytes, 20 
 ops)
  INFO 12:08:40,633 Writing Memtable-schema_columnfamilies@445620464(1428/1785 
 serialized/live bytes, 20 ops)
  INFO 12:08:40,696 Completed flushing 
 /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db
  (1487 bytes) for commitlog position 
 ReplayPosition(segmentId=180414134565599, position=11928)
  INFO 12:08:40,697 Enqueuing flush of 
 Memtable-schema_columns@1158801519(222/277 serialized/live bytes, 4 ops)
  INFO 12:08:40,697 Writing Memtable-schema_columns@1158801519(222/277 
 serialized/live bytes, 4 ops)
  INFO 12:08:40,704 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-1-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-3-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-2-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db')]
  INFO 12:08:40,731 Completed flushing 
 /var/lib/cassandra/data/system/schema_columns/system-schema_columns-hd-4-Data.db
  (273 bytes) for commitlog position ReplayPosition(segmentId=180414134565599, 
 position=11928)
  INFO 12:08:40,761 Compacted to 
 [/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-5-Data.db,].
   5,892 to 2,855 (~48% of original) bytes for 1 keys at 0.050421MB/s.  Time: 
 54ms.
 ERROR 12:08:40,780 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:77)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:97)
   at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:35)
   at

[jira] [Resolved] (CASSANDRA-4527) Issue with CQL and ALTER TABLE DROP

2012-08-21 Thread Victor Penela (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Penela resolved CASSANDRA-4527.
--

Resolution: Duplicate

There was an issue with with Jira when creating the issue. Sorry for the dupe!

 Issue with CQL and ALTER TABLE DROP
 ---

 Key: CASSANDRA-4527
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4527
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.2
 Environment: Ubuntu 12.04
 java version 1.6.0_33
 Java(TM) SE Runtime Environment (build 1.6.0_33-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.8-b03, mixed mode)
Reporter: Victor Penela
  Labels: cql

 Creating a CF in cqlsh -3
 CREATE COLUMNFAMILY ads_config (c_id uuid, ct_id uuid, pt_id uuid, c_type 
 int, creat blob, start timestamp, end timestamp, total int, pending int, 
 PRIMARY KEY ( campaign_id, creat_id, placement_id));
 INSERT INTO works fine. SELECT * works fine.
 ALTER TABLE ads_config add cost int;
 SELECT * works fine, new field is null.
 ALTER TABLE ads_config drop cost;
 Gives a:
 TSocket read 0 bytes
 After that the connection seems to die, giving me the same error a couple of 
 times and then a broken pipe:
 Traceback (most recent call last):
   File /usr/bin/cqlsh, line 1008, in perform_statement
 self.cursor.execute(statement, decoder=decoder)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 117, in execute
 response = self.handle_cql_execution_errors(doquery, prepared_q, compress)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cursor.py,
  line 132, in handle_cql_execution_errors
 return executor(*args, **kwargs)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1583, in execute_cql_query
 self.send_execute_cql_query(query, compression)
   File 
 /usr/share/cassandra/lib/cql-internal-only-1.0.10.zip/cql-1.0.10/cql/cassandra/Cassandra.py,
  line 1593, in send_execute_cql_query
 self._oprot.trans.flush()
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TTransport.py,
  line 293, in flush
 self.__trans.write(buf)
   File 
 /usr/share/cassandra/lib/thrift-python-internal-only-0.7.0.zip/thrift/transport/TSocket.py,
  line 117, in write
 plus = self.handle.send(buff)
 error: [Errno 32] Broken pipe
 Closing and reopening cql works fine. The column key exists and can't be 
 dropped.
 Error log:
  INFO 12:08:40,632 Enqueuing flush of 
 Memtable-schema_columnfamilies@445620464(1428/1785 serialized/live bytes, 20 
 ops)
  INFO 12:08:40,633 Writing Memtable-schema_columnfamilies@445620464(1428/1785 
 serialized/live bytes, 20 ops)
  INFO 12:08:40,696 Completed flushing 
 /var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db
  (1487 bytes) for commitlog position 
 ReplayPosition(segmentId=180414134565599, position=11928)
  INFO 12:08:40,697 Enqueuing flush of 
 Memtable-schema_columns@1158801519(222/277 serialized/live bytes, 4 ops)
  INFO 12:08:40,697 Writing Memtable-schema_columns@1158801519(222/277 
 serialized/live bytes, 4 ops)
  INFO 12:08:40,704 Compacting 
 [SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-1-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-3-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-2-Data.db'),
  
 SSTableReader(path='/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-4-Data.db')]
  INFO 12:08:40,731 Completed flushing 
 /var/lib/cassandra/data/system/schema_columns/system-schema_columns-hd-4-Data.db
  (273 bytes) for commitlog position ReplayPosition(segmentId=180414134565599, 
 position=11928)
  INFO 12:08:40,761 Compacted to 
 [/var/lib/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-hd-5-Data.db,].
   5,892 to 2,855 (~48% of original) bytes for 1 keys at 0.050421MB/s.  Time: 
 54ms.
 ERROR 12:08:40,780 Exception in thread Thread[MigrationStage:1,5,main]
 java.lang.NullPointerException
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:167)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.string(ByteBufferUtil.java:124)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.getString(JdbcUTF8.java:77)
   at org.apache.cassandra.cql.jdbc.JdbcUTF8.compose(JdbcUTF8.java:97)
   at org.apache.cassandra.db.marshal.UTF8Type.compose(UTF8Type.java:35)
   at 
 org.apache.cassandra.cql3.UntypedResultSet$Row.getString(UntypedResultSet.java:87)
   at

[jira] [Updated] (CASSANDRA-3763) compactionstats throws ArithmeticException: / by zero

2012-08-21 Thread Zenek Kraweznik (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zenek Kraweznik updated CASSANDRA-3763:
---

Affects Version/s: 1.1.4
   1.1.3

 compactionstats throws ArithmeticException: / by zero
 -

 Key: CASSANDRA-3763
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3763
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Affects Versions: 1.0.7, 1.0.8, 1.0.9, 1.0.10, 1.1.0, 1.1.1, 1.1.2, 1.1.3, 
 1.1.4
 Environment: debian linux - openvz kernel, oracle java 1.6.0.26
Reporter: Zenek Kraweznik
Priority: Trivial

 compactionstats looks like this:
 # nodetool -h localhost compactionstats
 Exception in thread main java.lang.ArithmeticException: / by zero
 at 
 org.apache.cassandra.db.compaction.LeveledManifest.getEstimatedTasks(LeveledManifest.java:435)
 at 
 org.apache.cassandra.db.compaction.LeveledCompactionStrategy.getEstimatedRemainingTasks(LeveledCompactionStrategy.java:128)
 at 
 org.apache.cassandra.db.compaction.CompactionManager.getPendingTasks(CompactionManager.java:1060)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93)
 at 
 com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27)
 at 
 com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208)
 at 
 com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65)
 at 
 com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216)
 at 
 com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666)
 at 
 com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72)
 at 
 javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360)
 at 
 javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:600)
 at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
 at sun.rmi.transport.Transport$1.run(Transport.java:159)
 at java.security.AccessController.doPrivileged(Native Method)
 at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
 at 
 sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
 at 
 sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 #
 nodetool is working fine in other actions:
 # nodetool -h localhost netstats
 Mode: NORMAL
 Not sending any streams.
 Not receiving any streams.
 Pool NameActive   Pending  Completed
 Commandsn/a 0  2
 Responses   n/a 0   1810
 #

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4245) Provide a UT8Type (case insensitive) comparator

2012-08-21 Thread JIRA


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438694#comment-13438694
 ] 

André Cruz commented on CASSANDRA-4245:
---

I'm also interested in a UTF-8 comparator that orders columns alphabetically. 
In fact, I was expecting this to be the default behaviour in Cassandra until it 
bit me. For example, with 3 columns: André, Zeus and Ándré.

I was expecting:
André
Ándré
Zeus

The result was:
André
Zeus
Ándré

This is what's being discussed in this issue, right?

 Provide a UT8Type (case insensitive) comparator
 ---

 Key: CASSANDRA-4245
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4245
 Project: Cassandra
  Issue Type: New Feature
Reporter: Ertio Lew
Assignee: Aaron Morton
Priority: Minor

 It is a common use case to use a bunch of entity names as column names  then 
 use the row as a search index, using search by range. For such use cases  
 others, it is useful to have a UTF8 comparator that provides case insensitive 
 ordering of columns.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4562) Cli getting odd states for Currently building index

Jeremy Hanna created CASSANDRA-4562:
---

 Summary: Cli getting odd states for Currently building index
 Key: CASSANDRA-4562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4562
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Jeremy Hanna
Priority: Minor


Whenever the cli outputs keyspace/column family data, if it's building an 
index, it will show the status of that build at the bottom of the output.  It 
looks like it's sometimes getting into a bad state.  One person reported seeing:
Currently building index index_name, completed d != java.lang.String

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4563) Remove nodetool setcachecapcity

Jeremy Hanna created CASSANDRA-4563:
---

 Summary: Remove nodetool setcachecapcity
 Key: CASSANDRA-4563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4563
 Project: Cassandra
  Issue Type: Task
  Components: Core
Affects Versions: 1.1.3
Reporter: Jeremy Hanna
Priority: Minor


nodetool setcachecapacity is now obsolete so it should be removed as it 
confuses users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[2/3] git commit: upgradesstables recommended for #4436

2012-08-21 Thread jbellis

upgradesstables recommended for #4436


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5655d972
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5655d972
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5655d972

Branch: refs/heads/trunk
Commit: 5655d97225b9873514a8b67da9a5d22357d05220
Parents: 7db46ef
Author: Jonathan Ellis jbel...@apache.org
Authored: Tue Aug 21 11:26:55 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Tue Aug 21 11:27:21 2012 -0500

--
 NEWS.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5655d972/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index a393c2d..a127ead 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -14,8 +14,8 @@ by version X, but the inverse is not necessarily the case.)
 
 Upgrading
 -
-- Nothing specific to this release, but please see 1.1 if you are upgrading
-  from a previous version.
+- Running nodetool upgradesstables after upgrading is recommended
+  if you use Counter columnfamilies.
 
 Features

[1/3] git commit: Merge branch 'cassandra-1.1' into trunk

2012-08-21 Thread jbellis

Updated Branches:
  refs/heads/cassandra-1.1 7db46ef80 - 5655d9722
  refs/heads/trunk 413a177df - dafcaeb06


Merge branch 'cassandra-1.1' into trunk


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/dafcaeb0
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/dafcaeb0
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/dafcaeb0

Branch: refs/heads/trunk
Commit: dafcaeb06103eb6e86ada1798ef6ce5cc4e87dac
Parents: 413a177 5655d97
Author: Jonathan Ellis jbel...@apache.org
Authored: Tue Aug 21 11:27:47 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Tue Aug 21 11:27:47 2012 -0500

--
 NEWS.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/dafcaeb0/NEWS.txt
--

[3/3] git commit: upgradesstables recommended for #4436

2012-08-21 Thread jbellis

upgradesstables recommended for #4436


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/5655d972
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/5655d972
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/5655d972

Branch: refs/heads/cassandra-1.1
Commit: 5655d97225b9873514a8b67da9a5d22357d05220
Parents: 7db46ef
Author: Jonathan Ellis jbel...@apache.org
Authored: Tue Aug 21 11:26:55 2012 -0500
Committer: Jonathan Ellis jbel...@apache.org
Committed: Tue Aug 21 11:27:21 2012 -0500

--
 NEWS.txt |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/5655d972/NEWS.txt
--
diff --git a/NEWS.txt b/NEWS.txt
index a393c2d..a127ead 100644
--- a/NEWS.txt
+++ b/NEWS.txt
@@ -14,8 +14,8 @@ by version X, but the inverse is not necessarily the case.)
 
 Upgrading
 -
-- Nothing specific to this release, but please see 1.1 if you are upgrading
-  from a previous version.
+- Running nodetool upgradesstables after upgrading is recommended
+  if you use Counter columnfamilies.
 
 Features

[jira] [Commented] (CASSANDRA-4562) Cli getting odd states for Currently building index


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438837#comment-13438837
 ] 

Jonathan Ellis commented on CASSANDRA-4562:
---

Whenever sounds like it's reproducible, is that the case?

 Cli getting odd states for Currently building index
 -

 Key: CASSANDRA-4562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4562
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Jeremy Hanna
Priority: Minor

 Whenever the cli outputs keyspace/column family data, if it's building an 
 index, it will show the status of that build at the bottom of the output.  It 
 looks like it's sometimes getting into a bad state.  One person reported 
 seeing:
 Currently building index index_name, completed d != java.lang.String

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4563) Remove nodetool setcachecapcity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4563:
--

 Reviewer: xedin
Affects Version/s: (was: 1.1.3)
   1.1.0
Fix Version/s: 1.1.5
 Assignee: Brandon Williams

 Remove nodetool setcachecapcity
 ---

 Key: CASSANDRA-4563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4563
 Project: Cassandra
  Issue Type: Task
  Components: Core
Affects Versions: 1.1.0
Reporter: Jeremy Hanna
Assignee: Brandon Williams
Priority: Minor
  Labels: nodetool
 Fix For: 1.1.5


 nodetool setcachecapacity is now obsolete so it should be removed as it 
 confuses users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4498) Remove openjdk-6-jre Cassandra APT dependencies


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4498:
--

Fix Version/s: (was: 1.1.4)
   1.1.5

 Remove openjdk-6-jre Cassandra APT dependencies
 ---

 Key: CASSANDRA-4498
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4498
 Project: Cassandra
  Issue Type: Improvement
Reporter: Terrance Shepherd
Assignee: paul cannon
Priority: Minor
  Labels: debian
 Fix For: 1.1.5, 1.2.0

 Attachments: apache_cassandra_Packages.diff


 As it is well known the recommended jre for Cassandra is sun java 1.6 but at 
 this point that package no longer in the debian or ubuntu apt repos. In order 
 to run Cassandra with the sun java 1.6 jre it must be installed manually with 
 out the repos. Because of this when you install cassandra via the apache or 
 datastax apt repos it must also install openjdk-6-jre even though sun java 
 1.6 jre is already installed.
 I would suggest that the java apt dependencies be removed from the Depends 
 field in package configuration and move to either the Recommends or Suggests 
 field so that way openjdk is not being downloaded when not necessary and 
 possibly interfering with a be pre-installed jre

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4563) Remove nodetool setcachecapcity


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-4563:
---

Reviewer: brandon.williams  (was: xedin)
Assignee: Pavel Yaskevich  (was: Brandon Williams)

 Remove nodetool setcachecapcity
 ---

 Key: CASSANDRA-4563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4563
 Project: Cassandra
  Issue Type: Task
  Components: Core
Affects Versions: 1.1.0
Reporter: Jeremy Hanna
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: nodetool
 Fix For: 1.1.5


 nodetool setcachecapacity is now obsolete so it should be removed as it 
 confuses users.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4533) Multithreaded cache saving can skip caches

[
https://issues.apache.org/jira/browse/CASSANDRA-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438996#comment-13438996
]

Jonathan Ellis commented on CASSANDRA-4533:
---

Hmm, I don't think this quite works because it still means we can skip saving
cache for CF X when CF Y is being flushed.

I think the problem this code is trying to solve, over a basic executor +
queue, is multiple tasks for X getting queued up while (say) compaction is
sucking a lot of i/o, then firing off those cache-save tasks for X faster than
the defined saving period when it speeds up.

I guess we could make it a PairCF, CacheType?

TBH this is probably premature optimization, if your cache period is so
frequent that multiple queued tasks is a problem, then you should just fix
that. I'd be okay with just ripping this out. Alternatively, we could have
the task check to see if the last-saved cache is older than M minutes before
overwriting it, similar to how normal background compaction submissions are a
no-op if it turns out there's nothing to do by the time we execute the task.

Multithreaded cache saving can skip caches
--

Key: CASSANDRA-4533
URL: https://issues.apache.org/jira/browse/CASSANDRA-4533
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.8.0
Reporter: Zhu Han
Assignee: Yuki Morishita
Priority: Trivial
Fix For: 1.1.5

Attachments: 4533-1.1.txt

Cassandra flushes the key and row cache to disk periodically. It also uses a
atomic flag in flushInProgress to enforce single cache writer at any time.
However, the cache saving task could be submitted to CompactionManager
concurrently, as long as the number of worker thread in CompactionManager is
larger than 1.
Due to the effect of above atomic flag, only one cache will be written out to
disk. Other writer are cancelled when the flag is true.
I observe the situation in Cassandra 1.0. If nothing is changed, the problem
should remain in Cassandra 1.1, either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4562) Cli getting odd states for Currently building index


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439000#comment-13439000
 ] 

Jeremy Hanna commented on CASSANDRA-4562:
-

It was anecdotal but it did appear when the user was running upgradesstables, 
which is different than building the indexes from scratch.  Maybe there's a 
clue in that.

 Cli getting odd states for Currently building index
 -

 Key: CASSANDRA-4562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4562
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Jeremy Hanna
Priority: Minor

 Whenever the cli outputs keyspace/column family data, if it's building an 
 index, it will show the status of that build at the bottom of the output.  It 
 looks like it's sometimes getting into a bad state.  One person reported 
 seeing:
 Currently building index index_name, completed d != java.lang.String

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439004#comment-13439004
 ] 

Jonathan Ellis commented on CASSANDRA-3772:
---

That sounds worth it to me.  Any downsides if we make MP the default for new 
clusters?

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4562) Cli getting odd states for Currently building index


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439007#comment-13439007
 ] 

Jeremy Hanna commented on CASSANDRA-4562:
-

Heh, so I suppose I just meant to say that the building status getting shown in 
the cli output even when that's not what is happening.  So it doesn't have 
correct data and looks bad.  However, since the CLI is deprecated, I'm not sure 
it's worth fixing.

 Cli getting odd states for Currently building index
 -

 Key: CASSANDRA-4562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4562
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Jeremy Hanna
Priority: Minor

 Whenever the cli outputs keyspace/column family data, if it's building an 
 index, it will show the status of that build at the bottom of the output.  It 
 looks like it's sometimes getting into a bad state.  One person reported 
 seeing:
 Currently building index index_name, completed d != java.lang.String

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4338) Experiment with direct buffer in SequentialWriter


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439010#comment-13439010
 ] 

Jonathan Ellis commented on CASSANDRA-4338:
---

Any difference in cpu usage with the direct buffer patch?  If we're not maxing 
out CPU then it wouldn't necessarily run faster even if it's more efficient.

 Experiment with direct buffer in SequentialWriter
 -

 Key: CASSANDRA-4338
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4338
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.2.0

 Attachments: 4338-gc.tar.gz, gc-4338-patched.png, gc-trunk.png


 Using a direct buffer instead of a heap-based byte[] should let us avoid a 
 copy into native memory when we flush the buffer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439012#comment-13439012
 ] 

Pavel Yaskevich commented on CASSANDRA-3772:


I don't see any, as it has both good collision resistance and distribution.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439016#comment-13439016
 ] 

Jonathan Ellis commented on CASSANDRA-3772:
---

Let's do it.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4562) Cli getting odd states for Currently building index


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439014#comment-13439014
 ] 

Jonathan Ellis commented on CASSANDRA-4562:
---

I'd be happy to fix it, but I'm still not sure what sequence of actions is 
supposed to make this happen.

 Cli getting odd states for Currently building index
 -

 Key: CASSANDRA-4562
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4562
 Project: Cassandra
  Issue Type: Bug
  Components: Core, Tools
Reporter: Jeremy Hanna
Priority: Minor

 Whenever the cli outputs keyspace/column family data, if it's building an 
 index, it will show the status of that build at the bottom of the output.  It 
 looks like it's sometimes getting into a bad state.  One person reported 
 seeing:
 Currently building index index_name, completed d != java.lang.String

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

git commit: add Murmur3Partitioner and make it default for new installations patch by Dave Brosius and Pavel Yaskevich; reviewed by Vijay for CASSANDRA-3772

2012-08-21 Thread xedin

Updated Branches:
  refs/heads/trunk dafcaeb06 - f41684fde


add Murmur3Partitioner and make it default for new installations
patch by Dave Brosius and Pavel Yaskevich; reviewed by Vijay for CASSANDRA-3772


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/f41684fd
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/f41684fd
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/f41684fd

Branch: refs/heads/trunk
Commit: f41684fdef7a9e8628cb40f66c13a88fcf7502e3
Parents: dafcaeb
Author: Pavel Yaskevich xe...@apache.org
Authored: Tue Aug 21 23:40:31 2012 +0300
Committer: Pavel Yaskevich xe...@apache.org
Committed: Tue Aug 21 23:40:31 2012 +0300

--
 CHANGES.txt|1 +
 conf/cassandra.yaml|4 +-
 .../cassandra/dht/AbstractHashedPartitioner.java   |  194 +++
 .../apache/cassandra/dht/Murmur3Partitioner.java   |   53 
 .../apache/cassandra/dht/RandomPartitioner.java|  160 +
 5 files changed, 255 insertions(+), 157 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/f41684fd/CHANGES.txt
--
diff --git a/CHANGES.txt b/CHANGES.txt
index 426ac7d..8fe1770 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -38,6 +38,7 @@
  * (cql3) Add support for 2ndary indexes (CASSANDRA-3680)
  * (cql3) fix defining more than one PK to be invalid (CASSANDRA-4477)
  * remove schema agreement checking from all external APIs (Thrift, CQL and 
CQL3) (CASSANDRA-4487)
+ * add Murmur3Partitioner and make it default for new installations 
(CASSANDRA-3772)
 
 
 1.1.5

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f41684fd/conf/cassandra.yaml
--
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index 1b89b2e..5e45961 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -70,6 +70,8 @@ authority: org.apache.cassandra.auth.AllowAllAuthority
 # 
 # - RandomPartitioner distributes rows across the cluster evenly by md5.
 #   When in doubt, this is the best option.
+# - Murmur3Partitioner is similar to RandomPartioner but uses Murmur3_128
+#   Hash Function instead of md5
 # - ByteOrderedPartitioner orders rows lexically by key bytes.  BOP allows
 #   scanning rows in key order, but the ordering can generate hot spots
 #   for sequential insertion workloads.
@@ -81,7 +83,7 @@ authority: org.apache.cassandra.auth.AllowAllAuthority
 #
 # See http://wiki.apache.org/cassandra/Operations for more on
 # partitioners and token selection.
-partitioner: org.apache.cassandra.dht.RandomPartitioner
+partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 
 # directories where Cassandra should store data on disk.
 data_file_directories:

http://git-wip-us.apache.org/repos/asf/cassandra/blob/f41684fd/src/java/org/apache/cassandra/dht/AbstractHashedPartitioner.java
--
diff --git a/src/java/org/apache/cassandra/dht/AbstractHashedPartitioner.java 
b/src/java/org/apache/cassandra/dht/AbstractHashedPartitioner.java
new file mode 100644
index 000..55dfb97
--- /dev/null
+++ b/src/java/org/apache/cassandra/dht/AbstractHashedPartitioner.java
@@ -0,0 +1,194 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.cassandra.dht;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+import java.nio.ByteBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.util.*;
+
+import org.apache.cassandra.config.ConfigurationException;
+import org.apache.cassandra.db.DecoratedKey;
+import org.apache.cassandra.utils.ByteBufferUtil;
+import org.apache.cassandra.utils.FBUtilities;
+import org.apache.cassandra.utils.GuidGenerator;
+import org.apache.cassandra.utils.Pair;
+
+/**
+ * This class is the super class of classes that generate a BigIntegerToken 
using hash function.
+ */
+public abstract class

[jira] [Updated] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-3772:
---

Fix Version/s: (was: 1.3)
   1.2.0

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439030#comment-13439030
 ] 

Jonathan Ellis commented on CASSANDRA-3772:
---

Sorry, didn't look at the code until commit...

Can you test making it hash to a Long or a 8-byte ByteBuffer?  16-byte 
BigInteger is overkill, all we need is a reasonable distribution (now that 
Tokens don't need to be unique) and 64 or even 32 bits is plenty for that.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reopened CASSANDRA-3772:
---


 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4457) Find the cause for the need for a larger stack size with jdk 7

2012-08-21 Thread Eric Parusel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439029#comment-13439029
 ] 

Eric Parusel commented on CASSANDRA-4457:
-

This happened to me too, when I killed one node in the cluster -- the other two 
nodes threw the following exception:

ERROR [WRITE-/10.40.12.67] 2012-08-21 20:35:29,512 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[WRITE-/10.40.12.67,5,main]
java.lang.StackOverflowError
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:156)
at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:126)

When I bring the 1st node back up the other nodes are unable to send messages 
to the first node due to the exception thrown on that thread.

After increasing the minimum stack size to 256k this doesn't occur.

 Find the cause for the need for a larger stack size with jdk 7
 --

 Key: CASSANDRA-4457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4457
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremy Hanna
Priority: Minor

 Based on discussions post CASSANDRA-4275, it appears that on jdk 7 that the 
 minimum stack size needs to be set to something higher than 160k.  That 
 shouldn't be necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (CASSANDRA-4457) Find the cause for the need for a larger stack size with jdk 7

2012-08-21 Thread Eric Parusel (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439029#comment-13439029
 ] 

Eric Parusel edited comment on CASSANDRA-4457 at 8/22/12 7:53 AM:
--

This happened to me too, when I killed one node in the cluster -- the other two 
nodes threw the following exception:

ERROR [WRITE-/10.40.12.67] 2012-08-21 20:35:29,512 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[WRITE-/10.40.12.67,5,main]
java.lang.StackOverflowError
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:156)
at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:126)

When I bring the 1st node back up the other nodes are unable to send messages 
to the first node due to the exceptions thrown by the other two nodes on that 
thread.

After increasing the minimum stack size to 256k this doesn't occur.

  was (Author: eparusel):
This happened to me too, when I killed one node in the cluster -- the other 
two nodes threw the following exception:

ERROR [WRITE-/10.40.12.67] 2012-08-21 20:35:29,512 AbstractCassandraDaemon.java 
(line 134) Exception in thread Thread[WRITE-/10.40.12.67,5,main]
java.lang.StackOverflowError
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at java.io.DataOutputStream.flush(DataOutputStream.java:123)
at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:156)
at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:126)

When I bring the 1st node back up the other nodes are unable to send messages 
to the first node due to the exception thrown on that thread.

After increasing the minimum stack size to 256k this doesn't occur.
  
 Find the cause for the need for a larger stack size with jdk 7
 --

 Key: CASSANDRA-4457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4457
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremy Hanna
Priority: Minor

 Based on discussions post CASSANDRA-4275, it appears that on jdk 7 that the 
 minimum stack size needs to be set to something higher than 160k.  That 
 shouldn't be necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439031#comment-13439031
 ] 

Pavel Yaskevich commented on CASSANDRA-3772:


Sure

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4559) implement token relocation

2012-08-21 Thread Eric Evans (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-4559:
--

Description: 
Whatever the specifics of a _shuffle_ (see CASSANDRA-4443), it will be 
necessary to relocate a range from one node to another.

_Edit0: Linked in new patch containing tests._


h3. Patches
||Compare||Raw diff||Description||
|[010_refactor_range_move|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move]|[010_refactor_range_move.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move.diff]|No
 Description|
|[020_calculate_pending|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending]|[020_calculate_pending.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending.diff]|No
 Description|
|[030_relocate_token|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token]|[030_relocate_token.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token.diff]|No
 Description|
|[040_tests|https://github.com/acunu/cassandra/compare/top-bases/p/4443/040_tests...p/4443/040_tests]|[040_tests.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/040_tests...p/4443/040_tests.diff]|No
 Description|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._

  was:
Whatever the specifics of a _shuffle_ (see CASSANDRA-4443), it will be 
necessary to relocate a range from one node to another.



h3. Patches
||Compare||Raw diff||Description||
|[010_refactor_range_move|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move]|[010_refactor_range_move.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move.diff]|No
 Description|
|[020_calculate_pending|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending]|[020_calculate_pending.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending.diff]|No
 Description|
|[030_relocate_token|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token]|[030_relocate_token.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token.diff]|No
 Description|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._


 implement token relocation
 --

 Key: CASSANDRA-4559
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4559
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core, Tools
Reporter: Eric Evans
Assignee: Eric Evans
  Labels: vnodes

 Whatever the specifics of a _shuffle_ (see CASSANDRA-4443), it will be 
 necessary to relocate a range from one node to another.
 _Edit0: Linked in new patch containing tests._
 
 h3. Patches
 ||Compare||Raw diff||Description||
 |[010_refactor_range_move|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move]|[010_refactor_range_move.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move.diff]|No
  Description|
 |[020_calculate_pending|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending]|[020_calculate_pending.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending.diff]|No
  Description|
 |[030_relocate_token|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token]|[030_relocate_token.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token.diff]|No
  Description|
 |[040_tests|https://github.com/acunu/cassandra/compare/top-bases/p/4443/040_tests...p/4443/040_tests]|[040_tests.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/040_tests...p/4443/040_tests.diff]|No
  Description|

[jira] [Commented] (CASSANDRA-4457) Find the cause for the need for a larger stack size with jdk 7


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439076#comment-13439076
 ] 

Jonathan Ellis commented on CASSANDRA-4457:
---

socketWrite0 in SocketOutputStream.c declares an 8KB buffer on the stack, so 
that could push it over the edge if it were tight.

 Find the cause for the need for a larger stack size with jdk 7
 --

 Key: CASSANDRA-4457
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4457
 Project: Cassandra
  Issue Type: Bug
Reporter: Jeremy Hanna
Priority: Minor

 Based on discussions post CASSANDRA-4275, it appears that on jdk 7 that the 
 minimum stack size needs to be set to something higher than 160k.  That 
 shouldn't be necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-3772) Evaluate Murmur3-based partitioner


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-3772:
---

Attachment: CASSANDRA-3772-v3.patch

Attached patch to use first part of hash3_x64_128 (no copies into byte array) 
which shows better results than hash2_64. This approach ~18 op points better 
than previous.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.2.0

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 CASSANDRA-3772-v3.patch, hashed_partitioner_3.diff, hashed_partitioner.diff, 
 MumPartitionerTest.docx, try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4564) MoveTest madness

2012-08-21 Thread Eric Evans (JIRA)

Eric Evans created CASSANDRA-4564:
-

 Summary: MoveTest madness
 Key: CASSANDRA-4564
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4564
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
Reporter: Eric Evans


I encountered what looks like bugs in 
{{o.a.c.service.MoveTest.newTestWriteEndpointsDuringMove()}} while doing 
something else; Here is a (poorly researched )ticket before I forget :)

* There are two loops over non-system tables, and the first is a NOOP
* In the second loop, a set exactly {{replicationFactor}} in size is compared 
against {{tmd.getWriteEndpoints()}}, which should produce greater than 
{{replicationFactor}} endpoints during a move (shouldn't it?); How does this 
pass?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush

Edward Capriolo created CASSANDRA-4565:
--

 Summary: TTL columns with older then gcgrace do not need to flush
 Key: CASSANDRA-4565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
 Project: Cassandra
  Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo


With memcache many people are willing to sacrifice durability for performance. 
Cassandra has a TimeToLive feature that can be used in caching scenarios with 
low values for gc_grace_seconds. However from a code dive it seems that 
cassandra will always write TTL to disk, even those that are beyond 
gc_grace_seconds. If a user very large memtables,small ttl, and low gc_grace it 
is possible that writing memtables can be skipped entirely in some scenarios.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-4565:
---

Attachment: cassandra-4565.patch.1.txt

First attempt at a patch. Test passes, but we will likely refine this later.

 TTL columns with older then gcgrace do not need to flush
 

 Key: CASSANDRA-4565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
 Project: Cassandra
  Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Attachments: cassandra-4565.patch.1.txt


 With memcache many people are willing to sacrifice durability for 
 performance. Cassandra has a TimeToLive feature that can be used in caching 
 scenarios with low values for gc_grace_seconds. However from a code dive it 
 seems that cassandra will always write TTL to disk, even those that are 
 beyond gc_grace_seconds. If a user very large memtables,small ttl, and low 
 gc_grace it is possible that writing memtables can be skipped entirely in 
 some scenarios.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush

[
https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Edward Capriolo updated CASSANDRA-4565:
---

Description: With memcache many people are willing to sacrifice durability
for performance. Cassandra has a TimeToLive feature that can be used in caching
scenarios with low values for gc_grace_seconds. However from a code dive it
seems that cassandra will always write TTL to disk, even those that are beyond
gc_grace_seconds. If a use case very large memtables,small ttl, and small
gc_grace it is possible that flushing these columns to disk can be skipped
entirely in some scenarios. (was: With memcache many people are willing to
sacrifice durability for performance. Cassandra has a TimeToLive feature that
can be used in caching scenarios with low values for gc_grace_seconds. However
from a code dive it seems that cassandra will always write TTL to disk, even
those that are beyond gc_grace_seconds. If a user very large memtables,small
ttl, and low gc_grace it is possible that writing memtables can be skipped
entirely in some scenarios.)

TTL columns with older then gcgrace do not need to flush

Key: CASSANDRA-4565
URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
Project: Cassandra
Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Attachments: cassandra-4565.patch.1.txt

With memcache many people are willing to sacrifice durability for
performance. Cassandra has a TimeToLive feature that can be used in caching
scenarios with low values for gc_grace_seconds. However from a code dive it
seems that cassandra will always write TTL to disk, even those that are
beyond gc_grace_seconds. If a use case very large memtables,small ttl, and
small gc_grace it is possible that flushing these columns to disk can be
skipped entirely in some scenarios.

[jira] [Updated] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-4565:
---

Fix Version/s: 1.3

 TTL columns with older then gcgrace do not need to flush
 

 Key: CASSANDRA-4565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
 Project: Cassandra
  Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 1.3

 Attachments: cassandra-4565.patch.1.txt


 With memcache many people are willing to sacrifice durability for 
 performance. Cassandra has a TimeToLive feature that can be used in caching 
 scenarios with low values for gc_grace_seconds. However from a code dive it 
 seems that cassandra will always write TTL to disk, even those that are 
 beyond gc_grace_seconds. If a use case very large memtables,small ttl, and 
 small gc_grace it is possible that flushing these columns to disk can be 
 skipped entirely in some scenarios. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439190#comment-13439190
 ] 

Edward Capriolo commented on CASSANDRA-4565:


Nevermind, from a code dive I see cf.maybeResetDeletionTimes(gcBefore); 
converts ExpiringColumns to DeletedColumns before this method.

 TTL columns with older then gcgrace do not need to flush
 

 Key: CASSANDRA-4565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
 Project: Cassandra
  Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 1.3

 Attachments: cassandra-4565.patch.1.txt


 With memcache many people are willing to sacrifice durability for 
 performance. Cassandra has a TimeToLive feature that can be used in caching 
 scenarios with low values for gc_grace_seconds. However from a code dive it 
 seems that cassandra will always write TTL to disk, even those that are 
 beyond gc_grace_seconds. If a use case very large memtables,small ttl, and 
 small gc_grace it is possible that flushing these columns to disk can be 
 skipped entirely in some scenarios. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4533) Multithreaded cache saving can skip caches

2012-08-21 Thread Yuki Morishita (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439199#comment-13439199
]

Yuki Morishita commented on CASSANDRA-4533:
---

bq. Hmm, I don't think this quite works because it still means we can skip
saving cache for CF X when CF Y is being flushed.

In my understanding, since 1.1, C* stores key and row caches globally, those
are saved at once for every CF for each cache type.
AutoSavingCache$Writer writes all CF for certain CacheType in one execution.

Multithreaded cache saving can skip caches
--

Attachments: 4533-1.1.txt

[jira] [Commented] (CASSANDRA-4565) TTL columns with older then gcgrace do not need to flush


[ 
https://issues.apache.org/jira/browse/CASSANDRA-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13439229#comment-13439229
 ] 

Jonathan Ellis commented on CASSANDRA-4565:
---

CASSANDRA-4542 calls for a generalization of this, btw.

 TTL columns with older then gcgrace do not need to flush
 

 Key: CASSANDRA-4565
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4565
 Project: Cassandra
  Issue Type: Improvement
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Fix For: 1.3

 Attachments: cassandra-4565.patch.1.txt


 With memcache many people are willing to sacrifice durability for 
 performance. Cassandra has a TimeToLive feature that can be used in caching 
 scenarios with low values for gc_grace_seconds. However from a code dive it 
 seems that cassandra will always write TTL to disk, even those that are 
 beyond gc_grace_seconds. If a use case very large memtables,small ttl, and 
 small gc_grace it is possible that flushing these columns to disk can be 
 skipped entirely in some scenarios. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-4533) Multithreaded cache saving can skip caches