[jira] [Commented] (CASSANDRA-3624) Hinted Handoff - related OOM

2011-12-23 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175332#comment-13175332
 ] 

Radim Kolar commented on CASSANDRA-3624:


I have this problem too but i do not have large rows, i have huge number of 
small rows (max 180 bytes serialized)

 Hinted Handoff - related OOM
 

 Key: CASSANDRA-3624
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3624
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Marcus Eriksson
Assignee: Jonathan Ellis
  Labels: hintedhandoff
 Fix For: 1.0.7

 Attachments: 3624.txt


 One of our nodes had collected alot of hints for another node, so when the 
 dead node came back and the row mutations were read back from disk, the node 
 died with an OOM-exception (and kept dying after restart, even with increased 
 heap (from 8G to 12G)). The heap dump contained alot of SuperColumns and our 
 application does not use those (but HH does). 
 I'm guessing that each mutation is big so that PAGE_SIZE*mutation_size does 
 not fit in memory (will check this tomorrow)
 A simple fix (if my assumption above is correct) would be to reduce the 
 PAGE_SIZE in HintedHandOffManager.java to something like 10 (or even 1?) to 
 reduce the memory pressure. The performance hit would be small since we are 
 doing the hinted handoff throttle delay sleep before sending every *mutation* 
 anyway (not every page), thoughts?
 If anyone runs in to the same problem, I got the node started again by simply 
 removing the HintsColumnFamily* files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3624) Hinted Handoff - related OOM

2011-12-23 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175333#comment-13175333
 ] 

Radim Kolar commented on CASSANDRA-3624:


I have this problem too but i do not have large rows, i have huge number of 
small rows (max 180 bytes serialized)

 Hinted Handoff - related OOM
 

 Key: CASSANDRA-3624
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3624
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Marcus Eriksson
Assignee: Jonathan Ellis
  Labels: hintedhandoff
 Fix For: 1.0.7

 Attachments: 3624.txt


 One of our nodes had collected alot of hints for another node, so when the 
 dead node came back and the row mutations were read back from disk, the node 
 died with an OOM-exception (and kept dying after restart, even with increased 
 heap (from 8G to 12G)). The heap dump contained alot of SuperColumns and our 
 application does not use those (but HH does). 
 I'm guessing that each mutation is big so that PAGE_SIZE*mutation_size does 
 not fit in memory (will check this tomorrow)
 A simple fix (if my assumption above is correct) would be to reduce the 
 PAGE_SIZE in HintedHandOffManager.java to something like 10 (or even 1?) to 
 reduce the memory pressure. The performance hit would be small since we are 
 doing the hinted handoff throttle delay sleep before sending every *mutation* 
 anyway (not every page), thoughts?
 If anyone runs in to the same problem, I got the node started again by simply 
 removing the HintsColumnFamily* files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2749) fine-grained control over data directories

2011-12-23 Thread Sylvain Lebresne (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2749:


Attachment: (was: 0002-fix-unit-tests.patch)

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1

 Attachments: 
 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 
 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 
 2749.tar.gz, 2749_backwards_compatible_v1.patch, 
 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 
 2749_backwards_compatible_v4.patch, 
 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 
 2749_proper.tar.gz


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2749) fine-grained control over data directories

2011-12-23 Thread Sylvain Lebresne (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2749:


Attachment: 0003-Fixes.patch
0002-fix-unit-tests.patch
0001-2749.patch

Attaching rebased patches, with a 3rd patches (0003-Fixes.patch) addressing the 
Pavel's remarks. More specifically:

bq. o.a.c.db.Directories comment should be updated because it still uses 
SSTable file name without keyspace.

Fixed, thanks

bq. o.a.c.io.sstable.SSTableReaderTest won't compile

Sorry, I forgot to check the test after a last rebase, fixed too (this involved 
renaming a number of sstables from test/data/legacy-sstables/hb to include the 
keyspace name, so that specific change is in the 2nd 'fix unit tests' patch to 
avoid polluting the 3rd one).

bq. if you start with empty data directory you get following exception and 
process exits

Fixed. I've actually made two modifications: the migration checks the existence 
of the directory to avoid the NPE during listFiles(), but I've also modified 
the 'should we migrate' check to detect new nodes (checking if the system 
keyspace directory exists) and thus not print the migration message at all.

bq. on snapshot doesn't create or move (from older schema) index SSTables 
related to CF

I'm not sure I see what this one is. Are we talking of the migration process?

In any case, you made me think about secondary indexes. Maybe it is more 
natural to have secondary indexes sstables be in the same directory than the 
base cfs? Since the indexes name is not really something exposed (granted you 
don't have to be a genius to figure it out), it feels like it would slightly 
simplify administration to not put them in a separate directory.

I've updated the patch to implement this last idea (so indexes are in the same 
directory than their base cf), but it would be nice to have multiple opinions 
on that move since we don't want to have to do a new migration in 6 month 
because we've changed our mind.

bq. shouldn't old snapshots directory be removed after move?

Your right, fixed (for backups too).


 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-2749.patch, 
 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 
 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 
 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 
 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 
 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 
 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 
 2749_proper.tar.gz


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175390#comment-13175390
 ] 

Pavel Yaskevich commented on CASSANDRA-2749:


bq. I'm not sure I see what this one is. Are we talking of the migration 
process?

I was testing it like this : 

# run 1.1 *without* modifications
# ./tools/stress/bin/stress -n 5 -S 512 -x KEYS
# ./bin/nodetool -h localhost flush Keyspace1 Standard1
# ./bin/nodetool -h localhost snapshot Keyspace1
# made sure that Standard1.Idx-* SSTables are in the snapshots/timestamp 
directory
# run 1.1 *with* you patch applied
# checked if snapshots directory was moved and what files did it include - 
it was lucking Standard1.Idx-* files
# cleaned data directory
# repeated steps 1 - 5 but this time *with* your patch applied and it 
didn't include Standard1.Idx-* into snapshot

bq. Maybe it is more natural to have secondary indexes sstables be in the 
same directory than the base cfs? 

+1


 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-2749.patch, 
 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 
 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 
 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 
 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 
 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 
 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 
 2749_proper.tar.gz


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down

2011-12-23 Thread Viktor Jevdokimov (Created) (JIRA)
Changing compaction strategy from Leveled to SizeTiered puts the node down
--

 Key: CASSANDRA-3666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
 Environment: Windows Server 2008 R2 64bit
Reporter: Viktor Jevdokimov


When column family compaction strategy is changed from Leveled to SizeTiered 
and there're Leveled compaction tasks pending, Cassandra starting to flood in 
logs with thousands per sec messages:

Nothing to compact in ColumnFamily1.  Use forceUserDefinedCompaction if you 
wish to force compaction of single sstables (e.g. for tombstone collection)

As a result, log disk is full and system is down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175428#comment-13175428
 ] 

Sylvain Lebresne commented on CASSANDRA-3143:
-

Alright, patch lgtm, +1. Great work Pavel.

Just a few minor details that would be nice to do before committing:
* As mentioned in the previous comments, currently when a row need to be read 
to be put in cache, CFS.cacheRow() decorates the key, which can be avoided just 
by making cacheRow take the DK and create the RowCacheKey internally.
* We should rename setRowCacheCapacity to setRowCacheCapacityMB to match the 
others
* It would be nice to move the cache stats from nodetool cfstats to nodetool 
info, rather than purely removing them
* The saveCaches method still does not respect the cacheKeysToSave options

And of course there is the question of disabling row caching on per-cf basis 
which, as said previously, I think is a must have before we release this 
(because any user that have at least one CF with wide rows (or that just 
happens to be a bad candidate for caching) will need it). So ok to do that post 
commit but let's put it at the top of the todo list then.


 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1

 Attachments: 0002-fixes.patch, CASSANDRA-3143-squashed.patch


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-12-23 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175447#comment-13175447
 ] 

Sylvain Lebresne commented on CASSANDRA-2749:
-

Weird. I just tried the same scenario and everything worked correctly. I should 
mention that when moving the snapshots/backups, the migration process rename 
them to the new filename convention, so they will be called 
Keyspace1-Standard1.Idx-*. Or maybe I fixed it with the last version of the 
patch without realizing it.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-2749.patch, 
 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 
 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 
 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 
 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 
 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 
 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 
 2749_proper.tar.gz


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Pavel Yaskevich (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-3143:
---

Attachment: (was: 0002-fixes.patch)

 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Pavel Yaskevich (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-3143:
---

Attachment: (was: CASSANDRA-3143-squashed.patch)

 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175450#comment-13175450
 ] 

Sylvain Lebresne commented on CASSANDRA-3143:
-

Last version lgtm, +1 (nit: I don't think the getCacheCapacityInBytes methods 
are too necessary when we already have it in MB).

 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 
 0003-final-fixes.patch


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.

2011-12-23 Thread Pavel Yaskevich (Created) (JIRA)
We need a way to deactivate row/key caching on a per-cf basis.
--

 Key: CASSANDRA-3667
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich


Initial idea would be to either have a boolean flag if we only want to allow 
disabling row cache, or some multi-value caches option that could be none, 
key_only, row_only or all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175453#comment-13175453
 ] 

Pavel Yaskevich commented on CASSANDRA-3143:


Thanks, Sylvain! I have created CASSANDRA-3667, will get to it as soon as I 
commit this one.

 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 
 0003-final-fixes.patch


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.

2011-12-23 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175457#comment-13175457
 ] 

Sylvain Lebresne commented on CASSANDRA-3667:
-

I don't care a lot but I would personally slightly prefer the multi-values 
setting as it's probably not very much harder to implement. 

 We need a way to deactivate row/key caching on a per-cf basis.
 --

 Key: CASSANDRA-3667
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich

 Initial idea would be to either have a boolean flag if we only want to allow 
 disabling row cache, or some multi-value caches option that could be none, 
 key_only, row_only or all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175458#comment-13175458
 ] 

Pavel Yaskevich commented on CASSANDRA-3667:


I agree.

 We need a way to deactivate row/key caching on a per-cf basis.
 --

 Key: CASSANDRA-3667
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich

 Initial idea would be to either have a boolean flag if we only want to allow 
 disabling row cache, or some multi-value caches option that could be none, 
 key_only, row_only or all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175460#comment-13175460
 ] 

Pavel Yaskevich commented on CASSANDRA-2749:


That my be the case :) I will re-test as part of the review anyway.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.1

 Attachments: 0001-2749.patch, 
 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 
 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 
 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 
 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 
 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 
 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 
 2749_proper.tar.gz


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way

2011-12-23 Thread Yuki Morishita (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175462#comment-13175462
 ] 

Yuki Morishita commented on CASSANDRA-3497:
---

+1

 BloomFilter FP ratio should be configurable or size-restricted some other way
 -

 Key: CASSANDRA-3497
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Brandon Williams
Assignee: Yuki Morishita
Priority: Minor
 Fix For: 1.0.7

 Attachments: 3497-v3.txt, 3497-v4.txt, CASSANDRA-1.0-3497.txt


 When you have a live dc and purely analytical dc, in many situations you can 
 have less nodes on the analytical side, but end up getting restricted by 
 having the BloomFilters in-memory, even though you have absolutely no use for 
 them.  It would be nice if you could reduce this memory requirement by tuning 
 the desired FP ratio, or even just disabling them altogether.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)

2011-12-23 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175470#comment-13175470
 ] 

Jonathan Ellis commented on CASSANDRA-3143:
---

+1

 Global caches (key/row)
 ---

 Key: CASSANDRA-3143
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: Core
 Fix For: 1.1

 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 
 0003-final-fixes.patch


 Caches are difficult to configure well as ColumnFamilies are added, similar 
 to how memtables were difficult pre-CASSANDRA-2006.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1222715 [3/3] - in /cassandra/trunk: ./ conf/ doc/cql/ interface/ src/avro/ src/java/org/apache/cassandra/cache/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassandra/config/ s

2011-12-23 Thread xedin
Modified: cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java?rev=1222715r1=1222714r2=1222715view=diff
==
--- cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java 
(original)
+++ cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java Fri Dec 
23 16:09:05 2011
@@ -20,12 +20,17 @@ package org.apache.cassandra.db;
  * 
  */
 
-
 import java.io.IOException;
 import java.util.HashMap;
 import java.util.Map;
 import java.util.concurrent.ExecutionException;
 
+import org.apache.cassandra.cache.KeyCacheKey;
+import org.apache.cassandra.db.filter.QueryFilter;
+import org.apache.cassandra.service.CacheService;
+import org.apache.cassandra.thrift.ColumnParent;
+
+import org.junit.AfterClass;
 import org.junit.Test;
 
 import org.apache.cassandra.CleanupHelper;
@@ -43,18 +48,11 @@ public class KeyCacheTest extends Cleanu
 private static final String TABLE1 = KeyCacheSpace;
 private static final String COLUMN_FAMILY1 = Standard1;
 private static final String COLUMN_FAMILY2 = Standard2;
-private static final String COLUMN_FAMILY3 = Standard3;
-
-@Test
-public void testKeyCache50() throws IOException, ExecutionException, 
InterruptedException
-{
-testKeyCache(COLUMN_FAMILY1, 64);
-}
 
-@Test
-public void testKeyCache100() throws IOException, ExecutionException, 
InterruptedException
+@AfterClass
+public static void cleanup()
 {
-testKeyCache(COLUMN_FAMILY2, 128);
+cleanupSavedCaches();
 }
 
 @Test
@@ -62,57 +60,48 @@ public class KeyCacheTest extends Cleanu
 {
 CompactionManager.instance.disableAutoCompaction();
 
-ColumnFamilyStore store = 
Table.open(TABLE1).getColumnFamilyStore(COLUMN_FAMILY3);
+ColumnFamilyStore store = 
Table.open(TABLE1).getColumnFamilyStore(COLUMN_FAMILY2);
 
 // empty the cache
-store.invalidateKeyCache();
-assert store.getKeyCacheSize() == 0;
+CacheService.instance.invalidateKeyCache();
+assert CacheService.instance.keyCache.size() == 0;
 
 // insert data and force to disk
-insertData(TABLE1, COLUMN_FAMILY3, 0, 100);
+insertData(TABLE1, COLUMN_FAMILY2, 0, 100);
 store.forceBlockingFlush();
 
 // populate the cache
-readData(TABLE1, COLUMN_FAMILY3, 0, 100);
-assertEquals(100, store.getKeyCacheSize());
+readData(TABLE1, COLUMN_FAMILY2, 0, 100);
+assertEquals(100, CacheService.instance.keyCache.size());
 
 // really? our caches don't implement the map interface? (hence no 
.addAll)
-MapPairDescriptor, DecoratedKey, Long savedMap = new 
HashMapPairDescriptor, DecoratedKey, Long();
-for (PairDescriptor, DecoratedKey k : 
store.getKeyCache().getKeySet())
+MapKeyCacheKey, Long savedMap = new HashMapKeyCacheKey, Long();
+for (KeyCacheKey k : CacheService.instance.keyCache.getKeySet())
 {
-savedMap.put(k, store.getKeyCache().get(k));
+savedMap.put(k, CacheService.instance.keyCache.get(k));
 }
 
 // force the cache to disk
-store.keyCache.submitWrite(Integer.MAX_VALUE).get();
-
-// empty the cache again to make sure values came from disk
-store.invalidateKeyCache();
-assert store.getKeyCacheSize() == 0;
-
-// load the cache from disk.  unregister the old mbean so we can 
recreate a new CFS object.
-// but don't invalidate() the old CFS, which would nuke the data we 
want to try to load
-store.unregisterMBean();
-ColumnFamilyStore newStore = 
ColumnFamilyStore.createColumnFamilyStore(Table.open(TABLE1), COLUMN_FAMILY3);
-assertEquals(100, newStore.getKeyCacheSize());
+CacheService.instance.keyCache.submitWrite(Integer.MAX_VALUE).get();
 
-assertEquals(100, savedMap.size());
-for (Map.EntryPairDescriptor, DecoratedKey, Long entry : 
savedMap.entrySet())
-{
-assert 
newStore.getKeyCache().get(entry.getKey()).equals(entry.getValue());
-}
+CacheService.instance.invalidateKeyCache();
+assert CacheService.instance.keyCache.size() == 0;
 }
 
-public void testKeyCache(String cfName, int expectedCacheSize) throws 
IOException, ExecutionException, InterruptedException
+@Test
+public void testKeyCache() throws IOException, ExecutionException, 
InterruptedException
 {
 CompactionManager.instance.disableAutoCompaction();
 
 Table table = Table.open(TABLE1);
-ColumnFamilyStore cfs = table.getColumnFamilyStore(cfName);
+ColumnFamilyStore cfs = table.getColumnFamilyStore(COLUMN_FAMILY1);
+
+// just to make sure that everything is clean
+CacheService.instance.invalidateKeyCache();
 
-

[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files

2011-12-23 Thread Sylvain Lebresne (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175481#comment-13175481
 ] 

Sylvain Lebresne commented on CASSANDRA-2988:
-

+1 (on 2988-2-v2) with 2 nits:
* It's probably worth caching the value of 
{{sstableMetadata.estimatedRowSize.count()}} to avoid the double computation 
most of the time.
* I think
{noformat}
long current = buckets.get(i);
if (current  0)
sum += current;
{noformat}
can be condensed to {{sum += buckets.get\(i);}} (given current can't be 
negative).

 Improve SSTableReader.load() when loading index files
 -

 Key: CASSANDRA-2988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Melvin Wang
Assignee: Melvin Wang
Priority: Minor
 Fix For: 1.0.7

 Attachments: 2988-2-cleaned.txt, 2988-2-v2.txt, 2988-parallel-v2.txt, 
 c2988-2-v2, c2988-modified-buffer.patch, c2988-parallel-load-sstables.patch


 * when we create BufferredRandomAccessFile, we pass skipCache=true. This 
 hurts the read performance because we always process the index files 
 sequentially. Simple fix would be set it to false.
 * multiple index files of a single column family can be loaded in parallel. 
 This buys a lot when you have multiple super large index files.
 * we may also change how we buffer. By using BufferredRandomAccessFile, for 
 every read, we need bunch of checking like
   - do we need to rebuffer?
   - isEOF()?
   - assertions
   These can be simplified to some extent.  We can blindly buffer the index 
 file by chunks and process the buffer until a key lies across boundary of a 
 chunk. Then we rebuffer and start from the beginning of the partially read 
 key. Conceptually, this is same as what BRAF does but w/o the overhead in the 
 read**() methods in BRAF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1222728 - in /cassandra/trunk: CHANGES.txt src/java/org/apache/cassandra/db/Memtable.java src/java/org/apache/cassandra/db/RowIteratorFactory.java

2011-12-23 Thread slebresne
Author: slebresne
Date: Fri Dec 23 16:25:19 2011
New Revision: 1222728

URL: http://svn.apache.org/viewvc?rev=1222728view=rev
Log:
Optimize memtable iteration during range scan
patch by slebresne; reviewed by jbellis for CASSANDRA-3638

Modified:
cassandra/trunk/CHANGES.txt
cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java
cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1222728r1=1222727r2=1222728view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Fri Dec 23 16:25:19 2011
@@ -29,6 +29,7 @@
  * fsync the directory after new sstable or commitlog segment are created 
(CASSANDRA-3250)
  * fix minor issues reported by FindBugs (CASSANDRA-3658)
  * global key/row caches (CASSANDRA-3143)
+ * optimize memtable iteration during range scan (CASSANDRA-3638)
 
 1.0.7
  * add nodetool setstreamthroughput (CASSANDRA-3571)

Modified: cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java?rev=1222728r1=1222727r2=1222728view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java Fri Dec 23 
16:25:19 2011
@@ -309,11 +309,13 @@ public class Memtable
  * @param startWith Include data in the result from and including this key 
and to the end of the memtable
  * @return An iterator of entries with the data from the start key 
  */
-public IteratorMap.EntryDecoratedKey, ColumnFamily 
getEntryIterator(final RowPosition startWith)
+public IteratorMap.EntryDecoratedKey, ColumnFamily 
getEntryIterator(final RowPosition startWith, final RowPosition stopAt)
 {
 return new IteratorMap.EntryDecoratedKey, ColumnFamily()
 {
-private IteratorMap.EntryRowPosition, ColumnFamily iter = 
columnFamilies.tailMap(startWith).entrySet().iterator();
+private IteratorMap.EntryRowPosition, ColumnFamily iter = 
stopAt.isMinimum()
+? 
columnFamilies.tailMap(startWith).entrySet().iterator()
+: 
columnFamilies.subMap(startWith, true, stopAt, true).entrySet().iterator();
 
 public boolean hasNext()
 {

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java?rev=1222728r1=1222727r2=1222728view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java 
Fri Dec 23 16:25:19 2011
@@ -65,21 +65,11 @@ public class RowIteratorFactory
 {
 // fetch data from current memtable, historical memtables, and 
SSTables in the correct order.
 final ListCloseableIteratorIColumnIterator iterators = new 
ArrayListCloseableIteratorIColumnIterator();
-// we iterate through memtables with a priority queue to avoid more 
sorting than necessary.
-// this predicate throws out the rows before the start of our range.
-PredicateIColumnIterator p = new PredicateIColumnIterator()
-{
-public boolean apply(IColumnIterator row)
-{
-return startWith.compareTo(row.getKey()) = 0
-(stopAt.isMinimum() || 
row.getKey().compareTo(stopAt) = 0);
-}
-};
 
 // memtables
 for (Memtable memtable : memtables)
 {
-iterators.add(new ConvertToColumnIterator(filter, p, 
memtable.getEntryIterator(startWith)));
+iterators.add(new ConvertToColumnIterator(filter, 
memtable.getEntryIterator(startWith, stopAt)));
 }
 
 for (SSTableReader sstable : sstables)
@@ -139,24 +129,20 @@ public class RowIteratorFactory
 private static class ConvertToColumnIterator extends 
AbstractIteratorIColumnIterator implements CloseableIteratorIColumnIterator
 {
 private final QueryFilter filter;
-private final PredicateIColumnIterator pred;
 private final IteratorMap.EntryDecoratedKey, ColumnFamily iter;
 
-public ConvertToColumnIterator(QueryFilter filter, 
PredicateIColumnIterator pred, IteratorMap.EntryDecoratedKey, 
ColumnFamily iter)
+public ConvertToColumnIterator(QueryFilter filter, 
IteratorMap.EntryDecoratedKey, ColumnFamily iter)
 {
 this.filter = filter;
-

svn commit: r1222738 - in /cassandra/branches/cassandra-1.0: ./ interface/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/avro/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassa

2011-12-23 Thread jbellis
Author: jbellis
Date: Fri Dec 23 16:39:01 2011
New Revision: 1222738

URL: http://svn.apache.org/viewvc?rev=1222738view=rev
Log:
allow configuring bloom_filter_fp_chance
patch by yukim and jbellis for CASSANDRA-3497

Modified:
cassandra/branches/cassandra-1.0/CHANGES.txt
cassandra/branches/cassandra-1.0/interface/cassandra.thrift

cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java

cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java

cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Constants.java
cassandra/branches/cassandra-1.0/src/avro/internode.genavro

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/cli/CliClient.java

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/config/CFMetaData.java

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java

cassandra/branches/cassandra-1.0/src/resources/org/apache/cassandra/cli/CliHelp.yaml

Modified: cassandra/branches/cassandra-1.0/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/CHANGES.txt?rev=1222738r1=1222737r2=1222738view=diff
==
--- cassandra/branches/cassandra-1.0/CHANGES.txt (original)
+++ cassandra/branches/cassandra-1.0/CHANGES.txt Fri Dec 23 16:39:01 2011
@@ -1,4 +1,5 @@
 1.0.7
+ * allow configuring bloom_filter_fp_chance (CASSANDRA-3497)
  * attempt hint delivery every ten minutes, or when failure detector
notifies us that a node is back up, whichever comes first.  hint
handoff throttle delay default changed to 1ms, from 50 (CASSANDRA-3554)

Modified: cassandra/branches/cassandra-1.0/interface/cassandra.thrift
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/cassandra.thrift?rev=1222738r1=1222737r2=1222738view=diff
==
--- cassandra/branches/cassandra-1.0/interface/cassandra.thrift (original)
+++ cassandra/branches/cassandra-1.0/interface/cassandra.thrift Fri Dec 23 
16:39:01 2011
@@ -46,7 +46,7 @@ namespace rb CassandraThrift
 #   for every edit that doesn't result in a change to major/minor.
 #
 # See the Semantic Versioning Specification (SemVer) http://semver.org.
-const string VERSION = 19.19.0
+const string VERSION = 19.20.0
 
 
 #
@@ -414,6 +414,7 @@ struct CfDef {
 30: optional mapstring,string compaction_strategy_options,
 31: optional i32 row_cache_keys_to_save,
 32: optional mapstring,string compression_options,
+33: optional double bloom_filter_fp_chance,
 }
 
 /* describes a keyspace. */

Modified: 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java?rev=1222738r1=1222737r2=1222738view=diff
==
--- 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
 (original)
+++ 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
 Fri Dec 23 16:39:01 2011
@@ -17041,6 +17041,8 @@ public class Cassandra {
 
 private void readObject(java.io.ObjectInputStream in) throws 
java.io.IOException, ClassNotFoundException {
   try {
+// it doesn't seem like you should have to do this, but java 
serialization is wacky, and doesn't call the default constructor.
+__isset_bit_vector = new BitSet(1);
 read(new org.apache.thrift.protocol.TCompactProtocol(new 
org.apache.thrift.transport.TIOStreamTransport(in)));
   } catch (org.apache.thrift.TException te) {
 throw new java.io.IOException(te);

Modified: 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java?rev=1222738r1=1222737r2=1222738view=diff
==
--- 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java
 (original)
+++ 
cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java
 Fri Dec 23 16:39:01 2011
@@ -71,6 +71,7 @@ public class CfDef implements org.apache
   private static final org.apache.thrift.protocol.TField 
COMPACTION_STRATEGY_OPTIONS_FIELD_DESC = new 
org.apache.thrift.protocol.TField(compaction_strategy_options, 
org.apache.thrift.protocol.TType.MAP, (short)30);
   private static final org.apache.thrift.protocol.TField 
ROW_CACHE_KEYS_TO_SAVE_FIELD_DESC = new 

svn commit: r1222743 - in /cassandra/trunk: ./ conf/ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/io/compress/ src/ja

2011-12-23 Thread jbellis
Author: jbellis
Date: Fri Dec 23 16:44:47 2011
New Revision: 1222743

URL: http://svn.apache.org/viewvc?rev=1222743view=rev
Log:
merge from 1.0

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/conf/cassandra.yaml
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStoreMBean.java
cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java

cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java
cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java

cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressionParameters.java
cassandra/trunk/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
cassandra/trunk/src/java/org/apache/cassandra/service/GCInspector.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java

cassandra/trunk/src/java/org/apache/cassandra/service/StorageServiceMBean.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Fri Dec 23 16:44:47 2011
@@ -1,10 +1,10 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
 /cassandra/branches/cassandra-0.7:1026516-1211709
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1212854,1212938,1214916,1222372
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1220925,1220927-1222440
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
-/cassandra/branches/cassandra-1.0:1167085-1222420
+/cassandra/branches/cassandra-1.0:1167085-1222470
 
/cassandra/branches/cassandra-1.0.0:1167104-1167229,1167232-1181093,1181741,1181816,1181820,1182951,1183243
 /cassandra/branches/cassandra-1.0.5:1208016
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1222743r1=1222742r2=1222743view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Fri Dec 23 16:44:47 2011
@@ -32,6 +32,9 @@
  * optimize memtable iteration during range scan (CASSANDRA-3638)
 
 1.0.7
+ * attempt hint delivery every ten minutes, or when failure detector
+   notifies us that a node is back up, whichever comes first.  hint
+   handoff throttle delay default changed to 1ms, from 50 (CASSANDRA-3554)
  * add nodetool setstreamthroughput (CASSANDRA-3571)
  * fix assertion when dropping a columnfamily with no sstables (CASSANDRA-3614)
  * more efficient allocation of small bloom filters (CASSANDRA-3618)
@@ -40,6 +43,7 @@
  * stop thrift service in shutdown hook so we can quiesce MessagingService
(CASSANDRA-3335)
 Merged from 0.8:
+ * avoid logging (harmless) exception when GC takes  1ms (CASSANDRA-3656)
  * prevent new nodes from thinking down nodes are up forever (CASSANDRA-3626)
  * Flush non-cfs backed secondary indexes (CASSANDRA-3659)
 

Modified: cassandra/trunk/conf/cassandra.yaml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/conf/cassandra.yaml?rev=1222743r1=1222742r2=1222743view=diff
==
--- cassandra/trunk/conf/cassandra.yaml (original)
+++ cassandra/trunk/conf/cassandra.yaml Fri Dec 23 16:44:47 2011
@@ -26,8 +26,8 @@ hinted_handoff_enabled: true
 # this defines the maximum amount of time a dead host will have hints
 # generated.  After it has been dead this long, hints will be dropped.
 max_hint_window_in_ms: 360 # one hour
-# Sleep this long after delivering each row or row fragment
-hinted_handoff_throttle_delay_in_ms: 50
+# Sleep this long after delivering each hint
+hinted_handoff_throttle_delay_in_ms: 1
 
 # authentication backend, implementing IAuthenticator; used to identify users
 authenticator: org.apache.cassandra.auth.AllowAllAuthenticator

Propchange: cassandra/trunk/contrib/

[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.

2011-12-23 Thread Radim Kolar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175536#comment-13175536
 ] 

Radim Kolar commented on CASSANDRA-3667:


you can reuse old cache settings for that purpose. if number of cached 
rows/keys is nonzero then use new global cache

 We need a way to deactivate row/key caching on a per-cf basis.
 --

 Key: CASSANDRA-3667
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667
 Project: Cassandra
  Issue Type: Improvement
Reporter: Pavel Yaskevich
Assignee: Pavel Yaskevich

 Initial idea would be to either have a boolean flag if we only want to allow 
 disabling row cache, or some multi-value caches option that could be none, 
 key_only, row_only or all.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1222806 - in /cassandra/branches/cassandra-1.0: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/index/ src/java/org/apache/cassandra/db/index/keys/

2011-12-23 Thread jake
Author: jake
Date: Fri Dec 23 19:10:54 2011
New Revision: 1222806

URL: http://svn.apache.org/viewvc?rev=1222806view=rev
Log:
Secondary Indexes should report memory consumption
Patch by tjake; reviewed by jbellis for CASSANDRA-3155


Modified:
cassandra/branches/cassandra-1.0/CHANGES.txt

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java

cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/keys/KeysIndex.java

Modified: cassandra/branches/cassandra-1.0/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/CHANGES.txt?rev=1222806r1=1222805r2=1222806view=diff
==
--- cassandra/branches/cassandra-1.0/CHANGES.txt (original)
+++ cassandra/branches/cassandra-1.0/CHANGES.txt Fri Dec 23 19:10:54 2011
@@ -14,7 +14,7 @@ Merged from 0.8:
  * avoid logging (harmless) exception when GC takes  1ms (CASSANDRA-3656)
  * prevent new nodes from thinking down nodes are up forever (CASSANDRA-3626)
  * Flush non-cfs backed secondary indexes (CASSANDRA-3659)
-
+ * Secondary Indexes should report memory consumption (CASSANDRA-3155)
 
 1.0.6
  * (CQL) fix cqlsh support for replicate_on_write (CASSANDRA-3596)

Modified: 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1222806r1=1222805r2=1222806view=diff
==
--- 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 (original)
+++ 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
 Fri Dec 23 19:10:54 2011
@@ -1028,10 +1028,7 @@ public class ColumnFamilyStore implement
 
 public long getTotalMemtableLiveSize()
 {
-long total = 0;
-for (ColumnFamilyStore cfs : concatWithIndexes())
-total += cfs.getMemtableThreadSafe().getLiveSize();
-return total;
+return getMemtableDataSize() + indexManager.getTotalLiveSize();
 }
 
 public int getMemtableSwitchCount()

Modified: 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java?rev=1222806r1=1222805r2=1222806view=diff
==
--- 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java
 (original)
+++ 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java
 Fri Dec 23 19:10:54 2011
@@ -112,6 +112,11 @@ public abstract class SecondaryIndex
 public abstract void forceBlockingFlush() throws IOException;
 
 /**
+ * Get current amount of memory this index is consuming (in bytes)
+ */
+public abstract long getLiveSize();
+
+/**
  * Allow access to the underlying column family store if there is one
  * @return the underlying column family store or null
  */

Modified: 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java?rev=1222806r1=1222805r2=1222806view=diff
==
--- 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
 (original)
+++ 
cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java
 Fri Dec 23 19:10:54 2011
@@ -328,6 +328,27 @@ public class SecondaryIndexManager
 return indexList.keySet();
 }
 
+/**
+ * @return total current ram size of all indexes
+ */
+public long getTotalLiveSize()
+{
+long total = 0;
+
+// we use identity map because per row indexes use same instance
+// across many columns
+IdentityHashMapSecondaryIndex, Object indexList = new 
IdentityHashMapSecondaryIndex, Object();
+
+for (Map.EntryByteBuffer, SecondaryIndex entry : 
indexesByColumn.entrySet())
+{
+SecondaryIndex index = entry.getValue();
+
+if (indexList.put(index, index) == null)
+total += index.getLiveSize();
+}
+
+return total;
+}
 
 /**
  * Removes obsolete index entries and creates new ones for the given row 
key

Modified: 

[jira] [Created] (CASSANDRA-3668) Performance of sstablloader is affected in 1.0.x

2011-12-23 Thread Manish Zope (Created) (JIRA)
Performance of sstablloader is affected in 1.0.x


 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
 Fix For: 1.0.7


One my colleague had reported the bug regarding the degraded performance of the 
sstable generator and sstable loader.
ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
Due to above reported issue generator problem is solved but performance of the 
sstableloader is still issue.
Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status.
But the problem with sstableloader still exists.

So opening other issue so that sstbleloader problem should not go unnoticed.

FYI : We have tested the generator part with the patch given in 3589.Its 
Working fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2011-12-23 Thread Manish Zope (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Zope updated CASSANDRA-3668:
---

Summary: Performance of sstableloader is affected in 1.0.x  (was: 
Performance of sstablloader is affected in 1.0.x)

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
 Fix For: 1.0.7

   Original Estimate: 96h
  Remaining Estimate: 96h

 One my colleague had reported the bug regarding the degraded performance of 
 the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 Due to above reported issue generator problem is solved but performance of 
 the sstableloader is still issue.
 Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status.
 But the problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2011-12-23 Thread Manish Zope (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Zope updated CASSANDRA-3668:
---

Remaining Estimate: 48h  (was: 96h)
 Original Estimate: 48h  (was: 96h)

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
 Fix For: 1.0.7

   Original Estimate: 48h
  Remaining Estimate: 48h

 One my colleague had reported the bug regarding the degraded performance of 
 the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 Due to above reported issue generator problem is solved but performance of 
 the sstableloader is still issue.
 Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status.
 But the problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3669) [patch] Word count sample has a flawed addToMutationMap, fix

2011-12-23 Thread Dave Brosius (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Brosius updated CASSANDRA-3669:


Attachment: mutation_sample.diff

 [patch] Word count sample has a flawed addToMutationMap, fix
 

 Key: CASSANDRA-3669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3669
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.0.6
Reporter: Dave Brosius
Priority: Trivial
 Attachments: mutation_sample.diff


 The WordCount example shows how to use client.batch_mutate, and has a helper 
 method for building a mutation map. While the example works properly, the 
 example addToMutationMap is flawed in that it won't allow adding of multiple 
 columns to the same row, as is what is needed to perform a 'sql like insert' 
 operation, which is the most likely example someone learning cassandra will 
 want to do. Fixed the sample addToMutationMap code so that it works correctly 
 for multi column inserts in one 'row'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3669) [patch] Word count sample has a flawed addToMutationMap, fix

2011-12-23 Thread Dave Brosius (Created) (JIRA)
[patch] Word count sample has a flawed addToMutationMap, fix


 Key: CASSANDRA-3669
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3669
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 1.0.6
Reporter: Dave Brosius
Priority: Trivial
 Attachments: mutation_sample.diff

The WordCount example shows how to use client.batch_mutate, and has a helper 
method for building a mutation map. While the example works properly, the 
example addToMutationMap is flawed in that it won't allow adding of multiple 
columns to the same row, as is what is needed to perform a 'sql like insert' 
operation, which is the most likely example someone learning cassandra will 
want to do. Fixed the sample addToMutationMap code so that it works correctly 
for multi column inserts in one 'row'.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2011-12-23 Thread Manish Zope (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Zope updated CASSANDRA-3668:
---

Description: 
One of my colleague had reported the bug regarding the degraded performance of 
the sstable generator and sstable loader.
ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
As stated in above issue generator performance is rectified but performance of 
the sstableloader is still an issue.

3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
problem with sstableloader still exists.

So opening other issue so that sstbleloader problem should not go unnoticed.

FYI : We have tested the generator part with the patch given in 3589.Its 
Working fine.

Please let us know if you guys require further inputs from our side.

  was:
One my colleague had reported the bug regarding the degraded performance of the 
sstable generator and sstable loader.
ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
Due to above reported issue generator problem is solved but performance of the 
sstableloader is still issue.
Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status.
But the problem with sstableloader still exists.

So opening other issue so that sstbleloader problem should not go unnoticed.

FYI : We have tested the generator part with the patch given in 3589.Its 
Working fine.


 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
 Fix For: 1.0.7

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2011-12-23 Thread Jonathan Ellis (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-3668:
-

Assignee: Yuki Morishita

Can you tell us how to reproduce?  What kind of degradation are you seeing?

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
Assignee: Yuki Morishita
 Fix For: 1.0.7

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x

2011-12-23 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175561#comment-13175561
 ] 

Jonathan Ellis commented on CASSANDRA-3668:
---

And to clarify: we're still talking about compared to 0.8.7 right?

 Performance of sstableloader is affected in 1.0.x
 -

 Key: CASSANDRA-3668
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.7
Reporter: Manish Zope
Assignee: Yuki Morishita
 Fix For: 1.0.7

   Original Estimate: 48h
  Remaining Estimate: 48h

 One of my colleague had reported the bug regarding the degraded performance 
 of the sstable generator and sstable loader.
 ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 
 As stated in above issue generator performance is rectified but performance 
 of the sstableloader is still an issue.
 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the 
 problem with sstableloader still exists.
 So opening other issue so that sstbleloader problem should not go unnoticed.
 FYI : We have tested the generator part with the patch given in 3589.Its 
 Working fine.
 Please let us know if you guys require further inputs from our side.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down

2011-12-23 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175566#comment-13175566
 ] 

Jonathan Ellis commented on CASSANDRA-3666:
---

Were there any error messages logged?

 Changing compaction strategy from Leveled to SizeTiered puts the node down
 --

 Key: CASSANDRA-3666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
 Environment: Windows Server 2008 R2 64bit
Reporter: Viktor Jevdokimov

 When column family compaction strategy is changed from Leveled to SizeTiered 
 and there're Leveled compaction tasks pending, Cassandra starting to flood in 
 logs with thousands per sec messages:
 Nothing to compact in ColumnFamily1.  Use forceUserDefinedCompaction if you 
 wish to force compaction of single sstables (e.g. for tombstone collection)
 As a result, log disk is full and system is down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down

2011-12-23 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3666:
--

Attachment: 3666.txt

I bet the culprit is how LCS sets the min/max compaction threshold but STCS 
does not.  Can you try the attached patch?

 Changing compaction strategy from Leveled to SizeTiered puts the node down
 --

 Key: CASSANDRA-3666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.6
 Environment: Windows Server 2008 R2 64bit
Reporter: Viktor Jevdokimov
  Labels: compaction
 Fix For: 1.0.7

 Attachments: 3666.txt


 When column family compaction strategy is changed from Leveled to SizeTiered 
 and there're Leveled compaction tasks pending, Cassandra starting to flood in 
 logs with thousands per sec messages:
 Nothing to compact in ColumnFamily1.  Use forceUserDefinedCompaction if you 
 wish to force compaction of single sstables (e.g. for tombstone collection)
 As a result, log disk is full and system is down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered logs millions of messages about nothing to compact

2011-12-23 Thread Jonathan Ellis (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3666:
--

Affects Version/s: (was: 1.0.6)
   1.0.0
  Summary: Changing compaction strategy from Leveled to SizeTiered 
logs millions of messages about nothing to compact  (was: Changing compaction 
strategy from Leveled to SizeTiered puts the node down)

 Changing compaction strategy from Leveled to SizeTiered logs millions of 
 messages about nothing to compact
 --

 Key: CASSANDRA-3666
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.0
 Environment: Windows Server 2008 R2 64bit
Reporter: Viktor Jevdokimov
Assignee: Jonathan Ellis
  Labels: compaction
 Fix For: 1.0.7

 Attachments: 3666.txt


 When column family compaction strategy is changed from Leveled to SizeTiered 
 and there're Leveled compaction tasks pending, Cassandra starting to flood in 
 logs with thousands per sec messages:
 Nothing to compact in ColumnFamily1.  Use forceUserDefinedCompaction if you 
 wish to force compaction of single sstables (e.g. for tombstone collection)
 As a result, log disk is full and system is down.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2011-12-23 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175574#comment-13175574
 ] 

paul cannon commented on CASSANDRA-3507:


All default Linux-based OS installs include Python nowadays, and so does Mac OS 
X. A py2exe compilation of cqlsh, along with a gui shell like IDLE, is a 
possibility for the Windows side. So no, there's no reason that switching to 
cqlsh should make the install or bootstrap processes more complicated.

Also, cqlsh is already written in python, and includes a lot of features which 
would probably be overly difficult or time-consuming to rewrite on the JVM.

 Proposal: separate cqlsh from CQL drivers
 -

 Key: CASSANDRA-3507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging, Tools
Affects Versions: 1.0.3
 Environment: Debian-based systems
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql, cqlsh
 Fix For: 1.1


 Whereas:
 * It has been shown to be very desirable to decouple the release cycles of 
 Cassandra from the various client CQL drivers, and
 * It is also desirable to include a good interactive CQL client with releases 
 of Cassandra, and
 * It is not desirable for Cassandra releases to depend on 3rd-party software 
 which is neither bundled with Cassandra nor readily available for every 
 target platform, but
 * Any good interactive CQL client will require a CQL driver;
 Therefore, be it resolved that:
 * cqlsh will not use an official or supported CQL driver, but will include 
 its own private CQL driver, not intended for use by anything else, and
 * the Cassandra project will still recommend installing and using a proper 
 CQL driver for client software.
 To ease maintenance, the private CQL driver included with cqlsh may very well 
 be created by copying the python CQL driver from one directory into 
 another, but the user shouldn't rely on this. Maybe we even ought to take 
 some minor steps to discourage its use for other purposes.
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3397) Problem markers don't show up in Eclipse

2011-12-23 Thread David Allsopp (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175577#comment-13175577
 ] 

David Allsopp commented on CASSANDRA-3397:
--

I was just interested in the error markers, thanks - I too found the Ant 
Builder too heavy!

 Problem markers don't show up in Eclipse
 

 Key: CASSANDRA-3397
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3397
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Affects Versions: 1.0.0
 Environment: Eclipse
Reporter: David Allsopp
Assignee: David Allsopp
Priority: Minor
  Labels: ant, eclipse, ide
 Fix For: 1.0.7

 Attachments: Cassandra-3397.patch


 The generated Eclipse files install an Ant Builder to build Cassandra within 
 Eclipse. This appears to mean that the default Java Builder is not present. 
 This means that no problem markers show up in the Problem view or the Package 
 Explorer etc when there are compiler errors or warnings  - you have to study 
 the console output, then navigate manually to the sources of the problems, 
 which is very tedious.
 It seems to be possible to re-install the default Java Builder in parallel 
 with the Ant Builder, getting the best of both worlds. I have documented this 
 on the wiki at http://wiki.apache.org/cassandra/RunningCassandraInEclipse
 I was wondering a) whether this can be done automatically by the 
 generate-eclipse-files Ant target, and b) whether using both Builders will be 
 problem if one is working on any of the generated code (Thrift, CQL etc). The 
 Java Builder can be temporarily disabled if so by unticking it under 
 Properties-Builders...
 See also https://issues.apache.org/jira/browse/CASSANDRA-2854

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3633) update stress to support prepared statements

2011-12-23 Thread Eric Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-3633:
--

Attachment: v1-0003-support-for-server-side-prepared-statements.txt
v1-0002-wrap-Cassandra.Client-for-prepared-statement-storage.txt
v1-0001-CASSANDRA-3633-refactor-for-parametized-queries.txt

 update stress to support prepared statements
 

 Key: CASSANDRA-3633
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3633
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: 
 v1-0001-CASSANDRA-3633-refactor-for-parametized-queries.txt, 
 v1-0002-wrap-Cassandra.Client-for-prepared-statement-storage.txt, 
 v1-0003-support-for-server-side-prepared-statements.txt


 The {{stress}} utility needs to be updated for testing prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-3634:
--

Attachment: v1-0002-change-bind-parms-from-string-to-bytes.txt
v1-0001-CASSANDRA-3634-generated-thrift-code.txt

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175581#comment-13175581
 ] 

Eric Evans commented on CASSANDRA-3634:
---

v1-0001-CASSANDRA-3634-generated-thrift-code.txt and 
v1-0002-change-bind-parms-from-string-to-bytes.txt convert string bind params 
to binary for purposes of performance testing.

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-3634:
--

Attachment: stress-change-bind-parms-to-BB.patch

stress-change-bind-parms-to-BB.patch updates stress to use binary query 
parameters for prepared statements.

This patch only updates the operations used in testing, (it would need more 
work before committing).

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175588#comment-13175588
 ] 

Eric Evans commented on CASSANDRA-3634:
---

Here is the performance comparison.  I stuck to the same tests I performed 
earlier (those earlier results can be found  
[here|http://www.acunu.com/blogs/eric-evans/cql-benchmarking]).  The patches to 
support binary query parameters for Cassandra and {{stress}} are attached to 
this issue, and the raw results can be found [here| 
http://people.apache.org/~eevans/3634].

_Note: Percentages listed are in relation to RPC performance._

h3. Inserts, 20M rows x 5 columns

!http://people.apache.org/~eevans/3634/insert_20mx5_noidx_t50_20111223.png|width=700!

|| ||Average OP rate||Average Latency||
|RPC|23,681/s|1.1ms|
|CQL|21,128/s (-11%)|1.3ms (+11%)|
|CQL w/ Prepared statements|23,911/s|1.1ms|
|CQL w/ Prepared statements (binary parms)|24,919/s (+5%)|1.2ms (+5%)|


h3. Inserts, 10M rows x 5 columns, KEYS index

!http://people.apache.org/~eevans/3634/insert_10mx5_keysidx_t50_20111223.png|width=700!

|| ||Average OP rate||Average Latency||
|RPC|10,054/s|5ms|
|CQL|9,326/s (-7%)|5.4ms (+8%)|
|CQL w/ Prepared statements|10,413/s (+3%)|4.8ms (-3%)|
|CQL w/ Prepared statements (binary parms)|10,299/s (+2%)|5ms|


h3. Counter increments, 10M rows x 5 columns

!http://people.apache.org/~eevans/3634/count_10mx5_noidx_t50_20111223.png|width=700!

|| ||Average OP rate||Average Latency||
|RPC|22,075/s|1.2ms|
|CQL|20,645/s (-6%)|1.2ms (+2%)|
|CQL w/ Prepared statements|24,286/s (+9%)|1.2ms (-1%)|
|CQL w/ Prepared statements (binary parms)|23,359/s (+5%)|1.2ms|


h3. Reads, 20M rows x 5 columns

!http://people.apache.org/~eevans/3634/read_20mx5_noidx_t50_20111223.png|width=700!

|| ||Average OP rate||Average Latency||
|RPC|22,285/s|2.1ms|
|CQL|20,080/s (-10%)|2.3ms (+9%)|
|CQL w/ Prepared statements|22,374/s|2.1ms (-1%)|
|CQL w/ Prepared statements (binary parms)|22,176/s|2.1ms|


 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay updated CASSANDRA-3623:
-

Attachment: 0002-tests-for-MMaped-Compression-segmented-file-v2.patch
0001-MMaped-Compression-segmented-file-v2.patch

Attached patch has optimization on memcpy which the earlier one didnt.

Performance:
Current trunk: 400+ms Avg
Removing CRC (CASSANDRA-3611): 200+ms Avg
With this patch: 100+ms Avg



 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Update of Cassandra2474 by JonathanEllis

2011-12-23 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The Cassandra2474 page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/Cassandra2474?action=diffrev1=2rev2=3

Comment:
add Alpha, Beta, and Discussion Summary sections

  
  TableOfContents(100)
  
+ == Goals ==
+ 
+ Primary: provide a CQL syntax for updating and querying composite column 
families.
+ 
+ Secondary goal: proposed syntax should be implementable by the Hive driver 
with the minimum of changes from mainline Hive.  In particular, changes to the 
Hive parser are too difficult to maintain long-term and are Right Out.  We 
would prefer to avoid changes to the Hive metastore but this is doable if 
necessary.
+ 
+ Tertiary goal: it would be nice to also support supercolumns
+ 
+ == Non-goals ==
+ 
+ Supporting arbitrarily-and-non-uniformly nested document data is a 
non-goal.  https://issues.apache.org/jira/browse/CASSANDRA-3647 is created to 
follow up on this related problem.
+ 
  == Alpha ==
  
- Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046834page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046834|here]]
+ The short-lived first proposal envisioned adding the prefix from which to 
select a resultset to the table name in the FROM clause.  Discussion starts 
Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046834page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046834|here]]
  
- === Goals ===
+ {{{
+ SELECT x, y FROM foo:bar WHERE parent='columnA'
+ }}}
  
-  * FIXME: add goals
-  * FIXME: add goals
-  * FIXME: add goals
+ {{{
+ select a, b FROM foo:bar:columnA where subparent='x'
+ }}}
+ 
+ === Discussion Summary ===
+ 
+ Jonathan was thinking in terms of supercolumns for this early proposal.  It's 
not clear how to generalize this to composites where the subcolumns are not 
explicitly named in the CompositeType definition.
+ 
+ This proposal would require a Hive metastore change, but the nail in the 
coffin is that this means you cannot use WHERE clauses with the parent parts 
of the column.  So, no range queries (necessary for map/reduce) or even slices 
within the same row.
  
  == Beta ==
  
- Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13095626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13095626|here]]
+ This proposal suggests the use of a keyword or hint to indicate that a query 
is transposed. Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046937page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046937|here]]
  
- === Goals ===
+ The first part of the discussion is where to put the transposition marker:
  
-  * FIXME: add goals
-  * FIXME: add goals
-  * FIXME: add goals
+ {{{
+ select /*+TRANSPOSED*/ key, column, subcolumn, value from foo;
+ }}}
+ 
+ {{{
+ select key, column, subcolumn, value from foo TRANSPOSED;
+ }}}
+ 
+ {{{
+ select transposed(key, column, subcolumn, value) from foo;
+ }}}
+ 
+ Settling on table:transposed because that requires no Hive changes:
+ 
+ {{{
+ select key, column, subcolumn, value from foo:transposed;
+ }}}
+ 
+ The second part, starting 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13095626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13095626|here]],
 digs into how to deal with destructuring the composite column name:
+ 
+ {{{
+ SELECT name AS (tweet_id, username), value AS body
+ FROM timeline:transposed
+ WHERE tweet_id = '95a789a' AND user_id = 'cscotta'
+ }}}
+ 
+ {{{
+ SELECT component1 AS tweet_id, component2 AS username, component3 location, 
value AS body
+ FROM timeline:transposed
+ WHERE user_id = '95a789a'
+ }}}
+ 
+ {{{
+ UPDATE tweets:transposed SET COMPOUND NAME ('2e1c3308', 'cscotta') = 'My 
motocycle...' WHERE KEY = key;
+ }}}
+ 
+ {{{
+ UPDATE tweets:transposed SET value = 'my motorcycle' WHERE KEY= key AND 
column = COMPOUND_NAME('2e1c3308', 'cscotta');
+ }}}
+ 
+ === Discussion Summary ===
+ 
+ There was general agreement that FROM foo:transposed is a reasonable 
syntax, however, neither the componentX syntax (where X is in range(1, number 
of components in the compositetype) nor the name AS (x, y) syntax met with 
approval: the name AS syntax requires patching the Hive parser, and the 
componentX syntax is ugly and repetitive to use.  The UPDATE syntaxes were 
also unsatisfactory.
  
  == Gamma ==
  
+ This proposal switches gears to dealing with transposition using DDL instead 
of 
+ 
  Discussion starts 
[[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13171304page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171304|here]]
  
- === Goals ===
- 
-  * FIXME: add goals
-  * FIXME: add 

[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175595#comment-13175595
 ] 

Vijay commented on CASSANDRA-3623:
--

Hot Methods before the patch (trunk):
Excl. User CPUName

   sec.  %
1480.474 100.00   Total
756.717  51.11   crc32
387.767  26.19   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
 54.814   3.70   
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String,
 org.apache.cassandra.io.compress.CompressionMetadata, boolean)
 46.676   3.15   
org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, 
boolean)
 45.697   3.09   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
 39.417   2.66   memcpy
 36.931   2.49   static@0xd8e9 (libpthread-2.5.so)
 23.272   1.57   CompactibleFreeListSpace::block_size(const HeapWord*) const
 22.766   1.54   SpinPause
 12.593   0.85   BlockOffsetArrayNonContigSpace::block_start_unsafe(const 
void*) const
  9.304   0.63   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
  8.468   0.57   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, 
MemRegionClosure*, bool)
  8.051   0.54   
ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  5.400   0.36   madvise
  4.619   0.31   CardTableModRefBS::process_chunk_boundaries(Space*, 
DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, 
unsigned long)
  1.584   0.11   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, 
bool, int)
  1.551   0.10   SweepClosure::do_blk_careful(HeapWord*)


Hot Methods After the patch:
sec.  %
537.681 100.00   Total
529.719  98.52   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
4.168   0.78   memcpy
0.143   0.03   Unknown
0.121   0.02   send
0.121   0.02   sun.misc.Unsafe.park(boolean, long)
0.110   0.02   sun.misc.Unsafe.unpark(java.lang.Object)
0.088   0.02   Interpreter
0.077   0.01   org.apache.cassandra.utils.EstimatedHistogram.max()
0.077   0.01   recv
0.066   0.01   SpinPause
0.055   0.01   org.apache.cassandra.utils.EstimatedHistogram.mean()
0.044   0.01   java.lang.Object.wait(long)
0.044   0.01   org.apache.cassandra.utils.EstimatedHistogram.min()
0.044   0.01   __pthread_cond_signal
0.044   0.01   vtable stub
0.033   0.01   java.lang.Object.notify()
0.033   0.01   
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
0.033   0.01   
org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read()
0.033   0.01   PhaseLive::compute(unsigned)
0.033   0.01   poll
0.022   0.00   Arena::contains(const void*) const
0.022   0.00   CompactibleFreeListSpace::free() const
0.022   0.00   I2C/C2I adapters
0.022   0.00   IndexSetIterator::advance_and_next()
0.022   0.00   java.lang.Class.forName0(java.lang.String, boolean, 
java.lang.ClassLoader)
0.022   0.00   java.lang.Long.getChars(long, int, char[])
0.022   0.00   java.nio.Bits.swap(int)



Before this patch response times:
Epoch   Rds/s   RdLat   Wrts/s  WrtLat %user   %sys  %idle  
 %iowait %steal  md0r/s  w/s rMB/s   wMB/s   NetRxKb NetTxKb Percentiles
 ReadWrite   Compacts
1324587443  15  186.305 00.000   27.85  0.0271.83   
0.24  0.053.890.000.120.0041  45  99th 
545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0
1324587455  15  1142.712   00.000   39.55  0.1357.61
   2.50  0.21118.30  0.302.200.0034  36  99th 
8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0
1324587467  10  171.808 00.000   23.83  0.0476.05   
0.04   0.054.800.000.140.00127 33  99th 
454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0
1324587478  10  182.775 00.000   20.43  0.0479.47   
0.01  0.051.600.400.040.0030  37  99th 
379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587490  13  190.893 00.000   27.58  0.0372.20   
0.14  0.063.200.500.090.0039  42  99th 
545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587503  28  358.719 00.000   52.24  0.0846.20   
1.40  0.09159.40  0.003.160.00196 71  99th 
3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0
1324587517  13  194.281 00.000   16.68  0.0283.23   
0.04  0.022.400.300.070.0038  41  99th 
785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0
1324587535  36  662.410 00.000   58.34  0.0841.42   
0.06  0.103.600.200.110.00173 81  99th 
3379.391 ms 

[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers

2011-12-23 Thread Jeremy Hanna (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175596#comment-13175596
 ] 

Jeremy Hanna commented on CASSANDRA-3507:
-

Makes sense.  I hadn't realized so much had gone into the python based shell.  
I also hadn't realized it could be made into an executable for windows.

 Proposal: separate cqlsh from CQL drivers
 -

 Key: CASSANDRA-3507
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging, Tools
Affects Versions: 1.0.3
 Environment: Debian-based systems
Reporter: paul cannon
Assignee: paul cannon
Priority: Minor
  Labels: cql, cqlsh
 Fix For: 1.1


 Whereas:
 * It has been shown to be very desirable to decouple the release cycles of 
 Cassandra from the various client CQL drivers, and
 * It is also desirable to include a good interactive CQL client with releases 
 of Cassandra, and
 * It is not desirable for Cassandra releases to depend on 3rd-party software 
 which is neither bundled with Cassandra nor readily available for every 
 target platform, but
 * Any good interactive CQL client will require a CQL driver;
 Therefore, be it resolved that:
 * cqlsh will not use an official or supported CQL driver, but will include 
 its own private CQL driver, not intended for use by anything else, and
 * the Cassandra project will still recommend installing and using a proper 
 CQL driver for client software.
 To ease maintenance, the private CQL driver included with cqlsh may very well 
 be created by copying the python CQL driver from one directory into 
 another, but the user shouldn't rely on this. Maybe we even ought to take 
 some minor steps to discourage its use for other purposes.
 Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175598#comment-13175598
 ] 

Vijay commented on CASSANDRA-3623:
--

The above test was done on 12 node cluster but the response time and the hot 
methods where collected from one random node in the cluster. 
This test was executed on AWS M2.4xl's with heap settings of 12/2.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175595#comment-13175595
 ] 

Vijay edited comment on CASSANDRA-3623 at 12/23/11 10:30 PM:
-

Hot Methods before the patch (trunk, without any patch):
Excl. User CPUName

   sec.  %
1480.474 100.00   Total
756.717  51.11   crc32
387.767  26.19   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
 54.814   3.70   
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String,
 org.apache.cassandra.io.compress.CompressionMetadata, boolean)
 46.676   3.15   
org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, 
boolean)
 45.697   3.09   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
 39.417   2.66   memcpy
 36.931   2.49   static@0xd8e9 (libpthread-2.5.so)
 23.272   1.57   CompactibleFreeListSpace::block_size(const HeapWord*) const
 22.766   1.54   SpinPause
 12.593   0.85   BlockOffsetArrayNonContigSpace::block_start_unsafe(const 
void*) const
  9.304   0.63   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
  8.468   0.57   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, 
MemRegionClosure*, bool)
  8.051   0.54   
ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  5.400   0.36   madvise
  4.619   0.31   CardTableModRefBS::process_chunk_boundaries(Space*, 
DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, 
unsigned long)
  1.584   0.11   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, 
bool, int)
  1.551   0.10   SweepClosure::do_blk_careful(HeapWord*)


Hot Methods After the patch:
sec.  %
537.681 100.00   Total
529.719  98.52   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
4.168   0.78   memcpy
0.143   0.03   Unknown
0.121   0.02   send
0.121   0.02   sun.misc.Unsafe.park(boolean, long)
0.110   0.02   sun.misc.Unsafe.unpark(java.lang.Object)
0.088   0.02   Interpreter
0.077   0.01   org.apache.cassandra.utils.EstimatedHistogram.max()
0.077   0.01   recv
0.066   0.01   SpinPause
0.055   0.01   org.apache.cassandra.utils.EstimatedHistogram.mean()
0.044   0.01   java.lang.Object.wait(long)
0.044   0.01   org.apache.cassandra.utils.EstimatedHistogram.min()
0.044   0.01   __pthread_cond_signal
0.044   0.01   vtable stub
0.033   0.01   java.lang.Object.notify()
0.033   0.01   
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable)
0.033   0.01   
org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read()
0.033   0.01   PhaseLive::compute(unsigned)
0.033   0.01   poll
0.022   0.00   Arena::contains(const void*) const
0.022   0.00   CompactibleFreeListSpace::free() const
0.022   0.00   I2C/C2I adapters
0.022   0.00   IndexSetIterator::advance_and_next()
0.022   0.00   java.lang.Class.forName0(java.lang.String, boolean, 
java.lang.ClassLoader)
0.022   0.00   java.lang.Long.getChars(long, int, char[])
0.022   0.00   java.nio.Bits.swap(int)



Before this patch response times (With crc chance set to 0):
Epoch   Rds/s   RdLat   Wrts/s  WrtLat %user   %sys  %idle  
 %iowait %steal  md0r/s  w/s rMB/s   wMB/s   NetRxKb NetTxKb Percentiles
 ReadWrite   Compacts
1324587443  15  186.305 00.000   27.85  0.0271.83   
0.24  0.053.890.000.120.0041  45  99th 
545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0
1324587455  15  1142.712   00.000   39.55  0.1357.61
   2.50  0.21118.30  0.302.200.0034  36  99th 
8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0
1324587467  10  171.808 00.000   23.83  0.0476.05   
0.04   0.054.800.000.140.00127 33  99th 
454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0
1324587478  10  182.775 00.000   20.43  0.0479.47   
0.01  0.051.600.400.040.0030  37  99th 
379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587490  13  190.893 00.000   27.58  0.0372.20   
0.14  0.063.200.500.090.0039  42  99th 
545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0
1324587503  28  358.719 00.000   52.24  0.0846.20   
1.40  0.09159.40  0.003.160.00196 71  99th 
3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0
1324587517  13  194.281 00.000   16.68  0.0283.23   
0.04  0.022.400.300.070.0038  41  99th 
785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0
1324587535  36  662.410 00.000   58.34  0.08

[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175601#comment-13175601
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


Can you please compare your version with trunk without crc32 because it doesn't 
seem to be fare match, would be nice to see the same statistics about hot 
methods and response time. The thing that I hate about MappedByteBuffer is if 
you duplicate it like you do in reBuffer() - will make unmap impossible until 
the every last duplicate is GC'ed, this implies that we won't be able to 
release old SSTables...

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175606#comment-13175606
 ] 

Vijay commented on CASSANDRA-3623:
--

I did it Again, i confused everyone with my test data :)
Hot methods shown above is the only data which is from the trunk rest are 
without CRC (hot methods without CRC and without this patch is as follows).


Excl. User CPU   Name

  sec.  %
629.460 100.00   Total
336.913  53.52   static@0x54999 (snappy-1.0.4.1-libsnappyjava.so)
50.074   7.96   
org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String,
 org.apache.cassandra.io.compress.CompressionMetadata, boolean)
43.057   6.84   
org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, 
boolean)
35.623   5.66   memcpy
33.555   5.33   static@0xd8e9 (libpthread-2.5.so)
30.673   4.87   Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long)
26.384   4.19   CompactibleFreeListSpace::block_size(const HeapWord*) const
15.199   2.41   SpinPause
11.966   1.90   BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) 
const
 8.479   1.35   CardTableModRefBSForCTRS::card_will_be_scanned(signed char)
 8.007   1.27   CardTableModRefBS::non_clean_card_iterate_work(MemRegion, 
MemRegionClosure*, bool)
 5.169   0.82   madvise
 5.059   0.80   ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
 4.146   0.66   CardTableModRefBS::process_chunk_boundaries(Space*, 
DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, 
unsigned long)
 2.431   0.39   CardTableModRefBS::dirty_card_range_after_reset(MemRegion, 
bool, int)
 1.375   0.22   SweepClosure::do_blk_careful(HeapWord*)
 0.825   0.13   Par_PushOrMarkClosure::do_oop(oopDesc*)
 0.616   0.10   GenericTaskQueueoopDesc*, 131072::pop_local(oopDesc*)
 0.561   0.09   instanceKlass::oop_oop_iterate_nv(oopDesc*, 
Par_PushOrMarkClosure*)
 0.473   0.08   CardTableModRefBS::process_stride(Space*, MemRegion, int, int, 
DirtyCardToOopClosure*, MemRegionClosure*, bool, signed char**, unsigned long, 
unsigned long)
 0.374   0.06   Par_MarkFromRootsClosure::scan_oops_in_oop(HeapWord*)
 0.319   0.05   BitMap::par_at_put(unsigned long, bool)
 0.308   0.05   MemRegion::intersection(MemRegion) const
 0.275   0.04   munmap
 0.220   0.03   CardTableModRefBS::dirty_card_iterate(MemRegion, 
MemRegionClosure*)


Hope this makes sense.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175607#comment-13175607
 ] 

Vijay commented on CASSANDRA-3623:
--

BTW: i can remove the duplicate() i didnt realize the implications, If you 
think rest is fine.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3374) CQL can't create column with compression or that use leveled compaction

2011-12-23 Thread paul cannon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175619#comment-13175619
 ] 

paul cannon commented on CASSANDRA-3374:


+1

 CQL can't create column with compression or that use leveled compaction
 ---

 Key: CASSANDRA-3374
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3374
 Project: Cassandra
  Issue Type: Bug
  Components: API
Affects Versions: 1.0.0
Reporter: Sylvain Lebresne
Assignee: Pavel Yaskevich
Priority: Minor
  Labels: cql
 Fix For: 1.0.7

 Attachments: CASSANDRA-3374.patch


 Looking at CreateColumnFamilyStatement.java, it doesn't seem CQL can create 
 compressed column families, nor define a compaction strategy.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Rick Shaw (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175623#comment-13175623
 ] 

Rick Shaw commented on CASSANDRA-3634:
--

+1

Looks like Strings wins in terms of performance. It offers the most 
flexibility in transformation as well. I think we have a winner.

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175626#comment-13175626
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


The problem is that you can't remove duplicate() because the same segment can 
be requested concurrently by different reads and we don't want to limit 
concurrency with synchronisation over segment use.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175627#comment-13175627
 ] 

Eric Evans commented on CASSANDRA-3634:
---

At Brandon's suggestion, I'm rerunning the insert test with some higher column 
counts.  That should make any per-term performance costs/savings more obvious.  
I'll post those results when I have them.

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175628#comment-13175628
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


Hot reads show the if we remove overhead of the CRAR and RAR initialization we 
would get the numbers very close to mmap'ed I/O, also as you can see that 
snappy takes ~1.6x time with mmap'ed I/O.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175630#comment-13175630
 ] 

Vijay commented on CASSANDRA-3623:
--

Regarding duplicates i was thinking of Creating duplicates in CMSF and having a 
helper function to track it.

Regarding Hot Reads: (I tried before and you have to access the FD and caching 
the initialized object didn't help), We do get something like 50% better 
latencies by doing MMap'ed without copying the data. Snappy is 1.6% more 
because there isn't any thing else holding up or any other over head. 

Currently with this patch we dont have to copy any uncompressed data but the 
CRAR will copy because we dont handle the DirectBB to snappy and that's made 
possible by using MMapped IO.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Pavel Yaskevich (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175631#comment-13175631
 ] 

Pavel Yaskevich commented on CASSANDRA-3623:


bq. We do get something like 50% better latencies by doing MMap'ed without 
copying the data.

But hot methods show the oposite, the main thing that hurts performance in the 
normal read case is not memcopy but reader class initialization overhead.

bq. Snappy is 1.6% more because there isn't any thing else holding up or any 
other over head.

I don't get what do you mean here, can you please elaborate? Slower snappy 
execution on my opinion could be caused by the additional expenses related to 
data mapping to the user-space in the conditions of the migrating page cache 
(situation when dataset does not fit in the page cache), mmap'ed I/O in that 
case makes kernel do more work comparing to syscalls (normal I/O).

bq. Currently with this patch we dont have to copy any uncompressed data but 
the CRAR will copy because we dont handle the DirectBB to snappy and that's 
made possible by using MMapped IO.

Did you mean compressed instead of uncompressed here?

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175638#comment-13175638
 ] 

Vijay commented on CASSANDRA-3623:
--

Pavel, it doesnt show the opposite it actually shows the time spent is 98% in 
the snappy library and only 2% in the remaining part of the code. Where as in 
the earlier case we spend 58% of the time in Snappy and rest in the other part 
of the code. Snappy/decompression is definitely the bottleneck... all i am 
saying is that now we are more efficient and thats the only bottleneck.

Did you mean compressed instead of uncompressed here?
Yes i ment compressed.

Plz try a test before and after the patch you will see what i am talking about, 
I did run the cluster (before and after there isnt any other variable in play 
here) test it for a long time and after this patch shows constat performance 
and doesn't vary a lot (response times after the patch).

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment

2011-12-23 Thread Vijay (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175639#comment-13175639
 ] 

Vijay commented on CASSANDRA-3623:
--

constant performance = not a lot of difference from 95th percentile and 
Average. Before patch there was a huge swing between those. Data is shown above.

Plz note i am not selling this patch ;) I am trying to find a better 
performance for our use case which needs compression... I am completely open 
for other options.

 use MMapedBuffer in CompressedSegmentedFile.getSegment
 --

 Key: CASSANDRA-3623
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 1.1
Reporter: Vijay
Assignee: Vijay
  Labels: compression
 Fix For: 1.1

 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 
 0001-MMaped-Compression-segmented-file.patch, 
 0002-tests-for-MMaped-Compression-segmented-file-v2.patch


 CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to 
 use the MMap and hence a higher CPU on the nodes and higher latencies on 
 reads. 
 This ticket is to implement the TODO mentioned in CompressedRandomAccessReader
 // TODO refactor this to separate concept of buffer to avoid lots of read() 
 syscalls and compression buffer
 but i think a separate class for the Buffer will be better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175646#comment-13175646
 ] 

Jonathan Ellis commented on CASSANDRA-3634:
---

Is the server om a separate machine from the client here?

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3603) CounterColumn and CounterContext use a log4j logger instead of using slf4j like the rest of the code base

2011-12-23 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175647#comment-13175647
 ] 

Peter Schuller commented on CASSANDRA-3603:
---

My apologies. Looks like I accidentally nuked projectCodeStyle.xml in the wc 
without realizing it.

 CounterColumn and CounterContext use a log4j logger instead of using slf4j 
 like the rest of the code base
 -

 Key: CASSANDRA-3603
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3603
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Fix For: 1.0.7

 Attachments: CASSANDRA-3603-trunk.txt


 (Will submit patch but not now, no time.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3641) inconsistent/corrupt counters w/ broken shards never converge

2011-12-23 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3641:
--

Attachment: CASSANDRA-3641-trunk-nojmx.txt

New version attached. Rebased to current trunk, and no JMX. Otherwise identical.

 inconsistent/corrupt counters w/ broken shards never converge
 -

 Key: CASSANDRA-3641
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3641
 Project: Cassandra
  Issue Type: Bug
Reporter: Peter Schuller
Assignee: Peter Schuller
 Attachments: 3641-0.8-internal-not-for-inclusion.txt, 3641-trunk.txt, 
 CASSANDRA-3641-trunk-nojmx.txt


 We ran into a case (which MIGHT be related to CASSANDRA-3070) whereby we had 
 counters that were corrupt (hopefully due to CASSANDRA-3178). The corruption 
 was that there would exist shards with the *same* node_id, *same* clock id, 
 but *different* counts.
 The counter column diffing and reconciliation code assumes that this never 
 happens, and ignores the count. The problem with this is that if there is an 
 inconsistency, the result of a reconciliation will depend on the order of the 
 shards.
 In our case for example, we would see the value of the counter randomly 
 fluctuating on a CL.ALL read, but we would get consistent (whatever the node 
 had) on CL.ONE (submitted to one of the nodes in the replica set for the key).
 In addition, read repair would not work despite digest mismatches because the 
 diffing algorithm also did not care about the counts when determining the 
 differences to send.
 I'm attaching patches that fixes this. The first patch is against our 0.8 
 branch, which is not terribly useful to people, but I include it because it 
 is the well-tested version that we have used on the production cluster which 
 was subject to this corruption.
 The other patch is against trunk, and contains the same change.
 What the patch does is:
 * On diffing, treat as DISJOINT if there is a count discrepancy.
 * On reconciliation, look at the count and *deterministically* pick the 
 higher one, and:
 ** log the fact that we detected a corrupt counter
 ** increment a JMX observable counter for monitoring purposes
 A cluster which is subject to such corruption and has this patch, will fix 
 itself with and AES + compact (or just repeated compactions assuming the 
 replicate-on-compact is able to deliver correctly).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Peter Schuller (Created) (JIRA)
provide red flags JMX instrumentation
---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor


As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain 
information which is almost without exception indicative of something being 
wrong with the node or cluster.

In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
Other examples include:

* Number of times the selection of files to compact was adjusted due to disk 
space heuristics
* Number of times compaction has failed
* Any I/O error reading from or writing to disk (the work here is collecting, 
not exposing, so maybe not in an initial version)
* Any data skipped due to checksum mismatches (when checksumming is being 
used); e.g., number of skips.
* Any arbitrary exception at least in certain code paths (compaction, scrub, 
cleanup for starters)

Probably other things.

The motivation is that if we have clear and obvious indications that something 
truly is wrong, it seems suboptimal to just leave that information in the log 
somewhere, for someone to discover later when something else broke as a result 
and a human investigates. You might argue that one should use non-trivial log 
analysis to detect these things, but I highly doubt a lot of people do this and 
it seems very wasteful to require that in comparison to just providing the 
MBean.

It is important to note that the *lack* of a certain problem being advertised 
in this MBean is not supposed to be indicative of a *lack* of a problem. 
Rather, the point is that to the extent we can easily do so, it is nice to have 
a clear method of communicating to monitoring systems where there *is* a clear 
indication of something being wrong.

The main part of this ticket is not to cover everything under the sun, but 
rather to reach agreement on adding an MBean where these types of indicators 
can be collected. Individual counters can then be added over time as one thinks 
of them.

I propose:

* Create an org.apache.cassandra.db.RedFlags MBean
* Populate with a few things to begin with.

I'll submit the patch if there is agreement.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3670:
--

Reviewer: slebresne

 provide red flags JMX instrumentation
 ---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 As discussed in CASSANDRA-3641, it would be nice to expose through JMX 
 certain information which is almost without exception indicative of something 
 being wrong with the node or cluster.
 In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
 Other examples include:
 * Number of times the selection of files to compact was adjusted due to disk 
 space heuristics
 * Number of times compaction has failed
 * Any I/O error reading from or writing to disk (the work here is collecting, 
 not exposing, so maybe not in an initial version)
 * Any data skipped due to checksum mismatches (when checksumming is being 
 used); e.g., number of skips.
 * Any arbitrary exception at least in certain code paths (compaction, scrub, 
 cleanup for starters)
 Probably other things.
 The motivation is that if we have clear and obvious indications that 
 something truly is wrong, it seems suboptimal to just leave that information 
 in the log somewhere, for someone to discover later when something else broke 
 as a result and a human investigates. You might argue that one should use 
 non-trivial log analysis to detect these things, but I highly doubt a lot of 
 people do this and it seems very wasteful to require that in comparison to 
 just providing the MBean.
 It is important to note that the *lack* of a certain problem being advertised 
 in this MBean is not supposed to be indicative of a *lack* of a problem. 
 Rather, the point is that to the extent we can easily do so, it is nice to 
 have a clear method of communicating to monitoring systems where there *is* a 
 clear indication of something being wrong.
 The main part of this ticket is not to cover everything under the sun, but 
 rather to reach agreement on adding an MBean where these types of indicators 
 can be collected. Individual counters can then be added over time as one 
 thinks of them.
 I propose:
 * Create an org.apache.cassandra.db.RedFlags MBean
 * Populate with a few things to begin with.
 I'll submit the patch if there is agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair

2011-12-23 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3483:
--

Attachment: CASSANDRA-3483-trunk-noredesign.txt

Attaching version rebased to trunk but not yet re-factored.

 Support bringing up a new datacenter to existing cluster without repair
 ---

 Key: CASSANDRA-3483
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.0.2
Reporter: Chris Goffinet
Assignee: Peter Schuller
 Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, 
 CASSANDRA-3483-trunk-noredesign.txt


 Was talking to Brandon in irc, and we ran into a case where we want to bring 
 up a new DC to an existing cluster. He suggested from jbellis the way to do 
 it currently was set strategy options of dc2:0, then add the nodes. After the 
 nodes are up, change the RF of dc2, and run repair. 
 I'd like to avoid a repair as it runs AES and is a bit more intense than how 
 bootstrap works currently by just streaming ranges from the SSTables. Would 
 it be possible to improve this functionality (adding a new DC to existing 
 cluster) than the proposed method? We'd be happy to do a patch if we got some 
 input on the best way to go about it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Brandon Williams (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175656#comment-13175656
 ] 

Brandon Williams commented on CASSANDRA-3670:
-

I almost feel bad to mention this here, but since the fixver is unset I'll do 
it :)

It seems like converting a lot of our one-off metrics to 
https://github.com/codahale/metrics would provide much more flexibility in the 
future, as well as giving us better metrics to gauge this sort of thing by.

 provide red flags JMX instrumentation
 ---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 As discussed in CASSANDRA-3641, it would be nice to expose through JMX 
 certain information which is almost without exception indicative of something 
 being wrong with the node or cluster.
 In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
 Other examples include:
 * Number of times the selection of files to compact was adjusted due to disk 
 space heuristics
 * Number of times compaction has failed
 * Any I/O error reading from or writing to disk (the work here is collecting, 
 not exposing, so maybe not in an initial version)
 * Any data skipped due to checksum mismatches (when checksumming is being 
 used); e.g., number of skips.
 * Any arbitrary exception at least in certain code paths (compaction, scrub, 
 cleanup for starters)
 Probably other things.
 The motivation is that if we have clear and obvious indications that 
 something truly is wrong, it seems suboptimal to just leave that information 
 in the log somewhere, for someone to discover later when something else broke 
 as a result and a human investigates. You might argue that one should use 
 non-trivial log analysis to detect these things, but I highly doubt a lot of 
 people do this and it seems very wasteful to require that in comparison to 
 just providing the MBean.
 It is important to note that the *lack* of a certain problem being advertised 
 in this MBean is not supposed to be indicative of a *lack* of a problem. 
 Rather, the point is that to the extent we can easily do so, it is nice to 
 have a clear method of communicating to monitoring systems where there *is* a 
 clear indication of something being wrong.
 The main part of this ticket is not to cover everything under the sun, but 
 rather to reach agreement on adding an MBean where these types of indicators 
 can be collected. Individual counters can then be added over time as one 
 thinks of them.
 I propose:
 * Create an org.apache.cassandra.db.RedFlags MBean
 * Populate with a few things to begin with.
 I'll submit the patch if there is agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175660#comment-13175660
 ] 

Peter Schuller commented on CASSANDRA-3670:
---

I have not used it, and only had a quick look. But provided that it does the 
job and has no significant downside, I'd be very +1 just from the mere fact 
alone that it natively supports exposing metrics through HTTP and JSON while 
still retaining JMX visibility, and from the fact that you avoid the 
ThingMBean+Thing acrobatics. The histogram support seems convenient.

The RedFlags stuff could be a good pilot case. If it causes problems, it 
doesn't break anything that people are used to working already.

 provide red flags JMX instrumentation
 ---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 As discussed in CASSANDRA-3641, it would be nice to expose through JMX 
 certain information which is almost without exception indicative of something 
 being wrong with the node or cluster.
 In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
 Other examples include:
 * Number of times the selection of files to compact was adjusted due to disk 
 space heuristics
 * Number of times compaction has failed
 * Any I/O error reading from or writing to disk (the work here is collecting, 
 not exposing, so maybe not in an initial version)
 * Any data skipped due to checksum mismatches (when checksumming is being 
 used); e.g., number of skips.
 * Any arbitrary exception at least in certain code paths (compaction, scrub, 
 cleanup for starters)
 Probably other things.
 The motivation is that if we have clear and obvious indications that 
 something truly is wrong, it seems suboptimal to just leave that information 
 in the log somewhere, for someone to discover later when something else broke 
 as a result and a human investigates. You might argue that one should use 
 non-trivial log analysis to detect these things, but I highly doubt a lot of 
 people do this and it seems very wasteful to require that in comparison to 
 just providing the MBean.
 It is important to note that the *lack* of a certain problem being advertised 
 in this MBean is not supposed to be indicative of a *lack* of a problem. 
 Rather, the point is that to the extent we can easily do so, it is nice to 
 have a clear method of communicating to monitoring systems where there *is* a 
 clear indication of something being wrong.
 The main part of this ticket is not to cover everything under the sun, but 
 rather to reach agreement on adding an MBean where these types of indicators 
 can be collected. Individual counters can then be added over time as one 
 thinks of them.
 I propose:
 * Create an org.apache.cassandra.db.RedFlags MBean
 * Populate with a few things to begin with.
 I'll submit the patch if there is agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175661#comment-13175661
 ] 

Peter Schuller commented on CASSANDRA-3670:
---

Also, the whole JMX bit is actually a pretty annoying little detail for many 
situations. There seems to exist no implementation outside of the JVM, and 
writing a trivial monitor along the lines of:

{code}
  warnings=$(curl http://localhost:XXX/bla/bla/redflags | egrep -v ': 0$' | wc 
-l)
{code}

Becomes a chore. From what I can tell everyone keeps using that magic .jar that 
no one knows where it comes from that e.g. cassandra-munin-plugins uses. It's a 
real hassle to be constantly launching a JVM just for metrics extraction.

Now granted, if you are fully JMX enabled in your infrastructure there is no 
issue, but I really think something like this goes a long way towards making 
Cassandra more operator-friendly - particularly to individuals and/or small 
organizations that want to monitor in some simple way and do not want to spend 
time on JMX issues.



 provide red flags JMX instrumentation
 ---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 As discussed in CASSANDRA-3641, it would be nice to expose through JMX 
 certain information which is almost without exception indicative of something 
 being wrong with the node or cluster.
 In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
 Other examples include:
 * Number of times the selection of files to compact was adjusted due to disk 
 space heuristics
 * Number of times compaction has failed
 * Any I/O error reading from or writing to disk (the work here is collecting, 
 not exposing, so maybe not in an initial version)
 * Any data skipped due to checksum mismatches (when checksumming is being 
 used); e.g., number of skips.
 * Any arbitrary exception at least in certain code paths (compaction, scrub, 
 cleanup for starters)
 Probably other things.
 The motivation is that if we have clear and obvious indications that 
 something truly is wrong, it seems suboptimal to just leave that information 
 in the log somewhere, for someone to discover later when something else broke 
 as a result and a human investigates. You might argue that one should use 
 non-trivial log analysis to detect these things, but I highly doubt a lot of 
 people do this and it seems very wasteful to require that in comparison to 
 just providing the MBean.
 It is important to note that the *lack* of a certain problem being advertised 
 in this MBean is not supposed to be indicative of a *lack* of a problem. 
 Rather, the point is that to the extent we can easily do so, it is nice to 
 have a clear method of communicating to monitoring systems where there *is* a 
 clear indication of something being wrong.
 The main part of this ticket is not to cover everything under the sun, but 
 rather to reach agreement on adding an MBean where these types of indicators 
 can be collected. Individual counters can then be added over time as one 
 thinks of them.
 I propose:
 * Create an org.apache.cassandra.db.RedFlags MBean
 * Populate with a few things to begin with.
 I'll submit the patch if there is agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation

2011-12-23 Thread Peter Schuller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175662#comment-13175662
 ] 

Peter Schuller commented on CASSANDRA-3670:
---

(For the record I'm not suggesting actually writing a monitor exactly like 
that; I'm not a fan of ad-hoc shell scripting for such things due to the 
potential for silent failures. But choose any arbitrary productive language and 
a HTTP+JSON interface is trivial to use in a clean way.)

 provide red flags JMX instrumentation
 ---

 Key: CASSANDRA-3670
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor

 As discussed in CASSANDRA-3641, it would be nice to expose through JMX 
 certain information which is almost without exception indicative of something 
 being wrong with the node or cluster.
 In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. 
 Other examples include:
 * Number of times the selection of files to compact was adjusted due to disk 
 space heuristics
 * Number of times compaction has failed
 * Any I/O error reading from or writing to disk (the work here is collecting, 
 not exposing, so maybe not in an initial version)
 * Any data skipped due to checksum mismatches (when checksumming is being 
 used); e.g., number of skips.
 * Any arbitrary exception at least in certain code paths (compaction, scrub, 
 cleanup for starters)
 Probably other things.
 The motivation is that if we have clear and obvious indications that 
 something truly is wrong, it seems suboptimal to just leave that information 
 in the log somewhere, for someone to discover later when something else broke 
 as a result and a human investigates. You might argue that one should use 
 non-trivial log analysis to detect these things, but I highly doubt a lot of 
 people do this and it seems very wasteful to require that in comparison to 
 just providing the MBean.
 It is important to note that the *lack* of a certain problem being advertised 
 in this MBean is not supposed to be indicative of a *lack* of a problem. 
 Rather, the point is that to the extent we can easily do so, it is nice to 
 have a clear method of communicating to monitoring systems where there *is* a 
 clear indication of something being wrong.
 The main part of this ticket is not to cover everything under the sun, but 
 rather to reach agreement on adding an MBean where these types of indicators 
 can be collected. Individual counters can then be added over time as one 
 thinks of them.
 I propose:
 * Create an org.apache.cassandra.db.RedFlags MBean
 * Populate with a few things to begin with.
 I'll submit the patch if there is agreement.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes

2011-12-23 Thread Peter Schuller (Created) (JIRA)
provide JMX counters for unavailables/timeouts for reads and writes
---

 Key: CASSANDRA-3671
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor


Attaching patch against trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes

2011-12-23 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3671:
--

Attachment: CASSANDRA-3671-trunk.txt

 provide JMX counters for unavailables/timeouts for reads and writes
 ---

 Key: CASSANDRA-3671
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3671-trunk.txt


 Attaching patch against trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes

2011-12-23 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3671:
--

Attachment: CASSANDRA-3671-trunk-v2.txt

Accidentally attached old version of patch. v2 attached which doesn't fail to 
re-throw in one case.

 provide JMX counters for unavailables/timeouts for reads and writes
 ---

 Key: CASSANDRA-3671
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Attachments: CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt


 Attaching patch against trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Eric Evans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175674#comment-13175674
 ] 

Eric Evans commented on CASSANDRA-3634:
---

No, it's not

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters

2011-12-23 Thread Jonathan Ellis (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175690#comment-13175690
 ] 

Jonathan Ellis commented on CASSANDRA-3634:
---

Let's get Brandon to do some testing on our cluster with separate clients and 
servers.  If strings are testing faster than binary then either

# something is wrong with the code, because parsing String - ByteBuffer can't 
possibly be faster than just using the ByteBuffer from Thrift (not to mention 
that Thrift's internal creation of the String object has more overhead than 
marking a ByteBuffer slice of the frame)
# the difference is negligible compared to other factors and the test noise
# the difference is hidden by environmental factors, e.g., String runs just as 
fast as BB but with X% more CPU used

Splitting out clients/servers will help determine if #3 is playing a role here.

 compare string vs. binary prepared statement parameters
 ---

 Key: CASSANDRA-3634
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
  Labels: cql
 Fix For: 1.1

 Attachments: stress-change-bind-parms-to-BB.patch, 
 v1-0001-CASSANDRA-3634-generated-thrift-code.txt, 
 v1-0002-change-bind-parms-from-string-to-bytes.txt


 Perform benchmarks to compare the performance of string and pre-serialized 
 binary parameters to prepared statements.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira