[jira] [Commented] (CASSANDRA-3624) Hinted Handoff - related OOM
[ https://issues.apache.org/jira/browse/CASSANDRA-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175332#comment-13175332 ] Radim Kolar commented on CASSANDRA-3624: I have this problem too but i do not have large rows, i have huge number of small rows (max 180 bytes serialized) Hinted Handoff - related OOM Key: CASSANDRA-3624 URL: https://issues.apache.org/jira/browse/CASSANDRA-3624 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Marcus Eriksson Assignee: Jonathan Ellis Labels: hintedhandoff Fix For: 1.0.7 Attachments: 3624.txt One of our nodes had collected alot of hints for another node, so when the dead node came back and the row mutations were read back from disk, the node died with an OOM-exception (and kept dying after restart, even with increased heap (from 8G to 12G)). The heap dump contained alot of SuperColumns and our application does not use those (but HH does). I'm guessing that each mutation is big so that PAGE_SIZE*mutation_size does not fit in memory (will check this tomorrow) A simple fix (if my assumption above is correct) would be to reduce the PAGE_SIZE in HintedHandOffManager.java to something like 10 (or even 1?) to reduce the memory pressure. The performance hit would be small since we are doing the hinted handoff throttle delay sleep before sending every *mutation* anyway (not every page), thoughts? If anyone runs in to the same problem, I got the node started again by simply removing the HintsColumnFamily* files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3624) Hinted Handoff - related OOM
[ https://issues.apache.org/jira/browse/CASSANDRA-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175333#comment-13175333 ] Radim Kolar commented on CASSANDRA-3624: I have this problem too but i do not have large rows, i have huge number of small rows (max 180 bytes serialized) Hinted Handoff - related OOM Key: CASSANDRA-3624 URL: https://issues.apache.org/jira/browse/CASSANDRA-3624 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Reporter: Marcus Eriksson Assignee: Jonathan Ellis Labels: hintedhandoff Fix For: 1.0.7 Attachments: 3624.txt One of our nodes had collected alot of hints for another node, so when the dead node came back and the row mutations were read back from disk, the node died with an OOM-exception (and kept dying after restart, even with increased heap (from 8G to 12G)). The heap dump contained alot of SuperColumns and our application does not use those (but HH does). I'm guessing that each mutation is big so that PAGE_SIZE*mutation_size does not fit in memory (will check this tomorrow) A simple fix (if my assumption above is correct) would be to reduce the PAGE_SIZE in HintedHandOffManager.java to something like 10 (or even 1?) to reduce the memory pressure. The performance hit would be small since we are doing the hinted handoff throttle delay sleep before sending every *mutation* anyway (not every page), thoughts? If anyone runs in to the same problem, I got the node started again by simply removing the HintsColumnFamily* files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2749: Attachment: (was: 0002-fix-unit-tests.patch) fine-grained control over data directories -- Key: CASSANDRA-2749 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1 Attachments: 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 2749.tar.gz, 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 2749_proper.tar.gz Currently Cassandra supports multiple data directories but no way to control what sstables are placed where. Particularly for systems with mixed SSDs and rotational disks, it would be nice to pin frequently accessed columnfamilies to the SSDs. Postgresql does this with tablespaces (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we should probably avoid using that name because of confusing similarity to keyspaces. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2749: Attachment: 0003-Fixes.patch 0002-fix-unit-tests.patch 0001-2749.patch Attaching rebased patches, with a 3rd patches (0003-Fixes.patch) addressing the Pavel's remarks. More specifically: bq. o.a.c.db.Directories comment should be updated because it still uses SSTable file name without keyspace. Fixed, thanks bq. o.a.c.io.sstable.SSTableReaderTest won't compile Sorry, I forgot to check the test after a last rebase, fixed too (this involved renaming a number of sstables from test/data/legacy-sstables/hb to include the keyspace name, so that specific change is in the 2nd 'fix unit tests' patch to avoid polluting the 3rd one). bq. if you start with empty data directory you get following exception and process exits Fixed. I've actually made two modifications: the migration checks the existence of the directory to avoid the NPE during listFiles(), but I've also modified the 'should we migrate' check to detect new nodes (checking if the system keyspace directory exists) and thus not print the migration message at all. bq. on snapshot doesn't create or move (from older schema) index SSTables related to CF I'm not sure I see what this one is. Are we talking of the migration process? In any case, you made me think about secondary indexes. Maybe it is more natural to have secondary indexes sstables be in the same directory than the base cfs? Since the indexes name is not really something exposed (granted you don't have to be a genius to figure it out), it feels like it would slightly simplify administration to not put them in a separate directory. I've updated the patch to implement this last idea (so indexes are in the same directory than their base cf), but it would be nice to have multiple opinions on that move since we don't want to have to do a new migration in 6 month because we've changed our mind. bq. shouldn't old snapshots directory be removed after move? Your right, fixed (for backups too). fine-grained control over data directories -- Key: CASSANDRA-2749 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1 Attachments: 0001-2749.patch, 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 2749_proper.tar.gz Currently Cassandra supports multiple data directories but no way to control what sstables are placed where. Particularly for systems with mixed SSDs and rotational disks, it would be nice to pin frequently accessed columnfamilies to the SSDs. Postgresql does this with tablespaces (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we should probably avoid using that name because of confusing similarity to keyspaces. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175390#comment-13175390 ] Pavel Yaskevich commented on CASSANDRA-2749: bq. I'm not sure I see what this one is. Are we talking of the migration process? I was testing it like this : # run 1.1 *without* modifications # ./tools/stress/bin/stress -n 5 -S 512 -x KEYS # ./bin/nodetool -h localhost flush Keyspace1 Standard1 # ./bin/nodetool -h localhost snapshot Keyspace1 # made sure that Standard1.Idx-* SSTables are in the snapshots/timestamp directory # run 1.1 *with* you patch applied # checked if snapshots directory was moved and what files did it include - it was lucking Standard1.Idx-* files # cleaned data directory # repeated steps 1 - 5 but this time *with* your patch applied and it didn't include Standard1.Idx-* into snapshot bq. Maybe it is more natural to have secondary indexes sstables be in the same directory than the base cfs? +1 fine-grained control over data directories -- Key: CASSANDRA-2749 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1 Attachments: 0001-2749.patch, 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 2749_proper.tar.gz Currently Cassandra supports multiple data directories but no way to control what sstables are placed where. Particularly for systems with mixed SSDs and rotational disks, it would be nice to pin frequently accessed columnfamilies to the SSDs. Postgresql does this with tablespaces (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we should probably avoid using that name because of confusing similarity to keyspaces. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down
Changing compaction strategy from Leveled to SizeTiered puts the node down -- Key: CASSANDRA-3666 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.6 Environment: Windows Server 2008 R2 64bit Reporter: Viktor Jevdokimov When column family compaction strategy is changed from Leveled to SizeTiered and there're Leveled compaction tasks pending, Cassandra starting to flood in logs with thousands per sec messages: Nothing to compact in ColumnFamily1. Use forceUserDefinedCompaction if you wish to force compaction of single sstables (e.g. for tombstone collection) As a result, log disk is full and system is down. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175428#comment-13175428 ] Sylvain Lebresne commented on CASSANDRA-3143: - Alright, patch lgtm, +1. Great work Pavel. Just a few minor details that would be nice to do before committing: * As mentioned in the previous comments, currently when a row need to be read to be put in cache, CFS.cacheRow() decorates the key, which can be avoided just by making cacheRow take the DK and create the RowCacheKey internally. * We should rename setRowCacheCapacity to setRowCacheCapacityMB to match the others * It would be nice to move the cache stats from nodetool cfstats to nodetool info, rather than purely removing them * The saveCaches method still does not respect the cacheKeysToSave options And of course there is the question of disabling row caching on per-cf basis which, as said previously, I think is a must have before we release this (because any user that have at least one CF with wide rows (or that just happens to be a bad candidate for caching) will need it). So ok to do that post commit but let's put it at the top of the todo list then. Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Attachments: 0002-fixes.patch, CASSANDRA-3143-squashed.patch Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175447#comment-13175447 ] Sylvain Lebresne commented on CASSANDRA-2749: - Weird. I just tried the same scenario and everything worked correctly. I should mention that when moving the snapshots/backups, the migration process rename them to the new filename convention, so they will be called Keyspace1-Standard1.Idx-*. Or maybe I fixed it with the last version of the patch without realizing it. fine-grained control over data directories -- Key: CASSANDRA-2749 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1 Attachments: 0001-2749.patch, 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 2749_proper.tar.gz Currently Cassandra supports multiple data directories but no way to control what sstables are placed where. Particularly for systems with mixed SSDs and rotational disks, it would be nice to pin frequently accessed columnfamilies to the SSDs. Postgresql does this with tablespaces (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we should probably avoid using that name because of confusing similarity to keyspaces. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3143: --- Attachment: (was: 0002-fixes.patch) Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3143: --- Attachment: (was: CASSANDRA-3143-squashed.patch) Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175450#comment-13175450 ] Sylvain Lebresne commented on CASSANDRA-3143: - Last version lgtm, +1 (nit: I don't think the getCacheCapacityInBytes methods are too necessary when we already have it in MB). Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 0003-final-fixes.patch Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.
We need a way to deactivate row/key caching on a per-cf basis. -- Key: CASSANDRA-3667 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Initial idea would be to either have a boolean flag if we only want to allow disabling row cache, or some multi-value caches option that could be none, key_only, row_only or all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175453#comment-13175453 ] Pavel Yaskevich commented on CASSANDRA-3143: Thanks, Sylvain! I have created CASSANDRA-3667, will get to it as soon as I commit this one. Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 0003-final-fixes.patch Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.
[ https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175457#comment-13175457 ] Sylvain Lebresne commented on CASSANDRA-3667: - I don't care a lot but I would personally slightly prefer the multi-values setting as it's probably not very much harder to implement. We need a way to deactivate row/key caching on a per-cf basis. -- Key: CASSANDRA-3667 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Initial idea would be to either have a boolean flag if we only want to allow disabling row cache, or some multi-value caches option that could be none, key_only, row_only or all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.
[ https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175458#comment-13175458 ] Pavel Yaskevich commented on CASSANDRA-3667: I agree. We need a way to deactivate row/key caching on a per-cf basis. -- Key: CASSANDRA-3667 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Initial idea would be to either have a boolean flag if we only want to allow disabling row cache, or some multi-value caches option that could be none, key_only, row_only or all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories
[ https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175460#comment-13175460 ] Pavel Yaskevich commented on CASSANDRA-2749: That my be the case :) I will re-test as part of the review anyway. fine-grained control over data directories -- Key: CASSANDRA-2749 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 1.1 Attachments: 0001-2749.patch, 0001-Make-it-possible-to-put-column-families-in-subdirect.patch, 0001-non-backwards-compatible-patch-for-2749-putting-cfs-.patch.gz, 0002-fix-unit-tests.patch, 0003-Fixes.patch, 2749.tar.gz, 2749_backwards_compatible_v1.patch, 2749_backwards_compatible_v2.patch, 2749_backwards_compatible_v3.patch, 2749_backwards_compatible_v4.patch, 2749_backwards_compatible_v4_rebase1.patch, 2749_not_backwards.tar.gz, 2749_proper.tar.gz Currently Cassandra supports multiple data directories but no way to control what sstables are placed where. Particularly for systems with mixed SSDs and rotational disks, it would be nice to pin frequently accessed columnfamilies to the SSDs. Postgresql does this with tablespaces (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we should probably avoid using that name because of confusing similarity to keyspaces. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3497) BloomFilter FP ratio should be configurable or size-restricted some other way
[ https://issues.apache.org/jira/browse/CASSANDRA-3497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175462#comment-13175462 ] Yuki Morishita commented on CASSANDRA-3497: --- +1 BloomFilter FP ratio should be configurable or size-restricted some other way - Key: CASSANDRA-3497 URL: https://issues.apache.org/jira/browse/CASSANDRA-3497 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Brandon Williams Assignee: Yuki Morishita Priority: Minor Fix For: 1.0.7 Attachments: 3497-v3.txt, 3497-v4.txt, CASSANDRA-1.0-3497.txt When you have a live dc and purely analytical dc, in many situations you can have less nodes on the analytical side, but end up getting restricted by having the BloomFilters in-memory, even though you have absolutely no use for them. It would be nice if you could reduce this memory requirement by tuning the desired FP ratio, or even just disabling them altogether. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3143) Global caches (key/row)
[ https://issues.apache.org/jira/browse/CASSANDRA-3143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175470#comment-13175470 ] Jonathan Ellis commented on CASSANDRA-3143: --- +1 Global caches (key/row) --- Key: CASSANDRA-3143 URL: https://issues.apache.org/jira/browse/CASSANDRA-3143 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Priority: Minor Labels: Core Fix For: 1.1 Attachments: 0001-CASSANDRA-3143-squashed.patch, 0002-fixes.patch, 0003-final-fixes.patch Caches are difficult to configure well as ColumnFamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1222715 [3/3] - in /cassandra/trunk: ./ conf/ doc/cql/ interface/ src/avro/ src/java/org/apache/cassandra/cache/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassandra/config/ s
Modified: cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java URL: http://svn.apache.org/viewvc/cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java?rev=1222715r1=1222714r2=1222715view=diff == --- cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java (original) +++ cassandra/trunk/test/unit/org/apache/cassandra/db/KeyCacheTest.java Fri Dec 23 16:09:05 2011 @@ -20,12 +20,17 @@ package org.apache.cassandra.db; * */ - import java.io.IOException; import java.util.HashMap; import java.util.Map; import java.util.concurrent.ExecutionException; +import org.apache.cassandra.cache.KeyCacheKey; +import org.apache.cassandra.db.filter.QueryFilter; +import org.apache.cassandra.service.CacheService; +import org.apache.cassandra.thrift.ColumnParent; + +import org.junit.AfterClass; import org.junit.Test; import org.apache.cassandra.CleanupHelper; @@ -43,18 +48,11 @@ public class KeyCacheTest extends Cleanu private static final String TABLE1 = KeyCacheSpace; private static final String COLUMN_FAMILY1 = Standard1; private static final String COLUMN_FAMILY2 = Standard2; -private static final String COLUMN_FAMILY3 = Standard3; - -@Test -public void testKeyCache50() throws IOException, ExecutionException, InterruptedException -{ -testKeyCache(COLUMN_FAMILY1, 64); -} -@Test -public void testKeyCache100() throws IOException, ExecutionException, InterruptedException +@AfterClass +public static void cleanup() { -testKeyCache(COLUMN_FAMILY2, 128); +cleanupSavedCaches(); } @Test @@ -62,57 +60,48 @@ public class KeyCacheTest extends Cleanu { CompactionManager.instance.disableAutoCompaction(); -ColumnFamilyStore store = Table.open(TABLE1).getColumnFamilyStore(COLUMN_FAMILY3); +ColumnFamilyStore store = Table.open(TABLE1).getColumnFamilyStore(COLUMN_FAMILY2); // empty the cache -store.invalidateKeyCache(); -assert store.getKeyCacheSize() == 0; +CacheService.instance.invalidateKeyCache(); +assert CacheService.instance.keyCache.size() == 0; // insert data and force to disk -insertData(TABLE1, COLUMN_FAMILY3, 0, 100); +insertData(TABLE1, COLUMN_FAMILY2, 0, 100); store.forceBlockingFlush(); // populate the cache -readData(TABLE1, COLUMN_FAMILY3, 0, 100); -assertEquals(100, store.getKeyCacheSize()); +readData(TABLE1, COLUMN_FAMILY2, 0, 100); +assertEquals(100, CacheService.instance.keyCache.size()); // really? our caches don't implement the map interface? (hence no .addAll) -MapPairDescriptor, DecoratedKey, Long savedMap = new HashMapPairDescriptor, DecoratedKey, Long(); -for (PairDescriptor, DecoratedKey k : store.getKeyCache().getKeySet()) +MapKeyCacheKey, Long savedMap = new HashMapKeyCacheKey, Long(); +for (KeyCacheKey k : CacheService.instance.keyCache.getKeySet()) { -savedMap.put(k, store.getKeyCache().get(k)); +savedMap.put(k, CacheService.instance.keyCache.get(k)); } // force the cache to disk -store.keyCache.submitWrite(Integer.MAX_VALUE).get(); - -// empty the cache again to make sure values came from disk -store.invalidateKeyCache(); -assert store.getKeyCacheSize() == 0; - -// load the cache from disk. unregister the old mbean so we can recreate a new CFS object. -// but don't invalidate() the old CFS, which would nuke the data we want to try to load -store.unregisterMBean(); -ColumnFamilyStore newStore = ColumnFamilyStore.createColumnFamilyStore(Table.open(TABLE1), COLUMN_FAMILY3); -assertEquals(100, newStore.getKeyCacheSize()); +CacheService.instance.keyCache.submitWrite(Integer.MAX_VALUE).get(); -assertEquals(100, savedMap.size()); -for (Map.EntryPairDescriptor, DecoratedKey, Long entry : savedMap.entrySet()) -{ -assert newStore.getKeyCache().get(entry.getKey()).equals(entry.getValue()); -} +CacheService.instance.invalidateKeyCache(); +assert CacheService.instance.keyCache.size() == 0; } -public void testKeyCache(String cfName, int expectedCacheSize) throws IOException, ExecutionException, InterruptedException +@Test +public void testKeyCache() throws IOException, ExecutionException, InterruptedException { CompactionManager.instance.disableAutoCompaction(); Table table = Table.open(TABLE1); -ColumnFamilyStore cfs = table.getColumnFamilyStore(cfName); +ColumnFamilyStore cfs = table.getColumnFamilyStore(COLUMN_FAMILY1); + +// just to make sure that everything is clean +CacheService.instance.invalidateKeyCache(); -
[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files
[ https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175481#comment-13175481 ] Sylvain Lebresne commented on CASSANDRA-2988: - +1 (on 2988-2-v2) with 2 nits: * It's probably worth caching the value of {{sstableMetadata.estimatedRowSize.count()}} to avoid the double computation most of the time. * I think {noformat} long current = buckets.get(i); if (current 0) sum += current; {noformat} can be condensed to {{sum += buckets.get\(i);}} (given current can't be negative). Improve SSTableReader.load() when loading index files - Key: CASSANDRA-2988 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Melvin Wang Assignee: Melvin Wang Priority: Minor Fix For: 1.0.7 Attachments: 2988-2-cleaned.txt, 2988-2-v2.txt, 2988-parallel-v2.txt, c2988-2-v2, c2988-modified-buffer.patch, c2988-parallel-load-sstables.patch * when we create BufferredRandomAccessFile, we pass skipCache=true. This hurts the read performance because we always process the index files sequentially. Simple fix would be set it to false. * multiple index files of a single column family can be loaded in parallel. This buys a lot when you have multiple super large index files. * we may also change how we buffer. By using BufferredRandomAccessFile, for every read, we need bunch of checking like - do we need to rebuffer? - isEOF()? - assertions These can be simplified to some extent. We can blindly buffer the index file by chunks and process the buffer until a key lies across boundary of a chunk. Then we rebuffer and start from the beginning of the partially read key. Conceptually, this is same as what BRAF does but w/o the overhead in the read**() methods in BRAF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1222728 - in /cassandra/trunk: CHANGES.txt src/java/org/apache/cassandra/db/Memtable.java src/java/org/apache/cassandra/db/RowIteratorFactory.java
Author: slebresne Date: Fri Dec 23 16:25:19 2011 New Revision: 1222728 URL: http://svn.apache.org/viewvc?rev=1222728view=rev Log: Optimize memtable iteration during range scan patch by slebresne; reviewed by jbellis for CASSANDRA-3638 Modified: cassandra/trunk/CHANGES.txt cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1222728r1=1222727r2=1222728view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Fri Dec 23 16:25:19 2011 @@ -29,6 +29,7 @@ * fsync the directory after new sstable or commitlog segment are created (CASSANDRA-3250) * fix minor issues reported by FindBugs (CASSANDRA-3658) * global key/row caches (CASSANDRA-3143) + * optimize memtable iteration during range scan (CASSANDRA-3638) 1.0.7 * add nodetool setstreamthroughput (CASSANDRA-3571) Modified: cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java?rev=1222728r1=1222727r2=1222728view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java Fri Dec 23 16:25:19 2011 @@ -309,11 +309,13 @@ public class Memtable * @param startWith Include data in the result from and including this key and to the end of the memtable * @return An iterator of entries with the data from the start key */ -public IteratorMap.EntryDecoratedKey, ColumnFamily getEntryIterator(final RowPosition startWith) +public IteratorMap.EntryDecoratedKey, ColumnFamily getEntryIterator(final RowPosition startWith, final RowPosition stopAt) { return new IteratorMap.EntryDecoratedKey, ColumnFamily() { -private IteratorMap.EntryRowPosition, ColumnFamily iter = columnFamilies.tailMap(startWith).entrySet().iterator(); +private IteratorMap.EntryRowPosition, ColumnFamily iter = stopAt.isMinimum() +? columnFamilies.tailMap(startWith).entrySet().iterator() +: columnFamilies.subMap(startWith, true, stopAt, true).entrySet().iterator(); public boolean hasNext() { Modified: cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java?rev=1222728r1=1222727r2=1222728view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/RowIteratorFactory.java Fri Dec 23 16:25:19 2011 @@ -65,21 +65,11 @@ public class RowIteratorFactory { // fetch data from current memtable, historical memtables, and SSTables in the correct order. final ListCloseableIteratorIColumnIterator iterators = new ArrayListCloseableIteratorIColumnIterator(); -// we iterate through memtables with a priority queue to avoid more sorting than necessary. -// this predicate throws out the rows before the start of our range. -PredicateIColumnIterator p = new PredicateIColumnIterator() -{ -public boolean apply(IColumnIterator row) -{ -return startWith.compareTo(row.getKey()) = 0 -(stopAt.isMinimum() || row.getKey().compareTo(stopAt) = 0); -} -}; // memtables for (Memtable memtable : memtables) { -iterators.add(new ConvertToColumnIterator(filter, p, memtable.getEntryIterator(startWith))); +iterators.add(new ConvertToColumnIterator(filter, memtable.getEntryIterator(startWith, stopAt))); } for (SSTableReader sstable : sstables) @@ -139,24 +129,20 @@ public class RowIteratorFactory private static class ConvertToColumnIterator extends AbstractIteratorIColumnIterator implements CloseableIteratorIColumnIterator { private final QueryFilter filter; -private final PredicateIColumnIterator pred; private final IteratorMap.EntryDecoratedKey, ColumnFamily iter; -public ConvertToColumnIterator(QueryFilter filter, PredicateIColumnIterator pred, IteratorMap.EntryDecoratedKey, ColumnFamily iter) +public ConvertToColumnIterator(QueryFilter filter, IteratorMap.EntryDecoratedKey, ColumnFamily iter) { this.filter = filter; -
svn commit: r1222738 - in /cassandra/branches/cassandra-1.0: ./ interface/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/avro/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassa
Author: jbellis Date: Fri Dec 23 16:39:01 2011 New Revision: 1222738 URL: http://svn.apache.org/viewvc?rev=1222738view=rev Log: allow configuring bloom_filter_fp_chance patch by yukim and jbellis for CASSANDRA-3497 Modified: cassandra/branches/cassandra-1.0/CHANGES.txt cassandra/branches/cassandra-1.0/interface/cassandra.thrift cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Constants.java cassandra/branches/cassandra-1.0/src/avro/internode.genavro cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/cli/CliClient.java cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/config/CFMetaData.java cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java cassandra/branches/cassandra-1.0/src/resources/org/apache/cassandra/cli/CliHelp.yaml Modified: cassandra/branches/cassandra-1.0/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/CHANGES.txt?rev=1222738r1=1222737r2=1222738view=diff == --- cassandra/branches/cassandra-1.0/CHANGES.txt (original) +++ cassandra/branches/cassandra-1.0/CHANGES.txt Fri Dec 23 16:39:01 2011 @@ -1,4 +1,5 @@ 1.0.7 + * allow configuring bloom_filter_fp_chance (CASSANDRA-3497) * attempt hint delivery every ten minutes, or when failure detector notifies us that a node is back up, whichever comes first. hint handoff throttle delay default changed to 1ms, from 50 (CASSANDRA-3554) Modified: cassandra/branches/cassandra-1.0/interface/cassandra.thrift URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/cassandra.thrift?rev=1222738r1=1222737r2=1222738view=diff == --- cassandra/branches/cassandra-1.0/interface/cassandra.thrift (original) +++ cassandra/branches/cassandra-1.0/interface/cassandra.thrift Fri Dec 23 16:39:01 2011 @@ -46,7 +46,7 @@ namespace rb CassandraThrift # for every edit that doesn't result in a change to major/minor. # # See the Semantic Versioning Specification (SemVer) http://semver.org. -const string VERSION = 19.19.0 +const string VERSION = 19.20.0 # @@ -414,6 +414,7 @@ struct CfDef { 30: optional mapstring,string compaction_strategy_options, 31: optional i32 row_cache_keys_to_save, 32: optional mapstring,string compression_options, +33: optional double bloom_filter_fp_chance, } /* describes a keyspace. */ Modified: cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java?rev=1222738r1=1222737r2=1222738view=diff == --- cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (original) +++ cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java Fri Dec 23 16:39:01 2011 @@ -17041,6 +17041,8 @@ public class Cassandra { private void readObject(java.io.ObjectInputStream in) throws java.io.IOException, ClassNotFoundException { try { +// it doesn't seem like you should have to do this, but java serialization is wacky, and doesn't call the default constructor. +__isset_bit_vector = new BitSet(1); read(new org.apache.thrift.protocol.TCompactProtocol(new org.apache.thrift.transport.TIOStreamTransport(in))); } catch (org.apache.thrift.TException te) { throw new java.io.IOException(te); Modified: cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java?rev=1222738r1=1222737r2=1222738view=diff == --- cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java (original) +++ cassandra/branches/cassandra-1.0/interface/thrift/gen-java/org/apache/cassandra/thrift/CfDef.java Fri Dec 23 16:39:01 2011 @@ -71,6 +71,7 @@ public class CfDef implements org.apache private static final org.apache.thrift.protocol.TField COMPACTION_STRATEGY_OPTIONS_FIELD_DESC = new org.apache.thrift.protocol.TField(compaction_strategy_options, org.apache.thrift.protocol.TType.MAP, (short)30); private static final org.apache.thrift.protocol.TField ROW_CACHE_KEYS_TO_SAVE_FIELD_DESC = new
svn commit: r1222743 - in /cassandra/trunk: ./ conf/ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/io/compress/ src/ja
Author: jbellis Date: Fri Dec 23 16:44:47 2011 New Revision: 1222743 URL: http://svn.apache.org/viewvc?rev=1222743view=rev Log: merge from 1.0 Modified: cassandra/trunk/ (props changed) cassandra/trunk/CHANGES.txt cassandra/trunk/conf/cassandra.yaml cassandra/trunk/contrib/ (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStoreMBean.java cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressionParameters.java cassandra/trunk/src/java/org/apache/cassandra/net/IncomingTcpConnection.java cassandra/trunk/src/java/org/apache/cassandra/service/GCInspector.java cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java cassandra/trunk/src/java/org/apache/cassandra/service/StorageServiceMBean.java Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Fri Dec 23 16:44:47 2011 @@ -1,10 +1,10 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291 /cassandra/branches/cassandra-0.7:1026516-1211709 /cassandra/branches/cassandra-0.7.0:1053690-1055654 -/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1212854,1212938,1214916,1222372 +/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1198724,1198726-1206097,1206099-1220925,1220927-1222440 /cassandra/branches/cassandra-0.8.0:1125021-1130369 /cassandra/branches/cassandra-0.8.1:1101014-1125018 -/cassandra/branches/cassandra-1.0:1167085-1222420 +/cassandra/branches/cassandra-1.0:1167085-1222470 /cassandra/branches/cassandra-1.0.0:1167104-1167229,1167232-1181093,1181741,1181816,1181820,1182951,1183243 /cassandra/branches/cassandra-1.0.5:1208016 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1222743r1=1222742r2=1222743view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Fri Dec 23 16:44:47 2011 @@ -32,6 +32,9 @@ * optimize memtable iteration during range scan (CASSANDRA-3638) 1.0.7 + * attempt hint delivery every ten minutes, or when failure detector + notifies us that a node is back up, whichever comes first. hint + handoff throttle delay default changed to 1ms, from 50 (CASSANDRA-3554) * add nodetool setstreamthroughput (CASSANDRA-3571) * fix assertion when dropping a columnfamily with no sstables (CASSANDRA-3614) * more efficient allocation of small bloom filters (CASSANDRA-3618) @@ -40,6 +43,7 @@ * stop thrift service in shutdown hook so we can quiesce MessagingService (CASSANDRA-3335) Merged from 0.8: + * avoid logging (harmless) exception when GC takes 1ms (CASSANDRA-3656) * prevent new nodes from thinking down nodes are up forever (CASSANDRA-3626) * Flush non-cfs backed secondary indexes (CASSANDRA-3659) Modified: cassandra/trunk/conf/cassandra.yaml URL: http://svn.apache.org/viewvc/cassandra/trunk/conf/cassandra.yaml?rev=1222743r1=1222742r2=1222743view=diff == --- cassandra/trunk/conf/cassandra.yaml (original) +++ cassandra/trunk/conf/cassandra.yaml Fri Dec 23 16:44:47 2011 @@ -26,8 +26,8 @@ hinted_handoff_enabled: true # this defines the maximum amount of time a dead host will have hints # generated. After it has been dead this long, hints will be dropped. max_hint_window_in_ms: 360 # one hour -# Sleep this long after delivering each row or row fragment -hinted_handoff_throttle_delay_in_ms: 50 +# Sleep this long after delivering each hint +hinted_handoff_throttle_delay_in_ms: 1 # authentication backend, implementing IAuthenticator; used to identify users authenticator: org.apache.cassandra.auth.AllowAllAuthenticator Propchange: cassandra/trunk/contrib/
[jira] [Commented] (CASSANDRA-3667) We need a way to deactivate row/key caching on a per-cf basis.
[ https://issues.apache.org/jira/browse/CASSANDRA-3667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175536#comment-13175536 ] Radim Kolar commented on CASSANDRA-3667: you can reuse old cache settings for that purpose. if number of cached rows/keys is nonzero then use new global cache We need a way to deactivate row/key caching on a per-cf basis. -- Key: CASSANDRA-3667 URL: https://issues.apache.org/jira/browse/CASSANDRA-3667 Project: Cassandra Issue Type: Improvement Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Initial idea would be to either have a boolean flag if we only want to allow disabling row cache, or some multi-value caches option that could be none, key_only, row_only or all. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1222806 - in /cassandra/branches/cassandra-1.0: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/index/ src/java/org/apache/cassandra/db/index/keys/
Author: jake Date: Fri Dec 23 19:10:54 2011 New Revision: 1222806 URL: http://svn.apache.org/viewvc?rev=1222806view=rev Log: Secondary Indexes should report memory consumption Patch by tjake; reviewed by jbellis for CASSANDRA-3155 Modified: cassandra/branches/cassandra-1.0/CHANGES.txt cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/keys/KeysIndex.java Modified: cassandra/branches/cassandra-1.0/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/CHANGES.txt?rev=1222806r1=1222805r2=1222806view=diff == --- cassandra/branches/cassandra-1.0/CHANGES.txt (original) +++ cassandra/branches/cassandra-1.0/CHANGES.txt Fri Dec 23 19:10:54 2011 @@ -14,7 +14,7 @@ Merged from 0.8: * avoid logging (harmless) exception when GC takes 1ms (CASSANDRA-3656) * prevent new nodes from thinking down nodes are up forever (CASSANDRA-3626) * Flush non-cfs backed secondary indexes (CASSANDRA-3659) - + * Secondary Indexes should report memory consumption (CASSANDRA-3155) 1.0.6 * (CQL) fix cqlsh support for replicate_on_write (CASSANDRA-3596) Modified: cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1222806r1=1222805r2=1222806view=diff == --- cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java (original) +++ cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/ColumnFamilyStore.java Fri Dec 23 19:10:54 2011 @@ -1028,10 +1028,7 @@ public class ColumnFamilyStore implement public long getTotalMemtableLiveSize() { -long total = 0; -for (ColumnFamilyStore cfs : concatWithIndexes()) -total += cfs.getMemtableThreadSafe().getLiveSize(); -return total; +return getMemtableDataSize() + indexManager.getTotalLiveSize(); } public int getMemtableSwitchCount() Modified: cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java?rev=1222806r1=1222805r2=1222806view=diff == --- cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java (original) +++ cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndex.java Fri Dec 23 19:10:54 2011 @@ -112,6 +112,11 @@ public abstract class SecondaryIndex public abstract void forceBlockingFlush() throws IOException; /** + * Get current amount of memory this index is consuming (in bytes) + */ +public abstract long getLiveSize(); + +/** * Allow access to the underlying column family store if there is one * @return the underlying column family store or null */ Modified: cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java?rev=1222806r1=1222805r2=1222806view=diff == --- cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java (original) +++ cassandra/branches/cassandra-1.0/src/java/org/apache/cassandra/db/index/SecondaryIndexManager.java Fri Dec 23 19:10:54 2011 @@ -328,6 +328,27 @@ public class SecondaryIndexManager return indexList.keySet(); } +/** + * @return total current ram size of all indexes + */ +public long getTotalLiveSize() +{ +long total = 0; + +// we use identity map because per row indexes use same instance +// across many columns +IdentityHashMapSecondaryIndex, Object indexList = new IdentityHashMapSecondaryIndex, Object(); + +for (Map.EntryByteBuffer, SecondaryIndex entry : indexesByColumn.entrySet()) +{ +SecondaryIndex index = entry.getValue(); + +if (indexList.put(index, index) == null) +total += index.getLiveSize(); +} + +return total; +} /** * Removes obsolete index entries and creates new ones for the given row key Modified:
[jira] [Created] (CASSANDRA-3668) Performance of sstablloader is affected in 1.0.x
Performance of sstablloader is affected in 1.0.x Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Fix For: 1.0.7 One my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 Due to above reported issue generator problem is solved but performance of the sstableloader is still issue. Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status. But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Zope updated CASSANDRA-3668: --- Summary: Performance of sstableloader is affected in 1.0.x (was: Performance of sstablloader is affected in 1.0.x) Performance of sstableloader is affected in 1.0.x - Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Fix For: 1.0.7 Original Estimate: 96h Remaining Estimate: 96h One my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 Due to above reported issue generator problem is solved but performance of the sstableloader is still issue. Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status. But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Zope updated CASSANDRA-3668: --- Remaining Estimate: 48h (was: 96h) Original Estimate: 48h (was: 96h) Performance of sstableloader is affected in 1.0.x - Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Fix For: 1.0.7 Original Estimate: 48h Remaining Estimate: 48h One my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 Due to above reported issue generator problem is solved but performance of the sstableloader is still issue. Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status. But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3669) [patch] Word count sample has a flawed addToMutationMap, fix
[ https://issues.apache.org/jira/browse/CASSANDRA-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Brosius updated CASSANDRA-3669: Attachment: mutation_sample.diff [patch] Word count sample has a flawed addToMutationMap, fix Key: CASSANDRA-3669 URL: https://issues.apache.org/jira/browse/CASSANDRA-3669 Project: Cassandra Issue Type: Improvement Affects Versions: 1.0.6 Reporter: Dave Brosius Priority: Trivial Attachments: mutation_sample.diff The WordCount example shows how to use client.batch_mutate, and has a helper method for building a mutation map. While the example works properly, the example addToMutationMap is flawed in that it won't allow adding of multiple columns to the same row, as is what is needed to perform a 'sql like insert' operation, which is the most likely example someone learning cassandra will want to do. Fixed the sample addToMutationMap code so that it works correctly for multi column inserts in one 'row'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3669) [patch] Word count sample has a flawed addToMutationMap, fix
[patch] Word count sample has a flawed addToMutationMap, fix Key: CASSANDRA-3669 URL: https://issues.apache.org/jira/browse/CASSANDRA-3669 Project: Cassandra Issue Type: Improvement Affects Versions: 1.0.6 Reporter: Dave Brosius Priority: Trivial Attachments: mutation_sample.diff The WordCount example shows how to use client.batch_mutate, and has a helper method for building a mutation map. While the example works properly, the example addToMutationMap is flawed in that it won't allow adding of multiple columns to the same row, as is what is needed to perform a 'sql like insert' operation, which is the most likely example someone learning cassandra will want to do. Fixed the sample addToMutationMap code so that it works correctly for multi column inserts in one 'row'. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Zope updated CASSANDRA-3668: --- Description: One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. was: One my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 Due to above reported issue generator problem is solved but performance of the sstableloader is still issue. Isuue 3589 is marked as duplicate of 3618.Both issues shows resolved status. But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Performance of sstableloader is affected in 1.0.x - Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Fix For: 1.0.7 Original Estimate: 48h Remaining Estimate: 48h One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-3668: - Assignee: Yuki Morishita Can you tell us how to reproduce? What kind of degradation are you seeing? Performance of sstableloader is affected in 1.0.x - Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Assignee: Yuki Morishita Fix For: 1.0.7 Original Estimate: 48h Remaining Estimate: 48h One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3668) Performance of sstableloader is affected in 1.0.x
[ https://issues.apache.org/jira/browse/CASSANDRA-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175561#comment-13175561 ] Jonathan Ellis commented on CASSANDRA-3668: --- And to clarify: we're still talking about compared to 0.8.7 right? Performance of sstableloader is affected in 1.0.x - Key: CASSANDRA-3668 URL: https://issues.apache.org/jira/browse/CASSANDRA-3668 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.7 Reporter: Manish Zope Assignee: Yuki Morishita Fix For: 1.0.7 Original Estimate: 48h Remaining Estimate: 48h One of my colleague had reported the bug regarding the degraded performance of the sstable generator and sstable loader. ISSUE :- https://issues.apache.org/jira/browse/CASSANDRA-3589 As stated in above issue generator performance is rectified but performance of the sstableloader is still an issue. 3589 is marked as duplicate of 3618.Both issues shows resolved status.But the problem with sstableloader still exists. So opening other issue so that sstbleloader problem should not go unnoticed. FYI : We have tested the generator part with the patch given in 3589.Its Working fine. Please let us know if you guys require further inputs from our side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down
[ https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175566#comment-13175566 ] Jonathan Ellis commented on CASSANDRA-3666: --- Were there any error messages logged? Changing compaction strategy from Leveled to SizeTiered puts the node down -- Key: CASSANDRA-3666 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.6 Environment: Windows Server 2008 R2 64bit Reporter: Viktor Jevdokimov When column family compaction strategy is changed from Leveled to SizeTiered and there're Leveled compaction tasks pending, Cassandra starting to flood in logs with thousands per sec messages: Nothing to compact in ColumnFamily1. Use forceUserDefinedCompaction if you wish to force compaction of single sstables (e.g. for tombstone collection) As a result, log disk is full and system is down. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered puts the node down
[ https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3666: -- Attachment: 3666.txt I bet the culprit is how LCS sets the min/max compaction threshold but STCS does not. Can you try the attached patch? Changing compaction strategy from Leveled to SizeTiered puts the node down -- Key: CASSANDRA-3666 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.6 Environment: Windows Server 2008 R2 64bit Reporter: Viktor Jevdokimov Labels: compaction Fix For: 1.0.7 Attachments: 3666.txt When column family compaction strategy is changed from Leveled to SizeTiered and there're Leveled compaction tasks pending, Cassandra starting to flood in logs with thousands per sec messages: Nothing to compact in ColumnFamily1. Use forceUserDefinedCompaction if you wish to force compaction of single sstables (e.g. for tombstone collection) As a result, log disk is full and system is down. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3666) Changing compaction strategy from Leveled to SizeTiered logs millions of messages about nothing to compact
[ https://issues.apache.org/jira/browse/CASSANDRA-3666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3666: -- Affects Version/s: (was: 1.0.6) 1.0.0 Summary: Changing compaction strategy from Leveled to SizeTiered logs millions of messages about nothing to compact (was: Changing compaction strategy from Leveled to SizeTiered puts the node down) Changing compaction strategy from Leveled to SizeTiered logs millions of messages about nothing to compact -- Key: CASSANDRA-3666 URL: https://issues.apache.org/jira/browse/CASSANDRA-3666 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.0 Environment: Windows Server 2008 R2 64bit Reporter: Viktor Jevdokimov Assignee: Jonathan Ellis Labels: compaction Fix For: 1.0.7 Attachments: 3666.txt When column family compaction strategy is changed from Leveled to SizeTiered and there're Leveled compaction tasks pending, Cassandra starting to flood in logs with thousands per sec messages: Nothing to compact in ColumnFamily1. Use forceUserDefinedCompaction if you wish to force compaction of single sstables (e.g. for tombstone collection) As a result, log disk is full and system is down. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers
[ https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175574#comment-13175574 ] paul cannon commented on CASSANDRA-3507: All default Linux-based OS installs include Python nowadays, and so does Mac OS X. A py2exe compilation of cqlsh, along with a gui shell like IDLE, is a possibility for the Windows side. So no, there's no reason that switching to cqlsh should make the install or bootstrap processes more complicated. Also, cqlsh is already written in python, and includes a lot of features which would probably be overly difficult or time-consuming to rewrite on the JVM. Proposal: separate cqlsh from CQL drivers - Key: CASSANDRA-3507 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507 Project: Cassandra Issue Type: Improvement Components: Packaging, Tools Affects Versions: 1.0.3 Environment: Debian-based systems Reporter: paul cannon Assignee: paul cannon Priority: Minor Labels: cql, cqlsh Fix For: 1.1 Whereas: * It has been shown to be very desirable to decouple the release cycles of Cassandra from the various client CQL drivers, and * It is also desirable to include a good interactive CQL client with releases of Cassandra, and * It is not desirable for Cassandra releases to depend on 3rd-party software which is neither bundled with Cassandra nor readily available for every target platform, but * Any good interactive CQL client will require a CQL driver; Therefore, be it resolved that: * cqlsh will not use an official or supported CQL driver, but will include its own private CQL driver, not intended for use by anything else, and * the Cassandra project will still recommend installing and using a proper CQL driver for client software. To ease maintenance, the private CQL driver included with cqlsh may very well be created by copying the python CQL driver from one directory into another, but the user shouldn't rely on this. Maybe we even ought to take some minor steps to discourage its use for other purposes. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3397) Problem markers don't show up in Eclipse
[ https://issues.apache.org/jira/browse/CASSANDRA-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175577#comment-13175577 ] David Allsopp commented on CASSANDRA-3397: -- I was just interested in the error markers, thanks - I too found the Ant Builder too heavy! Problem markers don't show up in Eclipse Key: CASSANDRA-3397 URL: https://issues.apache.org/jira/browse/CASSANDRA-3397 Project: Cassandra Issue Type: Bug Components: Packaging Affects Versions: 1.0.0 Environment: Eclipse Reporter: David Allsopp Assignee: David Allsopp Priority: Minor Labels: ant, eclipse, ide Fix For: 1.0.7 Attachments: Cassandra-3397.patch The generated Eclipse files install an Ant Builder to build Cassandra within Eclipse. This appears to mean that the default Java Builder is not present. This means that no problem markers show up in the Problem view or the Package Explorer etc when there are compiler errors or warnings - you have to study the console output, then navigate manually to the sources of the problems, which is very tedious. It seems to be possible to re-install the default Java Builder in parallel with the Ant Builder, getting the best of both worlds. I have documented this on the wiki at http://wiki.apache.org/cassandra/RunningCassandraInEclipse I was wondering a) whether this can be done automatically by the generate-eclipse-files Ant target, and b) whether using both Builders will be problem if one is working on any of the generated code (Thrift, CQL etc). The Java Builder can be temporarily disabled if so by unticking it under Properties-Builders... See also https://issues.apache.org/jira/browse/CASSANDRA-2854 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3633) update stress to support prepared statements
[ https://issues.apache.org/jira/browse/CASSANDRA-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-3633: -- Attachment: v1-0003-support-for-server-side-prepared-statements.txt v1-0002-wrap-Cassandra.Client-for-prepared-statement-storage.txt v1-0001-CASSANDRA-3633-refactor-for-parametized-queries.txt update stress to support prepared statements Key: CASSANDRA-3633 URL: https://issues.apache.org/jira/browse/CASSANDRA-3633 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: v1-0001-CASSANDRA-3633-refactor-for-parametized-queries.txt, v1-0002-wrap-Cassandra.Client-for-prepared-statement-storage.txt, v1-0003-support-for-server-side-prepared-statements.txt The {{stress}} utility needs to be updated for testing prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-3634: -- Attachment: v1-0002-change-bind-parms-from-string-to-bytes.txt v1-0001-CASSANDRA-3634-generated-thrift-code.txt compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175581#comment-13175581 ] Eric Evans commented on CASSANDRA-3634: --- v1-0001-CASSANDRA-3634-generated-thrift-code.txt and v1-0002-change-bind-parms-from-string-to-bytes.txt convert string bind params to binary for purposes of performance testing. compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-3634: -- Attachment: stress-change-bind-parms-to-BB.patch stress-change-bind-parms-to-BB.patch updates stress to use binary query parameters for prepared statements. This patch only updates the operations used in testing, (it would need more work before committing). compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175588#comment-13175588 ] Eric Evans commented on CASSANDRA-3634: --- Here is the performance comparison. I stuck to the same tests I performed earlier (those earlier results can be found [here|http://www.acunu.com/blogs/eric-evans/cql-benchmarking]). The patches to support binary query parameters for Cassandra and {{stress}} are attached to this issue, and the raw results can be found [here| http://people.apache.org/~eevans/3634]. _Note: Percentages listed are in relation to RPC performance._ h3. Inserts, 20M rows x 5 columns !http://people.apache.org/~eevans/3634/insert_20mx5_noidx_t50_20111223.png|width=700! || ||Average OP rate||Average Latency|| |RPC|23,681/s|1.1ms| |CQL|21,128/s (-11%)|1.3ms (+11%)| |CQL w/ Prepared statements|23,911/s|1.1ms| |CQL w/ Prepared statements (binary parms)|24,919/s (+5%)|1.2ms (+5%)| h3. Inserts, 10M rows x 5 columns, KEYS index !http://people.apache.org/~eevans/3634/insert_10mx5_keysidx_t50_20111223.png|width=700! || ||Average OP rate||Average Latency|| |RPC|10,054/s|5ms| |CQL|9,326/s (-7%)|5.4ms (+8%)| |CQL w/ Prepared statements|10,413/s (+3%)|4.8ms (-3%)| |CQL w/ Prepared statements (binary parms)|10,299/s (+2%)|5ms| h3. Counter increments, 10M rows x 5 columns !http://people.apache.org/~eevans/3634/count_10mx5_noidx_t50_20111223.png|width=700! || ||Average OP rate||Average Latency|| |RPC|22,075/s|1.2ms| |CQL|20,645/s (-6%)|1.2ms (+2%)| |CQL w/ Prepared statements|24,286/s (+9%)|1.2ms (-1%)| |CQL w/ Prepared statements (binary parms)|23,359/s (+5%)|1.2ms| h3. Reads, 20M rows x 5 columns !http://people.apache.org/~eevans/3634/read_20mx5_noidx_t50_20111223.png|width=700! || ||Average OP rate||Average Latency|| |RPC|22,285/s|2.1ms| |CQL|20,080/s (-10%)|2.3ms (+9%)| |CQL w/ Prepared statements|22,374/s|2.1ms (-1%)| |CQL w/ Prepared statements (binary parms)|22,176/s|2.1ms| compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-3623: - Attachment: 0002-tests-for-MMaped-Compression-segmented-file-v2.patch 0001-MMaped-Compression-segmented-file-v2.patch Attached patch has optimization on memcpy which the earlier one didnt. Performance: Current trunk: 400+ms Avg Removing CRC (CASSANDRA-3611): 200+ms Avg With this patch: 100+ms Avg use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of Cassandra2474 by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The Cassandra2474 page has been changed by JonathanEllis: http://wiki.apache.org/cassandra/Cassandra2474?action=diffrev1=2rev2=3 Comment: add Alpha, Beta, and Discussion Summary sections TableOfContents(100) + == Goals == + + Primary: provide a CQL syntax for updating and querying composite column families. + + Secondary goal: proposed syntax should be implementable by the Hive driver with the minimum of changes from mainline Hive. In particular, changes to the Hive parser are too difficult to maintain long-term and are Right Out. We would prefer to avoid changes to the Hive metastore but this is doable if necessary. + + Tertiary goal: it would be nice to also support supercolumns + + == Non-goals == + + Supporting arbitrarily-and-non-uniformly nested document data is a non-goal. https://issues.apache.org/jira/browse/CASSANDRA-3647 is created to follow up on this related problem. + == Alpha == - Discussion starts [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046834page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046834|here]] + The short-lived first proposal envisioned adding the prefix from which to select a resultset to the table name in the FROM clause. Discussion starts Discussion starts [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046834page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046834|here]] - === Goals === + {{{ + SELECT x, y FROM foo:bar WHERE parent='columnA' + }}} - * FIXME: add goals - * FIXME: add goals - * FIXME: add goals + {{{ + select a, b FROM foo:bar:columnA where subparent='x' + }}} + + === Discussion Summary === + + Jonathan was thinking in terms of supercolumns for this early proposal. It's not clear how to generalize this to composites where the subcolumns are not explicitly named in the CompositeType definition. + + This proposal would require a Hive metastore change, but the nail in the coffin is that this means you cannot use WHERE clauses with the parent parts of the column. So, no range queries (necessary for map/reduce) or even slices within the same row. == Beta == - Discussion starts [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13095626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13095626|here]] + This proposal suggests the use of a keyword or hint to indicate that a query is transposed. Discussion starts [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13046937page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046937|here]] - === Goals === + The first part of the discussion is where to put the transposition marker: - * FIXME: add goals - * FIXME: add goals - * FIXME: add goals + {{{ + select /*+TRANSPOSED*/ key, column, subcolumn, value from foo; + }}} + + {{{ + select key, column, subcolumn, value from foo TRANSPOSED; + }}} + + {{{ + select transposed(key, column, subcolumn, value) from foo; + }}} + + Settling on table:transposed because that requires no Hive changes: + + {{{ + select key, column, subcolumn, value from foo:transposed; + }}} + + The second part, starting [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13095626page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13095626|here]], digs into how to deal with destructuring the composite column name: + + {{{ + SELECT name AS (tweet_id, username), value AS body + FROM timeline:transposed + WHERE tweet_id = '95a789a' AND user_id = 'cscotta' + }}} + + {{{ + SELECT component1 AS tweet_id, component2 AS username, component3 location, value AS body + FROM timeline:transposed + WHERE user_id = '95a789a' + }}} + + {{{ + UPDATE tweets:transposed SET COMPOUND NAME ('2e1c3308', 'cscotta') = 'My motocycle...' WHERE KEY = key; + }}} + + {{{ + UPDATE tweets:transposed SET value = 'my motorcycle' WHERE KEY= key AND column = COMPOUND_NAME('2e1c3308', 'cscotta'); + }}} + + === Discussion Summary === + + There was general agreement that FROM foo:transposed is a reasonable syntax, however, neither the componentX syntax (where X is in range(1, number of components in the compositetype) nor the name AS (x, y) syntax met with approval: the name AS syntax requires patching the Hive parser, and the componentX syntax is ugly and repetitive to use. The UPDATE syntaxes were also unsatisfactory. == Gamma == + This proposal switches gears to dealing with transposition using DDL instead of + Discussion starts [[https://issues.apache.org/jira/browse/CASSANDRA-2474?focusedCommentId=13171304page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13171304|here]] - === Goals === - - * FIXME: add goals - * FIXME: add
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175595#comment-13175595 ] Vijay commented on CASSANDRA-3623: -- Hot Methods before the patch (trunk): Excl. User CPUName sec. % 1480.474 100.00 Total 756.717 51.11 crc32 387.767 26.19 static@0x54999 (snappy-1.0.4.1-libsnappyjava.so) 54.814 3.70 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String, org.apache.cassandra.io.compress.CompressionMetadata, boolean) 46.676 3.15 org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, boolean) 45.697 3.09 Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long) 39.417 2.66 memcpy 36.931 2.49 static@0xd8e9 (libpthread-2.5.so) 23.272 1.57 CompactibleFreeListSpace::block_size(const HeapWord*) const 22.766 1.54 SpinPause 12.593 0.85 BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const 9.304 0.63 CardTableModRefBSForCTRS::card_will_be_scanned(signed char) 8.468 0.57 CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*, bool) 8.051 0.54 ParallelTaskTerminator::offer_termination(TerminatorTerminator*) 5.400 0.36 madvise 4.619 0.31 CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, unsigned long) 1.584 0.11 CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int) 1.551 0.10 SweepClosure::do_blk_careful(HeapWord*) Hot Methods After the patch: sec. % 537.681 100.00 Total 529.719 98.52 static@0x54999 (snappy-1.0.4.1-libsnappyjava.so) 4.168 0.78 memcpy 0.143 0.03 Unknown 0.121 0.02 send 0.121 0.02 sun.misc.Unsafe.park(boolean, long) 0.110 0.02 sun.misc.Unsafe.unpark(java.lang.Object) 0.088 0.02 Interpreter 0.077 0.01 org.apache.cassandra.utils.EstimatedHistogram.max() 0.077 0.01 recv 0.066 0.01 SpinPause 0.055 0.01 org.apache.cassandra.utils.EstimatedHistogram.mean() 0.044 0.01 java.lang.Object.wait(long) 0.044 0.01 org.apache.cassandra.utils.EstimatedHistogram.min() 0.044 0.01 __pthread_cond_signal 0.044 0.01 vtable stub 0.033 0.01 java.lang.Object.notify() 0.033 0.01 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable) 0.033 0.01 org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read() 0.033 0.01 PhaseLive::compute(unsigned) 0.033 0.01 poll 0.022 0.00 Arena::contains(const void*) const 0.022 0.00 CompactibleFreeListSpace::free() const 0.022 0.00 I2C/C2I adapters 0.022 0.00 IndexSetIterator::advance_and_next() 0.022 0.00 java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) 0.022 0.00 java.lang.Long.getChars(long, int, char[]) 0.022 0.00 java.nio.Bits.swap(int) Before this patch response times: Epoch Rds/s RdLat Wrts/s WrtLat %user %sys %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb Percentiles ReadWrite Compacts 1324587443 15 186.305 00.000 27.85 0.0271.83 0.24 0.053.890.000.120.0041 45 99th 545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0 1324587455 15 1142.712 00.000 39.55 0.1357.61 2.50 0.21118.30 0.302.200.0034 36 99th 8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0 1324587467 10 171.808 00.000 23.83 0.0476.05 0.04 0.054.800.000.140.00127 33 99th 454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0 1324587478 10 182.775 00.000 20.43 0.0479.47 0.01 0.051.600.400.040.0030 37 99th 379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587490 13 190.893 00.000 27.58 0.0372.20 0.14 0.063.200.500.090.0039 42 99th 545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587503 28 358.719 00.000 52.24 0.0846.20 1.40 0.09159.40 0.003.160.00196 71 99th 3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0 1324587517 13 194.281 00.000 16.68 0.0283.23 0.04 0.022.400.300.070.0038 41 99th 785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0 1324587535 36 662.410 00.000 58.34 0.0841.42 0.06 0.103.600.200.110.00173 81 99th 3379.391 ms
[jira] [Commented] (CASSANDRA-3507) Proposal: separate cqlsh from CQL drivers
[ https://issues.apache.org/jira/browse/CASSANDRA-3507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175596#comment-13175596 ] Jeremy Hanna commented on CASSANDRA-3507: - Makes sense. I hadn't realized so much had gone into the python based shell. I also hadn't realized it could be made into an executable for windows. Proposal: separate cqlsh from CQL drivers - Key: CASSANDRA-3507 URL: https://issues.apache.org/jira/browse/CASSANDRA-3507 Project: Cassandra Issue Type: Improvement Components: Packaging, Tools Affects Versions: 1.0.3 Environment: Debian-based systems Reporter: paul cannon Assignee: paul cannon Priority: Minor Labels: cql, cqlsh Fix For: 1.1 Whereas: * It has been shown to be very desirable to decouple the release cycles of Cassandra from the various client CQL drivers, and * It is also desirable to include a good interactive CQL client with releases of Cassandra, and * It is not desirable for Cassandra releases to depend on 3rd-party software which is neither bundled with Cassandra nor readily available for every target platform, but * Any good interactive CQL client will require a CQL driver; Therefore, be it resolved that: * cqlsh will not use an official or supported CQL driver, but will include its own private CQL driver, not intended for use by anything else, and * the Cassandra project will still recommend installing and using a proper CQL driver for client software. To ease maintenance, the private CQL driver included with cqlsh may very well be created by copying the python CQL driver from one directory into another, but the user shouldn't rely on this. Maybe we even ought to take some minor steps to discourage its use for other purposes. Thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175598#comment-13175598 ] Vijay commented on CASSANDRA-3623: -- The above test was done on 12 node cluster but the response time and the hot methods where collected from one random node in the cluster. This test was executed on AWS M2.4xl's with heap settings of 12/2. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175595#comment-13175595 ] Vijay edited comment on CASSANDRA-3623 at 12/23/11 10:30 PM: - Hot Methods before the patch (trunk, without any patch): Excl. User CPUName sec. % 1480.474 100.00 Total 756.717 51.11 crc32 387.767 26.19 static@0x54999 (snappy-1.0.4.1-libsnappyjava.so) 54.814 3.70 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String, org.apache.cassandra.io.compress.CompressionMetadata, boolean) 46.676 3.15 org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, boolean) 45.697 3.09 Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long) 39.417 2.66 memcpy 36.931 2.49 static@0xd8e9 (libpthread-2.5.so) 23.272 1.57 CompactibleFreeListSpace::block_size(const HeapWord*) const 22.766 1.54 SpinPause 12.593 0.85 BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const 9.304 0.63 CardTableModRefBSForCTRS::card_will_be_scanned(signed char) 8.468 0.57 CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*, bool) 8.051 0.54 ParallelTaskTerminator::offer_termination(TerminatorTerminator*) 5.400 0.36 madvise 4.619 0.31 CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, unsigned long) 1.584 0.11 CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int) 1.551 0.10 SweepClosure::do_blk_careful(HeapWord*) Hot Methods After the patch: sec. % 537.681 100.00 Total 529.719 98.52 static@0x54999 (snappy-1.0.4.1-libsnappyjava.so) 4.168 0.78 memcpy 0.143 0.03 Unknown 0.121 0.02 send 0.121 0.02 sun.misc.Unsafe.park(boolean, long) 0.110 0.02 sun.misc.Unsafe.unpark(java.lang.Object) 0.088 0.02 Interpreter 0.077 0.01 org.apache.cassandra.utils.EstimatedHistogram.max() 0.077 0.01 recv 0.066 0.01 SpinPause 0.055 0.01 org.apache.cassandra.utils.EstimatedHistogram.mean() 0.044 0.01 java.lang.Object.wait(long) 0.044 0.01 org.apache.cassandra.utils.EstimatedHistogram.min() 0.044 0.01 __pthread_cond_signal 0.044 0.01 vtable stub 0.033 0.01 java.lang.Object.notify() 0.033 0.01 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(java.lang.Runnable) 0.033 0.01 org.apache.cassandra.io.compress.CompressedMappedFileDataInput.read() 0.033 0.01 PhaseLive::compute(unsigned) 0.033 0.01 poll 0.022 0.00 Arena::contains(const void*) const 0.022 0.00 CompactibleFreeListSpace::free() const 0.022 0.00 I2C/C2I adapters 0.022 0.00 IndexSetIterator::advance_and_next() 0.022 0.00 java.lang.Class.forName0(java.lang.String, boolean, java.lang.ClassLoader) 0.022 0.00 java.lang.Long.getChars(long, int, char[]) 0.022 0.00 java.nio.Bits.swap(int) Before this patch response times (With crc chance set to 0): Epoch Rds/s RdLat Wrts/s WrtLat %user %sys %idle %iowait %steal md0r/s w/s rMB/s wMB/s NetRxKb NetTxKb Percentiles ReadWrite Compacts 1324587443 15 186.305 00.000 27.85 0.0271.83 0.24 0.053.890.000.120.0041 45 99th 545.791 ms 95th 454.826 ms 99th 0.00 ms95th 0.00 msPen/0 1324587455 15 1142.712 00.000 39.55 0.1357.61 2.50 0.21118.30 0.302.200.0034 36 99th 8409.007 ms95th 8409.007 ms99th 0.00 ms95th 0.00 msPen/0 1324587467 10 171.808 00.000 23.83 0.0476.05 0.04 0.054.800.000.140.00127 33 99th 454.826 ms 95th 315.852 ms 99th 0.00 ms95th 0.00 msPen/0 1324587478 10 182.775 00.000 20.43 0.0479.47 0.01 0.051.600.400.040.0030 37 99th 379.022 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587490 13 190.893 00.000 27.58 0.0372.20 0.14 0.063.200.500.090.0039 42 99th 545.791 ms 95th 379.022 ms 99th 0.00 ms95th 0.00 msPen/0 1324587503 28 358.719 00.000 52.24 0.0846.20 1.40 0.09159.40 0.003.160.00196 71 99th 3379.391 ms95th 943.127 ms 99th 0.00 ms95th 0.00 msPen/0 1324587517 13 194.281 00.000 16.68 0.0283.23 0.04 0.022.400.300.070.0038 41 99th 785.939 ms 95th 545.791 ms 99th 0.00 ms95th 0.00 msPen/0 1324587535 36 662.410 00.000 58.34 0.08
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175601#comment-13175601 ] Pavel Yaskevich commented on CASSANDRA-3623: Can you please compare your version with trunk without crc32 because it doesn't seem to be fare match, would be nice to see the same statistics about hot methods and response time. The thing that I hate about MappedByteBuffer is if you duplicate it like you do in reBuffer() - will make unmap impossible until the every last duplicate is GC'ed, this implies that we won't be able to release old SSTables... use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175606#comment-13175606 ] Vijay commented on CASSANDRA-3623: -- I did it Again, i confused everyone with my test data :) Hot methods shown above is the only data which is from the trunk rest are without CRC (hot methods without CRC and without this patch is as follows). Excl. User CPU Name sec. % 629.460 100.00 Total 336.913 53.52 static@0x54999 (snappy-1.0.4.1-libsnappyjava.so) 50.074 7.96 org.apache.cassandra.io.compress.CompressedRandomAccessReader.init(java.lang.String, org.apache.cassandra.io.compress.CompressionMetadata, boolean) 43.057 6.84 org.apache.cassandra.io.util.RandomAccessReader.init(java.io.File, int, boolean) 35.623 5.66 memcpy 33.555 5.33 static@0xd8e9 (libpthread-2.5.so) 30.673 4.87 Copy::pd_disjoint_words(HeapWord*, HeapWord*, unsigned long) 26.384 4.19 CompactibleFreeListSpace::block_size(const HeapWord*) const 15.199 2.41 SpinPause 11.966 1.90 BlockOffsetArrayNonContigSpace::block_start_unsafe(const void*) const 8.479 1.35 CardTableModRefBSForCTRS::card_will_be_scanned(signed char) 8.007 1.27 CardTableModRefBS::non_clean_card_iterate_work(MemRegion, MemRegionClosure*, bool) 5.169 0.82 madvise 5.059 0.80 ParallelTaskTerminator::offer_termination(TerminatorTerminator*) 4.146 0.66 CardTableModRefBS::process_chunk_boundaries(Space*, DirtyCardToOopClosure*, MemRegion, MemRegion, signed char**, unsigned long, unsigned long) 2.431 0.39 CardTableModRefBS::dirty_card_range_after_reset(MemRegion, bool, int) 1.375 0.22 SweepClosure::do_blk_careful(HeapWord*) 0.825 0.13 Par_PushOrMarkClosure::do_oop(oopDesc*) 0.616 0.10 GenericTaskQueueoopDesc*, 131072::pop_local(oopDesc*) 0.561 0.09 instanceKlass::oop_oop_iterate_nv(oopDesc*, Par_PushOrMarkClosure*) 0.473 0.08 CardTableModRefBS::process_stride(Space*, MemRegion, int, int, DirtyCardToOopClosure*, MemRegionClosure*, bool, signed char**, unsigned long, unsigned long) 0.374 0.06 Par_MarkFromRootsClosure::scan_oops_in_oop(HeapWord*) 0.319 0.05 BitMap::par_at_put(unsigned long, bool) 0.308 0.05 MemRegion::intersection(MemRegion) const 0.275 0.04 munmap 0.220 0.03 CardTableModRefBS::dirty_card_iterate(MemRegion, MemRegionClosure*) Hope this makes sense. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175607#comment-13175607 ] Vijay commented on CASSANDRA-3623: -- BTW: i can remove the duplicate() i didnt realize the implications, If you think rest is fine. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3374) CQL can't create column with compression or that use leveled compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-3374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175619#comment-13175619 ] paul cannon commented on CASSANDRA-3374: +1 CQL can't create column with compression or that use leveled compaction --- Key: CASSANDRA-3374 URL: https://issues.apache.org/jira/browse/CASSANDRA-3374 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0.0 Reporter: Sylvain Lebresne Assignee: Pavel Yaskevich Priority: Minor Labels: cql Fix For: 1.0.7 Attachments: CASSANDRA-3374.patch Looking at CreateColumnFamilyStatement.java, it doesn't seem CQL can create compressed column families, nor define a compaction strategy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175623#comment-13175623 ] Rick Shaw commented on CASSANDRA-3634: -- +1 Looks like Strings wins in terms of performance. It offers the most flexibility in transformation as well. I think we have a winner. compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175626#comment-13175626 ] Pavel Yaskevich commented on CASSANDRA-3623: The problem is that you can't remove duplicate() because the same segment can be requested concurrently by different reads and we don't want to limit concurrency with synchronisation over segment use. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175627#comment-13175627 ] Eric Evans commented on CASSANDRA-3634: --- At Brandon's suggestion, I'm rerunning the insert test with some higher column counts. That should make any per-term performance costs/savings more obvious. I'll post those results when I have them. compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175628#comment-13175628 ] Pavel Yaskevich commented on CASSANDRA-3623: Hot reads show the if we remove overhead of the CRAR and RAR initialization we would get the numbers very close to mmap'ed I/O, also as you can see that snappy takes ~1.6x time with mmap'ed I/O. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175630#comment-13175630 ] Vijay commented on CASSANDRA-3623: -- Regarding duplicates i was thinking of Creating duplicates in CMSF and having a helper function to track it. Regarding Hot Reads: (I tried before and you have to access the FD and caching the initialized object didn't help), We do get something like 50% better latencies by doing MMap'ed without copying the data. Snappy is 1.6% more because there isn't any thing else holding up or any other over head. Currently with this patch we dont have to copy any uncompressed data but the CRAR will copy because we dont handle the DirectBB to snappy and that's made possible by using MMapped IO. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175631#comment-13175631 ] Pavel Yaskevich commented on CASSANDRA-3623: bq. We do get something like 50% better latencies by doing MMap'ed without copying the data. But hot methods show the oposite, the main thing that hurts performance in the normal read case is not memcopy but reader class initialization overhead. bq. Snappy is 1.6% more because there isn't any thing else holding up or any other over head. I don't get what do you mean here, can you please elaborate? Slower snappy execution on my opinion could be caused by the additional expenses related to data mapping to the user-space in the conditions of the migrating page cache (situation when dataset does not fit in the page cache), mmap'ed I/O in that case makes kernel do more work comparing to syscalls (normal I/O). bq. Currently with this patch we dont have to copy any uncompressed data but the CRAR will copy because we dont handle the DirectBB to snappy and that's made possible by using MMapped IO. Did you mean compressed instead of uncompressed here? use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175638#comment-13175638 ] Vijay commented on CASSANDRA-3623: -- Pavel, it doesnt show the opposite it actually shows the time spent is 98% in the snappy library and only 2% in the remaining part of the code. Where as in the earlier case we spend 58% of the time in Snappy and rest in the other part of the code. Snappy/decompression is definitely the bottleneck... all i am saying is that now we are more efficient and thats the only bottleneck. Did you mean compressed instead of uncompressed here? Yes i ment compressed. Plz try a test before and after the patch you will see what i am talking about, I did run the cluster (before and after there isnt any other variable in play here) test it for a long time and after this patch shows constat performance and doesn't vary a lot (response times after the patch). use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3623) use MMapedBuffer in CompressedSegmentedFile.getSegment
[ https://issues.apache.org/jira/browse/CASSANDRA-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175639#comment-13175639 ] Vijay commented on CASSANDRA-3623: -- constant performance = not a lot of difference from 95th percentile and Average. Before patch there was a huge swing between those. Data is shown above. Plz note i am not selling this patch ;) I am trying to find a better performance for our use case which needs compression... I am completely open for other options. use MMapedBuffer in CompressedSegmentedFile.getSegment -- Key: CASSANDRA-3623 URL: https://issues.apache.org/jira/browse/CASSANDRA-3623 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.1 Reporter: Vijay Assignee: Vijay Labels: compression Fix For: 1.1 Attachments: 0001-MMaped-Compression-segmented-file-v2.patch, 0001-MMaped-Compression-segmented-file.patch, 0002-tests-for-MMaped-Compression-segmented-file-v2.patch CompressedSegmentedFile.getSegment seem to open a new file and doesnt seem to use the MMap and hence a higher CPU on the nodes and higher latencies on reads. This ticket is to implement the TODO mentioned in CompressedRandomAccessReader // TODO refactor this to separate concept of buffer to avoid lots of read() syscalls and compression buffer but i think a separate class for the Buffer will be better. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175646#comment-13175646 ] Jonathan Ellis commented on CASSANDRA-3634: --- Is the server om a separate machine from the client here? compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3603) CounterColumn and CounterContext use a log4j logger instead of using slf4j like the rest of the code base
[ https://issues.apache.org/jira/browse/CASSANDRA-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175647#comment-13175647 ] Peter Schuller commented on CASSANDRA-3603: --- My apologies. Looks like I accidentally nuked projectCodeStyle.xml in the wc without realizing it. CounterColumn and CounterContext use a log4j logger instead of using slf4j like the rest of the code base - Key: CASSANDRA-3603 URL: https://issues.apache.org/jira/browse/CASSANDRA-3603 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Fix For: 1.0.7 Attachments: CASSANDRA-3603-trunk.txt (Will submit patch but not now, no time.) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3641) inconsistent/corrupt counters w/ broken shards never converge
[ https://issues.apache.org/jira/browse/CASSANDRA-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3641: -- Attachment: CASSANDRA-3641-trunk-nojmx.txt New version attached. Rebased to current trunk, and no JMX. Otherwise identical. inconsistent/corrupt counters w/ broken shards never converge - Key: CASSANDRA-3641 URL: https://issues.apache.org/jira/browse/CASSANDRA-3641 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: Peter Schuller Attachments: 3641-0.8-internal-not-for-inclusion.txt, 3641-trunk.txt, CASSANDRA-3641-trunk-nojmx.txt We ran into a case (which MIGHT be related to CASSANDRA-3070) whereby we had counters that were corrupt (hopefully due to CASSANDRA-3178). The corruption was that there would exist shards with the *same* node_id, *same* clock id, but *different* counts. The counter column diffing and reconciliation code assumes that this never happens, and ignores the count. The problem with this is that if there is an inconsistency, the result of a reconciliation will depend on the order of the shards. In our case for example, we would see the value of the counter randomly fluctuating on a CL.ALL read, but we would get consistent (whatever the node had) on CL.ONE (submitted to one of the nodes in the replica set for the key). In addition, read repair would not work despite digest mismatches because the diffing algorithm also did not care about the counts when determining the differences to send. I'm attaching patches that fixes this. The first patch is against our 0.8 branch, which is not terribly useful to people, but I include it because it is the well-tested version that we have used on the production cluster which was subject to this corruption. The other patch is against trunk, and contains the same change. What the patch does is: * On diffing, treat as DISJOINT if there is a count discrepancy. * On reconciliation, look at the count and *deterministically* pick the higher one, and: ** log the fact that we detected a corrupt counter ** increment a JMX observable counter for monitoring purposes A cluster which is subject to such corruption and has this patch, will fix itself with and AES + compact (or just repeated compactions assuming the replicate-on-compact is able to deliver correctly). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3670) provide red flags JMX instrumentation
provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3670: -- Reviewer: slebresne provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3483) Support bringing up a new datacenter to existing cluster without repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3483: -- Attachment: CASSANDRA-3483-trunk-noredesign.txt Attaching version rebased to trunk but not yet re-factored. Support bringing up a new datacenter to existing cluster without repair --- Key: CASSANDRA-3483 URL: https://issues.apache.org/jira/browse/CASSANDRA-3483 Project: Cassandra Issue Type: Bug Affects Versions: 1.0.2 Reporter: Chris Goffinet Assignee: Peter Schuller Attachments: CASSANDRA-3483-0.8-prelim.txt, CASSANDRA-3483-1.0.txt, CASSANDRA-3483-trunk-noredesign.txt Was talking to Brandon in irc, and we ran into a case where we want to bring up a new DC to an existing cluster. He suggested from jbellis the way to do it currently was set strategy options of dc2:0, then add the nodes. After the nodes are up, change the RF of dc2, and run repair. I'd like to avoid a repair as it runs AES and is a bit more intense than how bootstrap works currently by just streaming ranges from the SSTables. Would it be possible to improve this functionality (adding a new DC to existing cluster) than the proposed method? We'd be happy to do a patch if we got some input on the best way to go about it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175656#comment-13175656 ] Brandon Williams commented on CASSANDRA-3670: - I almost feel bad to mention this here, but since the fixver is unset I'll do it :) It seems like converting a lot of our one-off metrics to https://github.com/codahale/metrics would provide much more flexibility in the future, as well as giving us better metrics to gauge this sort of thing by. provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175660#comment-13175660 ] Peter Schuller commented on CASSANDRA-3670: --- I have not used it, and only had a quick look. But provided that it does the job and has no significant downside, I'd be very +1 just from the mere fact alone that it natively supports exposing metrics through HTTP and JSON while still retaining JMX visibility, and from the fact that you avoid the ThingMBean+Thing acrobatics. The histogram support seems convenient. The RedFlags stuff could be a good pilot case. If it causes problems, it doesn't break anything that people are used to working already. provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175661#comment-13175661 ] Peter Schuller commented on CASSANDRA-3670: --- Also, the whole JMX bit is actually a pretty annoying little detail for many situations. There seems to exist no implementation outside of the JVM, and writing a trivial monitor along the lines of: {code} warnings=$(curl http://localhost:XXX/bla/bla/redflags | egrep -v ': 0$' | wc -l) {code} Becomes a chore. From what I can tell everyone keeps using that magic .jar that no one knows where it comes from that e.g. cassandra-munin-plugins uses. It's a real hassle to be constantly launching a JVM just for metrics extraction. Now granted, if you are fully JMX enabled in your infrastructure there is no issue, but I really think something like this goes a long way towards making Cassandra more operator-friendly - particularly to individuals and/or small organizations that want to monitor in some simple way and do not want to spend time on JMX issues. provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3670) provide red flags JMX instrumentation
[ https://issues.apache.org/jira/browse/CASSANDRA-3670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175662#comment-13175662 ] Peter Schuller commented on CASSANDRA-3670: --- (For the record I'm not suggesting actually writing a monitor exactly like that; I'm not a fan of ad-hoc shell scripting for such things due to the potential for silent failures. But choose any arbitrary productive language and a HTTP+JSON interface is trivial to use in a clean way.) provide red flags JMX instrumentation --- Key: CASSANDRA-3670 URL: https://issues.apache.org/jira/browse/CASSANDRA-3670 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor As discussed in CASSANDRA-3641, it would be nice to expose through JMX certain information which is almost without exception indicative of something being wrong with the node or cluster. In the CASSANDRA-3641 case, it was the detection of corrupt counter shards. Other examples include: * Number of times the selection of files to compact was adjusted due to disk space heuristics * Number of times compaction has failed * Any I/O error reading from or writing to disk (the work here is collecting, not exposing, so maybe not in an initial version) * Any data skipped due to checksum mismatches (when checksumming is being used); e.g., number of skips. * Any arbitrary exception at least in certain code paths (compaction, scrub, cleanup for starters) Probably other things. The motivation is that if we have clear and obvious indications that something truly is wrong, it seems suboptimal to just leave that information in the log somewhere, for someone to discover later when something else broke as a result and a human investigates. You might argue that one should use non-trivial log analysis to detect these things, but I highly doubt a lot of people do this and it seems very wasteful to require that in comparison to just providing the MBean. It is important to note that the *lack* of a certain problem being advertised in this MBean is not supposed to be indicative of a *lack* of a problem. Rather, the point is that to the extent we can easily do so, it is nice to have a clear method of communicating to monitoring systems where there *is* a clear indication of something being wrong. The main part of this ticket is not to cover everything under the sun, but rather to reach agreement on adding an MBean where these types of indicators can be collected. Individual counters can then be added over time as one thinks of them. I propose: * Create an org.apache.cassandra.db.RedFlags MBean * Populate with a few things to begin with. I'll submit the patch if there is agreement. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk.txt provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3671) provide JMX counters for unavailables/timeouts for reads and writes
[ https://issues.apache.org/jira/browse/CASSANDRA-3671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Schuller updated CASSANDRA-3671: -- Attachment: CASSANDRA-3671-trunk-v2.txt Accidentally attached old version of patch. v2 attached which doesn't fail to re-throw in one case. provide JMX counters for unavailables/timeouts for reads and writes --- Key: CASSANDRA-3671 URL: https://issues.apache.org/jira/browse/CASSANDRA-3671 Project: Cassandra Issue Type: Improvement Reporter: Peter Schuller Assignee: Peter Schuller Priority: Minor Attachments: CASSANDRA-3671-trunk-v2.txt, CASSANDRA-3671-trunk.txt Attaching patch against trunk. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175674#comment-13175674 ] Eric Evans commented on CASSANDRA-3634: --- No, it's not compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-3634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175690#comment-13175690 ] Jonathan Ellis commented on CASSANDRA-3634: --- Let's get Brandon to do some testing on our cluster with separate clients and servers. If strings are testing faster than binary then either # something is wrong with the code, because parsing String - ByteBuffer can't possibly be faster than just using the ByteBuffer from Thrift (not to mention that Thrift's internal creation of the String object has more overhead than marking a ByteBuffer slice of the frame) # the difference is negligible compared to other factors and the test noise # the difference is hidden by environmental factors, e.g., String runs just as fast as BB but with X% more CPU used Splitting out clients/servers will help determine if #3 is playing a role here. compare string vs. binary prepared statement parameters --- Key: CASSANDRA-3634 URL: https://issues.apache.org/jira/browse/CASSANDRA-3634 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.1 Attachments: stress-change-bind-parms-to-BB.patch, v1-0001-CASSANDRA-3634-generated-thrift-code.txt, v1-0002-change-bind-parms-from-string-to-bytes.txt Perform benchmarks to compare the performance of string and pre-serialized binary parameters to prepared statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira