[
https://issues.apache.org/jira/browse/CASSANDRA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Johan Oskarsson updated CASSANDRA-420:
--------------------------------------
Attachment: CASSANDRA-420.patch
This patch changes the decorated keys to be stored as an object with a
BigInteger and a String as member variables instead of both values in a String.
This means we can avoid a lot of the heavy lifting in the comparator.
In a non scientific mini benchmark the sorting phase takes an order of
magnitude shorter time with the patch applied. I also see double throughput per
node when loading data from Hadoop.
The patch needs a bit more work, comments etc but as per IRC discussion I am
putting it up so others can weigh in. Should we start using the DecoratedKey
class or a version thereof more extensively instead of the String we use now?
> Improve performance of BinaryMemtable sort phase
> ------------------------------------------------
>
> Key: CASSANDRA-420
> URL: https://issues.apache.org/jira/browse/CASSANDRA-420
> Project: Cassandra
> Issue Type: Improvement
> Affects Versions: 0.5
> Reporter: Johan Oskarsson
> Priority: Minor
> Fix For: 0.5
>
> Attachments: CASSANDRA-420.patch
>
>
> The BinaryMemtable sorts an array of decorated keys. There are a lot of
> string operations and object allocation in the comparator that could be
> avoided to improve performance..
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.