[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

Joshua McKenzie (JIRA) Fri, 26 Sep 2014 13:29:53 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-7282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14149929#comment-14149929
 ]


Joshua McKenzie commented on CASSANDRA-7282:
--------------------------------------------

(nits interleaved with the rest)

General:
* Could use a comment on LongToken.hashCode() explaining reasoning for shift / 
truncation.  While it makes sense in the context of this patch, there's little 
context within the class itself to explain why we're just lopping off half the 
bits.
* Consider more descriptive variable names rather than abbreviations in 
AtomicReferenceArrayUpdater (rather than V exp, V upd)
* AtomicReferenceArrayUpdater: [could use some 
comments|https://github.com/belliottsmith/cassandra/blob/7282-fastmemtablemap/src/java/org/apache/cassandra/utils/concurrent/AtomicReferenceArrayUpdater.java#L35-41]
 clarifying index calculation logic

NonBlockingHashOrderedMap.java:
* I think a diagram showing how the container works ([see 
CSLM|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ConcurrentSkipListMap.java])
 would be quite helpful
* Document why INDEX_SHIFT == 18. Currently just a magic number
* Why do we bit shift on generation of index node instead of just using integer 
value?
      {code} private volatile Node<K, V>[][] index = new Node[1][1 << 10]; 
{code}
* It might help to reference the shalev whitepaper's title as a reference for 
some more formal reading on a similar data structure to what we're using here
* this.index aliasing in predecessor appears redundant
* [predecessor() is still very 
dense|https://github.com/belliottsmith/cassandra/blob/7282-fastmemtablemap/src/java/org/apache/cassandra/concurrent/NonBlockingHashOrderedMap.java#L182-203].
  Not a bad thing if necessary, but I'm wondering if there aren't opportunities 
to abstract out some of the guts of what's going on in there to segregate the 
algorithm from the implementation.
* The comments on indexHash and firstHashOfIndex are quite helpful.
* As non-blocking algorithms are still somewhat new, it might be worth 
documenting why we're using [lazy 
updates|http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6275329] for future 
readers of the code.  Might not.
* Would it be worth making maybeResize threshold configurable?  What's our 
reasoning for targeting 66%?
* 
[maybeResize|https://github.com/belliottsmith/cassandra/blob/7282-fastmemtablemap/src/java/org/apache/cassandra/concurrent/NonBlockingHashOrderedMap.java#L268-274]
 logic could use some more clarity on documentation.
* [Signal-to-noise ratio in resize() seems 
off|https://github.com/belliottsmith/cassandra/blob/7282-fastmemtablemap/src/java/org/apache/cassandra/concurrent/NonBlockingHashOrderedMap.java#L293-302].
  Might be worth factoring out all the INDEX_SHIFTING and Math.max/min to get 
implementation details out of the immediate visibility of the algorithm's 
scope, assuming we can do that without impacting performance.
* in putIfAbsent / maybeResize / resize: Just to confirm - are we relying on 
the executor to drop duplicate resize calls on the floor?  If so, it might be 
worth documenting.

LongNonBlockingHashOrderedMapTest.java
* Reinforces the feeling that the heavy bitwise operations could serve to be 
abstracted out or more heavily commented to help reduce complexity.  For 
example, 
[this|https://github.com/belliottsmith/cassandra/blob/7282-fastmemtablemap/test/long/org/apache/cassandra/concurrent/LongNonBlockingHashOrderedMapTest.java#L213-216]
 is opaque.  While it's parsable and appears correct, some trivial commenting 
could help smooth out the reading / maintaining process
* LongNonBlockingHashOrderedMapTest times out on default settings for ant 
long-test on my box
* And still times out at 10x timeout threshold.  :)  This may be a Windows 
thing - do they run to completion within the time on your setup?

In summary - the code looks good but in my opinion could use some more 
documentation and possibly some abstracting out of lower-level implementation 
details.  The logic of the container itself is definitely simpler than the CSLM 
and seems more tailored to our needs.

> Faster Memtable map
> -------------------
>
>                 Key: CASSANDRA-7282
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7282
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: jasobrown-sample-run.txt, profile.yaml, reads.svg, 
> run1.svg, writes.svg
>
>
> Currently we maintain a ConcurrentSkipLastMap of DecoratedKey -> Partition in 
> our memtables. Maintaining this is an O(lg(n)) operation; since the vast 
> majority of users use a hash partitioner, it occurs to me we could maintain a 
> hybrid ordered list / hash map. The list would impose the normal order on the 
> collection, but a hash index would live alongside as part of the same data 
> structure, simply mapping into the list and permitting O(1) lookups and 
> inserts.
> I've chosen to implement this initial version as a linked-list node per item, 
> but we can optimise this in future by storing fatter nodes that permit a 
> cache-line's worth of hashes to be checked at once,  further reducing the 
> constant factor costs for lookups.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-7282) Faster Memtable map

Reply via email to