Thomas,
I did not introduce this feature or looked at the code. But from a Hashing
standpoint, average-case of a properly implemented Hash map should be O(1). If
HashMap<FullPath, Node> is statistically slower than HashMap<name, Node> it
means the hasing algorithm used for the type FullPath is inefficient, causing
inefficiency. The solution would be to analyze FullPath to see how hashCode()
is implemented. In a perfect hashing scenario there are no collisions which
means it becomes a simple index lookup. When hashes start colliding it requires
an extra steps in a bucket to determine the correct object to return, through
equals(). You can imagine this is horribly inefficient.
Run your tests again with a modified FullPath that returns a hashCode() that is
simple to compute hash that is cached until data is changed and a equals()
method to appropriately compute equality.
Andrew
On Sep 8, 2011, at 11:35 AM, Thomas Koch wrote:
> Hi,
>
> I'm studying the ZK DataTree and wondered, whether the code overhead of a
> lookup HashMap for the nodes provides the expected speed benefit.
>
> Therefor I found japex[1], a microbenchmarking tool and wrote two different
> "tree" implementations: A simple "HashMap<FullPath, Node>" and a real tree
> where each Node contains "HashMap<name, Node> children".
>
> I ran benchmarks to lookup random nodes with trees of different depth ("l" =
> levels) and children count for each node. (see attachment,
> tps=transactions/sec) The benchmark was executed single threaded on my
> laptop.
> I don't dare to say, that it's accurate, but it may be good enough to make
> one
> wonder.
>
> So the HashMap based lookup seems to be faster for all configurations, but at
> most twice as fast. On the other hand the code is made more complicated, the
> memory consumption of every path in the DataTree is doubled and the lookup
> time may hardly be a bottleneck at all in a network application.
>
> I scanned the VCS logs of ZK to find informations when the HashMap lookup was
> introduced and how it was motivated, but it was already there when the
> project
> came out of Yahoo.
>
> Could sbd. please give more information, why the HashMap lookup has been
> introduced and whether benchmarks were made at that time?
>
> A slightly related topic: Does anybody run regulary benchmarks against ZK? Is
> there a benchmark suite? Would you recommend a tool to build benchmarks for
> ZK?
>
> [1] http://japex.java.net
>
> Thank you,
>
> Thomas Koch, http://www.koch.ro