[hypertable-dev] Re: judy arrays for hash tables

Luke Sun, 22 Feb 2009 23:13:17 -0800

Thanks for the benchmark, Mateusz!

Are you sure you're comparing the same thing? the std::map test uses a
map<int, map<int, int> >. And the Judy test's first level index is a
static hash table, which would be faster than anything for first level
indexing. Just doing a simple map<string, int> vs JudySL  would be
more interesting (to me.), as the performance of range scans is
important for cell cache, which Judy array should support. OTOH, the
current Judy array interface is too limiting to implement a non-
trivial key without major hacking on the code, which is quite scary.


__Luke

On Feb 20, 3:45 pm, Mateusz Berezecki <[email protected]> wrote:
> So here are the numbers from a very basic benchmark.
>
> Hi everybody,
>
> As promised, here are the numbers. First let's look at Judy arrays
>
> user-89-108-219-194:Desktop m$ for i in 1 10 100 1000 10000 100000
> 1000000; do ./test $i; done
> Begin storing 1 random numbers in a Judy scalable hash array
> Insertion of 1 indexes took 59080.000 clocks per index
> Retrieval of 1 indexes took 11256.000 clocks per index
> Begin storing 10 random numbers in a Judy scalable hash array
> Insertion of 10 indexes took 6637.400 clocks per index
> Retrieval of 10 indexes took 1822.800 clocks per index
> Begin storing 100 random numbers in a Judy scalable hash array
> Insertion of 100 indexes took 1309.980 clocks per index
> Retrieval of 100 indexes took 1465.380 clocks per index
> Begin storing 1000 random numbers in a Judy scalable hash array
> Insertion of 1000 indexes took 613.242 clocks per index
> Retrieval of 1000 indexes took 134.204 clocks per index
> Begin storing 10000 random numbers in a Judy scalable hash array
> Insertion of 10000 indexes took 577.836 clocks per index
> Retrieval of 10000 indexes took 141.834 clocks per index
> Begin storing 100000 random numbers in a Judy scalable hash array
> Insertion of 100000 indexes took 600.942 clocks per index
> Retrieval of 100000 indexes took 177.093 clocks per index
> Begin storing 1000000 random numbers in a Judy scalable hash array
> Insertion of 1000000 indexes took 731.993 clocks per index
> Retrieval of 1000000 indexes took 369.726 clocks per index
>
> And for std::map
>
> user-89-108-219-194:Desktop m$ for i in 1 10 100 1000 10000 100000
> 1000000; do ./maptest $i; done
> Begin storing 1 random numbers in a std::map
> Insertion of 1 indexes took 58254.00000000 clocks per index.
> Retrieval of 1 indexes took 616.00000000 clocks per index.
> Begin storing 10 random numbers in a std::map
> Insertion of 10 indexes took 6253.80000000 clocks per index.
> Retrieval of 10 indexes took 238.00000000 clocks per index.
> Begin storing 100 random numbers in a std::map
> Insertion of 100 indexes took 2135.14000000 clocks per index.
> Retrieval of 100 indexes took 199.78000000 clocks per index.
> Begin storing 1000 random numbers in a std::map
> Insertion of 1000 indexes took 1361.17800000 clocks per index.
> Retrieval of 1000 indexes took 297.22000000 clocks per index.
> Begin storing 10000 random numbers in a std::map
> Insertion of 10000 indexes took 1750.86380000 clocks per index.
> Retrieval of 10000 indexes took 450.38280000 clocks per index.
> Begin storing 100000 random numbers in a std::map
> Insertion of 100000 indexes took 2238.61890000 clocks per index.
> Retrieval of 100000 indexes took 1404.19986000 clocks per index.
> Begin storing 1000000 random numbers in a std::map
> Insertion of 1000000 indexes took 3586.68496200 clocks per index.
> Retrieval of 1000000 indexes took 2162.24295000 clocks per index.
>
> Programs were compiled with -O3 -sse -sse3 flags. Time was measured
> with rtdsc instruction. You can find the test programs in the
> attachment to this e-mail message.
>
> Sorry that I did not supply any boost benchmarks but did not have time for 
> doing
> this yet. I can post more numbers if I find some more time over a weekend.
>
> Mateusz
>
> P.S.
> The judytest.c file is basically a verbatim (with rtdsc added) from :
>
> Judy_hashing.pdf (http://judy.sourceforge.net/examples/Judy_hashing.pdf)
> How to use Judy to create a scalable hash table with outstanding
> performance and automatic scaling, while avoiding the complexity of
> dynamic hashing.
>
>  judytest.c
> 3KViewDownload
>
>  maptest.cc
> 2KViewDownload
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Hypertable Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/hypertable-dev?hl=en
-~----------~----~----~----~------~----~------~--~---

[hypertable-dev] Re: judy arrays for hash tables

Reply via email to