Are your times taking into account network latency? If you getting
latency/transfer time that's causing about 2000ms delay, then if you
took your times and subtracted that delay, you would get cache times
that are much better than your without cache times. If you latency
fluctuates a bit, the could account for some of the differences in time.
I could be wrong, but it depends upon the switch fabric between the code
executing the queries and code processing them. Local traffic on a box
compared to network traffic is much different. A single switch can add
200ms delay. This may not even apply to your situation, but this is the
first thing that came up in my head.
Thanks,
Colton McInroy
* Director of Security Engineering
Phone
(Toll Free)
_US_ (888)-818-1344 Press 2
_UK_ 0-800-635-0551 Press 2
My Extension 101
24/7 Support [email protected] <mailto:[email protected]>
Email [email protected] <mailto:[email protected]>
Website http://www.dosarrest.com
On 10/31/2013 1:50 PM, Josh Clum wrote:
Hello,
I refactored out the HDFS directory implementation from Blur to use in my
own project and was surprised to see how it performed. I'm using the both
the HDFSDirectory class and the
BlockCacheDirectoryFactoryV2 class.
On my local machine when using the cache there was a significant speed up.
It was a small enough that each file making up lucene index (12 docs) fit
into one block inside the cache.
When running it on a multinode cluster on AWS the performance pulling back
1031 docs with the cache was not that much better than without. According
to my log statements, the cache was being hit every time, but the
difference between this an my local was that there were several blocks per
file.
When setting up the cache I used the default BlurConfiguration.
Any ideas on how to speed up performance? Should I change the block size?
Is there something that blur does to put a wrapper around the cache?
ON A MULTI NODE CLUSTER
Number of documents in directory[1031]
Without Cache ->
Try #1 -> Total execution time: 4816
Try #2 -> Total execution time: 3137
Try #3 -> Total execution time: 2921
Try #4 -> Total execution time: 2525
Try #5 -> Total execution time: 2698
Try #6 -> Total execution time: 2330
Try #7 -> Total execution time: 2464
Try #8 -> Total execution time: 2568
Try #9 -> Total execution time: 2524
Try #10 -> Total execution time: 2537
With Cache ->
Cached try #1 -> Total execution time: 2228
Cached try #2 -> Total execution time: 2243
Cached try #3 -> Total execution time: 2584
Cached try #4 -> Total execution time: 2509
Cached try #5 -> Total execution time: 2163
Cached try #6 -> Total execution time: 2094
Cached try #7 -> Total execution time: 2069
Cached try #8 -> Total execution time: 2105
Cached try #9 -> Total execution time: 2124
Cached try #10 -> Total execution time: 2213
ON MY LOCAL
Number of documents in directory[12]
Without Cache ->
Try #1 -> Total execution time: 599
Try #2 -> Total execution time: 639
Try #3 -> Total execution time: 461
Try #4 -> Total execution time: 544
Try #5 -> Total execution time: 424
Try #6 -> Total execution time: 381
Try #7 -> Total execution time: 487
Try #8 -> Total execution time: 368
Try #9 -> Total execution time: 311
Try #10 -> Total execution time: 411
With Cache ->
Cached try #1 -> Total execution time: 31
Cached try #2 -> Total execution time: 32
Cached try #3 -> Total execution time: 27
Cached try #4 -> Total execution time: 23
Cached try #5 -> Total execution time: 21
Cached try #6 -> Total execution time: 26
Cached try #7 -> Total execution time: 27
Cached try #8 -> Total execution time: 28
Cached try #9 -> Total execution time: 26
Cached try #10 -> Total execution time: 27
Thanks,
Josh