Bulkloading impacts to block locality (0.94.6)

Scott Kuehn Wed, 07 Aug 2013 13:21:03 -0700

I'd like to improve block locality on a system where nearly 100% of data
ingest is via bulkloading.  Presently,  I measure block locality by
monitoring the hdfsBlocksLocalityIndex metric. On a 10 node cluster with
block replication of 3, the block locality index is about 30%, which is
what I'd expect to see from random block placement.  Running a major
compaction does not significantly improve the locality.


How can I maximize block locality in a bulkloading-based system?

Bulkloading impacts to block locality (0.94.6)

Reply via email to