On Wed, May 27, 2015 at 7:51 AM, Ravikumar Govindarajan < [email protected]> wrote:
> I was thinking on how blur can effectively use Mmap short-circuit-reads > from hadoop. It's kind of long but please bear... > > Checked out hadoop-2.3.0 source. I am summarizing logic found in > DFSInputStream, ClientMmap & ClientMmapManager source files... > > 1. New method read(ByteBufferPool bufferPool, > > int maxLength, EnumSet<ReadOption> opts) is exposed for > > short-circuit Mmap reads > > > 2. Local-blocks are Mmapped and added to LRU > > 3. A ref-count is maintained for every Mmapped block during reads > > 4. When ref-count drops to zero for the block, it is UnMapped.This happens > when incoming read-offset jumps to a block other than current block. > > 5. UnMapping actually happens via a separate reaper thread... > > Step 4 is problematic, because we don't want hadoop to control "unmapping" > blocks. Ideally blocks should be unmapped when the original IndexInput and > all clones are closed from blur-side… > > If someone from hadoop community can tell us if such a control is possible, > I feel that we can close any perceived perf-gaps between regular lucene > *MmapDirectory* and blur's *HdfsDirectory* > > It should be very trivial to change HdfsDirectory to use the Mmap read > apis.. > Is this the code for the legacy short circuit reads or the newer version that uses named pipes? > > -- > Ravi > > On Wed, May 27, 2015 at 12:55 PM, Ravikumar Govindarajan < > [email protected]> wrote: > > > My guess is > >> that SSDs are only going to help when the blocks for the shard are local > >> and short circuit reads are enabled. > > > > > > Yes, it's a good-fit for such a use-case alone… > > > > I would not recommend disabling the block cache. However you could > likely > >> lower the size of the cache and reduce the overall memory footprint of > >> Blur. > > > > > > Fine. Can we also scale down the machine RAM itself? [Ex: Instead of > 128GB > > RAM, we can opt for a 64GB or 32GB RAM slot] > > > > One interesting thought would be to > >> try using the HDFS cache feature that is present in the most recent > >> versions of HDFS. I haven't tried it yet but it would be interesting to > >> try. > >> > > > > I did try reading the HDFS cache code. Think it was written for > Map-Reduce > > use-case where blocks are loaded in memory [basically "mmap" followed by > > "mlock" on data-nodes] just before computation begins and unloaded once > > done. > > > > On the short-circuit reads, I found that HDFS-Client is offering 2 > options > > for block-reads > > 1. Domain Socket > > 2. Mmap > > > > I think Mmap is superior and must have the same performance as lucene's > > MmapDirectory… > > > > -- > > Ravi > > > > On Tue, May 26, 2015 at 8:00 PM, Aaron McCurry <[email protected]> > wrote: > > > >> On Fri, May 22, 2015 at 3:33 AM, Ravikumar Govindarajan < > >> [email protected]> wrote: > >> > >> > Recently I am trying to consider deploying SSDs on search machines > >> > > >> > Each machine runs data-nodes + shard-server and local reads of hadoop > >> are > >> > leveraged…. > >> > > >> > SSDs are a great-fit for general lucene/solr kind of setups. But for > >> blur, > >> > I need some help… > >> > > >> > 1. Is it a good idea to consider SSDs, especially when block-cache is > >> > present? > >> > > >> > >> Possibly, I don't have any hard number for this type of setup. My guess > >> is > >> that SSDs are only going to help when the blocks for the shard are local > >> and short circuit reads are enabled. > >> > >> > >> > 2. Are there any grids running blur on SSDs and how they compare to > >> normal > >> > HDDs? > >> > > >> > >> I haven't run any at scale yet. > >> > >> > >> > 3. Can we disable block-cache on SSDs, especially when local-reads are > >> > enabled? > >> > > >> > >> I would not recommend disabling the block cache. However you could > likely > >> lower the size of the cache and reduce the overall memory footprint of > >> Blur. > >> > >> > >> > 4. Using SSDs, blur/lucene will surely be CPU bound. But I don't know > >> what > >> > over-heads hadoop local-reads brings to the table… > >> > > >> > >> If you are using short circuit reads I have seen performance of local > >> accesses nearing that of native IO. However if Blur is making remote > HDFS > >> calls every call is like a cache miss. One interesting thought would be > >> to > >> try using the HDFS cache feature that is present in the most recent > >> versions of HDFS. I haven't tried it yet but it would be interesting to > >> try. > >> > >> > >> > > >> > Any help is much appreciated because I cannot find any info from web > on > >> > this topic > >> > > >> > -- > >> > Ravi > >> > > >> > > > > >
