Vladimir, any chance to run the same test with FAST_DIFF? J
2013/10/20 Vladimir Rodionov <[email protected]> > I wanted to try PREFIX_TREE because it is supposed to be fastest on > seek/reseek. > > > On Sat, Oct 19, 2013 at 9:12 PM, lars hofhansl <[email protected]> wrote: > > > I found FAST_DIFF to be the fastest of the block encoders. > > (Prefix tree is in 0.96+ only as far as I know.) > > > > -- Lars > > > > > > > > ----- Original Message ----- > > From: Vladimir Rodionov <[email protected]> > > To: "[email protected]" <[email protected]>; lars hofhansl < > > [email protected]> > > Cc: > > Sent: Saturday, October 19, 2013 9:08 PM > > Subject: Re: Beware of PREFIX_TREE block encoding > > > > *Now, which encoder did you test specifically? I seen a 20-40% slowdown > > when everything is in the blockcache (which is the worst case scenario > > here), certainly not a 10x slowdown.* > > > > I have 1.3M rows (very small - 48 bytes) in a block cache which I read > > sequentially, using encoding NONE, PREFIX_TREE and > > StoreScanner/StoreFileScanner (close to metal - block cache :) > > > > Time to read all 1.3M rows reported in ms. > > > > encoding = NONE, scanner = StoreScanner; time = 300 > > ms > > encoding = PREFIX_TREE, scanner = StoreScanner; time = 860 ms > > encoding = NONE , scanner = StoreFileScanner; time = 52 > ms > > encoding = PREFIX_TREE, scanner = StoreFileScanner; time = 545 ms > > > > -Vladimir > > > > > > > > > > On Sat, Oct 19, 2013 at 8:50 PM, lars hofhansl <[email protected]> wrote: > > > > > That is (unfortunately) a known issue. The main problem is that HBase > > > expects each KV to be backed by a contiguous byte[]. For any prefix > > > encoding it is thus necessary to rematerialize the KV (i.e. copy all > the > > > partial bytes into a new location). > > > That is inefficient. Nobody has taken on to fix this (we're 1/2 there > > with > > > Cells in 0.96, though). > > > > > > There a jiras out there to fix this like HBASE-7320 and more recently > > > HBASE-9794. > > > > > > Now, which encoder did you test specifically? I seen a 20-40% slowdown > > > when everything is in the blockcache (which is the worst case scenario > > > here), certainly not a 10x slowdown. > > > > > > Note that with block encoding the block are stored encoded in the > > > blockcache, so more data fits into the cache, and (obviously) there's > > less > > > IO when the data is not in the cache). So the extra work CPU cycles and > > > memory bandwidth used are offset by that. > > > > > > There're other problems too. I just filed an issue (HBASE-9807) where > > with > > > block encoders we make a copy of the key portion of the KV on each > > reseek, > > > just to compare it the current scan key. > > > > > > -- Lars > > > ________________________________ > > > From: Vladimir Rodionov <[email protected]> > > > To: "[email protected]" <[email protected]> > > > Sent: Saturday, October 19, 2013 7:34 PM > > > Subject: RE: Beware of PREFIX_TREE block encoding > > > > > > > > > What I wanted to say by this? HBase still does not have block encoding > > > which is optimal for both scan and seek (re-seek). > > > I do not think these goals are mutually exclusive. > > > > > > > > > Best regards, > > > Vladimir Rodionov > > > Principal Platform Engineer > > > Carrier IQ, www.carrieriq.com > > > e-mail: [email protected] > > > > > > ________________________________________ > > > > > > From: Vladimir Rodionov [[email protected]] > > > Sent: Saturday, October 19, 2013 7:32 PM > > > To: [email protected] > > > Subject: Beware of PREFIX_TREE block encoding > > > > > > The scan performance is bad. 10 x slower on my tests than for blocks > with > > > NONE encoding. I scan data directly from block cache through > > > StoreFileScanner (bypassing all StoreScanner/KeyValueHeap stuff). It > > should > > > be clearly stated that this encoding degrades overall performance > > > significantly in favor of data size reduction and is suitable only for > > Gets > > > - not for Scans. > > > > > > Best regards, > > > -Vladimir Rodionov > > > > > > - > > > > > > Confidentiality Notice: The information contained in this message, > > > including any attachments hereto, may be confidential and is intended > to > > be > > > read only by the individual or entity to whom this message is > addressed. > > If > > > the reader of this message is not the intended recipient or an agent or > > > designee of the intended recipient, please note that any review, use, > > > disclosure or distribution of this message or its attachments, in any > > form, > > > is strictly prohibited. If you have received this message in error, > > please > > > immediately notify the sender and/or [email protected] and > > > delete or destroy any copy of this message and its attachments. > > > > > > > >
