Re: Why HBase scan HFile first, before scan memstore?
Great explanation. And thank you for your patiently! Thanks. Alex 2016-10-16 23:34 GMT-07:00 ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com>: > Yes you are right. You can see the code after the list of scanners are > formed. They are all collected in a KeyValueHeap. > Pls note that memstore is not a cache, it is only a data structure where > the data is first written and subsequently gets flushed into files. So the > data you read may or may not reside in the memstore. So it is always needed > to scan > the memstore and the files and then keep returning keys in the > lexographical sorted order for which the heap comes into place. > > Regards > Ram > > On Mon, Oct 17, 2016 at 11:54 AM, Xi Yang wrote: > > > Got it. So you mean, actually, the result HBase return to user is from > the > > Heap. And the scanners' jobs are collecting data into that Heap. So the > > order of how to arrange HFile scanners and memstore scanner is not a big > > deal? > > > > Thanks, > > Alex > > > > 2016-10-16 23:04 GMT-07:00 Anoop John : > > > > > Over all these scanners we will be creating a Heap. (See in > > > StoreScanner where we make KeyValueHeap). Out of this cells come in > > > their key order. So said that, we will be opening and seeking to all > > > scanners and get cur cells from all.. Based on the comparator result > > > of all these cells emerge out from Heap. So it is not that we will > > > scan HFile scanners first and then do scan over memstore. Make sense? > > > > > > -Anoop- > > > > > > On Mon, Oct 17, 2016 at 11:04 AM, Xi Yang > > wrote: > > > > I found codes in HStore.java > > > > > > > > List sfScanners = > > > > StoreFileScanner.getScannersForStoreFiles(files, > > > > cacheBlocks, usePread, isCompaction, false, matcher, readPt, > > > > isPrimaryReplicaStore()); > > > > List scanners = new > > > > ArrayList(sfScanners.size() + 1); > > > > scanners.addAll(sfScanners); > > > > // Then the memstore scanners > > > > if (memStoreScanners != null) { > > > > scanners.addAll(memStoreScanners); > > > > } > > > > > > > > So is it mean before scan memstore it will scan HFile first? > > > > Why not scan memstore first, because memory is always faster then > hard > > > disk? > > > > > > > > > > > > Thanks, > > > > Alex > > > > > >
Re: Why HBase scan HFile first, before scan memstore?
Yes you are right. You can see the code after the list of scanners are formed. They are all collected in a KeyValueHeap. Pls note that memstore is not a cache, it is only a data structure where the data is first written and subsequently gets flushed into files. So the data you read may or may not reside in the memstore. So it is always needed to scan the memstore and the files and then keep returning keys in the lexographical sorted order for which the heap comes into place. Regards Ram On Mon, Oct 17, 2016 at 11:54 AM, Xi Yang wrote: > Got it. So you mean, actually, the result HBase return to user is from the > Heap. And the scanners' jobs are collecting data into that Heap. So the > order of how to arrange HFile scanners and memstore scanner is not a big > deal? > > Thanks, > Alex > > 2016-10-16 23:04 GMT-07:00 Anoop John : > > > Over all these scanners we will be creating a Heap. (See in > > StoreScanner where we make KeyValueHeap). Out of this cells come in > > their key order. So said that, we will be opening and seeking to all > > scanners and get cur cells from all.. Based on the comparator result > > of all these cells emerge out from Heap. So it is not that we will > > scan HFile scanners first and then do scan over memstore. Make sense? > > > > -Anoop- > > > > On Mon, Oct 17, 2016 at 11:04 AM, Xi Yang > wrote: > > > I found codes in HStore.java > > > > > > List sfScanners = > > > StoreFileScanner.getScannersForStoreFiles(files, > > > cacheBlocks, usePread, isCompaction, false, matcher, readPt, > > > isPrimaryReplicaStore()); > > > List scanners = new > > > ArrayList(sfScanners.size() + 1); > > > scanners.addAll(sfScanners); > > > // Then the memstore scanners > > > if (memStoreScanners != null) { > > > scanners.addAll(memStoreScanners); > > > } > > > > > > So is it mean before scan memstore it will scan HFile first? > > > Why not scan memstore first, because memory is always faster then hard > > disk? > > > > > > > > > Thanks, > > > Alex > > >
Re: Why HBase scan HFile first, before scan memstore?
Got it. So you mean, actually, the result HBase return to user is from the Heap. And the scanners' jobs are collecting data into that Heap. So the order of how to arrange HFile scanners and memstore scanner is not a big deal? Thanks, Alex 2016-10-16 23:04 GMT-07:00 Anoop John : > Over all these scanners we will be creating a Heap. (See in > StoreScanner where we make KeyValueHeap). Out of this cells come in > their key order. So said that, we will be opening and seeking to all > scanners and get cur cells from all.. Based on the comparator result > of all these cells emerge out from Heap. So it is not that we will > scan HFile scanners first and then do scan over memstore. Make sense? > > -Anoop- > > On Mon, Oct 17, 2016 at 11:04 AM, Xi Yang wrote: > > I found codes in HStore.java > > > > List sfScanners = > > StoreFileScanner.getScannersForStoreFiles(files, > > cacheBlocks, usePread, isCompaction, false, matcher, readPt, > > isPrimaryReplicaStore()); > > List scanners = new > > ArrayList(sfScanners.size() + 1); > > scanners.addAll(sfScanners); > > // Then the memstore scanners > > if (memStoreScanners != null) { > > scanners.addAll(memStoreScanners); > > } > > > > So is it mean before scan memstore it will scan HFile first? > > Why not scan memstore first, because memory is always faster then hard > disk? > > > > > > Thanks, > > Alex >
Re: Why HBase scan HFile first, before scan memstore?
Over all these scanners we will be creating a Heap. (See in StoreScanner where we make KeyValueHeap). Out of this cells come in their key order. So said that, we will be opening and seeking to all scanners and get cur cells from all.. Based on the comparator result of all these cells emerge out from Heap. So it is not that we will scan HFile scanners first and then do scan over memstore. Make sense? -Anoop- On Mon, Oct 17, 2016 at 11:04 AM, Xi Yang wrote: > I found codes in HStore.java > > List sfScanners = > StoreFileScanner.getScannersForStoreFiles(files, > cacheBlocks, usePread, isCompaction, false, matcher, readPt, > isPrimaryReplicaStore()); > List scanners = new > ArrayList(sfScanners.size() + 1); > scanners.addAll(sfScanners); > // Then the memstore scanners > if (memStoreScanners != null) { > scanners.addAll(memStoreScanners); > } > > So is it mean before scan memstore it will scan HFile first? > Why not scan memstore first, because memory is always faster then hard disk? > > > Thanks, > Alex
Why HBase scan HFile first, before scan memstore?
I found codes in HStore.java List sfScanners = StoreFileScanner.getScannersForStoreFiles(files, cacheBlocks, usePread, isCompaction, false, matcher, readPt, isPrimaryReplicaStore()); List scanners = new ArrayList(sfScanners.size() + 1); scanners.addAll(sfScanners); // Then the memstore scanners if (memStoreScanners != null) { scanners.addAll(memStoreScanners); } So is it mean before scan memstore it will scan HFile first? Why not scan memstore first, because memory is always faster then hard disk? Thanks, Alex