Once 0.94.7 is released and more users try this feature out, we surely can consider turning it on (in 0.94.8)
Cheers On Wed, Apr 10, 2013 at 4:02 PM, lars hofhansl <[email protected]> wrote: > Fix is committed and will be in 0.94.7. > > I guess we should have a discussion at some point on whether we should > always switch this feature on (it is disabled by default), as we now can no > longer find any case where enabling it is slower. > > -- Lars > > > > ________________________________ > From: Anoop Sam John <[email protected]> > To: "[email protected]" <[email protected]>; lars hofhansl < > [email protected]> > Sent: Tuesday, April 9, 2013 10:30 PM > Subject: RE: Essential column family performance > > Good finding Lars & team :) > > -Anoop- > ________________________________________ > From: lars hofhansl [[email protected]] > Sent: Wednesday, April 10, 2013 9:46 AM > To: [email protected] > Subject: Re: Essential column family performance > > That part did not show up in the profiling session. > It was just the unnecessary seek that slowed it all down. > > -- Lars > > > > ________________________________ > From: Ted Yu <[email protected]> > To: [email protected] > Sent: Tuesday, April 9, 2013 9:03 PM > Subject: Re: Essential column family performance > > Looking at populateFromJoinedHeap(): > > KeyValue kv = populateResult(results, this.joinedHeap, limit, > > joinedContinuationRow.getBuffer(), joinedContinuationRow > .getRowOffset(), > > joinedContinuationRow.getRowLength(), metric); > > ... > > Collections.sort(results, comparator); > > Arrays.mergeSort() is used in the Collections.sort() call. > > There seems to be some optimization we can do above: we can record the size > of results before calling populateResult(). Upon return, we can merge the > two segments without resorting to Arrays.mergeSort() which is recursive. > > > On Tue, Apr 9, 2013 at 6:21 PM, Ted Yu <[email protected]> wrote: > > > bq. with only 10000 rows that would all fit in the memstore. > > > > This aspect should be enhanced in the test. > > > > Cheers > > > > On Tue, Apr 9, 2013 at 6:17 PM, Lars Hofhansl <[email protected]> > wrote: > > > >> Also the unittest tests with only 10000 rows that would all fit in the > >> memstore. Seek vs reseek should make little difference for the memstore. > >> > >> We tested with 1m and 10m rows, and flushed the memstore and compacted > >> the store. > >> > >> Will do some more verification later tonight. > >> > >> -- Lars > >> > >> > >> Lars H <[email protected]> wrote: > >> > >> >Your slow scanner performance seems to vary as well. How come? Slow is > >> with the feature off. > >> > > >> >I don't how reseek can be slower than seek in any scenario. > >> > > >> >-- Lars > >> > > >> >Ted Yu <[email protected]> schrieb: > >> > > >> >>I tried using reseek() as suggested, along with my patch from > >> HBASE-8306 (30% > >> >>selection rate, random distribution and FAST_DIFF encoding on both > >> column > >> >>families). > >> >>I got uneven results: > >> >> > >> >>2013-04-09 16:59:01,324 INFO [main] > >> regionserver.TestJoinedScanners(167): > >> >>Slow scanner finished in 7.529083 seconds, got 1546 rows > >> >> > >> >>2013-04-09 16:59:06,760 INFO [main] > >> regionserver.TestJoinedScanners(167): > >> >>Joined scanner finished in 5.43579 seconds, got 1546 rows > >> >>... > >> >>2013-04-09 16:59:12,711 INFO [main] > >> regionserver.TestJoinedScanners(167): > >> >>Slow scanner finished in 5.95016 seconds, got 1546 rows > >> >> > >> >>2013-04-09 16:59:20,240 INFO [main] > >> regionserver.TestJoinedScanners(167): > >> >>Joined scanner finished in 7.529044 seconds, got 1546 rows > >> >> > >> >>FYI > >> >> > >> >>On Tue, Apr 9, 2013 at 4:47 PM, lars hofhansl <[email protected]> > wrote: > >> >> > >> >>> We did some tests here. > >> >>> I ran this through the profiler against a local RegionServer and > >> found the > >> >>> part that causes the slowdown is a seek called here: > >> >>> boolean mayHaveData = > >> >>> (nextJoinedKv != null && > >> >>> nextJoinedKv.matchingRow(currentRow, offset, length)) > >> >>> || > >> >>> (this.joinedHeap.seek(KeyValue.createFirstOnRow(currentRow, offset, > >> length)) > >> >>> && joinedHeap.peek() != null > >> >>> && joinedHeap.peek().matchingRow(currentRow, > offset, > >> >>> length)); > >> >>> > >> >>> Looking at the code, this is needed because the joinedHeap can fall > >> >>> behind, and hence we have to catch it up. > >> >>> The key observation, though, is that the joined heap can only ever > be > >> >>> behind, and hence we do not need a seek, but only a reseek. > >> >>> > >> >>> Deploying a RegionServer with the seek replaced with reseek we see > an > >> >>> improvement in *all* cases. > >> >>> > >> >>> I'll file a jira with a fix later. > >> >>> > >> >>> -- Lars > >> >>> > >> >>> > >> >>> > >> >>> ________________________________ > >> >>> From: James Taylor <[email protected]> > >> >>> To: [email protected] > >> >>> Sent: Monday, April 8, 2013 6:53 PM > >> >>> Subject: Re: Essential column family performance > >> >>> > >> >>> Good idea, Sergey. We'll rerun with larger non essential column > family > >> >>> values and see if there's a crossover point. One other difference > for > >> us > >> >>> is that we're using FAST_DIFF encoding. We'll try with no encoding > >> too. > >> >>> Our table has 20 million rows across four regions servers. > >> >>> > >> >>> Regarding the parallelization we do, we run multiple scans in > parallel > >> >>> instead of one single scan over the table. We use the region > >> boundaries > >> >>> of the table to divide up the work evenly, adding a start/stop key > for > >> >>> each scan that corresponds to the region boundaries. Our client then > >> >>> does a final merge/aggregation step (i.e. adding up the count it > gets > >> >>> back from the scan for each region). > >> >>> > >> >>> On 04/08/2013 01:34 PM, Sergey Shelukhin wrote: > >> >>> > IntegrationTestLazyCfLoading uses randomly distributed keys with > the > >> >>> > following condition for filtering: > >> >>> > 1 == (Long.parseLong(Bytes.toString(rowKey, 0, 4), 16) & 1); where > >> rowKey > >> >>> > is hex string of MD5 key. > >> >>> > Then, there are 2 "lazy" CFs, each of which has a value of 4-64k. > >> >>> > This test also showed significant improvement IIRC, so random > >> >>> distribution > >> >>> > and high %%ge of values selected should not be a problem as such. > >> >>> > > >> >>> > My hunch would be that the additional cost of seeks/merging the > >> results > >> >>> > from two CFs outweights the benefit of lazy loading on such small > >> values > >> >>> > for the "lazy" CF with lots of data selected. This feature > >> definitely > >> >>> makes > >> >>> > no sense if you are selecting all values, because then extra work > is > >> >>> being > >> >>> > done for no benefit (everything is read anyway). > >> >>> > So the use cases would be larger "lazy" CFs or/and low percentage > of > >> >>> values > >> >>> > selected. > >> >>> > > >> >>> > Can you try to increase the 2nd CF values' size and rerun the > test? > >> >>> > > >> >>> > > >> >>> > On Mon, Apr 8, 2013 at 10:38 AM, James Taylor < > >> [email protected] > >> >>> >wrote: > >> >>> > > >> >>> >> In the TestJoinedScanners.java, is the 40% randomly distributed > or > >> >>> >> sequential? > >> >>> >> > >> >>> >> In our test, the % is randomly distributed. Also, our custom > >> filter does > >> >>> >> the same thing that SingleColumnValueFilter does. On the > >> client-side, > >> >>> we'd > >> >>> >> execute the query in parallel, through multiple scans along the > >> region > >> >>> >> boundaries. Would that have a negative impact on performance for > >> this > >> >>> >> "essential column family" feature? > >> >>> >> > >> >>> >> Thanks, > >> >>> >> > >> >>> >> James > >> >>> >> > >> >>> >> > >> >>> >> On 04/08/2013 10:10 AM, Anoop John wrote: > >> >>> >> > >> >>> >>> Agree here. The effectiveness depends on what % of data > satisfies > >> the > >> >>> >>> condition, how it is distributed across HFile blocks. We will > get > >> >>> >>> performance gain when the we will be able to skip some HFile > >> blocks > >> >>> (from > >> >>> >>> non essential CFs). Can test with different HFile block size > >> (lower > >> >>> >>> value)? > >> >>> >>> > >> >>> >>> -Anoop- > >> >>> >>> > >> >>> >>> > >> >>> >>> On Mon, Apr 8, 2013 at 8:19 PM, Ted Yu <[email protected]> > >> wrote: > >> >>> >>> > >> >>> >>> I made the following change in TestJoinedScanners.java: > >> >>> >>>> - int flag_percent = 1; > >> >>> >>>> + int flag_percent = 40; > >> >>> >>>> > >> >>> >>>> The test took longer but still favors joined scanner. > >> >>> >>>> I got some new results: > >> >>> >>>> > >> >>> >>>> 2013-04-08 07:46:06,959 INFO [main] regionserver.** > >> >>> >>>> TestJoinedScanners(157): > >> >>> >>>> Slow scanner finished in 7.424388 seconds, got 2050 rows > >> >>> >>>> ... > >> >>> >>>> 2013-04-08 07:46:12,010 INFO [main] regionserver.** > >> >>> >>>> TestJoinedScanners(157): > >> >>> >>>> Joined scanner finished in 5.05063 seconds, got 2050 rows > >> >>> >>>> > >> >>> >>>> 2013-04-08 07:46:18,358 INFO [main] regionserver.** > >> >>> >>>> TestJoinedScanners(157): > >> >>> >>>> Slow scanner finished in 6.348517 seconds, got 2050 rows > >> >>> >>>> ... > >> >>> >>>> 2013-04-08 07:46:22,946 INFO [main] regionserver.** > >> >>> >>>> TestJoinedScanners(157): > >> >>> >>>> Joined scanner finished in 4.587545 seconds, got 2050 rows > >> >>> >>>> > >> >>> >>>> Looks like effectiveness of joined scanner is affected by > >> >>> distribution of > >> >>> >>>> data. > >> >>> >>>> > >> >>> >>>> Cheers > >> >>> >>>> > >> >>> >>>> On Sun, Apr 7, 2013 at 8:52 PM, lars hofhansl < > [email protected]> > >> >>> wrote: > >> >>> >>>> > >> >>> >>>> Looking at the joined scanner test code, it sets it up such > >> that 1% > >> >>> of > >> >>> >>>> the > >> >>> >>>> > >> >>> >>>>> rows match, which would somewhat be in line with James' > results. > >> >>> >>>>> > >> >>> >>>>> In my own testing a while ago I found a 100% improvement with > 0% > >> >>> match. > >> >>> >>>>> > >> >>> >>>>> > >> >>> >>>>> -- Lars > >> >>> >>>>> > >> >>> >>>>> > >> >>> >>>>> > >> >>> >>>>> ______________________________**__ > >> >>> >>>>> From: Ted Yu <[email protected]> > >> >>> >>>>> To: [email protected] > >> >>> >>>>> Sent: Sunday, April 7, 2013 4:13 PM > >> >>> >>>>> Subject: Re: Essential column family performance > >> >>> >>>>> > >> >>> >>>>> I have attached 5416-TestJoinedScanners-0.94.**txt to > >> HBASE-5416 for > >> >>> >>>>> your > >> >>> >>>>> reference. > >> >>> >>>>> > >> >>> >>>>> On my MacBook, I got the following results from the test: > >> >>> >>>>> > >> >>> >>>>> 2013-04-07 16:08:17,474 INFO [main] > >> >>> >>>>> > >> >>> >>>> regionserver.**TestJoinedScanners(157): > >> >>> >>>> > >> >>> >>>>> Slow scanner finished in 7.973822 seconds, got 100 rows > >> >>> >>>>> ... > >> >>> >>>>> 2013-04-07 16:08:17,946 INFO [main] > >> >>> >>>>> > >> >>> >>>> regionserver.**TestJoinedScanners(157): > >> >>> >>>> > >> >>> >>>>> Joined scanner finished in 0.47235 seconds, got 100 rows > >> >>> >>>>> > >> >>> >>>>> Cheers > >> >>> >>>>> > >> >>> >>>>> On Sun, Apr 7, 2013 at 4:03 PM, Ted Yu <[email protected]> > >> wrote: > >> >>> >>>>> > >> >>> >>>>> Looking at > >> >>> >>>>>> https://issues.apache.org/**jira/secure/attachment/** > >> >>> >>>> 12564340/5416-0.94-v3.txt< > >> >>> > >> > https://issues.apache.org/jira/secure/attachment/12564340/5416-0.94-v3.txt > >> >>> > > >> >>> >>>> , > >> >>> >>>> > >> >>> >>>>> I found that it didn't contain TestJoinedScanners which shows > >> >>> >>>>> > >> >>> >>>>>> difference in scanner performance: > >> >>> >>>>>> > >> >>> >>>>>> LOG.info((slow ? "Slow" : "Joined") + " scanner finished > >> in " + > >> >>> >>>>>> Double.toString(timeSec) > >> >>> >>>>>> > >> >>> >>>>>> + " seconds, got " + Long.toString(rows_count/2) + " > >> rows"); > >> >>> >>>>>> > >> >>> >>>>>> The test uses SingleColumnValueFilter: > >> >>> >>>>>> > >> >>> >>>>>> SingleColumnValueFilter filter = new > >> SingleColumnValueFilter( > >> >>> >>>>>> > >> >>> >>>>>> cf_essential, col_name, > >> CompareFilter.CompareOp.EQUAL, > >> >>> >>>>>> > >> >>> >>>>> flag_yes); > >> >>> >>>>> It is possible that the custom filter you were using would > >> exhibit > >> >>> >>>>>> different access pattern compared to SingleColumnValueFilter. > >> e.g. > >> >>> does > >> >>> >>>>>> your filter utilize hint ? > >> >>> >>>>>> It would be easier for me and other people to reproduce the > >> issue > >> >>> you > >> >>> >>>>>> experienced if you put your scenario in some test similar to > >> >>> >>>>>> TestJoinedScanners. > >> >>> >>>>>> > >> >>> >>>>>> Will take a closer look at the code Monday. > >> >>> >>>>>> > >> >>> >>>>>> Cheers > >> >>> >>>>>> > >> >>> >>>>>> On Sun, Apr 7, 2013 at 11:37 AM, James Taylor < > >> >>> [email protected] > >> >>> >>>>>> wrote: > >> >>> >>>>>> > >> >>> >>>>>> Yes, on 0.94.6. We have our own custom filter derived from > >> >>> FilterBase, > >> >>> >>>>>> so > >> >>> >>>>>> filterIfMissing isn't the issue - the results of the scan are > >> >>> correct. > >> >>> >>>>>>> I can see that if the essential column family has more data > >> >>> compared > >> >>> >>>>>>> > >> >>> >>>>>> to > >> >>> >>>>> the non essential column family that the results would > >> eventually > >> >>> even > >> >>> >>>>>> out. > >> >>> >>>>>> I was hoping to always be able to enable the essential column > >> family > >> >>> >>>>>>> feature. Is there an inherent reason why performance would > >> degrade > >> >>> >>>>>>> > >> >>> >>>>>> like > >> >>> >>>>> this? Does it boil down to a single sequential scan versus > many > >> >>> seeks? > >> >>> >>>>>>> Thanks, > >> >>> >>>>>>> > >> >>> >>>>>>> James > >> >>> >>>>>>> > >> >>> >>>>>>> > >> >>> >>>>>>> On 04/07/2013 07:44 AM, Ted Yu wrote: > >> >>> >>>>>>> > >> >>> >>>>>>> James: > >> >>> >>>>>>>> Your test was based on 0.94.6.1, right ? > >> >>> >>>>>>>> > >> >>> >>>>>>>> What Filter were you using ? > >> >>> >>>>>>>> > >> >>> >>>>>>>> If you used SingleColumnValueFilter, have you seen my > comment > >> >>> here ? > >> >>> >>>>>>>> https://issues.apache.org/****jira/browse/HBASE-5416?**< > >> >>> https://issues.apache.org/**jira/browse/HBASE-5416?**> > >> >>> >>>>>>>> focusedCommentId=13541229&****page=com.atlassian.jira.** > >> >>> >>>>>>>> > >> plugin.system.issuetabpanels:****comment-tabpanel#comment-**** > >> >>> >>>>>>>> 13541229< > >> >>> >>>>>>>> > >> >>> >>>>>>> https://issues.apache.org/**jira/browse/HBASE-5416?** > >> >>> >>>> focusedCommentId=13541229&**page=com.atlassian.jira.** > >> >>> >>>> > >> plugin.system.issuetabpanels:**comment-tabpanel#comment-**13541229< > >> >>> > >> > https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541229 > >> >>> > > >> >>> >>>> > >> >>> >>>>> BTW the use case Max Lapan tried to address has non > essential > >> >>> column > >> >>> >>>>>>>> family > >> >>> >>>>>>>> carrying considerably more data compared to essential > column > >> >>> family. > >> >>> >>>>>>>> > >> >>> >>>>>>>> Cheers > >> >>> >>>>>>>> > >> >>> >>>>>>>> > >> >>> >>>>>>>> > >> >>> >>>>>>>> On Sat, Apr 6, 2013 at 11:05 PM, James Taylor < > >> >>> >>>>>>>> > >> >>> >>>>>>> [email protected] > >> >>> >>>>> wrote: > >> >>> >>>>>>>> Hello, > >> >>> >>>>>>>> > >> >>> >>>>>>>>> We're doing some performance testing of the essential > column > >> >>> family > >> >>> >>>>>>>>> feature, and we're seeing some performance degradation > when > >> >>> >>>>>>>>> > >> >>> >>>>>>>> comparing > >> >>> >>>>> with > >> >>> >>>>>>>>> and without the feature enabled: > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> Performance of scan relative > >> >>> >>>>>>>>> % of rows selected to not enabling the feature > >> >>> >>>>>>>>> --------------------- > >> ------------------------------******-- > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> 100% 1.0x > >> >>> >>>>>>>>> 80% 2.0x > >> >>> >>>>>>>>> 60% 2.3x > >> >>> >>>>>>>>> 40% 2.2x > >> >>> >>>>>>>>> 20% 1.5x > >> >>> >>>>>>>>> 10% 1.0x > >> >>> >>>>>>>>> 5% 0.67x > >> >>> >>>>>>>>> 0% 0.30% > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> In our scenario, we have two column families. The key > value > >> from > >> >>> the > >> >>> >>>>>>>>> essential column family is used in the filter, while the > key > >> >>> value > >> >>> >>>>>>>>> > >> >>> >>>>>>>> from > >> >>> >>>>>> the > >> >>> >>>>>>>>> other, non essential column family is returned by the > scan. > >> Each > >> >>> row > >> >>> >>>>>>>>> contains values for both key values, with the values being > >> >>> >>>>>>>>> > >> >>> >>>>>>>> relatively > >> >>> >>>>> narrow (less than 50 bytes). In this scenario, the only time > >> we're > >> >>> >>>>>>>>> seeing a > >> >>> >>>>>>>>> performance gain is when less than 10% of the rows are > >> selected. > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> Is this a reasonable test? Has anyone else measured this? > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> Thanks, > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> James > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> >>>>>>>>> > >> >>> > >> > > > > >
