That part did not show up in the profiling session. It was just the unnecessary seek that slowed it all down.
-- Lars ________________________________ From: Ted Yu <[email protected]> To: [email protected] Sent: Tuesday, April 9, 2013 9:03 PM Subject: Re: Essential column family performance Looking at populateFromJoinedHeap(): KeyValue kv = populateResult(results, this.joinedHeap, limit, joinedContinuationRow.getBuffer(), joinedContinuationRow .getRowOffset(), joinedContinuationRow.getRowLength(), metric); ... Collections.sort(results, comparator); Arrays.mergeSort() is used in the Collections.sort() call. There seems to be some optimization we can do above: we can record the size of results before calling populateResult(). Upon return, we can merge the two segments without resorting to Arrays.mergeSort() which is recursive. On Tue, Apr 9, 2013 at 6:21 PM, Ted Yu <[email protected]> wrote: > bq. with only 10000 rows that would all fit in the memstore. > > This aspect should be enhanced in the test. > > Cheers > > On Tue, Apr 9, 2013 at 6:17 PM, Lars Hofhansl <[email protected]> wrote: > >> Also the unittest tests with only 10000 rows that would all fit in the >> memstore. Seek vs reseek should make little difference for the memstore. >> >> We tested with 1m and 10m rows, and flushed the memstore and compacted >> the store. >> >> Will do some more verification later tonight. >> >> -- Lars >> >> >> Lars H <[email protected]> wrote: >> >> >Your slow scanner performance seems to vary as well. How come? Slow is >> with the feature off. >> > >> >I don't how reseek can be slower than seek in any scenario. >> > >> >-- Lars >> > >> >Ted Yu <[email protected]> schrieb: >> > >> >>I tried using reseek() as suggested, along with my patch from >> HBASE-8306 (30% >> >>selection rate, random distribution and FAST_DIFF encoding on both >> column >> >>families). >> >>I got uneven results: >> >> >> >>2013-04-09 16:59:01,324 INFO [main] >> regionserver.TestJoinedScanners(167): >> >>Slow scanner finished in 7.529083 seconds, got 1546 rows >> >> >> >>2013-04-09 16:59:06,760 INFO [main] >> regionserver.TestJoinedScanners(167): >> >>Joined scanner finished in 5.43579 seconds, got 1546 rows >> >>... >> >>2013-04-09 16:59:12,711 INFO [main] >> regionserver.TestJoinedScanners(167): >> >>Slow scanner finished in 5.95016 seconds, got 1546 rows >> >> >> >>2013-04-09 16:59:20,240 INFO [main] >> regionserver.TestJoinedScanners(167): >> >>Joined scanner finished in 7.529044 seconds, got 1546 rows >> >> >> >>FYI >> >> >> >>On Tue, Apr 9, 2013 at 4:47 PM, lars hofhansl <[email protected]> wrote: >> >> >> >>> We did some tests here. >> >>> I ran this through the profiler against a local RegionServer and >> found the >> >>> part that causes the slowdown is a seek called here: >> >>> boolean mayHaveData = >> >>> (nextJoinedKv != null && >> >>> nextJoinedKv.matchingRow(currentRow, offset, length)) >> >>> || >> >>> (this.joinedHeap.seek(KeyValue.createFirstOnRow(currentRow, offset, >> length)) >> >>> && joinedHeap.peek() != null >> >>> && joinedHeap.peek().matchingRow(currentRow, offset, >> >>> length)); >> >>> >> >>> Looking at the code, this is needed because the joinedHeap can fall >> >>> behind, and hence we have to catch it up. >> >>> The key observation, though, is that the joined heap can only ever be >> >>> behind, and hence we do not need a seek, but only a reseek. >> >>> >> >>> Deploying a RegionServer with the seek replaced with reseek we see an >> >>> improvement in *all* cases. >> >>> >> >>> I'll file a jira with a fix later. >> >>> >> >>> -- Lars >> >>> >> >>> >> >>> >> >>> ________________________________ >> >>> From: James Taylor <[email protected]> >> >>> To: [email protected] >> >>> Sent: Monday, April 8, 2013 6:53 PM >> >>> Subject: Re: Essential column family performance >> >>> >> >>> Good idea, Sergey. We'll rerun with larger non essential column family >> >>> values and see if there's a crossover point. One other difference for >> us >> >>> is that we're using FAST_DIFF encoding. We'll try with no encoding >> too. >> >>> Our table has 20 million rows across four regions servers. >> >>> >> >>> Regarding the parallelization we do, we run multiple scans in parallel >> >>> instead of one single scan over the table. We use the region >> boundaries >> >>> of the table to divide up the work evenly, adding a start/stop key for >> >>> each scan that corresponds to the region boundaries. Our client then >> >>> does a final merge/aggregation step (i.e. adding up the count it gets >> >>> back from the scan for each region). >> >>> >> >>> On 04/08/2013 01:34 PM, Sergey Shelukhin wrote: >> >>> > IntegrationTestLazyCfLoading uses randomly distributed keys with the >> >>> > following condition for filtering: >> >>> > 1 == (Long.parseLong(Bytes.toString(rowKey, 0, 4), 16) & 1); where >> rowKey >> >>> > is hex string of MD5 key. >> >>> > Then, there are 2 "lazy" CFs, each of which has a value of 4-64k. >> >>> > This test also showed significant improvement IIRC, so random >> >>> distribution >> >>> > and high %%ge of values selected should not be a problem as such. >> >>> > >> >>> > My hunch would be that the additional cost of seeks/merging the >> results >> >>> > from two CFs outweights the benefit of lazy loading on such small >> values >> >>> > for the "lazy" CF with lots of data selected. This feature >> definitely >> >>> makes >> >>> > no sense if you are selecting all values, because then extra work is >> >>> being >> >>> > done for no benefit (everything is read anyway). >> >>> > So the use cases would be larger "lazy" CFs or/and low percentage of >> >>> values >> >>> > selected. >> >>> > >> >>> > Can you try to increase the 2nd CF values' size and rerun the test? >> >>> > >> >>> > >> >>> > On Mon, Apr 8, 2013 at 10:38 AM, James Taylor < >> [email protected] >> >>> >wrote: >> >>> > >> >>> >> In the TestJoinedScanners.java, is the 40% randomly distributed or >> >>> >> sequential? >> >>> >> >> >>> >> In our test, the % is randomly distributed. Also, our custom >> filter does >> >>> >> the same thing that SingleColumnValueFilter does. On the >> client-side, >> >>> we'd >> >>> >> execute the query in parallel, through multiple scans along the >> region >> >>> >> boundaries. Would that have a negative impact on performance for >> this >> >>> >> "essential column family" feature? >> >>> >> >> >>> >> Thanks, >> >>> >> >> >>> >> James >> >>> >> >> >>> >> >> >>> >> On 04/08/2013 10:10 AM, Anoop John wrote: >> >>> >> >> >>> >>> Agree here. The effectiveness depends on what % of data satisfies >> the >> >>> >>> condition, how it is distributed across HFile blocks. We will get >> >>> >>> performance gain when the we will be able to skip some HFile >> blocks >> >>> (from >> >>> >>> non essential CFs). Can test with different HFile block size >> (lower >> >>> >>> value)? >> >>> >>> >> >>> >>> -Anoop- >> >>> >>> >> >>> >>> >> >>> >>> On Mon, Apr 8, 2013 at 8:19 PM, Ted Yu <[email protected]> >> wrote: >> >>> >>> >> >>> >>> I made the following change in TestJoinedScanners.java: >> >>> >>>> - int flag_percent = 1; >> >>> >>>> + int flag_percent = 40; >> >>> >>>> >> >>> >>>> The test took longer but still favors joined scanner. >> >>> >>>> I got some new results: >> >>> >>>> >> >>> >>>> 2013-04-08 07:46:06,959 INFO [main] regionserver.** >> >>> >>>> TestJoinedScanners(157): >> >>> >>>> Slow scanner finished in 7.424388 seconds, got 2050 rows >> >>> >>>> ... >> >>> >>>> 2013-04-08 07:46:12,010 INFO [main] regionserver.** >> >>> >>>> TestJoinedScanners(157): >> >>> >>>> Joined scanner finished in 5.05063 seconds, got 2050 rows >> >>> >>>> >> >>> >>>> 2013-04-08 07:46:18,358 INFO [main] regionserver.** >> >>> >>>> TestJoinedScanners(157): >> >>> >>>> Slow scanner finished in 6.348517 seconds, got 2050 rows >> >>> >>>> ... >> >>> >>>> 2013-04-08 07:46:22,946 INFO [main] regionserver.** >> >>> >>>> TestJoinedScanners(157): >> >>> >>>> Joined scanner finished in 4.587545 seconds, got 2050 rows >> >>> >>>> >> >>> >>>> Looks like effectiveness of joined scanner is affected by >> >>> distribution of >> >>> >>>> data. >> >>> >>>> >> >>> >>>> Cheers >> >>> >>>> >> >>> >>>> On Sun, Apr 7, 2013 at 8:52 PM, lars hofhansl <[email protected]> >> >>> wrote: >> >>> >>>> >> >>> >>>> Looking at the joined scanner test code, it sets it up such >> that 1% >> >>> of >> >>> >>>> the >> >>> >>>> >> >>> >>>>> rows match, which would somewhat be in line with James' results. >> >>> >>>>> >> >>> >>>>> In my own testing a while ago I found a 100% improvement with 0% >> >>> match. >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> -- Lars >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> ______________________________**__ >> >>> >>>>> From: Ted Yu <[email protected]> >> >>> >>>>> To: [email protected] >> >>> >>>>> Sent: Sunday, April 7, 2013 4:13 PM >> >>> >>>>> Subject: Re: Essential column family performance >> >>> >>>>> >> >>> >>>>> I have attached 5416-TestJoinedScanners-0.94.**txt to >> HBASE-5416 for >> >>> >>>>> your >> >>> >>>>> reference. >> >>> >>>>> >> >>> >>>>> On my MacBook, I got the following results from the test: >> >>> >>>>> >> >>> >>>>> 2013-04-07 16:08:17,474 INFO [main] >> >>> >>>>> >> >>> >>>> regionserver.**TestJoinedScanners(157): >> >>> >>>> >> >>> >>>>> Slow scanner finished in 7.973822 seconds, got 100 rows >> >>> >>>>> ... >> >>> >>>>> 2013-04-07 16:08:17,946 INFO [main] >> >>> >>>>> >> >>> >>>> regionserver.**TestJoinedScanners(157): >> >>> >>>> >> >>> >>>>> Joined scanner finished in 0.47235 seconds, got 100 rows >> >>> >>>>> >> >>> >>>>> Cheers >> >>> >>>>> >> >>> >>>>> On Sun, Apr 7, 2013 at 4:03 PM, Ted Yu <[email protected]> >> wrote: >> >>> >>>>> >> >>> >>>>> Looking at >> >>> >>>>>> https://issues.apache.org/**jira/secure/attachment/** >> >>> >>>> 12564340/5416-0.94-v3.txt< >> >>> >> https://issues.apache.org/jira/secure/attachment/12564340/5416-0.94-v3.txt >> >>> > >> >>> >>>> , >> >>> >>>> >> >>> >>>>> I found that it didn't contain TestJoinedScanners which shows >> >>> >>>>> >> >>> >>>>>> difference in scanner performance: >> >>> >>>>>> >> >>> >>>>>> LOG.info((slow ? "Slow" : "Joined") + " scanner finished >> in " + >> >>> >>>>>> Double.toString(timeSec) >> >>> >>>>>> >> >>> >>>>>> + " seconds, got " + Long.toString(rows_count/2) + " >> rows"); >> >>> >>>>>> >> >>> >>>>>> The test uses SingleColumnValueFilter: >> >>> >>>>>> >> >>> >>>>>> SingleColumnValueFilter filter = new >> SingleColumnValueFilter( >> >>> >>>>>> >> >>> >>>>>> cf_essential, col_name, >> CompareFilter.CompareOp.EQUAL, >> >>> >>>>>> >> >>> >>>>> flag_yes); >> >>> >>>>> It is possible that the custom filter you were using would >> exhibit >> >>> >>>>>> different access pattern compared to SingleColumnValueFilter. >> e.g. >> >>> does >> >>> >>>>>> your filter utilize hint ? >> >>> >>>>>> It would be easier for me and other people to reproduce the >> issue >> >>> you >> >>> >>>>>> experienced if you put your scenario in some test similar to >> >>> >>>>>> TestJoinedScanners. >> >>> >>>>>> >> >>> >>>>>> Will take a closer look at the code Monday. >> >>> >>>>>> >> >>> >>>>>> Cheers >> >>> >>>>>> >> >>> >>>>>> On Sun, Apr 7, 2013 at 11:37 AM, James Taylor < >> >>> [email protected] >> >>> >>>>>> wrote: >> >>> >>>>>> >> >>> >>>>>> Yes, on 0.94.6. We have our own custom filter derived from >> >>> FilterBase, >> >>> >>>>>> so >> >>> >>>>>> filterIfMissing isn't the issue - the results of the scan are >> >>> correct. >> >>> >>>>>>> I can see that if the essential column family has more data >> >>> compared >> >>> >>>>>>> >> >>> >>>>>> to >> >>> >>>>> the non essential column family that the results would >> eventually >> >>> even >> >>> >>>>>> out. >> >>> >>>>>> I was hoping to always be able to enable the essential column >> family >> >>> >>>>>>> feature. Is there an inherent reason why performance would >> degrade >> >>> >>>>>>> >> >>> >>>>>> like >> >>> >>>>> this? Does it boil down to a single sequential scan versus many >> >>> seeks? >> >>> >>>>>>> Thanks, >> >>> >>>>>>> >> >>> >>>>>>> James >> >>> >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> On 04/07/2013 07:44 AM, Ted Yu wrote: >> >>> >>>>>>> >> >>> >>>>>>> James: >> >>> >>>>>>>> Your test was based on 0.94.6.1, right ? >> >>> >>>>>>>> >> >>> >>>>>>>> What Filter were you using ? >> >>> >>>>>>>> >> >>> >>>>>>>> If you used SingleColumnValueFilter, have you seen my comment >> >>> here ? >> >>> >>>>>>>> https://issues.apache.org/****jira/browse/HBASE-5416?**< >> >>> https://issues.apache.org/**jira/browse/HBASE-5416?**> >> >>> >>>>>>>> focusedCommentId=13541229&****page=com.atlassian.jira.** >> >>> >>>>>>>> >> plugin.system.issuetabpanels:****comment-tabpanel#comment-**** >> >>> >>>>>>>> 13541229< >> >>> >>>>>>>> >> >>> >>>>>>> https://issues.apache.org/**jira/browse/HBASE-5416?** >> >>> >>>> focusedCommentId=13541229&**page=com.atlassian.jira.** >> >>> >>>> >> plugin.system.issuetabpanels:**comment-tabpanel#comment-**13541229< >> >>> >> https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541229 >> >>> > >> >>> >>>> >> >>> >>>>> BTW the use case Max Lapan tried to address has non essential >> >>> column >> >>> >>>>>>>> family >> >>> >>>>>>>> carrying considerably more data compared to essential column >> >>> family. >> >>> >>>>>>>> >> >>> >>>>>>>> Cheers >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> On Sat, Apr 6, 2013 at 11:05 PM, James Taylor < >> >>> >>>>>>>> >> >>> >>>>>>> [email protected] >> >>> >>>>> wrote: >> >>> >>>>>>>> Hello, >> >>> >>>>>>>> >> >>> >>>>>>>>> We're doing some performance testing of the essential column >> >>> family >> >>> >>>>>>>>> feature, and we're seeing some performance degradation when >> >>> >>>>>>>>> >> >>> >>>>>>>> comparing >> >>> >>>>> with >> >>> >>>>>>>>> and without the feature enabled: >> >>> >>>>>>>>> >> >>> >>>>>>>>> Performance of scan relative >> >>> >>>>>>>>> % of rows selected to not enabling the feature >> >>> >>>>>>>>> --------------------- >> ------------------------------******-- >> >>> >>>>>>>>> >> >>> >>>>>>>>> 100% 1.0x >> >>> >>>>>>>>> 80% 2.0x >> >>> >>>>>>>>> 60% 2.3x >> >>> >>>>>>>>> 40% 2.2x >> >>> >>>>>>>>> 20% 1.5x >> >>> >>>>>>>>> 10% 1.0x >> >>> >>>>>>>>> 5% 0.67x >> >>> >>>>>>>>> 0% 0.30% >> >>> >>>>>>>>> >> >>> >>>>>>>>> In our scenario, we have two column families. The key value >> from >> >>> the >> >>> >>>>>>>>> essential column family is used in the filter, while the key >> >>> value >> >>> >>>>>>>>> >> >>> >>>>>>>> from >> >>> >>>>>> the >> >>> >>>>>>>>> other, non essential column family is returned by the scan. >> Each >> >>> row >> >>> >>>>>>>>> contains values for both key values, with the values being >> >>> >>>>>>>>> >> >>> >>>>>>>> relatively >> >>> >>>>> narrow (less than 50 bytes). In this scenario, the only time >> we're >> >>> >>>>>>>>> seeing a >> >>> >>>>>>>>> performance gain is when less than 10% of the rows are >> selected. >> >>> >>>>>>>>> >> >>> >>>>>>>>> Is this a reasonable test? Has anyone else measured this? >> >>> >>>>>>>>> >> >>> >>>>>>>>> Thanks, >> >>> >>>>>>>>> >> >>> >>>>>>>>> James >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >> > >
