Good finding Lars & team  :)

-Anoop-
________________________________________
From: lars hofhansl [[email protected]]
Sent: Wednesday, April 10, 2013 9:46 AM
To: [email protected]
Subject: Re: Essential column family performance

That part did not show up in the profiling session.
It was just the unnecessary seek that slowed it all down.

-- Lars



________________________________
 From: Ted Yu <[email protected]>
To: [email protected]
Sent: Tuesday, April 9, 2013 9:03 PM
Subject: Re: Essential column family performance

Looking at populateFromJoinedHeap():

      KeyValue kv = populateResult(results, this.joinedHeap, limit,

          joinedContinuationRow.getBuffer(), joinedContinuationRow
.getRowOffset(),

          joinedContinuationRow.getRowLength(), metric);

...

      Collections.sort(results, comparator);

Arrays.mergeSort() is used in the Collections.sort() call.

There seems to be some optimization we can do above: we can record the size
of results before calling populateResult(). Upon return, we can merge the
two segments without resorting to Arrays.mergeSort() which is recursive.


On Tue, Apr 9, 2013 at 6:21 PM, Ted Yu <[email protected]> wrote:

> bq. with only 10000 rows that would all fit in the memstore.
>
> This aspect should be enhanced in the test.
>
> Cheers
>
> On Tue, Apr 9, 2013 at 6:17 PM, Lars Hofhansl <[email protected]> wrote:
>
>> Also the unittest tests with only 10000 rows that would all fit in the
>> memstore. Seek vs reseek should make little difference for the memstore.
>>
>> We tested with 1m and 10m rows, and flushed the memstore  and compacted
>> the store.
>>
>> Will do some more verification later tonight.
>>
>> -- Lars
>>
>>
>> Lars H <[email protected]> wrote:
>>
>> >Your slow scanner performance seems to vary as well. How come? Slow is
>> with the feature off.
>> >
>> >I don't how reseek can be slower than seek in any scenario.
>> >
>> >-- Lars
>> >
>> >Ted Yu <[email protected]> schrieb:
>> >
>> >>I tried using reseek() as suggested, along with my patch from
>> HBASE-8306 (30%
>> >>selection rate, random distribution and FAST_DIFF encoding on both
>> column
>> >>families).
>> >>I got uneven results:
>> >>
>> >>2013-04-09 16:59:01,324 INFO  [main]
>> regionserver.TestJoinedScanners(167):
>> >>Slow scanner finished in 7.529083 seconds, got 1546 rows
>> >>
>> >>2013-04-09 16:59:06,760 INFO  [main]
>> regionserver.TestJoinedScanners(167):
>> >>Joined scanner finished in 5.43579 seconds, got 1546 rows
>> >>...
>> >>2013-04-09 16:59:12,711 INFO  [main]
>> regionserver.TestJoinedScanners(167):
>> >>Slow scanner finished in 5.95016 seconds, got 1546 rows
>> >>
>> >>2013-04-09 16:59:20,240 INFO  [main]
>> regionserver.TestJoinedScanners(167):
>> >>Joined scanner finished in 7.529044 seconds, got 1546 rows
>> >>
>> >>FYI
>> >>
>> >>On Tue, Apr 9, 2013 at 4:47 PM, lars hofhansl <[email protected]> wrote:
>> >>
>> >>> We did some tests here.
>> >>> I ran this through the profiler against a local RegionServer and
>> found the
>> >>> part that causes the slowdown is a seek called here:
>> >>>              boolean mayHaveData =
>> >>>               (nextJoinedKv != null &&
>> >>> nextJoinedKv.matchingRow(currentRow, offset, length))
>> >>>               ||
>> >>> (this.joinedHeap.seek(KeyValue.createFirstOnRow(currentRow, offset,
>> length))
>> >>>                   && joinedHeap.peek() != null
>> >>>                   && joinedHeap.peek().matchingRow(currentRow, offset,
>> >>> length));
>> >>>
>> >>> Looking at the code, this is needed because the joinedHeap can fall
>> >>> behind, and hence we have to catch it up.
>> >>> The key observation, though, is that the joined heap can only ever be
>> >>> behind, and hence we do not need a seek, but only a reseek.
>> >>>
>> >>> Deploying a RegionServer with the seek replaced with reseek we see an
>> >>> improvement in *all* cases.
>> >>>
>> >>> I'll file a jira with a fix later.
>> >>>
>> >>> -- Lars
>> >>>
>> >>>
>> >>>
>> >>> ________________________________
>> >>>  From: James Taylor <[email protected]>
>> >>> To: [email protected]
>> >>> Sent: Monday, April 8, 2013 6:53 PM
>> >>> Subject: Re: Essential column family performance
>> >>>
>> >>> Good idea, Sergey. We'll rerun with larger non essential column family
>> >>> values and see if there's a crossover point. One other difference for
>> us
>> >>> is that we're using FAST_DIFF encoding. We'll try with no encoding
>> too.
>> >>> Our table has 20 million rows across four regions servers.
>> >>>
>> >>> Regarding the parallelization we do, we run multiple scans in parallel
>> >>> instead of one single scan over the table. We use the region
>> boundaries
>> >>> of the table to divide up the work evenly, adding a start/stop key for
>> >>> each scan that corresponds to the region boundaries. Our client then
>> >>> does a final merge/aggregation step (i.e. adding up the count it gets
>> >>> back from the scan for each region).
>> >>>
>> >>> On 04/08/2013 01:34 PM, Sergey Shelukhin wrote:
>> >>> > IntegrationTestLazyCfLoading uses randomly distributed keys with the
>> >>> > following condition for filtering:
>> >>> > 1 == (Long.parseLong(Bytes.toString(rowKey, 0, 4), 16) & 1); where
>> rowKey
>> >>> > is hex string of MD5 key.
>> >>> > Then, there are 2 "lazy" CFs, each of which has a value of 4-64k.
>> >>> > This test also showed significant improvement IIRC, so random
>> >>> distribution
>> >>> > and high %%ge of values selected should not be a problem as such.
>> >>> >
>> >>> > My hunch would be that the additional cost of seeks/merging the
>> results
>> >>> > from two CFs outweights the benefit of lazy loading on such small
>> values
>> >>> > for the "lazy" CF with lots of data selected. This feature
>> definitely
>> >>> makes
>> >>> > no sense if you are selecting all values, because then extra work is
>> >>> being
>> >>> > done for no benefit (everything is read anyway).
>> >>> > So the use cases would be larger "lazy" CFs or/and low percentage of
>> >>> values
>> >>> > selected.
>> >>> >
>> >>> > Can you try to increase the 2nd CF values' size and rerun the test?
>> >>> >
>> >>> >
>> >>> > On Mon, Apr 8, 2013 at 10:38 AM, James Taylor <
>> [email protected]
>> >>> >wrote:
>> >>> >
>> >>> >> In the TestJoinedScanners.java, is the 40% randomly distributed or
>> >>> >> sequential?
>> >>> >>
>> >>> >> In our test, the % is randomly distributed. Also, our custom
>> filter does
>> >>> >> the same thing that SingleColumnValueFilter does.  On the
>> client-side,
>> >>> we'd
>> >>> >> execute the query in parallel, through multiple scans along the
>> region
>> >>> >> boundaries. Would that have a negative impact on performance for
>> this
>> >>> >> "essential column family" feature?
>> >>> >>
>> >>> >> Thanks,
>> >>> >>
>> >>> >>      James
>> >>> >>
>> >>> >>
>> >>> >> On 04/08/2013 10:10 AM, Anoop John wrote:
>> >>> >>
>> >>> >>> Agree here. The effectiveness depends on what % of data satisfies
>> the
>> >>> >>> condition, how it is distributed across HFile blocks. We will get
>> >>> >>> performance gain when the we will be able to skip some HFile
>> blocks
>> >>> (from
>> >>> >>> non essential CFs). Can test with different HFile block size
>> (lower
>> >>> >>> value)?
>> >>> >>>
>> >>> >>> -Anoop-
>> >>> >>>
>> >>> >>>
>> >>> >>> On Mon, Apr 8, 2013 at 8:19 PM, Ted Yu <[email protected]>
>> wrote:
>> >>> >>>
>> >>> >>>   I made the following change in TestJoinedScanners.java:
>> >>> >>>> -      int flag_percent = 1;
>> >>> >>>> +      int flag_percent = 40;
>> >>> >>>>
>> >>> >>>> The test took longer but still favors joined scanner.
>> >>> >>>> I got some new results:
>> >>> >>>>
>> >>> >>>> 2013-04-08 07:46:06,959 INFO  [main] regionserver.**
>> >>> >>>> TestJoinedScanners(157):
>> >>> >>>> Slow scanner finished in 7.424388 seconds, got 2050 rows
>> >>> >>>> ...
>> >>> >>>> 2013-04-08 07:46:12,010 INFO  [main] regionserver.**
>> >>> >>>> TestJoinedScanners(157):
>> >>> >>>> Joined scanner finished in 5.05063 seconds, got 2050 rows
>> >>> >>>>
>> >>> >>>> 2013-04-08 07:46:18,358 INFO  [main] regionserver.**
>> >>> >>>> TestJoinedScanners(157):
>> >>> >>>> Slow scanner finished in 6.348517 seconds, got 2050 rows
>> >>> >>>> ...
>> >>> >>>> 2013-04-08 07:46:22,946 INFO  [main] regionserver.**
>> >>> >>>> TestJoinedScanners(157):
>> >>> >>>> Joined scanner finished in 4.587545 seconds, got 2050 rows
>> >>> >>>>
>> >>> >>>> Looks like effectiveness of joined scanner is affected by
>> >>> distribution of
>> >>> >>>> data.
>> >>> >>>>
>> >>> >>>> Cheers
>> >>> >>>>
>> >>> >>>> On Sun, Apr 7, 2013 at 8:52 PM, lars hofhansl <[email protected]>
>> >>> wrote:
>> >>> >>>>
>> >>> >>>>   Looking at the joined scanner test code, it sets it up such
>> that 1%
>> >>> of
>> >>> >>>> the
>> >>> >>>>
>> >>> >>>>> rows match, which would somewhat be in line with James' results.
>> >>> >>>>>
>> >>> >>>>> In my own testing a while ago I found a 100% improvement with 0%
>> >>> match.
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>> -- Lars
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>> ______________________________**__
>> >>> >>>>>    From: Ted Yu <[email protected]>
>> >>> >>>>> To: [email protected]
>> >>> >>>>> Sent: Sunday, April 7, 2013 4:13 PM
>> >>> >>>>> Subject: Re: Essential column family performance
>> >>> >>>>>
>> >>> >>>>> I have attached 5416-TestJoinedScanners-0.94.**txt to
>> HBASE-5416 for
>> >>> >>>>> your
>> >>> >>>>> reference.
>> >>> >>>>>
>> >>> >>>>> On my MacBook, I got the following results from the test:
>> >>> >>>>>
>> >>> >>>>> 2013-04-07 16:08:17,474 INFO  [main]
>> >>> >>>>>
>> >>> >>>> regionserver.**TestJoinedScanners(157):
>> >>> >>>>
>> >>> >>>>> Slow scanner finished in 7.973822 seconds, got 100 rows
>> >>> >>>>> ...
>> >>> >>>>> 2013-04-07 16:08:17,946 INFO  [main]
>> >>> >>>>>
>> >>> >>>> regionserver.**TestJoinedScanners(157):
>> >>> >>>>
>> >>> >>>>> Joined scanner finished in 0.47235 seconds, got 100 rows
>> >>> >>>>>
>> >>> >>>>> Cheers
>> >>> >>>>>
>> >>> >>>>> On Sun, Apr 7, 2013 at 4:03 PM, Ted Yu <[email protected]>
>> wrote:
>> >>> >>>>>
>> >>> >>>>>   Looking at
>> >>> >>>>>>  https://issues.apache.org/**jira/secure/attachment/**
>> >>> >>>> 12564340/5416-0.94-v3.txt<
>> >>>
>> https://issues.apache.org/jira/secure/attachment/12564340/5416-0.94-v3.txt
>> >>> >
>> >>> >>>> ,
>> >>> >>>>
>> >>> >>>>> I found that it didn't contain TestJoinedScanners which shows
>> >>> >>>>>
>> >>> >>>>>> difference in scanner performance:
>> >>> >>>>>>
>> >>> >>>>>>      LOG.info((slow ? "Slow" : "Joined") + " scanner finished
>> in " +
>> >>> >>>>>> Double.toString(timeSec)
>> >>> >>>>>>
>> >>> >>>>>>         + " seconds, got " + Long.toString(rows_count/2) + "
>> rows");
>> >>> >>>>>>
>> >>> >>>>>> The test uses SingleColumnValueFilter:
>> >>> >>>>>>
>> >>> >>>>>>       SingleColumnValueFilter filter = new
>> SingleColumnValueFilter(
>> >>> >>>>>>
>> >>> >>>>>>           cf_essential, col_name,
>> CompareFilter.CompareOp.EQUAL,
>> >>> >>>>>>
>> >>> >>>>> flag_yes);
>> >>> >>>>> It is possible that the custom filter you were using would
>> exhibit
>> >>> >>>>>> different access pattern compared to SingleColumnValueFilter.
>> e.g.
>> >>> does
>> >>> >>>>>> your filter utilize hint ?
>> >>> >>>>>> It would be easier for me and other people to reproduce the
>> issue
>> >>> you
>> >>> >>>>>> experienced if you put your scenario in some test similar to
>> >>> >>>>>> TestJoinedScanners.
>> >>> >>>>>>
>> >>> >>>>>> Will take a closer look at the code Monday.
>> >>> >>>>>>
>> >>> >>>>>> Cheers
>> >>> >>>>>>
>> >>> >>>>>> On Sun, Apr 7, 2013 at 11:37 AM, James Taylor <
>> >>> [email protected]
>> >>> >>>>>> wrote:
>> >>> >>>>>>
>> >>> >>>>>>   Yes, on 0.94.6. We have our own custom filter derived from
>> >>> FilterBase,
>> >>> >>>>>> so
>> >>> >>>>>> filterIfMissing isn't the issue - the results of the scan are
>> >>> correct.
>> >>> >>>>>>> I can see that if the essential column family has more data
>> >>> compared
>> >>> >>>>>>>
>> >>> >>>>>> to
>> >>> >>>>> the non essential column family that the results would
>> eventually
>> >>> even
>> >>> >>>>>> out.
>> >>> >>>>>> I was hoping to always be able to enable the essential column
>> family
>> >>> >>>>>>> feature. Is there an inherent reason why performance would
>> degrade
>> >>> >>>>>>>
>> >>> >>>>>> like
>> >>> >>>>> this? Does it boil down to a single sequential scan versus many
>> >>> seeks?
>> >>> >>>>>>> Thanks,
>> >>> >>>>>>>
>> >>> >>>>>>> James
>> >>> >>>>>>>
>> >>> >>>>>>>
>> >>> >>>>>>> On 04/07/2013 07:44 AM, Ted Yu wrote:
>> >>> >>>>>>>
>> >>> >>>>>>>   James:
>> >>> >>>>>>>> Your test was based on 0.94.6.1, right ?
>> >>> >>>>>>>>
>> >>> >>>>>>>> What Filter were you using ?
>> >>> >>>>>>>>
>> >>> >>>>>>>> If you used SingleColumnValueFilter, have you seen my comment
>> >>> here ?
>> >>> >>>>>>>> https://issues.apache.org/****jira/browse/HBASE-5416?**<
>> >>> https://issues.apache.org/**jira/browse/HBASE-5416?**>
>> >>> >>>>>>>> focusedCommentId=13541229&****page=com.atlassian.jira.**
>> >>> >>>>>>>>
>> plugin.system.issuetabpanels:****comment-tabpanel#comment-****
>> >>> >>>>>>>> 13541229<
>> >>> >>>>>>>>
>> >>> >>>>>>> https://issues.apache.org/**jira/browse/HBASE-5416?**
>> >>> >>>> focusedCommentId=13541229&**page=com.atlassian.jira.**
>> >>> >>>>
>> plugin.system.issuetabpanels:**comment-tabpanel#comment-**13541229<
>> >>>
>> https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541229
>> >>> >
>> >>> >>>>
>> >>> >>>>>   BTW the use case Max Lapan tried to address has non essential
>> >>> column
>> >>> >>>>>>>> family
>> >>> >>>>>>>> carrying considerably more data compared to essential column
>> >>> family.
>> >>> >>>>>>>>
>> >>> >>>>>>>> Cheers
>> >>> >>>>>>>>
>> >>> >>>>>>>>
>> >>> >>>>>>>>
>> >>> >>>>>>>> On Sat, Apr 6, 2013 at 11:05 PM, James Taylor <
>> >>> >>>>>>>>
>> >>> >>>>>>> [email protected]
>> >>> >>>>>   wrote:
>> >>> >>>>>>>>    Hello,
>> >>> >>>>>>>>
>> >>> >>>>>>>>> We're doing some performance testing of the essential column
>> >>> family
>> >>> >>>>>>>>> feature, and we're seeing some performance degradation when
>> >>> >>>>>>>>>
>> >>> >>>>>>>> comparing
>> >>> >>>>>   with
>> >>> >>>>>>>>> and without the feature enabled:
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>                              Performance of scan relative
>> >>> >>>>>>>>> % of rows selected        to not enabling the feature
>> >>> >>>>>>>>> ---------------------
>>  ------------------------------******--
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> 100%                            1.0x
>> >>> >>>>>>>>>     80%                            2.0x
>> >>> >>>>>>>>>     60%                            2.3x
>> >>> >>>>>>>>>     40%                            2.2x
>> >>> >>>>>>>>>     20%                            1.5x
>> >>> >>>>>>>>>     10%                            1.0x
>> >>> >>>>>>>>>      5%                            0.67x
>> >>> >>>>>>>>>      0%                            0.30%
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> In our scenario, we have two column families. The key value
>> from
>> >>> the
>> >>> >>>>>>>>> essential column family is used in the filter, while the key
>> >>> value
>> >>> >>>>>>>>>
>> >>> >>>>>>>> from
>> >>> >>>>>>   the
>> >>> >>>>>>>>> other, non essential column family is returned by the scan.
>> Each
>> >>> row
>> >>> >>>>>>>>> contains values for both key values, with the values being
>> >>> >>>>>>>>>
>> >>> >>>>>>>> relatively
>> >>> >>>>>   narrow (less than 50 bytes). In this scenario, the only time
>> we're
>> >>> >>>>>>>>> seeing a
>> >>> >>>>>>>>> performance gain is when less than 10% of the rows are
>> selected.
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> Is this a reasonable test? Has anyone else measured this?
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> Thanks,
>> >>> >>>>>>>>>
>> >>> >>>>>>>>> James
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>> >>>>>>>>>
>> >>>
>>
>
>

Reply via email to