Re: Essential column family performance

Ted Yu Wed, 10 Apr 2013 16:05:35 -0700

Once 0.94.7 is released and more users try this feature out, we surely can
consider turning it on (in 0.94.8)


Cheers

On Wed, Apr 10, 2013 at 4:02 PM, lars hofhansl <[email protected]> wrote:

> Fix is committed and will be in 0.94.7.
>
> I guess we should have a discussion at some point on whether we should
> always switch this feature on (it is disabled by default), as we now can no
> longer find any case where enabling it is slower.
>
> -- Lars
>
>
>
> ________________________________
>  From: Anoop Sam John <[email protected]>
> To: "[email protected]" <[email protected]>; lars hofhansl <
> [email protected]>
> Sent: Tuesday, April 9, 2013 10:30 PM
> Subject: RE: Essential column family performance
>
> Good finding Lars & team  :)
>
> -Anoop-
> ________________________________________
> From: lars hofhansl [[email protected]]
> Sent: Wednesday, April 10, 2013 9:46 AM
> To: [email protected]
> Subject: Re: Essential column family performance
>
> That part did not show up in the profiling session.
> It was just the unnecessary seek that slowed it all down.
>
> -- Lars
>
>
>
> ________________________________
> From: Ted Yu <[email protected]>
> To: [email protected]
> Sent: Tuesday, April 9, 2013 9:03 PM
> Subject: Re: Essential column family performance
>
> Looking at populateFromJoinedHeap():
>
>       KeyValue kv = populateResult(results, this.joinedHeap, limit,
>
>           joinedContinuationRow.getBuffer(), joinedContinuationRow
> .getRowOffset(),
>
>           joinedContinuationRow.getRowLength(), metric);
>
> ...
>
>       Collections.sort(results, comparator);
>
> Arrays.mergeSort() is used in the Collections.sort() call.
>
> There seems to be some optimization we can do above: we can record the size
> of results before calling populateResult(). Upon return, we can merge the
> two segments without resorting to Arrays.mergeSort() which is recursive.
>
>
> On Tue, Apr 9, 2013 at 6:21 PM, Ted Yu <[email protected]> wrote:
>
> > bq. with only 10000 rows that would all fit in the memstore.
> >
> > This aspect should be enhanced in the test.
> >
> > Cheers
> >
> > On Tue, Apr 9, 2013 at 6:17 PM, Lars Hofhansl <[email protected]>
> wrote:
> >
> >> Also the unittest tests with only 10000 rows that would all fit in the
> >> memstore. Seek vs reseek should make little difference for the memstore.
> >>
> >> We tested with 1m and 10m rows, and flushed the memstore  and compacted
> >> the store.
> >>
> >> Will do some more verification later tonight.
> >>
> >> -- Lars
> >>
> >>
> >> Lars H <[email protected]> wrote:
> >>
> >> >Your slow scanner performance seems to vary as well. How come? Slow is
> >> with the feature off.
> >> >
> >> >I don't how reseek can be slower than seek in any scenario.
> >> >
> >> >-- Lars
> >> >
> >> >Ted Yu <[email protected]> schrieb:
> >> >
> >> >>I tried using reseek() as suggested, along with my patch from
> >> HBASE-8306 (30%
> >> >>selection rate, random distribution and FAST_DIFF encoding on both
> >> column
> >> >>families).
> >> >>I got uneven results:
> >> >>
> >> >>2013-04-09 16:59:01,324 INFO  [main]
> >> regionserver.TestJoinedScanners(167):
> >> >>Slow scanner finished in 7.529083 seconds, got 1546 rows
> >> >>
> >> >>2013-04-09 16:59:06,760 INFO  [main]
> >> regionserver.TestJoinedScanners(167):
> >> >>Joined scanner finished in 5.43579 seconds, got 1546 rows
> >> >>...
> >> >>2013-04-09 16:59:12,711 INFO  [main]
> >> regionserver.TestJoinedScanners(167):
> >> >>Slow scanner finished in 5.95016 seconds, got 1546 rows
> >> >>
> >> >>2013-04-09 16:59:20,240 INFO  [main]
> >> regionserver.TestJoinedScanners(167):
> >> >>Joined scanner finished in 7.529044 seconds, got 1546 rows
> >> >>
> >> >>FYI
> >> >>
> >> >>On Tue, Apr 9, 2013 at 4:47 PM, lars hofhansl <[email protected]>
> wrote:
> >> >>
> >> >>> We did some tests here.
> >> >>> I ran this through the profiler against a local RegionServer and
> >> found the
> >> >>> part that causes the slowdown is a seek called here:
> >> >>>              boolean mayHaveData =
> >> >>>               (nextJoinedKv != null &&
> >> >>> nextJoinedKv.matchingRow(currentRow, offset, length))
> >> >>>               ||
> >> >>> (this.joinedHeap.seek(KeyValue.createFirstOnRow(currentRow, offset,
> >> length))
> >> >>>                   && joinedHeap.peek() != null
> >> >>>                   && joinedHeap.peek().matchingRow(currentRow,
> offset,
> >> >>> length));
> >> >>>
> >> >>> Looking at the code, this is needed because the joinedHeap can fall
> >> >>> behind, and hence we have to catch it up.
> >> >>> The key observation, though, is that the joined heap can only ever
> be
> >> >>> behind, and hence we do not need a seek, but only a reseek.
> >> >>>
> >> >>> Deploying a RegionServer with the seek replaced with reseek we see
> an
> >> >>> improvement in *all* cases.
> >> >>>
> >> >>> I'll file a jira with a fix later.
> >> >>>
> >> >>> -- Lars
> >> >>>
> >> >>>
> >> >>>
> >> >>> ________________________________
> >> >>>  From: James Taylor <[email protected]>
> >> >>> To: [email protected]
> >> >>> Sent: Monday, April 8, 2013 6:53 PM
> >> >>> Subject: Re: Essential column family performance
> >> >>>
> >> >>> Good idea, Sergey. We'll rerun with larger non essential column
> family
> >> >>> values and see if there's a crossover point. One other difference
> for
> >> us
> >> >>> is that we're using FAST_DIFF encoding. We'll try with no encoding
> >> too.
> >> >>> Our table has 20 million rows across four regions servers.
> >> >>>
> >> >>> Regarding the parallelization we do, we run multiple scans in
> parallel
> >> >>> instead of one single scan over the table. We use the region
> >> boundaries
> >> >>> of the table to divide up the work evenly, adding a start/stop key
> for
> >> >>> each scan that corresponds to the region boundaries. Our client then
> >> >>> does a final merge/aggregation step (i.e. adding up the count it
> gets
> >> >>> back from the scan for each region).
> >> >>>
> >> >>> On 04/08/2013 01:34 PM, Sergey Shelukhin wrote:
> >> >>> > IntegrationTestLazyCfLoading uses randomly distributed keys with
> the
> >> >>> > following condition for filtering:
> >> >>> > 1 == (Long.parseLong(Bytes.toString(rowKey, 0, 4), 16) & 1); where
> >> rowKey
> >> >>> > is hex string of MD5 key.
> >> >>> > Then, there are 2 "lazy" CFs, each of which has a value of 4-64k.
> >> >>> > This test also showed significant improvement IIRC, so random
> >> >>> distribution
> >> >>> > and high %%ge of values selected should not be a problem as such.
> >> >>> >
> >> >>> > My hunch would be that the additional cost of seeks/merging the
> >> results
> >> >>> > from two CFs outweights the benefit of lazy loading on such small
> >> values
> >> >>> > for the "lazy" CF with lots of data selected. This feature
> >> definitely
> >> >>> makes
> >> >>> > no sense if you are selecting all values, because then extra work
> is
> >> >>> being
> >> >>> > done for no benefit (everything is read anyway).
> >> >>> > So the use cases would be larger "lazy" CFs or/and low percentage
> of
> >> >>> values
> >> >>> > selected.
> >> >>> >
> >> >>> > Can you try to increase the 2nd CF values' size and rerun the
> test?
> >> >>> >
> >> >>> >
> >> >>> > On Mon, Apr 8, 2013 at 10:38 AM, James Taylor <
> >> [email protected]
> >> >>> >wrote:
> >> >>> >
> >> >>> >> In the TestJoinedScanners.java, is the 40% randomly distributed
> or
> >> >>> >> sequential?
> >> >>> >>
> >> >>> >> In our test, the % is randomly distributed. Also, our custom
> >> filter does
> >> >>> >> the same thing that SingleColumnValueFilter does.  On the
> >> client-side,
> >> >>> we'd
> >> >>> >> execute the query in parallel, through multiple scans along the
> >> region
> >> >>> >> boundaries. Would that have a negative impact on performance for
> >> this
> >> >>> >> "essential column family" feature?
> >> >>> >>
> >> >>> >> Thanks,
> >> >>> >>
> >> >>> >>      James
> >> >>> >>
> >> >>> >>
> >> >>> >> On 04/08/2013 10:10 AM, Anoop John wrote:
> >> >>> >>
> >> >>> >>> Agree here. The effectiveness depends on what % of data
> satisfies
> >> the
> >> >>> >>> condition, how it is distributed across HFile blocks. We will
> get
> >> >>> >>> performance gain when the we will be able to skip some HFile
> >> blocks
> >> >>> (from
> >> >>> >>> non essential CFs). Can test with different HFile block size
> >> (lower
> >> >>> >>> value)?
> >> >>> >>>
> >> >>> >>> -Anoop-
> >> >>> >>>
> >> >>> >>>
> >> >>> >>> On Mon, Apr 8, 2013 at 8:19 PM, Ted Yu <[email protected]>
> >> wrote:
> >> >>> >>>
> >> >>> >>>   I made the following change in TestJoinedScanners.java:
> >> >>> >>>> -      int flag_percent = 1;
> >> >>> >>>> +      int flag_percent = 40;
> >> >>> >>>>
> >> >>> >>>> The test took longer but still favors joined scanner.
> >> >>> >>>> I got some new results:
> >> >>> >>>>
> >> >>> >>>> 2013-04-08 07:46:06,959 INFO  [main] regionserver.**
> >> >>> >>>> TestJoinedScanners(157):
> >> >>> >>>> Slow scanner finished in 7.424388 seconds, got 2050 rows
> >> >>> >>>> ...
> >> >>> >>>> 2013-04-08 07:46:12,010 INFO  [main] regionserver.**
> >> >>> >>>> TestJoinedScanners(157):
> >> >>> >>>> Joined scanner finished in 5.05063 seconds, got 2050 rows
> >> >>> >>>>
> >> >>> >>>> 2013-04-08 07:46:18,358 INFO  [main] regionserver.**
> >> >>> >>>> TestJoinedScanners(157):
> >> >>> >>>> Slow scanner finished in 6.348517 seconds, got 2050 rows
> >> >>> >>>> ...
> >> >>> >>>> 2013-04-08 07:46:22,946 INFO  [main] regionserver.**
> >> >>> >>>> TestJoinedScanners(157):
> >> >>> >>>> Joined scanner finished in 4.587545 seconds, got 2050 rows
> >> >>> >>>>
> >> >>> >>>> Looks like effectiveness of joined scanner is affected by
> >> >>> distribution of
> >> >>> >>>> data.
> >> >>> >>>>
> >> >>> >>>> Cheers
> >> >>> >>>>
> >> >>> >>>> On Sun, Apr 7, 2013 at 8:52 PM, lars hofhansl <
> [email protected]>
> >> >>> wrote:
> >> >>> >>>>
> >> >>> >>>>   Looking at the joined scanner test code, it sets it up such
> >> that 1%
> >> >>> of
> >> >>> >>>> the
> >> >>> >>>>
> >> >>> >>>>> rows match, which would somewhat be in line with James'
> results.
> >> >>> >>>>>
> >> >>> >>>>> In my own testing a while ago I found a 100% improvement with
> 0%
> >> >>> match.
> >> >>> >>>>>
> >> >>> >>>>>
> >> >>> >>>>> -- Lars
> >> >>> >>>>>
> >> >>> >>>>>
> >> >>> >>>>>
> >> >>> >>>>> ______________________________**__
> >> >>> >>>>>    From: Ted Yu <[email protected]>
> >> >>> >>>>> To: [email protected]
> >> >>> >>>>> Sent: Sunday, April 7, 2013 4:13 PM
> >> >>> >>>>> Subject: Re: Essential column family performance
> >> >>> >>>>>
> >> >>> >>>>> I have attached 5416-TestJoinedScanners-0.94.**txt to
> >> HBASE-5416 for
> >> >>> >>>>> your
> >> >>> >>>>> reference.
> >> >>> >>>>>
> >> >>> >>>>> On my MacBook, I got the following results from the test:
> >> >>> >>>>>
> >> >>> >>>>> 2013-04-07 16:08:17,474 INFO  [main]
> >> >>> >>>>>
> >> >>> >>>> regionserver.**TestJoinedScanners(157):
> >> >>> >>>>
> >> >>> >>>>> Slow scanner finished in 7.973822 seconds, got 100 rows
> >> >>> >>>>> ...
> >> >>> >>>>> 2013-04-07 16:08:17,946 INFO  [main]
> >> >>> >>>>>
> >> >>> >>>> regionserver.**TestJoinedScanners(157):
> >> >>> >>>>
> >> >>> >>>>> Joined scanner finished in 0.47235 seconds, got 100 rows
> >> >>> >>>>>
> >> >>> >>>>> Cheers
> >> >>> >>>>>
> >> >>> >>>>> On Sun, Apr 7, 2013 at 4:03 PM, Ted Yu <[email protected]>
> >> wrote:
> >> >>> >>>>>
> >> >>> >>>>>   Looking at
> >> >>> >>>>>>  https://issues.apache.org/**jira/secure/attachment/**
> >> >>> >>>> 12564340/5416-0.94-v3.txt<
> >> >>>
> >>
> https://issues.apache.org/jira/secure/attachment/12564340/5416-0.94-v3.txt
> >> >>> >
> >> >>> >>>> ,
> >> >>> >>>>
> >> >>> >>>>> I found that it didn't contain TestJoinedScanners which shows
> >> >>> >>>>>
> >> >>> >>>>>> difference in scanner performance:
> >> >>> >>>>>>
> >> >>> >>>>>>      LOG.info((slow ? "Slow" : "Joined") + " scanner finished
> >> in " +
> >> >>> >>>>>> Double.toString(timeSec)
> >> >>> >>>>>>
> >> >>> >>>>>>         + " seconds, got " + Long.toString(rows_count/2) + "
> >> rows");
> >> >>> >>>>>>
> >> >>> >>>>>> The test uses SingleColumnValueFilter:
> >> >>> >>>>>>
> >> >>> >>>>>>       SingleColumnValueFilter filter = new
> >> SingleColumnValueFilter(
> >> >>> >>>>>>
> >> >>> >>>>>>           cf_essential, col_name,
> >> CompareFilter.CompareOp.EQUAL,
> >> >>> >>>>>>
> >> >>> >>>>> flag_yes);
> >> >>> >>>>> It is possible that the custom filter you were using would
> >> exhibit
> >> >>> >>>>>> different access pattern compared to SingleColumnValueFilter.
> >> e.g.
> >> >>> does
> >> >>> >>>>>> your filter utilize hint ?
> >> >>> >>>>>> It would be easier for me and other people to reproduce the
> >> issue
> >> >>> you
> >> >>> >>>>>> experienced if you put your scenario in some test similar to
> >> >>> >>>>>> TestJoinedScanners.
> >> >>> >>>>>>
> >> >>> >>>>>> Will take a closer look at the code Monday.
> >> >>> >>>>>>
> >> >>> >>>>>> Cheers
> >> >>> >>>>>>
> >> >>> >>>>>> On Sun, Apr 7, 2013 at 11:37 AM, James Taylor <
> >> >>> [email protected]
> >> >>> >>>>>> wrote:
> >> >>> >>>>>>
> >> >>> >>>>>>   Yes, on 0.94.6. We have our own custom filter derived from
> >> >>> FilterBase,
> >> >>> >>>>>> so
> >> >>> >>>>>> filterIfMissing isn't the issue - the results of the scan are
> >> >>> correct.
> >> >>> >>>>>>> I can see that if the essential column family has more data
> >> >>> compared
> >> >>> >>>>>>>
> >> >>> >>>>>> to
> >> >>> >>>>> the non essential column family that the results would
> >> eventually
> >> >>> even
> >> >>> >>>>>> out.
> >> >>> >>>>>> I was hoping to always be able to enable the essential column
> >> family
> >> >>> >>>>>>> feature. Is there an inherent reason why performance would
> >> degrade
> >> >>> >>>>>>>
> >> >>> >>>>>> like
> >> >>> >>>>> this? Does it boil down to a single sequential scan versus
> many
> >> >>> seeks?
> >> >>> >>>>>>> Thanks,
> >> >>> >>>>>>>
> >> >>> >>>>>>> James
> >> >>> >>>>>>>
> >> >>> >>>>>>>
> >> >>> >>>>>>> On 04/07/2013 07:44 AM, Ted Yu wrote:
> >> >>> >>>>>>>
> >> >>> >>>>>>>   James:
> >> >>> >>>>>>>> Your test was based on 0.94.6.1, right ?
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> What Filter were you using ?
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> If you used SingleColumnValueFilter, have you seen my
> comment
> >> >>> here ?
> >> >>> >>>>>>>> https://issues.apache.org/****jira/browse/HBASE-5416?**<
> >> >>> https://issues.apache.org/**jira/browse/HBASE-5416?**>
> >> >>> >>>>>>>> focusedCommentId=13541229&****page=com.atlassian.jira.**
> >> >>> >>>>>>>>
> >> plugin.system.issuetabpanels:****comment-tabpanel#comment-****
> >> >>> >>>>>>>> 13541229<
> >> >>> >>>>>>>>
> >> >>> >>>>>>> https://issues.apache.org/**jira/browse/HBASE-5416?**
> >> >>> >>>> focusedCommentId=13541229&**page=com.atlassian.jira.**
> >> >>> >>>>
> >> plugin.system.issuetabpanels:**comment-tabpanel#comment-**13541229<
> >> >>>
> >>
> https://issues.apache.org/jira/browse/HBASE-5416?focusedCommentId=13541229&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13541229
> >> >>> >
> >> >>> >>>>
> >> >>> >>>>>   BTW the use case Max Lapan tried to address has non
> essential
> >> >>> column
> >> >>> >>>>>>>> family
> >> >>> >>>>>>>> carrying considerably more data compared to essential
> column
> >> >>> family.
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> Cheers
> >> >>> >>>>>>>>
> >> >>> >>>>>>>>
> >> >>> >>>>>>>>
> >> >>> >>>>>>>> On Sat, Apr 6, 2013 at 11:05 PM, James Taylor <
> >> >>> >>>>>>>>
> >> >>> >>>>>>> [email protected]
> >> >>> >>>>>   wrote:
> >> >>> >>>>>>>>    Hello,
> >> >>> >>>>>>>>
> >> >>> >>>>>>>>> We're doing some performance testing of the essential
> column
> >> >>> family
> >> >>> >>>>>>>>> feature, and we're seeing some performance degradation
> when
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>> comparing
> >> >>> >>>>>   with
> >> >>> >>>>>>>>> and without the feature enabled:
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>                              Performance of scan relative
> >> >>> >>>>>>>>> % of rows selected        to not enabling the feature
> >> >>> >>>>>>>>> ---------------------
> >>  ------------------------------******--
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> 100%                            1.0x
> >> >>> >>>>>>>>>     80%                            2.0x
> >> >>> >>>>>>>>>     60%                            2.3x
> >> >>> >>>>>>>>>     40%                            2.2x
> >> >>> >>>>>>>>>     20%                            1.5x
> >> >>> >>>>>>>>>     10%                            1.0x
> >> >>> >>>>>>>>>      5%                            0.67x
> >> >>> >>>>>>>>>      0%                            0.30%
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> In our scenario, we have two column families. The key
> value
> >> from
> >> >>> the
> >> >>> >>>>>>>>> essential column family is used in the filter, while the
> key
> >> >>> value
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>> from
> >> >>> >>>>>>   the
> >> >>> >>>>>>>>> other, non essential column family is returned by the
> scan.
> >> Each
> >> >>> row
> >> >>> >>>>>>>>> contains values for both key values, with the values being
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>> relatively
> >> >>> >>>>>   narrow (less than 50 bytes). In this scenario, the only time
> >> we're
> >> >>> >>>>>>>>> seeing a
> >> >>> >>>>>>>>> performance gain is when less than 10% of the rows are
> >> selected.
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> Is this a reasonable test? Has anyone else measured this?
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> Thanks,
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>> James
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>> >>>>>>>>>
> >> >>>
> >>
> >
> >
>

Re: Essential column family performance

Reply via email to