Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

James Taylor Fri, 26 Feb 2016 22:24:03 -0800

Youngwoo - if I revert PHOENIX-1973, would you be able to build and deploy
the latest on the master branch to confirm whether or not performance is
back to what it was in 4.6?


Thanks,
James

On Fri, Feb 26, 2016 at 6:00 PM, James Taylor <[email protected]>
wrote:

> Ok, I'll rollback PHOENIX-1973, but are you saying this isn't the root
> cause of the pert regression?
>
>
> On Friday, February 26, 2016, Sergey Soldatov <[email protected]>
> wrote:
>
>> James,
>> Well, its not, but I have an another problem with big loads when the
>> input is get split to several mappers. The optimization with replacing
>> table name in tablerowkeyPair cause problems with output name
>> generation, so the index instead of the table name appears in the
>> path. So lets just rollback the patch.
>>
>> Thanks,
>> Sergey
>>
>> On Fri, Feb 26, 2016 at 2:37 PM, James Taylor <[email protected]>
>> wrote:
>> > Thanks, Sergey. Were you able to confirm whether or not PHOENIX-1973 was
>> > the root cause of the regression?
>> >
>> > On Fri, Feb 26, 2016 at 11:55 AM, Sergey Soldatov <
>> [email protected]>
>> > wrote:
>> >
>> >> Oops. Small update. We can revert  PHOENIX-1973 (bulk load
>> >> improvement), not PHOENIX-2649 (TableRowKeyPair comparator problem).
>> >>
>> >> On Fri, Feb 26, 2016 at 10:52 AM, Sergey Soldatov
>> >> <[email protected]> wrote:
>> >> > Well, that's how MR bulk load works. Mappers gets all rows from the
>> >> > file and create the corresponding pairs <rowkey, column value>. MR
>> >> > engine sorts this stuff by rowkey and reducer sort it by value and
>> put
>> >> > it to the hfile. After that HBase bulkload loads it into HBase.
>> >> > PHOENIX-2649 is just reduce the amount of data that is sending
>> between
>> >> > mappers and reducer. Before that was N rows * K columns after it
>> >> > becomes N only.  Because of the bug I mentioned before the phase when
>> >> > stuff is sorted by rowkey didn't work at all (first place why
>> >> > performance of 4.6 was better) and all values were written with a
>> >> > single rowkey and were received all at once by reducer (second place)
>> >> > and during the HBase bulkload there were no reason for splitting
>> >> > because of the same rowkey (third place).
>> >> > But of course we can reverse PHOENIX-2649 and see whether it helps.
>> >> >
>> >> > Thanks,
>> >> > Sergey
>> >> >
>> >> > On Fri, Feb 26, 2016 at 9:02 AM, 김영우 (Youngwoo Kim) <
>> [email protected]>
>> >> wrote:
>> >> >> Exactly! Gabriel describes the fact that I observed.
>> >> >>
>> >> >> Many map and reduce tasks are launched but one or two tasks are
>> running
>> >> at
>> >> >> the end of the job. it  looks like the work loads are skwed on
>> >> particular
>> >> >> task.
>> >> >>
>> >> >> Thanks,
>> >> >> Youngwoo
>> >> >>
>> >> >> 2016년 2월 26일 금요일, Gabriel Reid<[email protected]>님이 작성한 메시지:
>> >> >>
>> >> >>> I just did a quick test run on this, and it looks to me like
>> something
>> >> >>> is definitely wrong.
>> >> >>>
>> >> >>> I ran a simple ingest test for a table with 5 regions, and it
>> appears
>> >> >>> that only a single HFile is being created. This HFile then needs
>> to be
>> >> >>> recursively split during the step of handing HFiles over to the
>> region
>> >> >>> servers (hence the "xxx no longer fits inside a single region.
>> >> >>> Splitting..." log messages).
>> >> >>>
>> >> >>> This implies that only a single reducer is actually doing any
>> >> >>> processing, which would certainly account for a performance
>> >> >>> degradation. My assumption is that the underlying issue is in the
>> >> >>> partitioner (or the data being passed to the partitioner). I don't
>> >> >>> know if this was introduced as part of PHOENIX-2649 or not.
>> >> >>>
>> >> >>> Sergey, are you (or someone else) able to take a look at this?
>> >> >>> Unfortunately, I don't think there's any way I can get a serious
>> look
>> >> >>> at this any more today.
>> >> >>>
>> >> >>> - Gabriel
>> >> >>>
>> >> >>>
>> >> >>> On Fri, Feb 26, 2016 at 11:21 AM, Sergey Soldatov
>> >> >>> <[email protected] <javascript:;>> wrote:
>> >> >>> > I see. We will try to reproduce it. The degradation is possible
>> >> >>> > because 4.6 had a problem described in PHOENIX-2649. In two
>> words -
>> >> >>> > the comparator for rowkeys was working incorrectly and reported
>> that
>> >> >>> > all rowkeys are the same. If the input files are relatively
>> small and
>> >> >>> > reducer has enough memory, all records will be written in one
>> step
>> >> >>> > with the same single rowkey. And that can be the reason why it
>> was
>> >> >>> > faster and there were no splits.
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> > Sergey
>> >> >>> >
>> >> >>> > On Fri, Feb 26, 2016 at 1:37 AM, 김영우 (YoungWoo Kim) <
>> >> [email protected]
>> >> >>> <javascript:;>> wrote:
>> >> >>> >> Sergey,
>> >> >>> >>
>> >> >>> >> I can't access the cluster right now, so I'll post details and
>> >> >>> >> configurations next week. important facts as far as I remember:
>> >> >>> >> - 8 nodes dev cluster (Hadoop 2.7.1, HBase 1.1.3, Phoenix 4.7.0
>> RC2
>> >> and
>> >> >>> >> Zookeeper 3.4.6)
>> >> >>> >>  * 32 core / 256 GB RAM, Datanode/Nodemanager and RegionServer @
>> >> same
>> >> >>> node,
>> >> >>> >> Assigned 24 GB for heap for region server
>> >> >>> >> - # of tables = 9
>> >> >>> >> - Salted with 5, 10 or 20 buckets
>> >> >>> >> - Compressed using Snappy codec
>> >> >>> >> - Data Ingestion : 30 ~ 40 GB / day using bulk loading
>> >> >>> >> - Schema
>> >> >>> >>   The table that I mentioned has 10 columns and 7 columns are
>> >> varchar
>> >> >>> and
>> >> >>> >> the rest are varchar[].
>> >> >>> >>   I can see performance degradation on bulk load from other
>> tables
>> >> >>> >>
>> >> >>> >> Thanks,
>> >> >>> >>
>> >> >>> >> Youngwoo
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On Fri, Feb 26, 2016 at 6:02 PM, Sergey Soldatov <
>> >> >>> [email protected] <javascript:;>>
>> >> >>> >> wrote:
>> >> >>> >>
>> >> >>> >>> Hi Youngwoo,
>> >> >>> >>> Could you provide a bit more information about the table
>> structure
>> >> >>> >>> (DDL would be great)? Do you have indexes?
>> >> >>> >>>
>> >> >>> >>> Thanks,
>> >> >>> >>> Sergey
>> >> >>> >>>
>> >> >>> >>> On Tue, Feb 23, 2016 at 10:18 PM, 김영우 (Youngwoo Kim)
>> >> >>> >>> <[email protected] <javascript:;>> wrote:
>> >> >>> >>> > Gabriel,
>> >> >>> >>> >
>> >> >>> >>> > I'm using RC2.
>> >> >>> >>> >
>> >> >>> >>> > Youngwoo
>> >> >>> >>> >
>> >> >>> >>> > 2016년 2월 24일 수요일, Gabriel Reid<[email protected]
>> >> <javascript:;>>님이
>> >> >>> 작성한 메시지:
>> >> >>> >>> >
>> >> >>> >>> >> Hi Youngwoo,
>> >> >>> >>> >>
>> >> >>> >>> >> Which RC are you using for this? RC-1 or RC-2?
>> >> >>> >>> >>
>> >> >>> >>> >> Thanks,
>> >> >>> >>> >>
>> >> >>> >>> >> Gabriel
>> >> >>> >>> >>
>> >> >>> >>> >> On Tue, Feb 23, 2016 at 11:30 AM, 김영우 (YoungWoo Kim) <
>> >> >>> [email protected] <javascript:;>
>> >> >>> >>> >> <javascript:;>> wrote:
>> >> >>> >>> >> > Hi,
>> >> >>> >>> >> >
>> >> >>> >>> >> > I'm evaluating 4.7.0 RC on my dev cluster. Looks like it
>> works
>> >> >>> fine
>> >> >>> >>> but I
>> >> >>> >>> >> > run into performance degradation for MR based bulk
>> loading.
>> >> I've
>> >> >>> been
>> >> >>> >>> >> > loading a million of rows per day into Phoenix table. From
>> >> 4.7.0
>> >> >>> RC,
>> >> >>> >>> >> there
>> >> >>> >>> >> > are failed jobs with '600 sec' time out in map or reduce
>> >> stage.
>> >> >>> logs
>> >> >>> >>> as
>> >> >>> >>> >> > follows:
>> >> >>> >>> >> >
>> >> >>> >>> >> > 16/02/22 18:03:45 INFO mapreduce.Job: Task Id :
>> >> >>> >>> >> > attempt_1456035298774_0066_m_000002_0, Status : FAILED
>> >> >>> >>> >> > AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out
>> >> after
>> >> >>> 600
>> >> >>> >>> secs
>> >> >>> >>> >> >
>> >> >>> >>> >> > 16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles:
>> HFile
>> >> at
>> >> >>> >>> >> >
>> >> >>> >>> >>
>> >> >>> >>>
>> >> >>>
>> >>
>> hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top
>> >> >>> >>> >> > no longer fits inside a single region. Splitting...
>> >> >>> >>> >> >
>> >> >>> >>> >> > But, the logs have not seen before. so I'm facing about 5
>> ~
>> >> 10x
>> >> >>> >>> >> performance
>> >> >>> >>> >> > degradation for bulk loading. (4.6.0: 10min but 60+ min
>> from
>> >> >>> 4.7.0 RC)
>> >> >>> >>> >> > furthermore, I can't find a clue from MR logs why the
>> tasks
>> >> filed.
>> >> >>> >>> >> >
>> >> >>> >>> >> > And, I can see the hfile splitting after reduce stage. Is
>> it
>> >> >>> normal?
>> >> >>> >>> >> >
>> >> >>> >>> >> > My envs are:
>> >> >>> >>> >> > - Hadoop 2.7.1
>> >> >>> >>> >> > - HBase 1.1.3
>> >> >>> >>> >> > - Phoenix 4.7.0 RC
>> >> >>> >>> >> >
>> >> >>> >>> >> > Thanks,
>> >> >>> >>> >> >
>> >> >>> >>> >> > Youngwoo
>> >> >>> >>> >>
>> >> >>> >>>
>> >> >>>
>> >>
>>
>

Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Reply via email to