Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Sergey Soldatov Fri, 26 Feb 2016 22:58:33 -0800

James,
I haven't seen any performance problems during my testing. I tested
with the csv files up to 600Mb and the load time was less than 10 min
on a laptop. I will try tables with salted pks tomorrow.


Thanks,
Sergey.

On Fri, Feb 26, 2016 at 6:00 PM, James Taylor <[email protected]> wrote:
> Ok, I'll rollback PHOENIX-1973, but are you saying this isn't the root
> cause of the pert regression?
>
>
> On Friday, February 26, 2016, Sergey Soldatov <[email protected]>
> wrote:
>
>> James,
>> Well, its not, but I have an another problem with big loads when the
>> input is get split to several mappers. The optimization with replacing
>> table name in tablerowkeyPair cause problems with output name
>> generation, so the index instead of the table name appears in the
>> path. So lets just rollback the patch.
>>
>> Thanks,
>> Sergey
>>
>> On Fri, Feb 26, 2016 at 2:37 PM, James Taylor <[email protected]
>> <javascript:;>> wrote:
>> > Thanks, Sergey. Were you able to confirm whether or not PHOENIX-1973 was
>> > the root cause of the regression?
>> >
>> > On Fri, Feb 26, 2016 at 11:55 AM, Sergey Soldatov <
>> [email protected] <javascript:;>>
>> > wrote:
>> >
>> >> Oops. Small update. We can revert  PHOENIX-1973 (bulk load
>> >> improvement), not PHOENIX-2649 (TableRowKeyPair comparator problem).
>> >>
>> >> On Fri, Feb 26, 2016 at 10:52 AM, Sergey Soldatov
>> >> <[email protected] <javascript:;>> wrote:
>> >> > Well, that's how MR bulk load works. Mappers gets all rows from the
>> >> > file and create the corresponding pairs <rowkey, column value>. MR
>> >> > engine sorts this stuff by rowkey and reducer sort it by value and put
>> >> > it to the hfile. After that HBase bulkload loads it into HBase.
>> >> > PHOENIX-2649 is just reduce the amount of data that is sending between
>> >> > mappers and reducer. Before that was N rows * K columns after it
>> >> > becomes N only.  Because of the bug I mentioned before the phase when
>> >> > stuff is sorted by rowkey didn't work at all (first place why
>> >> > performance of 4.6 was better) and all values were written with a
>> >> > single rowkey and were received all at once by reducer (second place)
>> >> > and during the HBase bulkload there were no reason for splitting
>> >> > because of the same rowkey (third place).
>> >> > But of course we can reverse PHOENIX-2649 and see whether it helps.
>> >> >
>> >> > Thanks,
>> >> > Sergey
>> >> >
>> >> > On Fri, Feb 26, 2016 at 9:02 AM, 김영우 (Youngwoo Kim) <
>> [email protected] <javascript:;>>
>> >> wrote:
>> >> >> Exactly! Gabriel describes the fact that I observed.
>> >> >>
>> >> >> Many map and reduce tasks are launched but one or two tasks are
>> running
>> >> at
>> >> >> the end of the job. it  looks like the work loads are skwed on
>> >> particular
>> >> >> task.
>> >> >>
>> >> >> Thanks,
>> >> >> Youngwoo
>> >> >>
>> >> >> 2016년 2월 26일 금요일, Gabriel Reid<[email protected] <javascript:;>>님이
>> 작성한 메시지:
>> >> >>
>> >> >>> I just did a quick test run on this, and it looks to me like
>> something
>> >> >>> is definitely wrong.
>> >> >>>
>> >> >>> I ran a simple ingest test for a table with 5 regions, and it
>> appears
>> >> >>> that only a single HFile is being created. This HFile then needs to
>> be
>> >> >>> recursively split during the step of handing HFiles over to the
>> region
>> >> >>> servers (hence the "xxx no longer fits inside a single region.
>> >> >>> Splitting..." log messages).
>> >> >>>
>> >> >>> This implies that only a single reducer is actually doing any
>> >> >>> processing, which would certainly account for a performance
>> >> >>> degradation. My assumption is that the underlying issue is in the
>> >> >>> partitioner (or the data being passed to the partitioner). I don't
>> >> >>> know if this was introduced as part of PHOENIX-2649 or not.
>> >> >>>
>> >> >>> Sergey, are you (or someone else) able to take a look at this?
>> >> >>> Unfortunately, I don't think there's any way I can get a serious
>> look
>> >> >>> at this any more today.
>> >> >>>
>> >> >>> - Gabriel
>> >> >>>
>> >> >>>
>> >> >>> On Fri, Feb 26, 2016 at 11:21 AM, Sergey Soldatov
>> >> >>> <[email protected] <javascript:;> <javascript:;>> wrote:
>> >> >>> > I see. We will try to reproduce it. The degradation is possible
>> >> >>> > because 4.6 had a problem described in PHOENIX-2649. In two words
>> -
>> >> >>> > the comparator for rowkeys was working incorrectly and reported
>> that
>> >> >>> > all rowkeys are the same. If the input files are relatively small
>> and
>> >> >>> > reducer has enough memory, all records will be written in one step
>> >> >>> > with the same single rowkey. And that can be the reason why it was
>> >> >>> > faster and there were no splits.
>> >> >>> >
>> >> >>> > Thanks,
>> >> >>> > Sergey
>> >> >>> >
>> >> >>> > On Fri, Feb 26, 2016 at 1:37 AM, 김영우 (YoungWoo Kim) <
>> >> [email protected] <javascript:;>
>> >> >>> <javascript:;>> wrote:
>> >> >>> >> Sergey,
>> >> >>> >>
>> >> >>> >> I can't access the cluster right now, so I'll post details and
>> >> >>> >> configurations next week. important facts as far as I remember:
>> >> >>> >> - 8 nodes dev cluster (Hadoop 2.7.1, HBase 1.1.3, Phoenix 4.7.0
>> RC2
>> >> and
>> >> >>> >> Zookeeper 3.4.6)
>> >> >>> >>  * 32 core / 256 GB RAM, Datanode/Nodemanager and RegionServer @
>> >> same
>> >> >>> node,
>> >> >>> >> Assigned 24 GB for heap for region server
>> >> >>> >> - # of tables = 9
>> >> >>> >> - Salted with 5, 10 or 20 buckets
>> >> >>> >> - Compressed using Snappy codec
>> >> >>> >> - Data Ingestion : 30 ~ 40 GB / day using bulk loading
>> >> >>> >> - Schema
>> >> >>> >>   The table that I mentioned has 10 columns and 7 columns are
>> >> varchar
>> >> >>> and
>> >> >>> >> the rest are varchar[].
>> >> >>> >>   I can see performance degradation on bulk load from other
>> tables
>> >> >>> >>
>> >> >>> >> Thanks,
>> >> >>> >>
>> >> >>> >> Youngwoo
>> >> >>> >>
>> >> >>> >>
>> >> >>> >>
>> >> >>> >> On Fri, Feb 26, 2016 at 6:02 PM, Sergey Soldatov <
>> >> >>> [email protected] <javascript:;> <javascript:;>>
>> >> >>> >> wrote:
>> >> >>> >>
>> >> >>> >>> Hi Youngwoo,
>> >> >>> >>> Could you provide a bit more information about the table
>> structure
>> >> >>> >>> (DDL would be great)? Do you have indexes?
>> >> >>> >>>
>> >> >>> >>> Thanks,
>> >> >>> >>> Sergey
>> >> >>> >>>
>> >> >>> >>> On Tue, Feb 23, 2016 at 10:18 PM, 김영우 (Youngwoo Kim)
>> >> >>> >>> <[email protected] <javascript:;> <javascript:;>> wrote:
>> >> >>> >>> > Gabriel,
>> >> >>> >>> >
>> >> >>> >>> > I'm using RC2.
>> >> >>> >>> >
>> >> >>> >>> > Youngwoo
>> >> >>> >>> >
>> >> >>> >>> > 2016년 2월 24일 수요일, Gabriel Reid<[email protected]
>> <javascript:;>
>> >> <javascript:;>>님이
>> >> >>> 작성한 메시지:
>> >> >>> >>> >
>> >> >>> >>> >> Hi Youngwoo,
>> >> >>> >>> >>
>> >> >>> >>> >> Which RC are you using for this? RC-1 or RC-2?
>> >> >>> >>> >>
>> >> >>> >>> >> Thanks,
>> >> >>> >>> >>
>> >> >>> >>> >> Gabriel
>> >> >>> >>> >>
>> >> >>> >>> >> On Tue, Feb 23, 2016 at 11:30 AM, 김영우 (YoungWoo Kim) <
>> >> >>> [email protected] <javascript:;> <javascript:;>
>> >> >>> >>> >> <javascript:;>> wrote:
>> >> >>> >>> >> > Hi,
>> >> >>> >>> >> >
>> >> >>> >>> >> > I'm evaluating 4.7.0 RC on my dev cluster. Looks like it
>> works
>> >> >>> fine
>> >> >>> >>> but I
>> >> >>> >>> >> > run into performance degradation for MR based bulk loading.
>> >> I've
>> >> >>> been
>> >> >>> >>> >> > loading a million of rows per day into Phoenix table. From
>> >> 4.7.0
>> >> >>> RC,
>> >> >>> >>> >> there
>> >> >>> >>> >> > are failed jobs with '600 sec' time out in map or reduce
>> >> stage.
>> >> >>> logs
>> >> >>> >>> as
>> >> >>> >>> >> > follows:
>> >> >>> >>> >> >
>> >> >>> >>> >> > 16/02/22 18:03:45 INFO mapreduce.Job: Task Id :
>> >> >>> >>> >> > attempt_1456035298774_0066_m_000002_0, Status : FAILED
>> >> >>> >>> >> > AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out
>> >> after
>> >> >>> 600
>> >> >>> >>> secs
>> >> >>> >>> >> >
>> >> >>> >>> >> > 16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles:
>> HFile
>> >> at
>> >> >>> >>> >> >
>> >> >>> >>> >>
>> >> >>> >>>
>> >> >>>
>> >>
>> hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top
>> >> >>> >>> >> > no longer fits inside a single region. Splitting...
>> >> >>> >>> >> >
>> >> >>> >>> >> > But, the logs have not seen before. so I'm facing about 5 ~
>> >> 10x
>> >> >>> >>> >> performance
>> >> >>> >>> >> > degradation for bulk loading. (4.6.0: 10min but 60+ min
>> from
>> >> >>> 4.7.0 RC)
>> >> >>> >>> >> > furthermore, I can't find a clue from MR logs why the tasks
>> >> filed.
>> >> >>> >>> >> >
>> >> >>> >>> >> > And, I can see the hfile splitting after reduce stage. Is
>> it
>> >> >>> normal?
>> >> >>> >>> >> >
>> >> >>> >>> >> > My envs are:
>> >> >>> >>> >> > - Hadoop 2.7.1
>> >> >>> >>> >> > - HBase 1.1.3
>> >> >>> >>> >> > - Phoenix 4.7.0 RC
>> >> >>> >>> >> >
>> >> >>> >>> >> > Thanks,
>> >> >>> >>> >> >
>> >> >>> >>> >> > Youngwoo
>> >> >>> >>> >>
>> >> >>> >>>
>> >> >>>
>> >>
>>

Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Reply via email to