Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Sergey Soldatov Fri, 26 Feb 2016 11:56:23 -0800

Oops. Small update. We can revert  PHOENIX-1973 (bulk load
improvement), not PHOENIX-2649 (TableRowKeyPair comparator problem).


On Fri, Feb 26, 2016 at 10:52 AM, Sergey Soldatov
<[email protected]> wrote:
> Well, that's how MR bulk load works. Mappers gets all rows from the
> file and create the corresponding pairs <rowkey, column value>. MR
> engine sorts this stuff by rowkey and reducer sort it by value and put
> it to the hfile. After that HBase bulkload loads it into HBase.
> PHOENIX-2649 is just reduce the amount of data that is sending between
> mappers and reducer. Before that was N rows * K columns after it
> becomes N only.  Because of the bug I mentioned before the phase when
> stuff is sorted by rowkey didn't work at all (first place why
> performance of 4.6 was better) and all values were written with a
> single rowkey and were received all at once by reducer (second place)
> and during the HBase bulkload there were no reason for splitting
> because of the same rowkey (third place).
> But of course we can reverse PHOENIX-2649 and see whether it helps.
>
> Thanks,
> Sergey
>
> On Fri, Feb 26, 2016 at 9:02 AM, 김영우 (Youngwoo Kim) <[email protected]> 
> wrote:
>> Exactly! Gabriel describes the fact that I observed.
>>
>> Many map and reduce tasks are launched but one or two tasks are running at
>> the end of the job. it  looks like the work loads are skwed on particular
>> task.
>>
>> Thanks,
>> Youngwoo
>>
>> 2016년 2월 26일 금요일, Gabriel Reid<[email protected]>님이 작성한 메시지:
>>
>>> I just did a quick test run on this, and it looks to me like something
>>> is definitely wrong.
>>>
>>> I ran a simple ingest test for a table with 5 regions, and it appears
>>> that only a single HFile is being created. This HFile then needs to be
>>> recursively split during the step of handing HFiles over to the region
>>> servers (hence the "xxx no longer fits inside a single region.
>>> Splitting..." log messages).
>>>
>>> This implies that only a single reducer is actually doing any
>>> processing, which would certainly account for a performance
>>> degradation. My assumption is that the underlying issue is in the
>>> partitioner (or the data being passed to the partitioner). I don't
>>> know if this was introduced as part of PHOENIX-2649 or not.
>>>
>>> Sergey, are you (or someone else) able to take a look at this?
>>> Unfortunately, I don't think there's any way I can get a serious look
>>> at this any more today.
>>>
>>> - Gabriel
>>>
>>>
>>> On Fri, Feb 26, 2016 at 11:21 AM, Sergey Soldatov
>>> <[email protected] <javascript:;>> wrote:
>>> > I see. We will try to reproduce it. The degradation is possible
>>> > because 4.6 had a problem described in PHOENIX-2649. In two words -
>>> > the comparator for rowkeys was working incorrectly and reported that
>>> > all rowkeys are the same. If the input files are relatively small and
>>> > reducer has enough memory, all records will be written in one step
>>> > with the same single rowkey. And that can be the reason why it was
>>> > faster and there were no splits.
>>> >
>>> > Thanks,
>>> > Sergey
>>> >
>>> > On Fri, Feb 26, 2016 at 1:37 AM, 김영우 (YoungWoo Kim) <[email protected]
>>> <javascript:;>> wrote:
>>> >> Sergey,
>>> >>
>>> >> I can't access the cluster right now, so I'll post details and
>>> >> configurations next week. important facts as far as I remember:
>>> >> - 8 nodes dev cluster (Hadoop 2.7.1, HBase 1.1.3, Phoenix 4.7.0 RC2 and
>>> >> Zookeeper 3.4.6)
>>> >>  * 32 core / 256 GB RAM, Datanode/Nodemanager and RegionServer @ same
>>> node,
>>> >> Assigned 24 GB for heap for region server
>>> >> - # of tables = 9
>>> >> - Salted with 5, 10 or 20 buckets
>>> >> - Compressed using Snappy codec
>>> >> - Data Ingestion : 30 ~ 40 GB / day using bulk loading
>>> >> - Schema
>>> >>   The table that I mentioned has 10 columns and 7 columns are varchar
>>> and
>>> >> the rest are varchar[].
>>> >>   I can see performance degradation on bulk load from other tables
>>> >>
>>> >> Thanks,
>>> >>
>>> >> Youngwoo
>>> >>
>>> >>
>>> >>
>>> >> On Fri, Feb 26, 2016 at 6:02 PM, Sergey Soldatov <
>>> [email protected] <javascript:;>>
>>> >> wrote:
>>> >>
>>> >>> Hi Youngwoo,
>>> >>> Could you provide a bit more information about the table structure
>>> >>> (DDL would be great)? Do you have indexes?
>>> >>>
>>> >>> Thanks,
>>> >>> Sergey
>>> >>>
>>> >>> On Tue, Feb 23, 2016 at 10:18 PM, 김영우 (Youngwoo Kim)
>>> >>> <[email protected] <javascript:;>> wrote:
>>> >>> > Gabriel,
>>> >>> >
>>> >>> > I'm using RC2.
>>> >>> >
>>> >>> > Youngwoo
>>> >>> >
>>> >>> > 2016년 2월 24일 수요일, Gabriel Reid<[email protected] 
>>> >>> > <javascript:;>>님이
>>> 작성한 메시지:
>>> >>> >
>>> >>> >> Hi Youngwoo,
>>> >>> >>
>>> >>> >> Which RC are you using for this? RC-1 or RC-2?
>>> >>> >>
>>> >>> >> Thanks,
>>> >>> >>
>>> >>> >> Gabriel
>>> >>> >>
>>> >>> >> On Tue, Feb 23, 2016 at 11:30 AM, 김영우 (YoungWoo Kim) <
>>> [email protected] <javascript:;>
>>> >>> >> <javascript:;>> wrote:
>>> >>> >> > Hi,
>>> >>> >> >
>>> >>> >> > I'm evaluating 4.7.0 RC on my dev cluster. Looks like it works
>>> fine
>>> >>> but I
>>> >>> >> > run into performance degradation for MR based bulk loading. I've
>>> been
>>> >>> >> > loading a million of rows per day into Phoenix table. From 4.7.0
>>> RC,
>>> >>> >> there
>>> >>> >> > are failed jobs with '600 sec' time out in map or reduce stage.
>>> logs
>>> >>> as
>>> >>> >> > follows:
>>> >>> >> >
>>> >>> >> > 16/02/22 18:03:45 INFO mapreduce.Job: Task Id :
>>> >>> >> > attempt_1456035298774_0066_m_000002_0, Status : FAILED
>>> >>> >> > AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out after
>>> 600
>>> >>> secs
>>> >>> >> >
>>> >>> >> > 16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles: HFile at
>>> >>> >> >
>>> >>> >>
>>> >>>
>>> hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top
>>> >>> >> > no longer fits inside a single region. Splitting...
>>> >>> >> >
>>> >>> >> > But, the logs have not seen before. so I'm facing about 5 ~ 10x
>>> >>> >> performance
>>> >>> >> > degradation for bulk loading. (4.6.0: 10min but 60+ min from
>>> 4.7.0 RC)
>>> >>> >> > furthermore, I can't find a clue from MR logs why the tasks filed.
>>> >>> >> >
>>> >>> >> > And, I can see the hfile splitting after reduce stage. Is it
>>> normal?
>>> >>> >> >
>>> >>> >> > My envs are:
>>> >>> >> > - Hadoop 2.7.1
>>> >>> >> > - HBase 1.1.3
>>> >>> >> > - Phoenix 4.7.0 RC
>>> >>> >> >
>>> >>> >> > Thanks,
>>> >>> >> >
>>> >>> >> > Youngwoo
>>> >>> >>
>>> >>>
>>>

Re: 4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Reply via email to