4.7.0 RC, Bulk loading performance degradation and failed MR tasks

YoungWoo Kim Tue, 23 Feb 2016 02:31:03 -0800

Hi,

I'm evaluating 4.7.0 RC on my dev cluster. Looks like it works fine but I
run into performance degradation for MR based bulk loading. I've been
loading a million of rows per day into Phoenix table. From 4.7.0 RC, there
are failed jobs with '600 sec' time out in map or reduce stage. logs as
follows:


16/02/22 18:03:45 INFO mapreduce.Job: Task Id :
attempt_1456035298774_0066_m_000002_0, Status : FAILED
AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out after 600 secs

16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles: HFile at
hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top
no longer fits inside a single region. Splitting...

But, the logs have not seen before. so I'm facing about 5 ~ 10x performance
degradation for bulk loading. (4.6.0: 10min but 60+ min from 4.7.0 RC)
furthermore, I can't find a clue from MR logs why the tasks filed.

And, I can see the hfile splitting after reduce stage. Is it normal?

My envs are:
- Hadoop 2.7.1
- HBase 1.1.3
- Phoenix 4.7.0 RC

Thanks,

Youngwoo

4.7.0 RC, Bulk loading performance degradation and failed MR tasks

Reply via email to