Hi, I'm evaluating 4.7.0 RC on my dev cluster. Looks like it works fine but I run into performance degradation for MR based bulk loading. I've been loading a million of rows per day into Phoenix table. From 4.7.0 RC, there are failed jobs with '600 sec' time out in map or reduce stage. logs as follows:
16/02/22 18:03:45 INFO mapreduce.Job: Task Id : attempt_1456035298774_0066_m_000002_0, Status : FAILED AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out after 600 secs 16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles: HFile at hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top no longer fits inside a single region. Splitting... But, the logs have not seen before. so I'm facing about 5 ~ 10x performance degradation for bulk loading. (4.6.0: 10min but 60+ min from 4.7.0 RC) furthermore, I can't find a clue from MR logs why the tasks filed. And, I can see the hfile splitting after reduce stage. Is it normal? My envs are: - Hadoop 2.7.1 - HBase 1.1.3 - Phoenix 4.7.0 RC Thanks, Youngwoo
