Thanks, Gabriel. I filed PHOENIX-2716. Would you mind investigating, Sergey? Maybe a good first step would be to revert PHOENIX-2649 and see if performance goes back to what it was? We can roll a new RC without it and then get it back in for 4.8.
On Fri, Feb 26, 2016 at 3:01 AM, Gabriel Reid <[email protected]> wrote: > I just did a quick test run on this, and it looks to me like something > is definitely wrong. > > I ran a simple ingest test for a table with 5 regions, and it appears > that only a single HFile is being created. This HFile then needs to be > recursively split during the step of handing HFiles over to the region > servers (hence the "xxx no longer fits inside a single region. > Splitting..." log messages). > > This implies that only a single reducer is actually doing any > processing, which would certainly account for a performance > degradation. My assumption is that the underlying issue is in the > partitioner (or the data being passed to the partitioner). I don't > know if this was introduced as part of PHOENIX-2649 or not. > > Sergey, are you (or someone else) able to take a look at this? > Unfortunately, I don't think there's any way I can get a serious look > at this any more today. > > - Gabriel > > > On Fri, Feb 26, 2016 at 11:21 AM, Sergey Soldatov > <[email protected]> wrote: > > I see. We will try to reproduce it. The degradation is possible > > because 4.6 had a problem described in PHOENIX-2649. In two words - > > the comparator for rowkeys was working incorrectly and reported that > > all rowkeys are the same. If the input files are relatively small and > > reducer has enough memory, all records will be written in one step > > with the same single rowkey. And that can be the reason why it was > > faster and there were no splits. > > > > Thanks, > > Sergey > > > > On Fri, Feb 26, 2016 at 1:37 AM, 김영우 (YoungWoo Kim) <[email protected]> > wrote: > >> Sergey, > >> > >> I can't access the cluster right now, so I'll post details and > >> configurations next week. important facts as far as I remember: > >> - 8 nodes dev cluster (Hadoop 2.7.1, HBase 1.1.3, Phoenix 4.7.0 RC2 and > >> Zookeeper 3.4.6) > >> * 32 core / 256 GB RAM, Datanode/Nodemanager and RegionServer @ same > node, > >> Assigned 24 GB for heap for region server > >> - # of tables = 9 > >> - Salted with 5, 10 or 20 buckets > >> - Compressed using Snappy codec > >> - Data Ingestion : 30 ~ 40 GB / day using bulk loading > >> - Schema > >> The table that I mentioned has 10 columns and 7 columns are varchar > and > >> the rest are varchar[]. > >> I can see performance degradation on bulk load from other tables > >> > >> Thanks, > >> > >> Youngwoo > >> > >> > >> > >> On Fri, Feb 26, 2016 at 6:02 PM, Sergey Soldatov < > [email protected]> > >> wrote: > >> > >>> Hi Youngwoo, > >>> Could you provide a bit more information about the table structure > >>> (DDL would be great)? Do you have indexes? > >>> > >>> Thanks, > >>> Sergey > >>> > >>> On Tue, Feb 23, 2016 at 10:18 PM, 김영우 (Youngwoo Kim) > >>> <[email protected]> wrote: > >>> > Gabriel, > >>> > > >>> > I'm using RC2. > >>> > > >>> > Youngwoo > >>> > > >>> > 2016년 2월 24일 수요일, Gabriel Reid<[email protected]>님이 작성한 메시지: > >>> > > >>> >> Hi Youngwoo, > >>> >> > >>> >> Which RC are you using for this? RC-1 or RC-2? > >>> >> > >>> >> Thanks, > >>> >> > >>> >> Gabriel > >>> >> > >>> >> On Tue, Feb 23, 2016 at 11:30 AM, 김영우 (YoungWoo Kim) < > [email protected] > >>> >> <javascript:;>> wrote: > >>> >> > Hi, > >>> >> > > >>> >> > I'm evaluating 4.7.0 RC on my dev cluster. Looks like it works > fine > >>> but I > >>> >> > run into performance degradation for MR based bulk loading. I've > been > >>> >> > loading a million of rows per day into Phoenix table. From 4.7.0 > RC, > >>> >> there > >>> >> > are failed jobs with '600 sec' time out in map or reduce stage. > logs > >>> as > >>> >> > follows: > >>> >> > > >>> >> > 16/02/22 18:03:45 INFO mapreduce.Job: Task Id : > >>> >> > attempt_1456035298774_0066_m_000002_0, Status : FAILED > >>> >> > AttemptID:attempt_1456035298774_0066_m_000002_0 Timed out after > 600 > >>> secs > >>> >> > > >>> >> > 16/02/22 18:05:14 INFO mapreduce.LoadIncrementalHFiles: HFile at > >>> >> > > >>> >> > >>> > hdfs://fcbig/tmp/74da7ab1-a8ac-4ba8-9d43-0b70f08f8602/HYNIX.BIG_TRACE_SUMMARY/0/_tmp/_tmp/f305427aa8304cf98355bf01c1edb5ce.top > >>> >> > no longer fits inside a single region. Splitting... > >>> >> > > >>> >> > But, the logs have not seen before. so I'm facing about 5 ~ 10x > >>> >> performance > >>> >> > degradation for bulk loading. (4.6.0: 10min but 60+ min from > 4.7.0 RC) > >>> >> > furthermore, I can't find a clue from MR logs why the tasks filed. > >>> >> > > >>> >> > And, I can see the hfile splitting after reduce stage. Is it > normal? > >>> >> > > >>> >> > My envs are: > >>> >> > - Hadoop 2.7.1 > >>> >> > - HBase 1.1.3 > >>> >> > - Phoenix 4.7.0 RC > >>> >> > > >>> >> > Thanks, > >>> >> > > >>> >> > Youngwoo > >>> >> > >>> >
