I can't explain that. Seems like the new guidepost width is not taking effect. Maybe there's some "special" means of updating a region server config property in HDP? When you update this parameter, do you see less guideposts created after a major compaction occurs?
On Mon, Dec 7, 2015 at 3:46 PM, Thangamani, Arun <[email protected]> wrote: > I bounced the region servers with phoenix.stats.guidepost.width = 10737418240 > (which is the max file size set from ambari) > > Like Matt, I am seeing entries created in the SYSTEM.STATS table as well. > Any other suggestions James? > > From: Matt Kowalczyk <[email protected]> > Reply-To: "[email protected]" <[email protected]> > Date: Monday, December 7, 2015 at 2:52 PM > To: "[email protected]" <[email protected]> > Subject: Re: system.catalog and system.stats entries slows down bulk MR > inserts by 20-25X (Phoenix 4.4) > > I'm sorry I poorly communicated in the previous e-mail. I meant to provide > a list of things that I did. I bounced and then performed a major > compaction and then ran the select count(*) query. > > On Mon, Dec 7, 2015 at 2:49 PM, James Taylor <[email protected]> > wrote: > >> You need to bounce the cluster *before* major compaction or the region >> server will continue to use the old guideposts setting during compaction. >> >> On Mon, Dec 7, 2015 at 2:45 PM, Matt Kowalczyk <[email protected]> >> wrote: >> >>> bounced, just after major compaction, with the setting as indicated >>> above. I'm unable to disable the stats table. >>> >>> select count(*) from system.stats where physical_name = 'XXXXX'; >>> +------------------------------------------+ >>> | COUNT(1) | >>> +------------------------------------------+ >>> | 653 | >>> +------------------------------------------+ >>> 1 row selected (0.036 seconds) >>> >>> >>> On Mon, Dec 7, 2015 at 2:41 PM, James Taylor <[email protected]> >>> wrote: >>> >>>> Yes, setting that property is another way to disable stats. You'll need >>>> to bounce your cluster after setting either of these, and stats won't be >>>> updated until a major compaction occurs. >>>> >>>> >>>> On Monday, December 7, 2015, Matt Kowalczyk <[email protected]> >>>> wrote: >>>> >>>>> I've set, phoenix.stats.guidepost.per.region to 1 and continue to see >>>>> entries added to the system.stats table. I believe this should have the >>>>> same effect? I'll try setting the guidepost width though. >>>>> >>>>> >>>>> On Mon, Dec 7, 2015 at 12:11 PM, James Taylor <[email protected]> >>>>> wrote: >>>>> >>>>>> You can disable stats through setting >>>>>> the phoenix.stats.guidepost.width config parameter to a larger value in >>>>>> the >>>>>> server side hbase-site.xml. The default is 104857600 (or 10MB). If you >>>>>> set >>>>>> it to your MAX_FILESIZE (the size you allow a region to grow to before it >>>>>> splits - default 20GB), then you're essentially disabling it. You could >>>>>> also try increasing it somewhere in between to maybe 5 or 10GB. >>>>>> >>>>>> Thanks, >>>>>> James >>>>>> >>>>>> On Mon, Dec 7, 2015 at 10:25 AM, Matt Kowalczyk < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> We're also encountering slow downs after bulk MR inserts. I've only >>>>>>> measured slow downs in the query path (since our bulk inserts workloads >>>>>>> vary in size it hasn't been clear that we see slow downs here but i'll >>>>>>> now >>>>>>> measure this as well). The subject of my reported issue was titled, >>>>>>> "stats >>>>>>> table causing slow queries". >>>>>>> >>>>>>> the stats table seems to be re-built during compactions and and I >>>>>>> have to actively purge the table to regain sane query times. Would be >>>>>>> sweet >>>>>>> if the stats feature could be disabled. >>>>>>> >>>>>>> On Mon, Dec 7, 2015 at 9:53 AM, Thangamani, Arun <[email protected] >>>>>>> > wrote: >>>>>>> >>>>>>>> This is on hbase-1.1.1.2.3.0.0-2557 if that would make any >>>>>>>> difference in analysis. Thanks >>>>>>>> >>>>>>>> From: Arun Thangamani <[email protected]> >>>>>>>> Date: Monday, December 7, 2015 at 12:13 AM >>>>>>>> To: "[email protected]" <[email protected]> >>>>>>>> Subject: system.catalog and system.stats entries slows down bulk >>>>>>>> MR inserts by 20-25X (Phoenix 4.4) >>>>>>>> >>>>>>>> Hello, I noticed an issue with bulk insert through map reduce in >>>>>>>> phoenix 4.4.0.2.3.0.0-2557, using outline of the code below >>>>>>>> >>>>>>>> Normally the inserts of about 25 million rows complete in about 5 >>>>>>>> mins, there are 5 region servers and the phoenix table has 32 buckets >>>>>>>> But sometimes (maybe after major compactions or region movement?), >>>>>>>> writes simply slow down to 90 mins, when I truncate SYSTEM.STATS hbase >>>>>>>> table, the inserts get a little faster (60 mins), but when I truncate >>>>>>>> both >>>>>>>> SYSTEM.CATALOG & SYSTEM.STATS tables, and recreate the phoenix table >>>>>>>> def(s) >>>>>>>> the inserts go back to 5 mins, the workaround of truncating SYSTEM >>>>>>>> tables >>>>>>>> is not sustainable for long, can someone help and let me know if there >>>>>>>> is a >>>>>>>> patch available for this? Thanks in advance. >>>>>>>> >>>>>>>> Job job = Job.getInstance(conf, NAME); >>>>>>>> // Set the target Phoenix table and the columns >>>>>>>> PhoenixMapReduceUtil.setOutput(job, tableName, >>>>>>>> "WEB_ID,WEB_PAGE_LABEL,DEVICE_TYPE," + >>>>>>>> >>>>>>>> "WIDGET_INSTANCE_ID,WIDGET_TYPE,WIDGET_VERSION,WIDGET_CONTEXT," >>>>>>>> + >>>>>>>> >>>>>>>> "TOTAL_CLICKS,TOTAL_CLICK_VIEWS,TOTAL_HOVER_TIME_MS,TOTAL_TIME_ON_PAGE_MS,TOTAL_VIEWABLE_TIME_MS," >>>>>>>> + >>>>>>>> >>>>>>>> >>>>>>>> "VIEW_COUNT,USER_SEGMENT,DIM_DATE_KEY,VIEW_DATE,VIEW_DATE_TIMESTAMP,ROW_NUMBER"); >>>>>>>> FileInputFormat.setInputPaths(job, inputPath); >>>>>>>> job.setMapperClass(WidgetPhoenixMapper.class); >>>>>>>> job.setMapOutputKeyClass(NullWritable.class); >>>>>>>> job.setMapOutputValueClass(WidgetPagesStatsWritable.class); >>>>>>>> job.setOutputFormatClass(PhoenixOutputFormat.class); >>>>>>>> TableMapReduceUtil.addDependencyJars(job); >>>>>>>> job.setNumReduceTasks(0); >>>>>>>> job.waitForCompletion(true); >>>>>>>> >>>>>>>> public static class WidgetPhoenixMapper extends >>>>>>>> Mapper<LongWritable, Text, NullWritable, WidgetPagesStatsWritable> { >>>>>>>> @Override >>>>>>>> public void map(LongWritable longWritable, Text text, Context >>>>>>>> context) throws IOException, InterruptedException { >>>>>>>> Configuration conf = context.getConfiguration(); >>>>>>>> String rundateString = conf.get("rundate"); >>>>>>>> PagesSegmentWidgetLineParser parser = new >>>>>>>> PagesSegmentWidgetLineParser(); >>>>>>>> try { >>>>>>>> PagesSegmentWidget pagesSegmentWidget = >>>>>>>> parser.parse(text.toString()); >>>>>>>> >>>>>>>> if (pagesSegmentWidget != null) { >>>>>>>> WidgetPagesStatsWritable widgetPagesStatsWritable = >>>>>>>> new WidgetPagesStatsWritable(); >>>>>>>> WidgetPagesStats widgetPagesStats = new >>>>>>>> WidgetPagesStats(); >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStats.setWebId(pagesSegmentWidget.getWebId()); >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStats.setWebPageLabel(pagesSegmentWidget.getWebPageLabel()); >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStats.setWidgetInstanceId(pagesSegmentWidget.getWidgetInstanceId()); >>>>>>>> ….. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStatsWritable.setWidgetPagesStats(widgetPagesStats); >>>>>>>> context.write(NullWritable.get(), >>>>>>>> widgetPagesStatsWritable); >>>>>>>> } >>>>>>>> >>>>>>>> }catch (Exception e){ >>>>>>>> e.printStackTrace(); >>>>>>>> } >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> public final class WidgetPagesStats { >>>>>>>> private String webId; >>>>>>>> private String webPageLabel; >>>>>>>> private long widgetInstanceId; >>>>>>>> private String widgetType; >>>>>>>> >>>>>>>> … >>>>>>>> @Override >>>>>>>> public boolean equals(Object o) { >>>>>>>> >>>>>>>> .. >>>>>>>> } >>>>>>>> @Override >>>>>>>> public int hashCode() { >>>>>>>> >>>>>>>> .. >>>>>>>> } >>>>>>>> @Override >>>>>>>> public String toString() { >>>>>>>> return "WidgetPhoenix{“…. >>>>>>>> '}'; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> public class WidgetPagesStatsWritable implements DBWritable, >>>>>>>> Writable { >>>>>>>> >>>>>>>> private WidgetPagesStats widgetPagesStats; >>>>>>>> >>>>>>>> public void readFields(DataInput input) throws IOException { >>>>>>>> widgetPagesStats.setWebId(input.readLine()); >>>>>>>> widgetPagesStats.setWebPageLabel(input.readLine()); >>>>>>>> widgetPagesStats.setWidgetInstanceId(input.readLong()); >>>>>>>> widgetPagesStats.setWidgetType(input.readLine()); >>>>>>>> >>>>>>>> … >>>>>>>> } >>>>>>>> >>>>>>>> public void write(DataOutput output) throws IOException { >>>>>>>> output.writeBytes(widgetPagesStats.getWebId()); >>>>>>>> output.writeBytes(widgetPagesStats.getWebPageLabel()); >>>>>>>> >>>>>>>> output.writeLong(widgetPagesStats.getWidgetInstanceId()); >>>>>>>> output.writeBytes(widgetPagesStats.getWidgetType()); >>>>>>>> >>>>>>>> .. >>>>>>>> } >>>>>>>> >>>>>>>> public void readFields(ResultSet rs) throws SQLException { >>>>>>>> widgetPagesStats.setWebId(rs.getString("WEB_ID")); >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStats.setWebPageLabel(rs.getString("WEB_PAGE_LABEL")); >>>>>>>> >>>>>>>> >>>>>>>> widgetPagesStats.setWidgetInstanceId(rs.getLong("WIDGET_INSTANCE_ID")); >>>>>>>> widgetPagesStats.setWidgetType(rs.getString("WIDGET_TYPE")); >>>>>>>> >>>>>>>> … >>>>>>>> } >>>>>>>> >>>>>>>> public void write(PreparedStatement pstmt) throws SQLException { >>>>>>>> Connection connection = pstmt.getConnection(); >>>>>>>> PhoenixConnection phoenixConnection = (PhoenixConnection) >>>>>>>> connection; >>>>>>>> //connection.getClientInfo().setProperty("scn", >>>>>>>> Long.toString(widgetPhoenix.getViewDateTimestamp())); >>>>>>>> >>>>>>>> pstmt.setString(1, widgetPagesStats.getWebId()); >>>>>>>> pstmt.setString(2, widgetPagesStats.getWebPageLabel()); >>>>>>>> pstmt.setString(3, widgetPagesStats.getDeviceType()); >>>>>>>> >>>>>>>> pstmt.setLong(4, widgetPagesStats.getWidgetInstanceId()); >>>>>>>> >>>>>>>> … >>>>>>>> } >>>>>>>> >>>>>>>> public WidgetPagesStats getWidgetPagesStats() { >>>>>>>> return widgetPagesStats; >>>>>>>> } >>>>>>>> >>>>>>>> public void setWidgetPagesStats(WidgetPagesStats >>>>>>>> widgetPagesStats) { >>>>>>>> this.widgetPagesStats = widgetPagesStats; >>>>>>>> } >>>>>>>> } >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------ >>>>>>>> This message and any attachments are intended only for the use of >>>>>>>> the addressee and may contain information that is privileged and >>>>>>>> confidential. If the reader of the message is not the intended >>>>>>>> recipient or >>>>>>>> an authorized representative of the intended recipient, you are hereby >>>>>>>> notified that any dissemination of this communication is strictly >>>>>>>> prohibited. If you have received this communication in error, notify >>>>>>>> the >>>>>>>> sender immediately by return email and delete the message and any >>>>>>>> attachments from your system. >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>> >> >
