Re: rpc timeout when count on large table

James Taylor Thu, 15 Jan 2015 23:46:06 -0800

See http://phoenix.apache.org/update_statistics.html for more info.


Thanks,
James

On Thursday, January 15, 2015, [email protected] <
[email protected]> wrote:

> Hi, James
> Really appreciated for your detailed illustration. I issue UPDATE
> STATISTICS <table> and rerun the count (*) query
> then found query performance has achived better results. Does the
> statistics collection schema affects all of query or
> that just affects aggreate query? How does that command improve query
> performance? That would be fine if you
> can explain a little : )
>
> Our problem still occurs for now and we need to investigate more deep in
> the query. I will check the configurations you
> provide in the late tests.
>
> Thanks,
> Sun
>
> ------------------------------
> ------------------------------
>
> CertusNet
>
>
> *From:* James Taylor
> <javascript:_e(%7B%7D,'cvml','[email protected]');>
> *Date:* 2015-01-15 17:46
> *To:* user <javascript:_e(%7B%7D,'cvml','[email protected]');>
> *Subject:* Re: Re: rpc timeout when count on large table
> Those settings (one or the other - you wouldn't set both) drive the
> amount of parallelization done (i.e. the number or byte size of each
> parallel chunk).
>
> What do you get when you run the following queries?
> SELECT COUNT(*) FROM SYSTEM.STATS WHERE PHYSICAL_NAME = '<your full
> table name>';
> SELECT SUM(GUIDE_POSTS_COUNT) FROM SYSTEM.STATS WHERE PHYSICAL_NAME =
> '<your full table name>';
>
> As a test, try adding the following config parameter to
> hbase-sites.xml on each region server:
> <property>
>     <name>phoenix.stats.guidepost.per.region</name>
>     <value>1</value>
> </property>
>
> After setting it, bounce your cluster and run the following to update
> your stats:
> UPDATE STATISTICS <your full table name>
>
> Then run your count(*) query again and see if there's any impact. Try
> setting the phoenix.stats.guidepost.per.region successively higher to
> 2, 4, 8 (following above steps) and see if it makes a difference in
> your query performance.
>
> Thanks,
> James
>
> On Thu, Jan 15, 2015 at 1:23 AM, [email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>
> <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
> > Hi, James
> > Yes, we are running 4.2.2
> > Neither of these two configs are overridden. Do these configuration only
> > affects stats collection?
> > I had not searched for the regionserver log for refering if any major
> > compaction is running.
> >
> > Just curious about the query performance. Cause we are good to count on
> that
> > perviously.
> >
> > Thanks,
> > Sun.
> >
> > ________________________________
> > ________________________________
> >
> > CertusNet
> >
> >
> > From: James Taylor
> > Date: 2015-01-15 17:10
> > To: user
> > Subject: Re: rpc timeout when count on large table
> > You're on 4.2.2, Sun? Have you overridden either of
> > phoenix.stats.guidepost.width or phoenix.stats.guidepost.per.region?
> > These control the size of each parallel scan. I assume you've run a
> > major compaction on the table at some point?
> >
> > Thanks,
> > James
> >
> > On Wed, Jan 14, 2015 at 7:06 PM, [email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>
> > <[email protected]
> <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote:
> >> Hi, all
> >>
> >> When counting on large table, we got the following exception
> >>   org.apache.hadoop.hbase.ipc.RpcClient$CallTimeoutException: Call id=,
> >> waitTime=69714 rpcTimetout=60000
> >>
> >> How would that be resolved? Table size goes to 17.3G with issuing hdfs
> dfs
> >> -du. Table with 90+ columns
> >> and only one column family F. Compression codec is snappy.
> >>
> >> Thanks,
> >> Sun.
> >>
> >> ________________________________
> >> ________________________________
> >>
> >> CertusNet
> >>
> >>
> >
> >
>
>
>
>
>

Re: rpc timeout when count on large table

Reply via email to