Thanks for the debugging info Sunil. I have uploaded a possible patch for the issue you are running into - https://issues.apache.org/jira/browse/PHOENIX-2207
Mind giving it a try? On Tue, Aug 25, 2015 at 11:18 AM, Sunil B <[email protected]> wrote: > Hi Samarth, > > Answers to your questions: > 1) How many regions are there? > Ans: Total regions: 21. Each region is 5GB uncompressed (1.7GB > compressed). Total region servers: 7 > > 2) Do you have phoenix stats enabled? > http://phoenix.apache.org/update_statistics.html > Ans: The configuration is at default. In my understanding stats are > enabled by default. While debugging on the client I did notice that > phoenix client divides the query into ~1600 parallel scans using > guideposts. Let me know if this doesn't answer your question. > > 3) Is the table salted? > Ans: No > > 4) Do you have any overrides for scanner caching > (hbase.client.scanner.caching) or result size > (hbase.client.scanner.max.result.size) in your hbase-site.xml? > Ans: No. It is all at default configuration. > > > My Analysis: > ---------------- > As I said, it looks like a bug to me. Please let me know if you think > otherwise. > The chain of Iterators on the client side got from debugging: > RoundRobingResultIterator contains 1 ConcatResultIterator. The > ConcatResultIterator contains ~1600 LookAheadResultIterator. Each of > the LookAheadResultIterator contains one TableResultIterator > > In ParallelIterators.submitWork() function iterator.peek() is called > on each of the 1600 LookAheadResultIterators. This peek() function > retrieves data to cache from all the 1600 scanners. After this these > LookAheadResultIterators are added to ConcatResultIterator in > BaseResultIterators.getIterators() function. As ConcatResultIterator > goes serially through the rows, the purpose of peek() function is > lost. By the time ConcatResultIterator finishes with first couple of > LookAheadResultIterators, the scanners, which were initialized due to > peek(), timeout on the Region-Servers. > > > Workaround for now: > ---------------------------- > I am trying the query by adding an "order by" clause: > select * from TABLE order by PRIMARY_KEY > This modification seems to be working. It has been running for the > past 5 hours now. Will update the thread with success or failure. > Code Analysis: ScanPlan.newIterator function uses SerialIterators > instead of ParallelIterators if there is an "order by" in the query. > > Thanks, > Sunil > > On Mon, Aug 24, 2015 at 11:08 PM, Samarth Jain <[email protected]> wrote: > > Sunil, > > > > Can you tells us a little bit more about the table - > > 1) How many regions are there? > > > > 2) Do you have phoenix stats enabled? > > http://phoenix.apache.org/update_statistics.html > > > > 3) Is the table salted? > > > > 4) Do you have any overrides for scanner caching > > (hbase.client.scanner.caching) or result size > > (hbase.client.scanner.max.result.size) in your hbase-site.xml? > > > > Thanks, > > Samarth > > > > > > On Mon, Aug 24, 2015 at 2:03 PM, Sunil B <[email protected]> wrote: > >> > >> Hi, > >> > >> Phoenix Version: 4.5.0-Hbase-1.0 > >> Client: slqline/JDBC driver > >> > >> I have a large table which has around 100GB of data. I am trying to > >> execute a simple query "select * from TABLE", which times out with > >> scanner timeout exception. Please let me know if there is a way to > >> avoid this timeout without changing server side scanner timeout. > >> > >> Exception: WARN client.ScannerCallable: Ignore, probably already > closed > >> org.apache.hadoop.hbase.UnknownScannerException: > >> org.apache.hadoop.hbase.UnknownScannerException: Name: 15791, already > >> closed? > >> > >> > >> The reason for the timeout is that phoenix divides this query into > >> multiple parallel scans and executes scanner.next on each one of them > >> at the start of the query execution (this is because of the use of > >> PeekingResultIterator.peek function being called in submitWork > >> function of the ParallelIterators class). > >> > >> Is there a way I can force Phoenix to do a serial scan instead of > >> parallel scan with PeekingResultIterator? > >> > >> Thanks, > >> Sunil > > > > >
