Tanuj Khurana created PHOENIX-7117:
--------------------------------------

             Summary: Improve handling of scans that span large number of table 
regions
                 Key: PHOENIX-7117
                 URL: https://issues.apache.org/jira/browse/PHOENIX-7117
             Project: Phoenix
          Issue Type: Improvement
    Affects Versions: 5.1.3, 5.2.0
            Reporter: Tanuj Khurana


Phoenix determines the number of table regions a query will be sent to and 
creates that many scan objects which are executed in parallel. 

 
{code:java}
org.apache.phoenix.exception.PhoenixIOException: Task 
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@49c71704[Not 
completed, task = org.apache.phoenix.iterate.ParallelIterators$1@597a1aae] 
rejected from org.apache.phoenix.job.JobManager$1@7c04996f[Running, pool size = 
20, active threads = 20, queued tasks = 5000, completed tasks = 645934]`Failed 
query: DELETE FROM TEST.EVENT WHERE OID = ? and KP = ? and EVENT_TS < ?. Stack 
trace: org.apache.phoenix.exception.PhoenixIOException: Task 
org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@49c71704[Not 
completed, task = org.apache.phoenix.iterate.ParallelIterators$1@597a1aae] 
rejected from org.apache.phoenix.job.JobManager$1@7c04996f[Running, pool size = 
20, active threads = 20, queued tasks = 5000, completed tasks = 645934]
        at 
org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:146)
        at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1433)
        at 
org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1297)
        at 
org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:52)
        at 
org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:107)
        at 
org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:127)
        at 
org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
        at 
org.apache.phoenix.compile.DeleteCompiler$ServerSelectDeleteMutationPlan.execute(DeleteCompiler.java:815)
        at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:522)
        at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:488)
        at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:487)
        at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:475)
        at 
org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:206)
 {code}
The configuration parameter *phoenix.query.queueSize* which has a default value 
of 5000 controls the size of the array backing the task queue. There are 2 
issues here:
 * This query always fails in our production systems because the range scan of 
the query spans number of table regions > than 5000.
 * Moreover, other queries running concurrently with this query also fail with 
RejectedExecutionException even though those queries don't create that many 
tasks.

I think blindly creating as many parallel scans as the table regions doesn't 
scale for huge tables. In our production we have some tables which have more 
than 100,000 regions. Simply increasing the queue size is not a scalable 
solution. Moreover, a single query should not be able to monopolize the entire 
client JVM resources ( in this case the task queue) .

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to