[
https://issues.apache.org/jira/browse/PHOENIX-7117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788302#comment-17788302
]
Istvan Toth commented on PHOENIX-7117:
--------------------------------------
Do you have a fix in mind ?
We have run into simuilar issues, our solution was effectively disabling
guideposts for the table, and increasing the queue size.
A simple improvement could be falling back to one scan per region if the sumber
of scans would be over some configurable threshold.
However, for 100K regions that would still create 100K tasks.
Creating scans that span multiple regions could be a solution, but it would
only reduce the queue size, it would still use the same amount of HBase client
and cluster resources (region scans). I am also not sure how well that would
fit the rest of the code.
Maybe we could have some kind of per-statement queue-like structure that stores
the scan ranges, and dyanamically generates tasks so that there are only a
limited number of tasks per query is in the main queue ?
> Improve handling of scans that span large number of table regions
> -----------------------------------------------------------------
>
> Key: PHOENIX-7117
> URL: https://issues.apache.org/jira/browse/PHOENIX-7117
> Project: Phoenix
> Issue Type: Improvement
> Affects Versions: 5.2.0, 5.1.3
> Reporter: Tanuj Khurana
> Priority: Major
>
> Phoenix determines the number of table regions a query will be sent to and
> creates that many scan objects which are executed in parallel.
>
> {code:java}
> org.apache.phoenix.exception.PhoenixIOException: Task
> org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@49c71704[Not
> completed, task = org.apache.phoenix.iterate.ParallelIterators$1@597a1aae]
> rejected from org.apache.phoenix.job.JobManager$1@7c04996f[Running, pool size
> = 20, active threads = 20, queued tasks = 5000, completed tasks =
> 645934]`Failed query: DELETE FROM TEST.EVENT WHERE OID = ? and KP = ? and
> EVENT_TS < ?. Stack trace: org.apache.phoenix.exception.PhoenixIOException:
> Task org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@49c71704[Not
> completed, task = org.apache.phoenix.iterate.ParallelIterators$1@597a1aae]
> rejected from org.apache.phoenix.job.JobManager$1@7c04996f[Running, pool size
> = 20, active threads = 20, queued tasks = 5000, completed tasks = 645934]
> at
> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:146)
> at
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1433)
> at
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1297)
> at
> org.apache.phoenix.iterate.ConcatResultIterator.getIterators(ConcatResultIterator.java:52)
> at
> org.apache.phoenix.iterate.ConcatResultIterator.currentIterator(ConcatResultIterator.java:107)
> at
> org.apache.phoenix.iterate.ConcatResultIterator.next(ConcatResultIterator.java:127)
> at
> org.apache.phoenix.iterate.UngroupedAggregatingResultIterator.next(UngroupedAggregatingResultIterator.java:39)
> at
> org.apache.phoenix.compile.DeleteCompiler$ServerSelectDeleteMutationPlan.execute(DeleteCompiler.java:815)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:522)
> at
> org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:488)
> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:487)
> at
> org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:475)
> at
> org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeUpdate(PhoenixPreparedStatement.java:206)
>
> Caused by: java.util.concurrent.RejectedExecutionException: Task
> org.apache.phoenix.job.JobManager$InstrumentedJobFutureTask@49c71704[Not
> completed, task = org.apache.phoenix.iterate.ParallelIterators$1@597a1aae]
> rejected from org.apache.phoenix.job.JobManager$1@7c04996f[Running, pool size
> = 20, active threads = 20, queued tasks = 5000, completed tasks = 645934]
> at
> org.apache.phoenix.job.JobManager$InstrumentedThreadPoolExecutor$1.rejectedExecution(JobManager.java:246)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:825)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1355)
> at
> org.apache.phoenix.job.JobManager$InstrumentedThreadPoolExecutor.execute(JobManager.java:263)
> at
> java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:140)
> at
> org.apache.phoenix.iterate.ParallelIterators.submitWork(ParallelIterators.java:132)
> at
> org.apache.phoenix.iterate.BaseResultIterators.getIterators(BaseResultIterators.java:1329){code}
> The configuration parameter *phoenix.query.queueSize* which has a default
> value of 5000 controls the size of the array backing the task queue. There
> are 2 issues here:
> * This query always fails in our production systems because the range scan
> of the query spans number of table regions > than 5000.
> * Moreover, other queries running concurrently with this query also fail
> with RejectedExecutionException even though those queries don't create that
> many tasks.
> I think blindly creating as many parallel scans as the table regions doesn't
> scale for huge tables. In our production we have some tables which have more
> than 100,000 regions. Simply increasing the queue size is not a scalable
> solution. Moreover, a single query should not be able to monopolize the
> entire client JVM resources ( in this case the task queue) .
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)