[
https://issues.apache.org/jira/browse/DRILL-7962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17373957#comment-17373957
]
Paul Rogers commented on DRILL-7962:
------------------------------------
Unless there is a bug, Drill will cancel the readers. If the readers continue
to run, then there is a bug.
Here is what I understand.
In the normal case, if you have a LIMIT 10, and, say, a dozen readers, all must
start and run. Why? Drill has no way to know which, if any, reader will produce
rows. If the query has a WHERE clause, Drill does not know which, if any, rows
will pass that filter. So, we have a dozen readers all busily reading files and
sending batches downstream.
Before we discuss LIMIT, let's talk about what happens if there is an error.
Suppose that the WHERE clause has WHERE 10 / a > 1, and you have a value for
{{a}} which is zero. The query will fail. At this point, the query fragment
will fail and alert the the Foreman which is running the query. The Foreman
will tell all other fragments to fail as well. All of this shows up as detailed
messages in the log file. I _think_ that the fragment terminates the reader by
raising a Interrupted exception. The operators then fail upwards until the
stack is popped to the root, at which time the query exits.
So, that's how the query fails early on an error. Back to LIMIT.
At some point, a batch will arrive at the operator which implements LIMIT. I
can't recall which that is, but easy to check in the query profile diagram in
the web UI. The operator has to run in the root fragment, the one labeled
SCREEN. Once that limit is reached, Drill should send a message to all
fragments to stop reading. I'm not sure of the details because I never worked
with this feature itself. I would expect that this event should also appear in
the log file, perhaps at the INFO setting.
I'm going to guess that each fragment, when it receives the "stop reading"
event (whatever it really is called), must also throw an Interrupted exception
to the fragment thread, just as for an error. (This needs to be checked.) As
before, the exception should unwind the stack.
Now, what could happen is that reader chooses to ignore the exception, perhaps
because of a bug. If that happens, then the reader will keep working even when
the limit is reached. Let's suppose there is a bug. If so, then if your query
fails with an error, the readers should also continue to run. If we have such a
bug, we should fix it.
Now, what might be going wrong? Here is one thing to check. When the exception
unwinds the stack, code in the Fragment will catch the error and will walk down
the operator tree to tell each operator to stop. In the past, some operators
wanted to "clear their inputs." They would keep reading batches from their
input until the data was exhausted. This was done to work around problems in
early versions of Drill around the network receiver. The modern code never
needs to do this; the cleanup steps will ensure memory is released, files are
closed, etc. So, perhaps there is some old code that is trying to clean its
input and should now be removed. Or, the problem could be something else
entirely.
In summary: the query should stop once the limit is reached. I remember, when I
first started on Drill, that the team had just done a bunch of work to ensure
that "LIMIT 0" worked quickly. So, if the query does not stop, it is a bug, and
we have to track down the problem.
> Stop query hbase region when other fragment have received more records than
> LIMIT records
> -----------------------------------------------------------------------------------------
>
> Key: DRILL-7962
> URL: https://issues.apache.org/jira/browse/DRILL-7962
> Project: Apache Drill
> Issue Type: Wish
> Components: Storage - HBase
> Affects Versions: 1.19.0
> Reporter: Hanoch Yang
> Priority: Blocker
>
> I find if one fragment have received records more than limit records, drill
> will not let other fragment cancel, but work unit received message, it will
> slow the query.
> case: what data i want locate in 2M region, but there is a 100M region and 2
> left 2M region. i use limit 10 to query. 4 fragement query 4 region, when one
> fragment get 10 record from 2M region, another 100M scanning fragment will
> not work. so all fragment must wait that fragment to finish scan, but it
> could have been unnecessary.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)