[jira] [Commented] (CALCITE-849) Streams/Slow iterators dont close on statement close

Jesse Yates (JIRA) Tue, 17 Nov 2015 17:22:33 -0800

    [ 
https://issues.apache.org/jira/browse/CALCITE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009992#comment-15009992
 ]


Jesse Yates commented on CALCITE-849:
-------------------------------------

bq. Since the Filter removes all rows, Project.run never gets called

So then how do we interrupt the filter iteration? Or is that's source 
enumerable also managed via push-based scheduling?

bq. The scheduler does the "build me a batch...logic

So the client then submits a request in the form of a sink to the scheduler and 
the scheduler attempts to fulfill it when it gets a chance by making a request 
to the source, which pushes its state up to the scheduler when it has a row 
ready? Then, if the scheduler is just passing along the response to that 
request, how do you manage further down in the source the "1 second's worth of 
data" type of logic without a poll + timeout?

I believe I ended up with a simple, single-row request, push-based scheduler 
when implementing my 
[TimerBasedCooperativePolicy|https://github.com/jyates/calcite/commit/cb3bc6f41f5ea8ff4a33c219f7673443e306e6d6#diff-57b5f055c65174026300ecc0a8285aa7R39]
 based on the same style I used above. The scheduler is running in the calling 
thread with the sink request executed directly and the "source" enumerator 
running in a new thread that pushes results to the 'scheduler' which is just 
returning the value up to the "sink". Note lockless (except for access to the 
queue), but still needs a timeout

Am I getting anywhere close to what you are thinking? Sorry to run around on 
this, just want to avoid a writing a bunch of code that is completely the wrong 
direction from what you are thinking. 


> Streams/Slow iterators dont close on statement close
> ----------------------------------------------------
>
>                 Key: CALCITE-849
>                 URL: https://issues.apache.org/jira/browse/CALCITE-849
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Jesse Yates
>            Assignee: Julian Hyde
>             Fix For: 1.5.0
>
>         Attachments: calcite-849-bug.patch
>
>
> This is easily seen when querying an infinite stream with a clause that 
> cannot be matched
> {code}
> select stream PRODUCT from orders where PRODUCT LIKE 'noMatch';
> select stream * from orders where PRODUCT LIKE 'noMatch';
> {code}
> The issue arises when accessing the results in a multi-threaded context. Yes, 
> its not a good idea (and things will break, like here). However, this case 
> feels like it ought to be an exception.
> Suppose you are accessing a stream and have a query that doesn't match 
> anything on the stream for a long time. Because of the way a ResultSet is 
> built, the call to executeQuery() will hang until the first matching result 
> is received. In that case, you might want to cancel the query because its 
> taking so long. You also want the thing that's accessing the stream (the 
> StreamTable implementation) to cancel the querying/collection - via a call to 
> close on the passed iterator/enumerable.
> Since the first result was never generated, the ResultSet was never returned 
> to the caller. You can get around this by using a second thread and keeping a 
> handle to the creating statement. When you go to close that statement though, 
> you end up not closing the cursor (and the underlying iterables/enumberables) 
> because it never finished getting created.
> It gets even more problematic if you are use select * as the iterable doesn't 
> finish getting created in the AvaticaResultSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CALCITE-849) Streams/Slow iterators dont close on statement close

Reply via email to