[ 
https://issues.apache.org/jira/browse/CALCITE-849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009549#comment-15009549
 ] 

Jesse Yates commented on CALCITE-849:
-------------------------------------

When I was saying 'dashboard' I was thinking of a "topic" paradigm, where you 
subscribe to updates about some trait over time (i.e. a monitoring dashboard 
where you care about some metrics over time), which can be expressed as a 
projection + filtering + window; I think we are on the same page there. The 
"updating pie chart" model seems like a historical table joined to a stream, 
right?

The problem with using Source/Sink/Node is there is not feedback between the 
source and sink, when one is being overwhelmed. However, we could just 
implement the degenerate case where the source just ever sends a single row 
along to the enterpeter and manage the backpressure in your plugin (e.g. use 
the reactive interface to manage the stream without affecting the calcite 
engine).

Are you thinking that the interpreter's enumerator would instead be based on 
just calling Node#next()? For nodes that keep a list, then you just buffer that 
in memory anyways on the node, which actually cleans up the interpreter a bit. 

>From there, I see how WindowNode would work, but how would you redo something 
>like TableScanNode, which doesn't use run at all, but instead relies on 
>create() to build the enumable, which under the hood uses the generated code 
>to match the condition? Seems like a fair bit of work there to rewrite that.

A challenge I had was where a non-pushed down predicate never allows a row to 
pass from the TableScan, so you never bubble up a row to be checked (so the row 
count solution never gets a row to count), since it comes from the 
TableScanNode. My temp fix was a timeout based enumerator, but that had the 
overhead of an extra thread per query. Maybe that can be fixed by pushing down 
the same logic into the generated predicate logic (Buzz) as well?

Thanks Julian!

> Streams/Slow iterators dont close on statement close
> ----------------------------------------------------
>
>                 Key: CALCITE-849
>                 URL: https://issues.apache.org/jira/browse/CALCITE-849
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Jesse Yates
>            Assignee: Julian Hyde
>             Fix For: 1.5.0
>
>         Attachments: calcite-849-bug.patch
>
>
> This is easily seen when querying an infinite stream with a clause that 
> cannot be matched
> {code}
> select stream PRODUCT from orders where PRODUCT LIKE 'noMatch';
> select stream * from orders where PRODUCT LIKE 'noMatch';
> {code}
> The issue arises when accessing the results in a multi-threaded context. Yes, 
> its not a good idea (and things will break, like here). However, this case 
> feels like it ought to be an exception.
> Suppose you are accessing a stream and have a query that doesn't match 
> anything on the stream for a long time. Because of the way a ResultSet is 
> built, the call to executeQuery() will hang until the first matching result 
> is received. In that case, you might want to cancel the query because its 
> taking so long. You also want the thing that's accessing the stream (the 
> StreamTable implementation) to cancel the querying/collection - via a call to 
> close on the passed iterator/enumerable.
> Since the first result was never generated, the ResultSet was never returned 
> to the caller. You can get around this by using a second thread and keeping a 
> handle to the creating statement. When you go to close that statement though, 
> you end up not closing the cursor (and the underlying iterables/enumberables) 
> because it never finished getting created.
> It gets even more problematic if you are use select * as the iterable doesn't 
> finish getting created in the AvaticaResultSet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to