Hi Julian, Thanks for the response. Will create a jira ticket and come up with some samples.
Milinda On Sat, Nov 14, 2015 at 3:38 AM, Julian Hyde <[email protected]> wrote: > Short answer: yes, we should allow it. > > The design falls into 3 parts: > * Validation. We should allow any combination: table-table, stream-table > and stream-stream joins, as long as the query can make progress. That often > means that where a stream is involved, the join condition should involve a > monotonic expression. If it is a stream-table join you can make progress > without the monotonic expression, but if there are 2 streams you will need > it. > * Translation to relational algebra. Inspired by differential calculus’ > product rule[1], "stream(x join y)" becomes "x join stream(y) union all > stream(x) join y". Suppose that products is a table (i.e. we do not receive > notifications of new products); then "stream(products)" is empty. Suppose > that orders is a both a stream and a table; i.e. a stream with history. > Because stream(products) is empty, "stream(products join orders)" is simply > “products join stream(orders)”. These rewrites would happen in a > DeltaJoinTransposeRule. > * Updates to relations. Suppose that the products table is updated two or > three times during each day. How quickly does the end user expect those > updated records to appear in the output of the stream-table join? If the > table is updated at 10am, should the new data be loaded only when > processing transactions from 10am (which might not hit the join until say > 10:07am). There is no ‘right answer’ here; we should offer the end user a > choice of policies. A good basic policy would be “cache for no more than T > seconds” or “cache as long as you like” but give a manual way to flush the > cache. > > Can you please log a jira case to track this? Next step would be to write > some sample queries and decide whether they are valid. > > Julian > > [1] https://en.wikipedia.org/wiki/Product_rule > > > On Nov 13, 2015, at 9:35 PM, Milinda Pathirage <[email protected]> > wrote: > > > > Hi devs, > > > > Current SqlValidatorImpl doesn't allow queries like following: > > > > select stream orders.orderId, orders.productId, products.name from > > orders join products on orders.productId = products.id > > > > > > if the 'products' is a relation. This query fails at the modality check. > > But I am not sure whether fixing (or changing) the modality checking > logic > > is enough to solve this. Do we need to change planner rules as well. > Really > > appreciate any ideas on this. > > > > Thanks > > Milinda > > > > p.s. I am trying to get this base case working where every element from a > > stream is joined with a relation. stream-to-stream joins requires changes > > to parser as well to support windowing. That's my understanding, Julian > may > > have better ideas. > > > > -- > > Milinda Pathirage > > > > PhD Student | Research Assistant > > School of Informatics and Computing | Data to Insight Center > > Indiana University > > > > twitter: milindalakmal > > skype: milinda.pathirage > > blog: http://milinda.pathirage.org > > -- Milinda Pathirage PhD Student | Research Assistant School of Informatics and Computing | Data to Insight Center Indiana University twitter: milindalakmal skype: milinda.pathirage blog: http://milinda.pathirage.org
