I really appreciate the update, Nuwan! User error is entirely understandable - this is complex stuff. And we now know that there isn’t a lurking bug in Calcite.
By the way, 1.14 will add ANY, SOME and ALL operators to Calcite. These have similar tricky semantics to NOT IN but hopefully I got it right. Julian > On Sep 27, 2017, at 11:21 AM, Nuwan Abeysiriwardana <[email protected]> > wrote: > > Hi Julian, > i just found the issue. it's nothing to do with the query plan. it just a > mistake i did when converting iterator to Iterable. > > Linq4j.asEnumerable(new StreamIterable(rowSupplier.get().skip(1). > iterator())); > > public class StreamIterable<T> implements Iterable<T> { private final > Iterator<T> iterator; public StreamIterable(Iterator<T> iterator) { > this.iterator = iterator; } @Override public Iterator<T> iterator() { > return iterator; } } > > my above class implementation is wrong. it's using same iterator when > engine request new iterator. > > thought updating everyone is a good idea ,just for the knowledge. > > > thanks > nuwan > > On Wed, Sep 27, 2017 at 2:13 AM, Nuwan Abeysiriwardana <[email protected]> > wrote: > >> i have updated the test case with multiple columns table. still issue >> exist. i'll try to debug more and give you more detail on that. >> >> thanks a lot. >> >> >> nuwan >> >> On Wed, Sep 27, 2017 at 2:01 AM, Julian Hyde <[email protected]> wrote: >> >>> Thanks for the test case. Run with -Dcalcite.debug, take a look at the >>> generated code and see whether == and .equals are being used correctly. >>> Also see what happens if each table has two columns rather than one. Some >>> of our “optimizations” for single-column tables may be suspect. >>> >>> Julian >>> >>> >>>> On Sep 26, 2017, at 1:15 PM, Nuwan Abeysiriwardana <[email protected]> >>> wrote: >>>> >>>> Hi Julian, >>>> i'm sorry if I'm mislead with this CSV thing. this CSV table not the >>>> calcite inbuilt csv table. this table i have tested, created by me by >>> just >>>> implementing ScannableTable . test case created. please check below >>> link. >>>> >>>> <goog_1962026729> >>>> https://github.com/nuwansa/CalcitePlayGround >>>> >>>> >>>> thanks >>>> nuwan >>>> >>>> >>>> On Wed, Sep 27, 2017 at 12:41 AM, Julian Hyde <[email protected]> wrote: >>>> >>>>> It works fine for me: >>>>> >>>>> $ cat file/src/test/resources/sales/ORDERS.csv >>>>> Q:string >>>>> aaaa >>>>> 3333 >>>>> zzzz >>>>> $ cat file/src/test/resources/sales/TRADES.csv >>>>> A:string >>>>> aaaa >>>>> 1111 >>>>> 2222 >>>>> $ ./sqlline -u "jdbc:calcite:schemaFactory=org.apache.calcite.adapter. >>>>> file.FileSchemaFactory;schema.directory=/Users/jhyde/open1/ >>>>> calcite.2/file/src/test/resources/sales" >>>>> sqlline version 1.3.0 >>>>> 0: jdbc:calcite:schemaFactory=org.apache.calc> select a, a not in >>> (select >>>>> q from "adhoc".orders) from "adhoc".trades; >>>>> +---+--------+ >>>>> | A | EXPR$1 | >>>>> +---+--------+ >>>>> | aaaa | false | >>>>> | 1111 | true | >>>>> | 2222 | true | >>>>> +---+--------+ >>>>> 3 rows selected (1.918 seconds) >>>>> >>>>> >>>>> Can you provide a test case that fails? >>>>> >>>>> >>>>> Julian >>>>> >>>>> >>>>>> On Sep 26, 2017, at 11:46 AM, Nuwan Abeysiriwardana < >>> [email protected]> >>>>> wrote: >>>>>> >>>>>> Hi Julian, >>>>>> >>>>>> thanks for the quick reply.i have check the leading and trailing >>>>> spaces,it >>>>>> looks good. execution plan as follows. >>>>>> >>>>>> EnumerableCalc(expr#0..4=[{inputs}], expr#5=[0], expr#6=[=($t1, >>> $t5)], >>>>>> expr#7=[false], expr#8=[IS NOT NULL($t4)], expr#9=[true], expr#10=[IS >>>>>> NULL($t0)], expr#11=[null], expr#12=[<($t2, $t1)], expr#13=[CASE($t6, >>>>> $t7, >>>>>> $t8, $t9, $t10, $t11, $t12, $t11, $t7)], expr#14=[NOT($t13)], Q=[$t0], >>>>>> output=[$t14]) >>>>>> EnumerableJoin(condition=[=($0, $3)], joinType=[left]) >>>>>> EnumerableJoin(condition=[true], joinType=[inner]) >>>>>> EnumerableCalc(expr#0..2=[{inputs}], Q=[$t0]) >>>>>> EnumerableInterpreter >>>>>> BindableTableScan(table=[[csv, execution]]) >>>>>> EnumerableAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)]) >>>>>> EnumerableInterpreter >>>>>> BindableTableScan(table=[[csv, trades]]) >>>>>> EnumerableAggregate(group=[{0, 1}]) >>>>>> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[true], A=[$t0], >>>>> i=[$t3]) >>>>>> EnumerableInterpreter >>>>>> BindableTableScan(table=[[csv, trades]]) >>>>>> >>>>>> thanks >>>>>> nuwan >>>>>> >>>>>> On Wed, Sep 27, 2017 at 12:05 AM, Julian Hyde <[email protected]> >>> wrote: >>>>>> >>>>>>> Sounds like it might be a bug. Can you do ‘explain plan’ and send the >>>>>>> result? Also check for leading and trailing spaces in your data. >>>>>>> >>>>>>>> On Sep 26, 2017, at 11:11 AM, Nuwan Abeysiriwardana < >>>>> [email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> Hi All, >>>>>>>> i have following csv tables. >>>>>>>> >>>>>>>> TableA TableB >>>>>>>> +------+ +-------+ >>>>>>>> | Q | | A | >>>>>>>> +------+ +-------+ >>>>>>>> | aaaa | | aaaa | >>>>>>>> | 1111 | | ssss | >>>>>>>> | 555 | | 22222 | >>>>>>>> +------+ +-------+ >>>>>>>> >>>>>>>> 'select Q,Q not in(select A from TableB) as output from TableA' >>>>> produce >>>>>>>> following result >>>>>>>> >>>>>>>> +------+--------+ >>>>>>>> | Q | output | >>>>>>>> +------+--------+ >>>>>>>> | aaaa | true | >>>>>>>> | 1111 | true | >>>>>>>> | 555 | true | >>>>>>>> +------+--------+ >>>>>>>> >>>>>>>> but I'm expecting false for value 'aaaa'. >>>>>>>> >>>>>>>> also i have tested the following query and it work as expected. >>>>>>>> 'select Q,Q not in('aaaa','1111') as output from TableA' >>>>>>>> >>>>>>>> output : >>>>>>>> +------+--------+ >>>>>>>> | Q | output | >>>>>>>> +------+--------+ >>>>>>>> | aaaa | false | >>>>>>>> | 1111 | false | >>>>>>>> | 555 | true | >>>>>>>> +------+--------+ >>>>>>>> >>>>>>>> i have tested similar query with PostgreSQL ,it works fine. >>>>>>>> >>>>>>>> thanks >>>>>>>> nuwan >>>>>>> >>>>>>> >>>>> >>>>> >>> >>> >>
