Thanks for the clarification. Being a newbie I need to ask some basic questions: Running test suite, I would assume I need to provide a model for the connection to DB2 (I guess making a db2 version of test/resources/mysql-foodmart-model.json)
And I further assume the foodmart-db and data should be pre-populated? Regards, Magnus > 13 maj 2016 kl. 06:28 skrev Julian Hyde <[email protected]>: > > No dispute about aliases — they are a good idea, and we should use them, and > in fact we do use them. > > But every database I’ve come across (with the exception of DB2) has IMPLICIT > aliases which are equivalent to the un-qualified table name. For example, in > Oracle, > > select emp.empno, emp.deptno from sales.emp > > is equivalent to > > select emp.empno, emp.deptno from sales.emp as emp > > This assumption has worked for many databases so I’d rather treat DB2 as the > exception. I believe our approach is basically sound so I don’t want to turn > everything upside down. The queries we generate are reasonably concise, and > because they use aliases based on table names they are easier for humans to > understand. If our approach is flawed, you will surely be able to break > things and log bugs with test cases. > > If someone were to add DB2 to https://github.com/vlsi/calcite-test-dataset > <https://github.com/vlsi/calcite-test-dataset> then we could do much better > testing. One test case is worth a dozen email messages. > > Julian > > >> On May 12, 2016, at 12:56 AM, Magnus Pierre <[email protected]> wrote: >> >> Understand. Just joined… :) >> >> Just a comment on your reasoning around aliases. >> >> Aliases are the safest, and most practical route for all databases, and >> leaves no room to the database to do crazy interpretations of what you try >> to achieve. I fully understand that aliases are not for everyone, but being >> system generated SQL that will go against databases and not humans, the use >> of aliases will save query length (provided the aliases are shorter than >> table names), allow for returning duplications of column names since the >> name of the column will be alias.column-name, it will avoid cross join >> behavior commonly seen in for instance Teradata when mixing aliases and >> non-aliases. >> >> Not to mention that it could simplify the processing done by Calcite, and >> hopefully also remove a set of error conditions for specific databases (like >> DB2) where things are more strict than in others. >> >> Aliases has been required to be supported if I remember correctly since >> SQL-1999 so it should be a fairly safe feature to use for almost all JDBC >> databases. >> I have never seen automatic generation of aliases for table names as >> described. (But I must admit that for the last 15 or so years I’ve used >> aliases at all cases so that does not tell much) >> >> Another aspect is that unqualified table references need to be evaluated by >> the dbms against the current namespace (schema) that the user uses. If I >> have a table in my currently used namespace (schema) with the same name as >> the ”automatic alias” in a query, how will the database know if I am >> referring to my table in the current namespace which is unqualified, or an >> in query alias for the table in the from? >> >> Of course these are just my observations and I have complete trust in you >> guys knowing far,far more than me about database internals and how a >> database interprets the SQL language. I’m just a user even though I used to >> write some dbms engine code (object relational things) back in the days. >> >> I’ll open a JIRA late today, and will do the testing requested. >> >> Regards, >> Magnus >> >> -------- >> You should join the list so that you get replies, and so I don’t have to >> moderate each of >> your questions through. >> >> Well, I’ve learned something about DB2 SQL. I’d never heard of a database >> generating composite >> table aliases. We should fix the adapter so we generate the correct SQL for >> DB2. The fix is >> probably to make the result() method’s behavior depend on >> dialect.getDatabaseProduct(). >> Can you log a JIRA case please? >> >> Since you have access to DB2, could you also run Calcite’s whole test suite >> against DB2? >> Let’s find out what other issues are there. Add a “DB2" entry to enum >> CalciteAssert.DatabaseInstance >> and run with “-Dcalcite.test.db=DB2”. >> >> Julian >> >> >> >>> On May 11, 2016, at 1:50 PM, Magnus Pierre <[email protected]> wrote: >>> >>> Hello Julian, >>> >>> Not very convenient to not be part of the mailing list when asking >>> questions, good that >> the calcite-dev archives list responses. :) >>> >>> But here is my claim: >>> >>> The query: >>> >>> SELECT * >>> FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" >>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON >>> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" >>> = "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE” >>> >>> >>> Using squirrel returns: (as well as with Drill) >>> Error: DB2 SQL Error: SQLCODE=-206, SQLSTATE=42703, >>> SQLERRMC=PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE, >> DRIVER=4.13.127 >>> SQLState: 42703 >>> ErrorCode: -206 >>> Error: DB2 SQL Error: SQLCODE=-727, SQLSTATE=56098, >>> SQLERRMC=2;-206;42703;PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE, >> DRIVER=4.13.127 >>> SQLState: 56098 >>> ErrorCode: -727 >>> >>> >>> >>> In Drill: >>> >>> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: >>> The JDBC storage >> plugin failed while trying setup the SQL query. sql SELECT * FROM >> "GOSALESMR"."PRODUCT_SURVEY_RESULTS" >> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON >> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" >> = "PRODUCT_SURVEY_TOPIC"."PRODUCT_TOPIC_CODE" plugin gs_db2 Fragment 0:0 >> [Error Id: 9e5c6edc-8572-4de2-b9ad-dce8a74589b1 >> on mpie-mbp.lan:31010] (com.ibm.db2.jcc.am.SqlSyntaxErrorException) DB2 SQL >> Error: SQLCODE=-206, >> SQLSTATE=42703, SQLERRMC=PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE, >> DRIVER=4.13.127 com.ibm.db2.jcc.am.id.a():677 >> com.ibm.db2.jcc.am.id.a():60 com.ibm.db2.jcc.am.id.a():127 >> com.ibm.db2.jcc.am.no.c():2653 >> com.ibm.db2.jcc.am.no.d():2641 com.ibm.db2.jcc.am.no.a():2090 >> com.ibm.db2.jcc.t4.cb.h():141 >> com.ibm.db2.jcc.t4.cb.b():41 com.ibm.db2.jcc.t4.q.a():32 >> com.ibm.db2.jcc.t4.sb.i():135 com.ibm.db2.jcc.am.no.ib():2059 >> com.ibm.db2.jcc.am.no.a():3130 com.ibm.db2.jcc.am.no.a():688 >> com.ibm.db2.jcc.am.no.executeQuery():672 >> org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 >> org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 >> org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup():177 >> org.apache.drill.exec.physical.impl.ScanBatch.():108 >> org.apache.drill.exec.physical.impl.ScanBatch.():136 >> org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():40 >> org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():33 >> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():148 >> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 >> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():128 >> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 >> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():101 >> org.apache.drill.exec.physical.impl.ImplCreator.getExec():79 >> org.apache.drill.exec.work.fragment.FragmentExecutor.run():231 >> org.apache.drill.common.SelfCleaningRunnable.run():38 >> java.util.concurrent.ThreadPoolExecutor.runWorker():1145 >> java.util.concurrent.ThreadPoolExecutor$Worker.run():615 >> java.lang.Thread.run():745 >>> >>> >>> With aliases in Squirrel: >>> >>> SELECT * FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" AS >>> "PRODUCT_SURVEY_RESULTS" >>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" AS "PRODUCT_SURVEY_TOPIC" >>> ON "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" = >>> "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE" >>> >>> Returns all data with no errors or warnings. >>> >>> SELECT * FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" AS PRODUCT_SURVEY_RESULTS >>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" AS PRODUCT_SURVEY_TOPIC >>> ON "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" = >>> "PRODUCT_SURVEY_TOPIC"."PRODUCT_TOPIC_CODE" >>> >>> Returns all data with no errors or warnings. >>> >>> I distinctly remember this being the same behavior for both Netezza and >>> Teradata for >> instance. (even though it was a few years since I really did lots of SQL on >> these systems). >> I can test Netezza but have no access to a TD box currently. >>> >>> Regards, >>> Magnus >>> >>> ————————————— >>> >>> This is a response to: >>> ————————————— >>> >>> We originally generated an alias for every table in the FROM clause. But we >>> discovered >> that >>> the aliases were not necessary and made the query less easy to understand. >>> For example, >> the >>> query you mentioned earlier would become >>> >>> SELECT * >>> FROM "GOSALESMR".”PRODUCT_SURVEY_RESULTS” AS T0 >>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC” AS T1 ON >>> T0."PRODUCT_TOPIC_CODE" = T1.”PRODUCT_TOPIC_CODE" >>> >>> So, we add aliases only if the table aliases that are automatically >>> generated by the >> target >>> system’s SQL validator are not unique. >>> >>> This is the code that assigns aliases only if they are necessary: >>> >>> public Result result(SqlNode node, Collection<Clause> clauses, RelNode rel) >>> { >>> final String alias2 = SqlValidatorUtil.getAlias(node, -1); >>> final String alias3 = alias2 != null ? alias2 : "t"; >>> final String alias4 = >>> SqlValidatorUtil.uniquify( >>> alias3, aliasSet, SqlValidatorUtil.EXPR_SUGGESTER); >>> final String alias5 = alias2 == null || !alias2.equals(alias4) ? alias4 >>> : null; >>> return new Result(node, clauses, alias5, >>> Collections.singletonList(Pair.of(alias4, rel.getRowType()))); >>> } >>> >>> You assert that >>> >>> SELECT * >>> FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" >>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON >>> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" >>> = "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE” >>> >>> is not correct, but I disagree. I think it is fine. My understanding is >>> that the table >> "GOSALESMR".”PRODUCT_SURVEY_TOPIC” >>> will implicitly have an alias “PRODUCT_SURVEY_TOPIC” on all SQL systems, >>> including >> DB2. >>> If my understanding is wrong, then we do indeed have a problem. Does that >>> query fail >> to validate, >>> or give wrong results? >>> >>> Julian >>> >>> >>> >> >
