Thanks for the clarification. 

Being a newbie I need to ask some basic questions:
Running test suite, I would assume I need to provide a model for the connection 
to DB2 (I guess making a db2 version of 
test/resources/mysql-foodmart-model.json)

And I further assume the foodmart-db and data should be pre-populated? 

Regards,
Magnus


> 13 maj 2016 kl. 06:28 skrev Julian Hyde <[email protected]>:
> 
> No dispute about aliases — they are a good idea, and we should use them, and 
> in fact we do use them.
> 
> But every database I’ve come across (with the exception of DB2) has IMPLICIT 
> aliases which are equivalent to the un-qualified table name. For example, in 
> Oracle,
> 
>  select emp.empno, emp.deptno from sales.emp
> 
> is equivalent to 
> 
>  select emp.empno, emp.deptno from sales.emp as emp
> 
> This assumption has worked for many databases so I’d rather treat DB2 as the 
> exception. I believe our approach is basically sound so I don’t want to turn 
> everything upside down. The queries we generate are reasonably concise, and 
> because they use aliases based on table names they are easier for humans to 
> understand. If our approach is flawed, you will surely be able to break 
> things and log bugs with test cases.
> 
> If someone were to add DB2 to https://github.com/vlsi/calcite-test-dataset 
> <https://github.com/vlsi/calcite-test-dataset> then we could do much better 
> testing. One test case is worth a dozen email messages.
> 
> Julian
> 
> 
>> On May 12, 2016, at 12:56 AM, Magnus Pierre <[email protected]> wrote:
>> 
>> Understand. Just joined… :)
>> 
>> Just a comment on your reasoning around aliases. 
>> 
>> Aliases are the safest, and most practical route for all databases, and 
>> leaves no room to the database to do crazy interpretations of what you try 
>> to achieve. I fully understand that aliases are not for everyone, but being 
>> system generated SQL that will go against databases and not humans, the use 
>> of aliases will save query length (provided the aliases are shorter than 
>> table names), allow for returning duplications of column names since the 
>> name of the column will be alias.column-name, it will avoid cross join 
>> behavior commonly seen in for instance Teradata when mixing aliases and 
>> non-aliases.
>> 
>> Not to mention that it could simplify the processing done by Calcite, and 
>> hopefully also remove a set of error conditions for specific databases (like 
>> DB2) where things are more strict than in others.
>> 
>> Aliases has been required to be supported if I remember correctly since 
>> SQL-1999 so it should be a fairly safe feature to use for almost all JDBC 
>> databases.
>> I have never seen automatic generation of aliases for table names as 
>> described. (But I must admit that for the last 15 or so years I’ve used 
>> aliases at all cases so that does not tell much) 
>> 
>> Another aspect is that unqualified table references need to be evaluated by 
>> the dbms against the current namespace (schema) that the user uses. If I 
>> have a table in my currently used namespace (schema) with the same name as 
>> the ”automatic alias” in a query, how will the database know if I am 
>> referring to my table in the current namespace which is unqualified, or an 
>> in query alias for the table in the from? 
>> 
>> Of course these are just my observations and I have complete trust in you 
>> guys knowing far,far more than me about database internals and how a 
>> database interprets the SQL language. I’m just a user even though I used to 
>> write some dbms engine code (object relational things) back in the days.
>> 
>> I’ll open a JIRA late today, and will do the testing requested.
>> 
>> Regards,
>> Magnus
>> 
>> --------
>> You should join the list so that you get replies, and so I don’t have to 
>> moderate each of
>> your questions through.
>> 
>> Well, I’ve learned something about DB2 SQL. I’d never heard of a database 
>> generating composite
>> table aliases. We should fix the adapter so we generate the correct SQL for 
>> DB2. The fix is
>> probably to make the result() method’s behavior depend on 
>> dialect.getDatabaseProduct().
>> Can you log a JIRA case please?
>> 
>> Since you have access to DB2, could you also run Calcite’s whole test suite 
>> against DB2?
>> Let’s find out what other issues are there. Add a “DB2" entry to enum 
>> CalciteAssert.DatabaseInstance
>> and run with “-Dcalcite.test.db=DB2”.
>> 
>> Julian
>> 
>> 
>> 
>>> On May 11, 2016, at 1:50 PM, Magnus Pierre <[email protected]> wrote:
>>> 
>>> Hello Julian,
>>> 
>>> Not very convenient to not be part of the mailing list when asking 
>>> questions, good that
>> the calcite-dev archives list responses. :) 
>>> 
>>> But here is my claim:
>>> 
>>> The query:
>>> 
>>> SELECT *
>>> FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS"
>>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON 
>>> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE"
>>> = "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE”
>>> 
>>> 
>>> Using squirrel returns: (as well as with Drill)
>>> Error: DB2 SQL Error: SQLCODE=-206, SQLSTATE=42703, 
>>> SQLERRMC=PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE,
>> DRIVER=4.13.127
>>> SQLState:  42703
>>> ErrorCode: -206
>>> Error: DB2 SQL Error: SQLCODE=-727, SQLSTATE=56098, 
>>> SQLERRMC=2;-206;42703;PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE,
>> DRIVER=4.13.127
>>> SQLState:  56098
>>> ErrorCode: -727
>>> 
>>> 
>>> 
>>> In Drill:
>>> 
>>> org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: 
>>> The JDBC storage
>> plugin failed while trying setup the SQL query. sql SELECT * FROM 
>> "GOSALESMR"."PRODUCT_SURVEY_RESULTS"
>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON 
>> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE"
>> = "PRODUCT_SURVEY_TOPIC"."PRODUCT_TOPIC_CODE" plugin gs_db2 Fragment 0:0 
>> [Error Id: 9e5c6edc-8572-4de2-b9ad-dce8a74589b1
>> on mpie-mbp.lan:31010] (com.ibm.db2.jcc.am.SqlSyntaxErrorException) DB2 SQL 
>> Error: SQLCODE=-206,
>> SQLSTATE=42703, SQLERRMC=PRODUCT_SURVEY_RESULTS.PRODUCT_TOPIC_CODE, 
>> DRIVER=4.13.127 com.ibm.db2.jcc.am.id.a():677
>> com.ibm.db2.jcc.am.id.a():60 com.ibm.db2.jcc.am.id.a():127 
>> com.ibm.db2.jcc.am.no.c():2653
>> com.ibm.db2.jcc.am.no.d():2641 com.ibm.db2.jcc.am.no.a():2090 
>> com.ibm.db2.jcc.t4.cb.h():141
>> com.ibm.db2.jcc.t4.cb.b():41 com.ibm.db2.jcc.t4.q.a():32 
>> com.ibm.db2.jcc.t4.sb.i():135 com.ibm.db2.jcc.am.no.ib():2059
>> com.ibm.db2.jcc.am.no.a():3130 com.ibm.db2.jcc.am.no.a():688 
>> com.ibm.db2.jcc.am.no.executeQuery():672
>> org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 
>> org.apache.commons.dbcp.DelegatingStatement.executeQuery():208
>> org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup():177 
>> org.apache.drill.exec.physical.impl.ScanBatch.():108
>> org.apache.drill.exec.physical.impl.ScanBatch.():136 
>> org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():40
>> org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():33 
>> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():148
>> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 
>> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():128
>> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 
>> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():101
>> org.apache.drill.exec.physical.impl.ImplCreator.getExec():79 
>> org.apache.drill.exec.work.fragment.FragmentExecutor.run():231
>> org.apache.drill.common.SelfCleaningRunnable.run():38 
>> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
>> java.util.concurrent.ThreadPoolExecutor$Worker.run():615 
>> java.lang.Thread.run():745
>>> 
>>> 
>>> With aliases in Squirrel:
>>> 
>>> SELECT * FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" AS 
>>> "PRODUCT_SURVEY_RESULTS" 
>>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" AS "PRODUCT_SURVEY_TOPIC" 
>>> ON "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" = 
>>> "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE"
>>> 
>>> Returns all data with no errors or warnings.
>>> 
>>> SELECT * FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS" AS PRODUCT_SURVEY_RESULTS
>>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" AS PRODUCT_SURVEY_TOPIC 
>>> ON "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE" = 
>>> "PRODUCT_SURVEY_TOPIC"."PRODUCT_TOPIC_CODE"
>>> 
>>> Returns all data with no errors or warnings.
>>> 
>>> I distinctly remember this being the same behavior for both Netezza and 
>>> Teradata for
>> instance. (even though it was a few years since I really did lots of SQL on 
>> these systems).
>> I can test Netezza but have no access to a TD box currently. 
>>> 
>>> Regards,
>>> Magnus
>>> 
>>> —————————————
>>> 
>>> This is a response to:
>>> —————————————
>>> 
>>> We originally generated an alias for every table in the FROM clause. But we 
>>> discovered
>> that
>>> the aliases were not necessary and made the query less easy to understand. 
>>> For example,
>> the
>>> query you mentioned earlier would become
>>> 
>>> SELECT *
>>> FROM "GOSALESMR".”PRODUCT_SURVEY_RESULTS” AS T0
>>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC” AS T1 ON 
>>> T0."PRODUCT_TOPIC_CODE" = T1.”PRODUCT_TOPIC_CODE"
>>> 
>>> So, we add aliases only if the table aliases that are automatically 
>>> generated by the
>> target
>>> system’s SQL validator are not unique.
>>> 
>>> This is the code that assigns aliases only if they are necessary:
>>> 
>>> public Result result(SqlNode node, Collection<Clause> clauses, RelNode rel) 
>>> {
>>> final String alias2 = SqlValidatorUtil.getAlias(node, -1);
>>> final String alias3 = alias2 != null ? alias2 : "t";
>>> final String alias4 =
>>>    SqlValidatorUtil.uniquify(
>>>        alias3, aliasSet, SqlValidatorUtil.EXPR_SUGGESTER);
>>> final String alias5 = alias2 == null || !alias2.equals(alias4) ? alias4
>>>    : null;
>>> return new Result(node, clauses, alias5,
>>>    Collections.singletonList(Pair.of(alias4, rel.getRowType())));
>>> }
>>> 
>>> You assert that 
>>> 
>>> SELECT *
>>> FROM "GOSALESMR"."PRODUCT_SURVEY_RESULTS"
>>> INNER JOIN "GOSALESMR"."PRODUCT_SURVEY_TOPIC" ON 
>>> "PRODUCT_SURVEY_RESULTS"."PRODUCT_TOPIC_CODE"
>>> = "PRODUCT_SURVEY_TOPIC".”PRODUCT_TOPIC_CODE”
>>> 
>>> is not correct, but I disagree. I think it is fine. My understanding is 
>>> that the table
>> "GOSALESMR".”PRODUCT_SURVEY_TOPIC”
>>> will implicitly have an alias “PRODUCT_SURVEY_TOPIC” on all SQL systems, 
>>> including
>> DB2.
>>> If my understanding is wrong, then we do indeed have a problem. Does that 
>>> query fail
>> to validate,
>>> or give wrong results?
>>> 
>>> Julian
>>> 
>>> 
>>> 
>> 
> 

Reply via email to