The logical plan should show you where the cross join is needed. Here is
where it is logged:
https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamQueryPlanner.java#L150

(It should probably be put to DEBUG level)

If I look at the original template, like
https://github.com/gregrahn/tpcds-kit/blob/master/query_templates/query9.tpl
I see conditions "[RC.1]". Are those templates expected to be filled with
references to the `reason` table, perhaps? How does that change things?

I still think it would be good to support CROSS JOIN if we can - the
problem of course is huge data size, but when one side is small it would be
good for it to work simply.

Kenn

On Tue, May 15, 2018 at 7:41 AM Kai Jiang <[email protected]> wrote:

> Hi everyone,
>
> To prove the idea of GSoC project, I was working on some simple TPC-DS
> queries running with given generated data on direct runner. query example
> <https://gist.github.com/vectorijk/7c54f90aeebfd6fd9e9d2ee224bfed50>
>
> The example is executed with TPC-DS query 9
> <https://gist.github.com/vectorijk/7c54f90aeebfd6fd9e9d2ee224bfed50#file-tpcdssql-java-L176-L222>.
> Briefly, Query 9 uses case when clauses to select 5 counting numbers from 
> store_sales
> (table 1). In order to show those result numbers, case when clause inside
> one select clause. In short, it looks like:
> SELECT
>
> CASE WHEN ( SELECT count(*)  FROM  table 1 WHERE..... )
> THEN condition 1
> ELSE condition 2,
> .....
> CASE WHEN .....
>
> FROM table 2
>
> IIUC, this query doesn't need join operation on table 1 and table 2 since
> outside select clause doesn't need to interfere with table 1.
> But, the program shows it does and throws errors message said
> "java.lang.UnsupportedOperationException: CROSS JOIN is not supported". (error
> message detail
> <https://gist.github.com/vectorijk/5619a20485edc01113a56e348f87b0c3>)
>
> To make the query work, I am wondering where I can start with:
> 1. see logic plan?
> Will logic plan explain why the query need CROSS JOIN?
>
> 2. cross join support?
> I checked all queries in TPC-DS benchmark. Almost every query uses cross
> join. It is an important feature needs to implement. Unlike other join, it
> consumes a lot of computing resource. But, I think we need cross join in
> the future. and support both in join-library? I noticed James has open
> BEAM-2194 <https://issues.apache.org/jira/browse/BEAM-2194> for
> supporting cross join.
>
> Looking forward to comments!
>
> Best,
> Kai
>
> ᐧ
>

Reply via email to