Hi everyone, Thanks for the help. I was unable to read through the mails over the weekend, so I just saw that the problem is confirmed here. Do you need anything from me in order to pinpoint the problem?
Is a JIRA ticket already created or should I create one myself? Thanks again. On 22 June 2015 at 07:06, Rajkumar Singh <[email protected]> wrote: > Hi Hao > > I have only one row in my sample data and find along the queries which I > used to reproduce it. > > rsingh@Administrators-MacBook-Pro-4 ~/Downloads/apache-drill-1.0.0$ cat > sample-data/master-data.csv > 1,1,20.4.09,201,Orci Donec Nibh > PC,1,[email protected],200,Dai > Woodward,300,Hannah Sims,400,Abdul > > > CREATE VIEW dfs.tmp.`fact_a` AS select distinct CAST(columns[0] > as INTEGER) as b_id, CAST(columns[1] as INTEGER) as c_id, > CAST(columns[2] as DATE) as a_date, CAST(columns[3] AS INTEGER) > as a_value from > dfs.root.`/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv` > order by > b_id, c_id, a_date; > > > CREATE VIEW dfs.tmp.`dim_c` AS select CAST(columns[1] as > INTEGER) as c_id, CAST(columns[12] as VARCHAR) as c_desc, > CAST(columns[11] as INTEGER) as c4_id, CAST(columns[10] as > VARCHAR) as c4_desc, CAST(columns[9] as INTEGER) as c3_id, > CAST(columns[7] as INTEGER) as c2_id, CAST(columns[5] as > INTEGER) as c1_id from > dfs.root.`/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv` > order > by c_id, c4_id, c3_id, c2_id, c1_id; > > > select a11.c_id c_id, > max(a12.c_desc) c_desc, > max(a12.c4_desc) c4_desc, > sum(a11.a_value) a_value > from dfs.tmp.`fact_a` a11 > join dfs.tmp.`dim_c` a12 > on (a11.c_id =a12.c_id) > group by a11.c_id; > > > > Rajkumar Singh > MapR Technologies > > > > On Jun 21, 2015, at 1:34 AM, Hao Zhu <[email protected]> wrote: > > > > Nice. How many rows do you create? > > Just curious what is the difference with my reproduce. > > > > Here are mine: > > CREATE VIEW view_fact_account > > AS select cast(columns[0] as int) account_id,cast(columns[1] as int) > > costcenter_id, cast(columns[2] as date) account_date,cast(columns[3] as > > int) account_value from dfs.drill.fact_account; > > > > > > CREATE VIEW view_dim_costcenter AS > > select cast(columns[0] as int) costcenter_id,cast(columns[1] as varchar) > > costcenter_desc, cast(columns[2] as int) costcenter_name_id, > > cast(columns[3] as varchar) costcenter_name_desc,cast(columns[4] as int) > > department_id, cast(columns[5] as int) division_id, cast(columns[6] as > int) > > area_id from dfs.drill.dim_costcenter; > > > > Both tables only have 1 rows. > > > > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> select * from > > view_dim_costcenter ; > > > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > > | costcenter_id | costcenter_desc | costcenter_name_id | > > costcenter_name_desc | department_id | division_id | area_id | > > > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > > | 2 | a | 2 | a > > | 3 | 4 | 5 | > > > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > > 1 row selected (0.255 seconds) > > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> select * from > > view_fact_account ; > > +-------------+----------------+---------------+----------------+ > > | account_id | costcenter_id | account_date | account_value | > > +-------------+----------------+---------------+----------------+ > > | 1 | 2 | 2015-01-01 | 3 | > > +-------------+----------------+---------------+----------------+ > > 1 row selected (0.141 seconds) > > > > Thanks, > > Hao > > > > On Sat, Jun 20, 2015 at 9:31 AM, Rajkumar Singh <[email protected]> > wrote: > > > >> Hi Hao > >> > >> I tried to reproduce the issue and able to repro it, I am running > >> drill-1.0.0 in embedded mode with a small data set on my mac machine. > >> > >> for the bad sql I am getting calcite cannotplanexception (same as the > >> error stack), for a good sql find below the explain plan. > >> > >> > >> > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ > >> | 00-00 Screen : rowType = RecordType(INTEGER c_id, VARCHAR(1) > c_desc, > >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = > >> {15.1 rows, 129.1 cpu, 0.0 io, 0.0 network, 187.2 memory}, id = 1346 > >> 00-01 Project(c_id=[$0], c_desc=[$1], c4_desc=[$2], a_value=[$3]) : > >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, VARCHAR(1) > c4_desc, > >> INTEGER a_value): rowcount = 1.0, cumulative cost = {15.0 rows, 129.0 > cpu, > >> 0.0 io, 0.0 network, 187.2 memory}, id = 1345 > >> 00-02 HashAgg(group=[{0}], c_desc=[MAX($1)], c4_desc=[MAX($2)], > >> a_value=[SUM($3)]) : rowType = RecordType(INTEGER c_id, VARCHAR(1) > c_desc, > >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = > >> {15.0 rows, 129.0 cpu, 0.0 io, 0.0 network, 187.2 memory}, id = 1344 > >> 00-03 Project(c_id=[$8], c_desc=[$1], c4_desc=[$3], > >> a_value=[$10]) : rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, > >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = > >> {14.0 rows, 85.0 cpu, 0.0 io, 0.0 network, 169.6 memory}, id = 1343 > >> 00-04 HashJoin(condition=[=($8, $0)], joinType=[inner]) : > >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, > >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id, INTEGER > >> b_id, INTEGER c_id0, DATE a_date, INTEGER a_value): rowcount = 1.0, > >> cumulative cost = {14.0 rows, 85.0 cpu, 0.0 io, 0.0 network, 169.6 > memory}, > >> id = 1342 > >> 00-05 Project(b_id=[$0], c_id0=[$1], a_date=[$2], > >> a_value=[$3]) : rowType = RecordType(INTEGER b_id, INTEGER c_id0, DATE > >> a_date, INTEGER a_value): rowcount = 1.0, cumulative cost = {8.0 rows, > 35.0 > >> cpu, 0.0 io, 0.0 network, 96.0 memory}, id = 1341 > >> 00-07 SelectionVectorRemover : rowType = > RecordType(INTEGER > >> b_id, INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, > >> cumulative cost = {8.0 rows, 35.0 cpu, 0.0 io, 0.0 network, 96.0 > memory}, > >> id = 1340 > >> 00-09 Sort(sort0=[$0], sort1=[$1], sort2=[$2], > >> dir0=[ASC], dir1=[ASC], dir2=[ASC]) : rowType = RecordType(INTEGER b_id, > >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative > >> cost = {7.0 rows, 34.0 cpu, 0.0 io, 0.0 network, 96.0 memory}, id = 1339 > >> 00-11 SelectionVectorRemover : rowType = > >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): > >> rowcount = 1.0, cumulative cost = {6.0 rows, 34.0 cpu, 0.0 io, 0.0 > network, > >> 64.0 memory}, id = 1338 > >> 00-13 Sort(sort0=[$0], sort1=[$1], sort2=[$2], > >> dir0=[ASC], dir1=[ASC], dir2=[ASC]) : rowType = RecordType(INTEGER b_id, > >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative > >> cost = {5.0 rows, 33.0 cpu, 0.0 io, 0.0 network, 64.0 memory}, id = 1337 > >> 00-14 StreamAgg(group=[{0, 1, 2, 3}]) : rowType = > >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): > >> rowcount = 1.0, cumulative cost = {4.0 rows, 33.0 cpu, 0.0 io, 0.0 > network, > >> 32.0 memory}, id = 1336 > >> 00-15 Sort(sort0=[$0], sort1=[$1], sort2=[$2], > >> sort3=[$3], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC]) : rowType = > >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): > >> rowcount = 1.0, cumulative cost = {3.0 rows, 17.0 cpu, 0.0 io, 0.0 > network, > >> 32.0 memory}, id = 1335 > >> 00-16 Project(b_id=[CAST(ITEM($0, > 0)):INTEGER], > >> c_id=[CAST(ITEM($0, 1)):INTEGER], a_date=[CAST(ITEM($0, 2)):DATE], > >> a_value=[CAST(ITEM($0, 3)):INTEGER]) : rowType = RecordType(INTEGER > b_id, > >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative > >> cost = {2.0 rows, 17.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1334 > >> 00-17 Scan(groupscan=[EasyGroupScan > >> > [selectionRoot=/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv, > >> numFiles=1, columns=[`columns`[0], `columns`[1], `columns`[2], > >> `columns`[3]], > >> > files=[file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv]]]) > >> : rowType = RecordType(ANY columns): rowcount = 1.0, cumulative cost = > {1.0 > >> rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1333 > >> 00-06 SelectionVectorRemover : rowType = RecordType(INTEGER > >> c_id, VARCHAR(1) c_desc, INTEGER c4_id, VARCHAR(1) c4_desc, INTEGER > c3_id, > >> INTEGER c2_id, INTEGER c1_id): rowcount = 1.0, cumulative cost = {4.0 > rows, > >> 30.0 cpu, 0.0 io, 0.0 network, 56.0 memory}, id = 1332 > >> 00-08 Sort(sort0=[$0], sort1=[$2], sort2=[$4], > sort3=[$5], > >> sort4=[$6], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC]) > : > >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, > >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id): > rowcount > >> = 1.0, cumulative cost = {3.0 rows, 29.0 cpu, 0.0 io, 0.0 network, 56.0 > >> memory}, id = 1331 > >> 00-10 Project(c_id=[CAST(ITEM($0, 1)):INTEGER], > >> c_desc=[CAST(ITEM($0, 12)):VARCHAR(1) CHARACTER SET "ISO-8859-1" COLLATE > >> "ISO-8859-1$en_US$primary"], c4_id=[CAST(ITEM($0, 11)):INTEGER], > >> c4_desc=[CAST(ITEM($0, 10)):VARCHAR(1) CHARACTER SET "ISO-8859-1" > COLLATE > >> "ISO-8859-1$en_US$primary"], c3_id=[CAST(ITEM($0, 9)):INTEGER], > >> c2_id=[CAST(ITEM($0, 7)):INTEGER], c1_id=[CAST(ITEM($0, 5)):INTEGER]) : > >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, > >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id): > rowcount > >> = 1.0, cumulative cost = {2.0 rows, 29.0 cpu, 0.0 io, 0.0 network, 0.0 > >> memory}, id = 1330 > >> 00-12 Scan(groupscan=[EasyGroupScan > >> > [selectionRoot=/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv, > >> numFiles=1, columns=[`columns`[1], `columns`[12], `columns`[11], > >> `columns`[10], `columns`[9], `columns`[7], `columns`[5]], > >> > files=[file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv]]]) > >> : rowType = RecordType(ANY columns): rowcount = 1.0, cumulative cost = > {1.0 > >> rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1329 > >> | { > >> "head" : { > >> "version" : 1, > >> "generator" : { > >> "type" : "ExplainHandler", > >> "info" : "" > >> }, > >> "type" : "APACHE_DRILL_PHYSICAL", > >> "options" : [ ], > >> "queue" : 0, > >> "resultMode" : "EXEC" > >> }, > >> "graph" : [ { > >> "pop" : "fs-scan", > >> "@id" : 12, > >> "userName" : "rsingh", > >> "files" : [ > >> > "file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv" > >> ], > >> "storage" : { > >> "type" : "file", > >> "enabled" : true, > >> "connection" : "file:///", > >> "workspaces" : { > >> "root" : { > >> "location" : "/", > >> "writable" : false, > >> "defaultInputFormat" : null > >> }, > >> "tmp" : { > >> "location" : "/tmp", > >> "writable" : true, > >> "defaultInputFormat" : null > >> } > >> }, > >> "formats" : { > >> "psv" : { > >> "type" : "text", > >> "extensions" : [ "tbl" ], > >> "delimiter" : "|" > >> }, > >> "csv" : { > >> "type" : "text", > >> "extensions" : [ "csv" ], > >> "delimiter" : "," > >> }, > >> "tsv" : { > >> "type" : "text", > >> "extensions" : [ "tsv" ], > >> "delimiter" : "\t" > >> }, > >> "parquet" : { > >> "type" : "parquet" > >> }, > >> "json" : { > >> "type" : "json" > >> }, > >> "avro" : { > >> "type" : "avro" > >> } > >> } > >> }, > >> "format" : { > >> "type" : "text", > >> "extensions" : [ "csv" ], > >> "delimiter" : "," > >> }, > >> "columns" : [ "`columns`[1]", "`columns`[12]", "`columns`[11]", > >> "`columns`[10]", "`columns`[9]", "`columns`[7]", "`columns`[5]" ], > >> "selectionRoot" : > >> > "/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv", > >> "cost" : 1.0 > >> }, { > >> "pop" : "project", > >> "@id" : 10, > >> "exprs" : [ { > >> "ref" : "`c_id`", > >> "expr" : "cast( (`columns`[1] ) as INT )" > >> }, { > >> "ref" : "`c_desc`", > >> "expr" : "cast( (`columns`[12] ) as VARCHAR(1) )" > >> }, { > >> "ref" : "`c4_id`", > >> "expr" : "cast( (`columns`[11] ) as INT )" > >> }, { > >> "ref" : "`c4_desc`", > >> "expr" : "cast( (`columns`[10] ) as VARCHAR(1) )" > >> }, { > >> "ref" : "`c3_id`", > >> "expr" : "cast( (`columns`[9] ) as INT )" > >> }, { > >> "ref" : "`c2_id`", > >> "expr" : "cast( (`columns`[7] ) as INT )" > >> }, { > >> "ref" : "`c1_id`", > >> "expr" : "cast( (`columns`[5] ) as INT )" > >> } ], > >> "child" : 12, > >> "initialAllocation" : 1000000, > >> "maxAllocation" : 10000000000, > >> "cost" : 1.0 > >> }, { > >> "pop" : "external-sort", > >> "@id" : 8, > >> "child" : 10, > >> "orderings" : [ { > >> "expr" : "`c_id`", > >> "order" : "ASC", > >> "nullDirection" : "UNSPECIFIED" > >> }, { > >> "expr" : "`c4_id`", > >> "order" : "ASC", > >> "nullDirection" : "UNSPECIFIED" > >> }, { > >> "expr" : "`c3_id`", > >> "order" : "ASC", > >> "nullDirection" : "UNSPECIFIED" > >> }, { > >> "expr" : "`c2_id`", > >> "order" : "ASC", > >> "nullDirection" : "UNSPECIFIED" > >> }, { > >> "expr" : "`c1_id`", > >> "order" : "ASC", > >> "nullDirection" : "UNSPECIFIED" > >> } ], > >> "reverse" : false, > >> "initialAllocation" : 20000000, > >> "maxAllocation" : 10000000000, > >> "cost" : 1.0 > >> }, { > >> "pop" : "selection-vector-remover", > >> "@id" : 6, > >> "child" : 8, > >> "initialAllocation" : 1000000, > >> "maxAllocation" : 10000000000, > >> "cost" : 1.0 > >> }, { > >> "pop" : "fs-scan", > >> "@id" : 17, > >> "userName" : "rsingh", > >> "files" : [ > >> > "file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv" > >> ], > >> "storage" : { > >> "type" : "file", > >> "enabled" : true, > >> "connection" : "file:///", > >> "workspaces" : { > >> "root" : { > >> "location" : "/", > >> "writable" : false, > >> "defaultInputFormat" : null > >> }, > >> "tmp" : { > >> "location" : "/tmp", > >> "writable" : true, > >> "defaultInputFormat" : null > >> } > >> }, > >> "formats" : { > >> "psv" : { > >> "type" : "text", > >> "extensions" : [ "tbl" ], > >> "delimiter" : "|" > >> }, > >> "csv" : { > >> "type" : "text", > >> "extensions" : [ "csv" ], > >> "delimiter" : "," > >> }, > >> "tsv" : { > >> | > >> +------------- > >> > >> > >> Rajkumar Singh > >> MapR Technologies > >> > >> > >>> On Jun 20, 2015, at 9:07 PM, Hao Zhu <[email protected]> wrote: > >>> > >>> Hello, > >>> > >>> I tried to create the same data in my lab with Drill 1.0 on MapR 4.1, > >>> however both SQL works fine in my end: > >>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as > >>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, > >>> sum(a11.account_value) as sss from view_fact_account a11 > >>> join view_dim_costcenter a12 on (a11.costcenter_id = > >>> a12.costcenter_id) group by a11.costcenter_id; > >>> +----------------+------------------+-----------------------+------+ > >>> | costcenter_id | costcenter_desc | costcenter_name_desc | sss | > >>> +----------------+------------------+-----------------------+------+ > >>> | 2 | a | a | 3 | > >>> +----------------+------------------+-----------------------+------+ > >>> 1 row selected (0.302 seconds) > >>> > >>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as > >>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, > >>> sum(a11.account_value) as sss from view_dim_costcenter a12 > >>> join view_fact_account a11 on (a11.costcenter_id = > >>> a12.costcenter_id) group by a11.costcenter_id; > >>> > >>> +----------------+------------------+-----------------------+------+ > >>> | costcenter_id | costcenter_desc | costcenter_name_desc | sss | > >>> +----------------+------------------+-----------------------+------+ > >>> | 2 | a | a | 3 | > >>> +----------------+------------------+-----------------------+------+ > >>> 1 row selected (0.209 seconds) > >>> > >>> To narrow down the issue, could you test something: > >>> 1. Is this issue only happening with user2? Do you have the same issue > >>> using user1 also? > >>> Just want to confirm if this issue is related to impersonation or > >>> permission. > >>> > >>> 2. Is this issue only happening with the 582 rows table? > >>> I mean, if the 2 tables have fewer rows, can this issue reproduce? > >>> In my test, I only created 1 row. > >>> I just want to know if this issue is data driver or not. > >>> > >>> 3. Could you attach the good SQL and bad SQL profiles, so that the SQL > >> plan > >>> is more readable? > >>> > >>> Thanks, > >>> Hao > >>> > >>> On Thu, Jun 18, 2015 at 6:46 AM, Mustafa Engin Sözer < > >>> [email protected]> wrote: > >>> > >>>> Hi everyone, > >>>> > >>>> I've had an earlier topic regarding this issue but no resolution came > >> out > >>>> of this and it couldn't be reproduced. Let me re-describe the issue > and > >> my > >>>> cluster: > >>>> > >>>> Currently I have a 5-node Mapr cluster on AWS, including Drill. On > both > >>>> sides, the security is enabled and on drill, impersonation is also > >> enabled. > >>>> The only other configuration I changed in drill was the new views > >>>> permissions which I set to 750. I'm using maprfs and our MapR version > is > >>>> 4.1.0 and Drill version is 1.0.0. > >>>> > >>>> So the process goes like this: > >>>> > >>>> I have two users involved in this process, called usr1 and usr2. usr1 > is > >>>> kind of an admin for the raw data whereas usr2 is not allowed to > access > >> to > >>>> raw data. > >>>> > >>>> usr1 writes 3 csv files to /raw/costcenter volume and creates a > >> relational > >>>> model using drill views. These views are written to /views/costcenter > >> where > >>>> usr2 has access to. So usr2 can query these views without any issues. > >>>> > >>>> So there comes the problem. Along with several other tables, I have 2 > >>>> views, namely fact_account and dim_costcenter (created out of the same > >> csv) > >>>> Here are the table definitions: > >>>> > >>>> describe dfs.views_costcenter.fact_account; > >>>> +----------------+------------+--------------+ > >>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | > >>>> +----------------+------------+--------------+ > >>>> | account_id | INTEGER | YES | > >>>> | costcenter_id | INTEGER | YES | > >>>> | account_date | DATE | YES | > >>>> | account_value | INTEGER | YES | > >>>> +----------------+------------+--------------+ > >>>> > >>>> describe dfs.views_costcenter.dim_costcenter; > >>>> +-----------------------+------------+--------------+ > >>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | > >>>> +-----------------------+------------+--------------+ > >>>> | costcenter_id | INTEGER | YES | > >>>> | costcenter_desc | VARCHAR | YES | > >>>> | costcenter_name_id | INTEGER | YES | > >>>> | costcenter_name_desc | VARCHAR | YES | > >>>> | department_id | INTEGER | YES | > >>>> | division_id | INTEGER | YES | > >>>> | area_id | INTEGER | YES | > >>>> +-----------------------+------------+--------------+ > >>>> > >>>> Both tables have 582 rows. > >>>> > >>>> So I need to join these two tables and run some aggregations on them > in > >>>> order to create a report. I have the following query: > >>>> > >>>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as > >>>> costcenter_desc, max(a12.costcenter_name_desc) as > costcenter_name_desc, > >>>> sum(a11.account_value) as sss from dfs.views_costcenter.fact_account > a11 > >>>> join dfs.views_costcenter.dim_costcenter a12 on (a11.costcenter_id = > >>>> a12.costcenter_id) group by a11.costcenter_id; > >>>> > >>>> When I run this query, the execution planner throws a huge exception > as > >> you > >>>> can see below. However, I've found a strange solution to that. If I > >>>> exchange the order of the tables within the join, ie. > >>>> > >>>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as > >>>> costcenter_desc, max(a12.costcenter_name_desc) as > costcenter_name_desc, > >>>> sum(a11.account_value) as sss from dfs.views_costcenter.dim_costcenter > >> a12 > >>>> join dfs.views_costcenter.fact_account a11 on (a11.costcenter_id = > >>>> a12.costcenter_id) group by a11.costcenter_id; > >>>> > >>>> It works perfectly. So in summary, if I write t2 join t1 instead of t1 > >> join > >>>> t2 and change nothing else, it works like a charm. As inner join is > >>>> commutative and associative, this was completely unexpected for me. > Can > >>>> someone confirm if this is a bug? I didn't want to file a bug to JIRA > >>>> before asking you guys here first. > >>>> > >>>> Thanks in advance for your help. > >>>> > >>>> Below you can find the exception: (as the exception is huge, I've only > >>>> posted part of it here. Please let me know if you need the complete > >>>> exception) > >>>> > >>>> > >>>> Error: SYSTEM ERROR: > >>>> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node > >>>> [rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[]] could not be > >> implemented; > >>>> planner state: > >>>> > >>>> Root: rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[] > >>>> Original rel: > >>>> > >> > AbstractConverter(subset=[rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[]], > >>>> convention=[PHYSICAL], DrillDistributionTraitDef=[SINGLETON([])], > >>>> sort=[[]]): rowcount = 101.8, cumulative cost = {inf}, id = 15148 > >>>> DrillScreenRel(subset=[rel#15145:Subset#27.LOGICAL.ANY([]).[]]): > >> rowcount > >>>> = 101.8, cumulative cost = {10.18 rows, 10.18 cpu, 0.0 io, 0.0 > network, > >> 0.0 > >>>> memory}, id = 15144 > >>>> DrillAggregateRel(subset=[rel#15143:Subset#26.LOGICAL.ANY([]).[]], > >>>> group=[{0}], costcenter_desc=[MAX($1)], > costcenter_name_desc=[MAX($2)], > >>>> sss=[SUM($3)]): rowcount = 101.8, cumulative cost = {1.0 rows, 1.0 > cpu, > >> 0.0 > >>>> io, 0.0 network, 0.0 memory}, id = 15142 > >>>> DrillProjectRel(subset=[rel#15141:Subset#25.LOGICAL.ANY([]).[]], > >>>> costcenter_id=[$1], costcenter_desc=[$5], costcenter_name_desc=[$7], > >>>> account_value=[$3]): rowcount = 1018.0, cumulative cost = {0.0 rows, > 0.0 > >>>> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15140 > >>>> DrillProjectRel(subset=[rel#15139:Subset#24.LOGICAL.ANY([]).[0, > >> 2, > >>>> 4, 5, 6]], account_id=[$7], costcenter_id=[$8], account_date=[$9], > >>>> account_value=[$10], costcenter_id0=[$0], costcenter_desc=[$1], > >>>> costcenter_name_id=[$2], costcenter_name_desc=[$3], > department_id=[$4], > >>>> division_id=[$5], area_id=[$6]): rowcount = 1018.0, cumulative cost = > >> {0.0 > >>>> rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15138 > >>>> DrillJoinRel(subset=[rel#15137:Subset#23.LOGICAL.ANY([]).[0, > 2, > >>>> 4, 5, 6]], condition=[=($8, $0)], joinType=[inner]): rowcount = > 1018.0, > >>>> cumulative cost = {1119.8 rows, 13030.4 cpu, 0.0 io, 0.0 network, > >> 1791.68 > >>>> memory}, id = 15136 > >>>> DrillSortRel(subset=[rel#15128:Subset#18.LOGICAL.ANY([]).[0, > >> 2, > >>>> 4, 5, 6]], sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$5], sort4=[$6], > >>>> dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC]): rowcount > = > >>>> 1018.0, cumulative cost = {197407.16549843678 rows, 1018.0 cpu, 0.0 > io, > >> 0.0 > >>>> network, 0.0 memory}, id = 15127 > >>>> > >>>> DrillProjectRel(subset=[rel#15126:Subset#17.LOGICAL.ANY([]).[]], > >>>> costcenter_id=[CAST(ITEM($0, 1)):INTEGER], > >> costcenter_desc=[CAST(ITEM($0, > >>>> 12)):VARCHAR(45) CHARACTER SET "ISO-8859-1" COLLATE > >>>> "ISO-8859-1$en_US$primary"], costcenter_name_id=[CAST(ITEM($0, > >>>> 11)):INTEGER], costcenter_name_desc=[CAST(ITEM($0, 10)):VARCHAR(45) > >>>> CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], > >>>> department_id=[CAST(ITEM($0, 9)):INTEGER], division_id=[CAST(ITEM($0, > >>>> 7)):INTEGER], area_id=[CAST(ITEM($0, 5)):INTEGER]): rowcount = 1018.0, > >>>> cumulative cost = {1018.0 rows, 28504.0 cpu, 0.0 io, 0.0 network, 0.0 > >>>> memory}, id = 15125 > >>>> > >>>> DrillScanRel(subset=[rel#15124:Subset#16.LOGICAL.ANY([]).[]], > >> table=[[dfs, > >>>> raw_costcenter, master_datev*.csv]], groupscan=[EasyGroupScan > >>>> [selectionRoot=/raw/costcenter/master_datev*.csv, numFiles=1, > >>>> columns=[`columns`[1], `columns`[12], `columns`[11], `columns`[10], > >>>> `columns`[9], `columns`[7], `columns`[5]], > >>>> files=[maprfs:/raw/costcenter/master_datev_20150617.csv]]]): rowcount > = > >>>> 1018.0, cumulative cost = {1018.0 rows, 1018.0 cpu, 0.0 io, 0.0 > network, > >>>> 0.0 memory}, id = 15064 > >>>> DrillSortRel(subset=[rel#15135:Subset#22.LOGICAL.ANY([]).[0, > >> 1, > >>>> 2]], sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC], > >>>> dir2=[ASC]): rowcount = 101.8, cumulative cost = {7529.958857584828 > >> rows, > >>>> 101.8 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15134 > >>>> > >>>> DrillAggregateRel(subset=[rel#15133:Subset#21.LOGICAL.ANY([]).[]], > >>>> group=[{0, 1, 2, 3}]): rowcount = 101.8, cumulative cost = {1.0 rows, > >> 1.0 > >>>> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15132 > >>>> > >>>> DrillProjectRel(subset=[rel#15131:Subset#20.LOGICAL.ANY([]).[]], > >>>> account_id=[CAST(ITEM($0, 0)):INTEGER], costcenter_id=[CAST(ITEM($0, > >>>> 1)):INTEGER], account_date=[CAST(ITEM($0, 2)):DATE], > >>>> account_value=[CAST(ITEM($0, 3)):INTEGER]): rowcount = 1018.0, > >> cumulative > >>>> cost = {1018.0 rows, 16288.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, > id = > >>>> 15130 > >>>> > >>>> DrillScanRel(subset=[rel#15129:Subset#19.LOGICAL.ANY([]).[]], > >> table=[[dfs, > >>>> raw_costcenter, master_datev*.csv]], groupscan=[EasyGroupScan > >>>> [selectionRoot=/raw/costcenter/master_datev*.csv, numFiles=1, > >>>> columns=[`columns`[0], `columns`[1], `columns`[2], `columns`[3]], > >>>> files=[maprfs:/raw/costcenter/master_datev_20150617.csv]]]): rowcount > = > >>>> 1018.0, cumulative cost = {1018.0 rows, 1018.0 cpu, 0.0 io, 0.0 > network, > >>>> 0.0 memory}, id = 15074 > >>>> > >>>> Sets: > >>>> Set#16, type: RecordType(ANY columns) > >>>> rel#15124:Subset#16.LOGICAL.ANY([]).[], best=rel#15064, > >>>> importance=0.4304672100000001 > >>>> > >>>> > >>>> -- > >>>> > >>>> *M. Engin Sözer* > >>>> Junior Datawarehouse Manager > >>>> [email protected] > >>>> > >>>> Goodgame Studios > >>>> Theodorstr. 42-90, House 9 > >>>> 22761 Hamburg, Germany > >>>> Phone: +49 (0)40 219 880 -0 > >>>> *www.goodgamestudios.com <http://www.goodgamestudios.com>* > >>>> > >>>> Goodgame Studios is a branch of Altigi GmbH > >>>> Altigi GmbH, District court Hamburg, HRB 99869 > >>>> Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian > >>>> Ritter > >>>> > >> > >> > > -- *M. Engin Sözer* Junior Datawarehouse Manager [email protected] Goodgame Studios Theodorstr. 42-90, House 9 22761 Hamburg, Germany Phone: +49 (0)40 219 880 -0 *www.goodgamestudios.com <http://www.goodgamestudios.com>* Goodgame Studios is a branch of Altigi GmbH Altigi GmbH, District court Hamburg, HRB 99869 Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian Ritter
