Hi Hao I have only one row in my sample data and find along the queries which I used to reproduce it.
rsingh@Administrators-MacBook-Pro-4 ~/Downloads/apache-drill-1.0.0$ cat sample-data/master-data.csv 1,1,20.4.09,201,Orci Donec Nibh PC,1,[email protected],200,Dai Woodward,300,Hannah Sims,400,Abdul CREATE VIEW dfs.tmp.`fact_a` AS select distinct CAST(columns[0] as INTEGER) as b_id, CAST(columns[1] as INTEGER) as c_id, CAST(columns[2] as DATE) as a_date, CAST(columns[3] AS INTEGER) as a_value from dfs.root.`/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv` order by b_id, c_id, a_date; CREATE VIEW dfs.tmp.`dim_c` AS select CAST(columns[1] as INTEGER) as c_id, CAST(columns[12] as VARCHAR) as c_desc, CAST(columns[11] as INTEGER) as c4_id, CAST(columns[10] as VARCHAR) as c4_desc, CAST(columns[9] as INTEGER) as c3_id, CAST(columns[7] as INTEGER) as c2_id, CAST(columns[5] as INTEGER) as c1_id from dfs.root.`/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv` order by c_id, c4_id, c3_id, c2_id, c1_id; select a11.c_id c_id, max(a12.c_desc) c_desc, max(a12.c4_desc) c4_desc, sum(a11.a_value) a_value from dfs.tmp.`fact_a` a11 join dfs.tmp.`dim_c` a12 on (a11.c_id =a12.c_id) group by a11.c_id; Rajkumar Singh MapR Technologies > On Jun 21, 2015, at 1:34 AM, Hao Zhu <[email protected]> wrote: > > Nice. How many rows do you create? > Just curious what is the difference with my reproduce. > > Here are mine: > CREATE VIEW view_fact_account > AS select cast(columns[0] as int) account_id,cast(columns[1] as int) > costcenter_id, cast(columns[2] as date) account_date,cast(columns[3] as > int) account_value from dfs.drill.fact_account; > > > CREATE VIEW view_dim_costcenter AS > select cast(columns[0] as int) costcenter_id,cast(columns[1] as varchar) > costcenter_desc, cast(columns[2] as int) costcenter_name_id, > cast(columns[3] as varchar) costcenter_name_desc,cast(columns[4] as int) > department_id, cast(columns[5] as int) division_id, cast(columns[6] as int) > area_id from dfs.drill.dim_costcenter; > > Both tables only have 1 rows. > > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> select * from > view_dim_costcenter ; > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > | costcenter_id | costcenter_desc | costcenter_name_id | > costcenter_name_desc | department_id | division_id | area_id | > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > | 2 | a | 2 | a > | 3 | 4 | 5 | > +----------------+------------------+---------------------+-----------------------+----------------+--------------+----------+ > 1 row selected (0.255 seconds) > 0: jdbc:drill:zk=h2.poc.com:5181,h3.poc.com:5> select * from > view_fact_account ; > +-------------+----------------+---------------+----------------+ > | account_id | costcenter_id | account_date | account_value | > +-------------+----------------+---------------+----------------+ > | 1 | 2 | 2015-01-01 | 3 | > +-------------+----------------+---------------+----------------+ > 1 row selected (0.141 seconds) > > Thanks, > Hao > > On Sat, Jun 20, 2015 at 9:31 AM, Rajkumar Singh <[email protected]> wrote: > >> Hi Hao >> >> I tried to reproduce the issue and able to repro it, I am running >> drill-1.0.0 in embedded mode with a small data set on my mac machine. >> >> for the bad sql I am getting calcite cannotplanexception (same as the >> error stack), for a good sql find below the explain plan. >> >> >> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ >> | 00-00 Screen : rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = >> {15.1 rows, 129.1 cpu, 0.0 io, 0.0 network, 187.2 memory}, id = 1346 >> 00-01 Project(c_id=[$0], c_desc=[$1], c4_desc=[$2], a_value=[$3]) : >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, VARCHAR(1) c4_desc, >> INTEGER a_value): rowcount = 1.0, cumulative cost = {15.0 rows, 129.0 cpu, >> 0.0 io, 0.0 network, 187.2 memory}, id = 1345 >> 00-02 HashAgg(group=[{0}], c_desc=[MAX($1)], c4_desc=[MAX($2)], >> a_value=[SUM($3)]) : rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = >> {15.0 rows, 129.0 cpu, 0.0 io, 0.0 network, 187.2 memory}, id = 1344 >> 00-03 Project(c_id=[$8], c_desc=[$1], c4_desc=[$3], >> a_value=[$10]) : rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, >> VARCHAR(1) c4_desc, INTEGER a_value): rowcount = 1.0, cumulative cost = >> {14.0 rows, 85.0 cpu, 0.0 io, 0.0 network, 169.6 memory}, id = 1343 >> 00-04 HashJoin(condition=[=($8, $0)], joinType=[inner]) : >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id, INTEGER >> b_id, INTEGER c_id0, DATE a_date, INTEGER a_value): rowcount = 1.0, >> cumulative cost = {14.0 rows, 85.0 cpu, 0.0 io, 0.0 network, 169.6 memory}, >> id = 1342 >> 00-05 Project(b_id=[$0], c_id0=[$1], a_date=[$2], >> a_value=[$3]) : rowType = RecordType(INTEGER b_id, INTEGER c_id0, DATE >> a_date, INTEGER a_value): rowcount = 1.0, cumulative cost = {8.0 rows, 35.0 >> cpu, 0.0 io, 0.0 network, 96.0 memory}, id = 1341 >> 00-07 SelectionVectorRemover : rowType = RecordType(INTEGER >> b_id, INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, >> cumulative cost = {8.0 rows, 35.0 cpu, 0.0 io, 0.0 network, 96.0 memory}, >> id = 1340 >> 00-09 Sort(sort0=[$0], sort1=[$1], sort2=[$2], >> dir0=[ASC], dir1=[ASC], dir2=[ASC]) : rowType = RecordType(INTEGER b_id, >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative >> cost = {7.0 rows, 34.0 cpu, 0.0 io, 0.0 network, 96.0 memory}, id = 1339 >> 00-11 SelectionVectorRemover : rowType = >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): >> rowcount = 1.0, cumulative cost = {6.0 rows, 34.0 cpu, 0.0 io, 0.0 network, >> 64.0 memory}, id = 1338 >> 00-13 Sort(sort0=[$0], sort1=[$1], sort2=[$2], >> dir0=[ASC], dir1=[ASC], dir2=[ASC]) : rowType = RecordType(INTEGER b_id, >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative >> cost = {5.0 rows, 33.0 cpu, 0.0 io, 0.0 network, 64.0 memory}, id = 1337 >> 00-14 StreamAgg(group=[{0, 1, 2, 3}]) : rowType = >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): >> rowcount = 1.0, cumulative cost = {4.0 rows, 33.0 cpu, 0.0 io, 0.0 network, >> 32.0 memory}, id = 1336 >> 00-15 Sort(sort0=[$0], sort1=[$1], sort2=[$2], >> sort3=[$3], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC]) : rowType = >> RecordType(INTEGER b_id, INTEGER c_id, DATE a_date, INTEGER a_value): >> rowcount = 1.0, cumulative cost = {3.0 rows, 17.0 cpu, 0.0 io, 0.0 network, >> 32.0 memory}, id = 1335 >> 00-16 Project(b_id=[CAST(ITEM($0, 0)):INTEGER], >> c_id=[CAST(ITEM($0, 1)):INTEGER], a_date=[CAST(ITEM($0, 2)):DATE], >> a_value=[CAST(ITEM($0, 3)):INTEGER]) : rowType = RecordType(INTEGER b_id, >> INTEGER c_id, DATE a_date, INTEGER a_value): rowcount = 1.0, cumulative >> cost = {2.0 rows, 17.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1334 >> 00-17 Scan(groupscan=[EasyGroupScan >> [selectionRoot=/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv, >> numFiles=1, columns=[`columns`[0], `columns`[1], `columns`[2], >> `columns`[3]], >> files=[file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv]]]) >> : rowType = RecordType(ANY columns): rowcount = 1.0, cumulative cost = {1.0 >> rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1333 >> 00-06 SelectionVectorRemover : rowType = RecordType(INTEGER >> c_id, VARCHAR(1) c_desc, INTEGER c4_id, VARCHAR(1) c4_desc, INTEGER c3_id, >> INTEGER c2_id, INTEGER c1_id): rowcount = 1.0, cumulative cost = {4.0 rows, >> 30.0 cpu, 0.0 io, 0.0 network, 56.0 memory}, id = 1332 >> 00-08 Sort(sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$5], >> sort4=[$6], dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC]) : >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id): rowcount >> = 1.0, cumulative cost = {3.0 rows, 29.0 cpu, 0.0 io, 0.0 network, 56.0 >> memory}, id = 1331 >> 00-10 Project(c_id=[CAST(ITEM($0, 1)):INTEGER], >> c_desc=[CAST(ITEM($0, 12)):VARCHAR(1) CHARACTER SET "ISO-8859-1" COLLATE >> "ISO-8859-1$en_US$primary"], c4_id=[CAST(ITEM($0, 11)):INTEGER], >> c4_desc=[CAST(ITEM($0, 10)):VARCHAR(1) CHARACTER SET "ISO-8859-1" COLLATE >> "ISO-8859-1$en_US$primary"], c3_id=[CAST(ITEM($0, 9)):INTEGER], >> c2_id=[CAST(ITEM($0, 7)):INTEGER], c1_id=[CAST(ITEM($0, 5)):INTEGER]) : >> rowType = RecordType(INTEGER c_id, VARCHAR(1) c_desc, INTEGER c4_id, >> VARCHAR(1) c4_desc, INTEGER c3_id, INTEGER c2_id, INTEGER c1_id): rowcount >> = 1.0, cumulative cost = {2.0 rows, 29.0 cpu, 0.0 io, 0.0 network, 0.0 >> memory}, id = 1330 >> 00-12 Scan(groupscan=[EasyGroupScan >> [selectionRoot=/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv, >> numFiles=1, columns=[`columns`[1], `columns`[12], `columns`[11], >> `columns`[10], `columns`[9], `columns`[7], `columns`[5]], >> files=[file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv]]]) >> : rowType = RecordType(ANY columns): rowcount = 1.0, cumulative cost = {1.0 >> rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 1329 >> | { >> "head" : { >> "version" : 1, >> "generator" : { >> "type" : "ExplainHandler", >> "info" : "" >> }, >> "type" : "APACHE_DRILL_PHYSICAL", >> "options" : [ ], >> "queue" : 0, >> "resultMode" : "EXEC" >> }, >> "graph" : [ { >> "pop" : "fs-scan", >> "@id" : 12, >> "userName" : "rsingh", >> "files" : [ >> "file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv" >> ], >> "storage" : { >> "type" : "file", >> "enabled" : true, >> "connection" : "file:///", >> "workspaces" : { >> "root" : { >> "location" : "/", >> "writable" : false, >> "defaultInputFormat" : null >> }, >> "tmp" : { >> "location" : "/tmp", >> "writable" : true, >> "defaultInputFormat" : null >> } >> }, >> "formats" : { >> "psv" : { >> "type" : "text", >> "extensions" : [ "tbl" ], >> "delimiter" : "|" >> }, >> "csv" : { >> "type" : "text", >> "extensions" : [ "csv" ], >> "delimiter" : "," >> }, >> "tsv" : { >> "type" : "text", >> "extensions" : [ "tsv" ], >> "delimiter" : "\t" >> }, >> "parquet" : { >> "type" : "parquet" >> }, >> "json" : { >> "type" : "json" >> }, >> "avro" : { >> "type" : "avro" >> } >> } >> }, >> "format" : { >> "type" : "text", >> "extensions" : [ "csv" ], >> "delimiter" : "," >> }, >> "columns" : [ "`columns`[1]", "`columns`[12]", "`columns`[11]", >> "`columns`[10]", "`columns`[9]", "`columns`[7]", "`columns`[5]" ], >> "selectionRoot" : >> "/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv", >> "cost" : 1.0 >> }, { >> "pop" : "project", >> "@id" : 10, >> "exprs" : [ { >> "ref" : "`c_id`", >> "expr" : "cast( (`columns`[1] ) as INT )" >> }, { >> "ref" : "`c_desc`", >> "expr" : "cast( (`columns`[12] ) as VARCHAR(1) )" >> }, { >> "ref" : "`c4_id`", >> "expr" : "cast( (`columns`[11] ) as INT )" >> }, { >> "ref" : "`c4_desc`", >> "expr" : "cast( (`columns`[10] ) as VARCHAR(1) )" >> }, { >> "ref" : "`c3_id`", >> "expr" : "cast( (`columns`[9] ) as INT )" >> }, { >> "ref" : "`c2_id`", >> "expr" : "cast( (`columns`[7] ) as INT )" >> }, { >> "ref" : "`c1_id`", >> "expr" : "cast( (`columns`[5] ) as INT )" >> } ], >> "child" : 12, >> "initialAllocation" : 1000000, >> "maxAllocation" : 10000000000, >> "cost" : 1.0 >> }, { >> "pop" : "external-sort", >> "@id" : 8, >> "child" : 10, >> "orderings" : [ { >> "expr" : "`c_id`", >> "order" : "ASC", >> "nullDirection" : "UNSPECIFIED" >> }, { >> "expr" : "`c4_id`", >> "order" : "ASC", >> "nullDirection" : "UNSPECIFIED" >> }, { >> "expr" : "`c3_id`", >> "order" : "ASC", >> "nullDirection" : "UNSPECIFIED" >> }, { >> "expr" : "`c2_id`", >> "order" : "ASC", >> "nullDirection" : "UNSPECIFIED" >> }, { >> "expr" : "`c1_id`", >> "order" : "ASC", >> "nullDirection" : "UNSPECIFIED" >> } ], >> "reverse" : false, >> "initialAllocation" : 20000000, >> "maxAllocation" : 10000000000, >> "cost" : 1.0 >> }, { >> "pop" : "selection-vector-remover", >> "@id" : 6, >> "child" : 8, >> "initialAllocation" : 1000000, >> "maxAllocation" : 10000000000, >> "cost" : 1.0 >> }, { >> "pop" : "fs-scan", >> "@id" : 17, >> "userName" : "rsingh", >> "files" : [ >> "file:/Users/rsingh/Downloads/apache-drill-1.0.0/sample-data/master-data.csv" >> ], >> "storage" : { >> "type" : "file", >> "enabled" : true, >> "connection" : "file:///", >> "workspaces" : { >> "root" : { >> "location" : "/", >> "writable" : false, >> "defaultInputFormat" : null >> }, >> "tmp" : { >> "location" : "/tmp", >> "writable" : true, >> "defaultInputFormat" : null >> } >> }, >> "formats" : { >> "psv" : { >> "type" : "text", >> "extensions" : [ "tbl" ], >> "delimiter" : "|" >> }, >> "csv" : { >> "type" : "text", >> "extensions" : [ "csv" ], >> "delimiter" : "," >> }, >> "tsv" : { >> | >> +------------- >> >> >> Rajkumar Singh >> MapR Technologies >> >> >>> On Jun 20, 2015, at 9:07 PM, Hao Zhu <[email protected]> wrote: >>> >>> Hello, >>> >>> I tried to create the same data in my lab with Drill 1.0 on MapR 4.1, >>> however both SQL works fine in my end: >>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as >>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, >>> sum(a11.account_value) as sss from view_fact_account a11 >>> join view_dim_costcenter a12 on (a11.costcenter_id = >>> a12.costcenter_id) group by a11.costcenter_id; >>> +----------------+------------------+-----------------------+------+ >>> | costcenter_id | costcenter_desc | costcenter_name_desc | sss | >>> +----------------+------------------+-----------------------+------+ >>> | 2 | a | a | 3 | >>> +----------------+------------------+-----------------------+------+ >>> 1 row selected (0.302 seconds) >>> >>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as >>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, >>> sum(a11.account_value) as sss from view_dim_costcenter a12 >>> join view_fact_account a11 on (a11.costcenter_id = >>> a12.costcenter_id) group by a11.costcenter_id; >>> >>> +----------------+------------------+-----------------------+------+ >>> | costcenter_id | costcenter_desc | costcenter_name_desc | sss | >>> +----------------+------------------+-----------------------+------+ >>> | 2 | a | a | 3 | >>> +----------------+------------------+-----------------------+------+ >>> 1 row selected (0.209 seconds) >>> >>> To narrow down the issue, could you test something: >>> 1. Is this issue only happening with user2? Do you have the same issue >>> using user1 also? >>> Just want to confirm if this issue is related to impersonation or >>> permission. >>> >>> 2. Is this issue only happening with the 582 rows table? >>> I mean, if the 2 tables have fewer rows, can this issue reproduce? >>> In my test, I only created 1 row. >>> I just want to know if this issue is data driver or not. >>> >>> 3. Could you attach the good SQL and bad SQL profiles, so that the SQL >> plan >>> is more readable? >>> >>> Thanks, >>> Hao >>> >>> On Thu, Jun 18, 2015 at 6:46 AM, Mustafa Engin Sözer < >>> [email protected]> wrote: >>> >>>> Hi everyone, >>>> >>>> I've had an earlier topic regarding this issue but no resolution came >> out >>>> of this and it couldn't be reproduced. Let me re-describe the issue and >> my >>>> cluster: >>>> >>>> Currently I have a 5-node Mapr cluster on AWS, including Drill. On both >>>> sides, the security is enabled and on drill, impersonation is also >> enabled. >>>> The only other configuration I changed in drill was the new views >>>> permissions which I set to 750. I'm using maprfs and our MapR version is >>>> 4.1.0 and Drill version is 1.0.0. >>>> >>>> So the process goes like this: >>>> >>>> I have two users involved in this process, called usr1 and usr2. usr1 is >>>> kind of an admin for the raw data whereas usr2 is not allowed to access >> to >>>> raw data. >>>> >>>> usr1 writes 3 csv files to /raw/costcenter volume and creates a >> relational >>>> model using drill views. These views are written to /views/costcenter >> where >>>> usr2 has access to. So usr2 can query these views without any issues. >>>> >>>> So there comes the problem. Along with several other tables, I have 2 >>>> views, namely fact_account and dim_costcenter (created out of the same >> csv) >>>> Here are the table definitions: >>>> >>>> describe dfs.views_costcenter.fact_account; >>>> +----------------+------------+--------------+ >>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | >>>> +----------------+------------+--------------+ >>>> | account_id | INTEGER | YES | >>>> | costcenter_id | INTEGER | YES | >>>> | account_date | DATE | YES | >>>> | account_value | INTEGER | YES | >>>> +----------------+------------+--------------+ >>>> >>>> describe dfs.views_costcenter.dim_costcenter; >>>> +-----------------------+------------+--------------+ >>>> | COLUMN_NAME | DATA_TYPE | IS_NULLABLE | >>>> +-----------------------+------------+--------------+ >>>> | costcenter_id | INTEGER | YES | >>>> | costcenter_desc | VARCHAR | YES | >>>> | costcenter_name_id | INTEGER | YES | >>>> | costcenter_name_desc | VARCHAR | YES | >>>> | department_id | INTEGER | YES | >>>> | division_id | INTEGER | YES | >>>> | area_id | INTEGER | YES | >>>> +-----------------------+------------+--------------+ >>>> >>>> Both tables have 582 rows. >>>> >>>> So I need to join these two tables and run some aggregations on them in >>>> order to create a report. I have the following query: >>>> >>>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as >>>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, >>>> sum(a11.account_value) as sss from dfs.views_costcenter.fact_account a11 >>>> join dfs.views_costcenter.dim_costcenter a12 on (a11.costcenter_id = >>>> a12.costcenter_id) group by a11.costcenter_id; >>>> >>>> When I run this query, the execution planner throws a huge exception as >> you >>>> can see below. However, I've found a strange solution to that. If I >>>> exchange the order of the tables within the join, ie. >>>> >>>> select a11.costcenter_id as costcenter_id, max(a12.costcenter_desc) as >>>> costcenter_desc, max(a12.costcenter_name_desc) as costcenter_name_desc, >>>> sum(a11.account_value) as sss from dfs.views_costcenter.dim_costcenter >> a12 >>>> join dfs.views_costcenter.fact_account a11 on (a11.costcenter_id = >>>> a12.costcenter_id) group by a11.costcenter_id; >>>> >>>> It works perfectly. So in summary, if I write t2 join t1 instead of t1 >> join >>>> t2 and change nothing else, it works like a charm. As inner join is >>>> commutative and associative, this was completely unexpected for me. Can >>>> someone confirm if this is a bug? I didn't want to file a bug to JIRA >>>> before asking you guys here first. >>>> >>>> Thanks in advance for your help. >>>> >>>> Below you can find the exception: (as the exception is huge, I've only >>>> posted part of it here. Please let me know if you need the complete >>>> exception) >>>> >>>> >>>> Error: SYSTEM ERROR: >>>> org.apache.calcite.plan.RelOptPlanner$CannotPlanException: Node >>>> [rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[]] could not be >> implemented; >>>> planner state: >>>> >>>> Root: rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[] >>>> Original rel: >>>> >> AbstractConverter(subset=[rel#15146:Subset#27.PHYSICAL.SINGLETON([]).[]], >>>> convention=[PHYSICAL], DrillDistributionTraitDef=[SINGLETON([])], >>>> sort=[[]]): rowcount = 101.8, cumulative cost = {inf}, id = 15148 >>>> DrillScreenRel(subset=[rel#15145:Subset#27.LOGICAL.ANY([]).[]]): >> rowcount >>>> = 101.8, cumulative cost = {10.18 rows, 10.18 cpu, 0.0 io, 0.0 network, >> 0.0 >>>> memory}, id = 15144 >>>> DrillAggregateRel(subset=[rel#15143:Subset#26.LOGICAL.ANY([]).[]], >>>> group=[{0}], costcenter_desc=[MAX($1)], costcenter_name_desc=[MAX($2)], >>>> sss=[SUM($3)]): rowcount = 101.8, cumulative cost = {1.0 rows, 1.0 cpu, >> 0.0 >>>> io, 0.0 network, 0.0 memory}, id = 15142 >>>> DrillProjectRel(subset=[rel#15141:Subset#25.LOGICAL.ANY([]).[]], >>>> costcenter_id=[$1], costcenter_desc=[$5], costcenter_name_desc=[$7], >>>> account_value=[$3]): rowcount = 1018.0, cumulative cost = {0.0 rows, 0.0 >>>> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15140 >>>> DrillProjectRel(subset=[rel#15139:Subset#24.LOGICAL.ANY([]).[0, >> 2, >>>> 4, 5, 6]], account_id=[$7], costcenter_id=[$8], account_date=[$9], >>>> account_value=[$10], costcenter_id0=[$0], costcenter_desc=[$1], >>>> costcenter_name_id=[$2], costcenter_name_desc=[$3], department_id=[$4], >>>> division_id=[$5], area_id=[$6]): rowcount = 1018.0, cumulative cost = >> {0.0 >>>> rows, 0.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15138 >>>> DrillJoinRel(subset=[rel#15137:Subset#23.LOGICAL.ANY([]).[0, 2, >>>> 4, 5, 6]], condition=[=($8, $0)], joinType=[inner]): rowcount = 1018.0, >>>> cumulative cost = {1119.8 rows, 13030.4 cpu, 0.0 io, 0.0 network, >> 1791.68 >>>> memory}, id = 15136 >>>> DrillSortRel(subset=[rel#15128:Subset#18.LOGICAL.ANY([]).[0, >> 2, >>>> 4, 5, 6]], sort0=[$0], sort1=[$2], sort2=[$4], sort3=[$5], sort4=[$6], >>>> dir0=[ASC], dir1=[ASC], dir2=[ASC], dir3=[ASC], dir4=[ASC]): rowcount = >>>> 1018.0, cumulative cost = {197407.16549843678 rows, 1018.0 cpu, 0.0 io, >> 0.0 >>>> network, 0.0 memory}, id = 15127 >>>> >>>> DrillProjectRel(subset=[rel#15126:Subset#17.LOGICAL.ANY([]).[]], >>>> costcenter_id=[CAST(ITEM($0, 1)):INTEGER], >> costcenter_desc=[CAST(ITEM($0, >>>> 12)):VARCHAR(45) CHARACTER SET "ISO-8859-1" COLLATE >>>> "ISO-8859-1$en_US$primary"], costcenter_name_id=[CAST(ITEM($0, >>>> 11)):INTEGER], costcenter_name_desc=[CAST(ITEM($0, 10)):VARCHAR(45) >>>> CHARACTER SET "ISO-8859-1" COLLATE "ISO-8859-1$en_US$primary"], >>>> department_id=[CAST(ITEM($0, 9)):INTEGER], division_id=[CAST(ITEM($0, >>>> 7)):INTEGER], area_id=[CAST(ITEM($0, 5)):INTEGER]): rowcount = 1018.0, >>>> cumulative cost = {1018.0 rows, 28504.0 cpu, 0.0 io, 0.0 network, 0.0 >>>> memory}, id = 15125 >>>> >>>> DrillScanRel(subset=[rel#15124:Subset#16.LOGICAL.ANY([]).[]], >> table=[[dfs, >>>> raw_costcenter, master_datev*.csv]], groupscan=[EasyGroupScan >>>> [selectionRoot=/raw/costcenter/master_datev*.csv, numFiles=1, >>>> columns=[`columns`[1], `columns`[12], `columns`[11], `columns`[10], >>>> `columns`[9], `columns`[7], `columns`[5]], >>>> files=[maprfs:/raw/costcenter/master_datev_20150617.csv]]]): rowcount = >>>> 1018.0, cumulative cost = {1018.0 rows, 1018.0 cpu, 0.0 io, 0.0 network, >>>> 0.0 memory}, id = 15064 >>>> DrillSortRel(subset=[rel#15135:Subset#22.LOGICAL.ANY([]).[0, >> 1, >>>> 2]], sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC], >>>> dir2=[ASC]): rowcount = 101.8, cumulative cost = {7529.958857584828 >> rows, >>>> 101.8 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15134 >>>> >>>> DrillAggregateRel(subset=[rel#15133:Subset#21.LOGICAL.ANY([]).[]], >>>> group=[{0, 1, 2, 3}]): rowcount = 101.8, cumulative cost = {1.0 rows, >> 1.0 >>>> cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 15132 >>>> >>>> DrillProjectRel(subset=[rel#15131:Subset#20.LOGICAL.ANY([]).[]], >>>> account_id=[CAST(ITEM($0, 0)):INTEGER], costcenter_id=[CAST(ITEM($0, >>>> 1)):INTEGER], account_date=[CAST(ITEM($0, 2)):DATE], >>>> account_value=[CAST(ITEM($0, 3)):INTEGER]): rowcount = 1018.0, >> cumulative >>>> cost = {1018.0 rows, 16288.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = >>>> 15130 >>>> >>>> DrillScanRel(subset=[rel#15129:Subset#19.LOGICAL.ANY([]).[]], >> table=[[dfs, >>>> raw_costcenter, master_datev*.csv]], groupscan=[EasyGroupScan >>>> [selectionRoot=/raw/costcenter/master_datev*.csv, numFiles=1, >>>> columns=[`columns`[0], `columns`[1], `columns`[2], `columns`[3]], >>>> files=[maprfs:/raw/costcenter/master_datev_20150617.csv]]]): rowcount = >>>> 1018.0, cumulative cost = {1018.0 rows, 1018.0 cpu, 0.0 io, 0.0 network, >>>> 0.0 memory}, id = 15074 >>>> >>>> Sets: >>>> Set#16, type: RecordType(ANY columns) >>>> rel#15124:Subset#16.LOGICAL.ANY([]).[], best=rel#15064, >>>> importance=0.4304672100000001 >>>> >>>> >>>> -- >>>> >>>> *M. Engin Sözer* >>>> Junior Datawarehouse Manager >>>> [email protected] >>>> >>>> Goodgame Studios >>>> Theodorstr. 42-90, House 9 >>>> 22761 Hamburg, Germany >>>> Phone: +49 (0)40 219 880 -0 >>>> *www.goodgamestudios.com <http://www.goodgamestudios.com>* >>>> >>>> Goodgame Studios is a branch of Altigi GmbH >>>> Altigi GmbH, District court Hamburg, HRB 99869 >>>> Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian >>>> Ritter >>>> >> >>
