[ 
https://issues.apache.org/jira/browse/DRILL-5804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5804:
------------------------------
    Attachment: drillbit.log

> External Sort times out, may be infinite loop
> ---------------------------------------------
>
>                 Key: DRILL-5804
>                 URL: https://issues.apache.org/jira/browse/DRILL-5804
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Robert Hou
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
>
>         Attachments: drillbit.log
>
>
> Query is:
> {noformat}
> ALTER SESSION SET `exec.sort.disable_managed` = false;
> select count(*) from (
>   select * from (
>     select s1.type type, flatten(s1.rms.rptd) rptds, s1.rms, s1.uid 
>     from (
>       select d.type type, d.uid uid, flatten(d.map.rm) rms from 
> dfs.`/drill/testdata/resource-manager/nested_large` d order by d.uid
>     ) s1
>   ) s2
>   order by s2.rms.mapid, s2.rptds.a, s2.rptds.do_not_exist
> );
> {noformat}
> Plan is:
> {noformat}
> | 00-00    Screen
> 00-01      Project(EXPR$0=[$0])
> 00-02        StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
> 00-03          UnionExchange
> 01-01            StreamAgg(group=[{}], EXPR$0=[COUNT()])
> 01-02              Project($f0=[0])
> 01-03                SingleMergeExchange(sort0=[4 ASC], sort1=[5 ASC], 
> sort2=[6 ASC])
> 02-01                  SelectionVectorRemover
> 02-02                    Sort(sort0=[$4], sort1=[$5], sort2=[$6], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 02-03                      Project(type=[$0], rptds=[$1], rms=[$2], uid=[$3], 
> EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6])
> 02-04                        HashToRandomExchange(dist0=[[$4]], dist1=[[$5]], 
> dist2=[[$6]])
> 03-01                          UnorderedMuxExchange
> 04-01                            Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], 
> E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($6, hash32AsDouble($5, 
> hash32AsDouble($4, 1301011)))])
> 04-02                              Project(type=[$0], rptds=[$1], rms=[$2], 
> uid=[$3], EXPR$4=[ITEM($2, 'mapid')], EXPR$5=[ITEM($1, 'a')], 
> EXPR$6=[ITEM($1, 'do_not_exist')])
> 04-03                                Flatten(flattenField=[$1])
> 04-04                                  Project(type=[$0], rptds=[ITEM($2, 
> 'rptd')], rms=[$2], uid=[$1])
> 04-05                                    SingleMergeExchange(sort0=[1 ASC])
> 05-01                                      SelectionVectorRemover
> 05-02                                        Sort(sort0=[$1], dir0=[ASC])
> 05-03                                          Project(type=[$0], uid=[$1], 
> rms=[$2])
> 05-04                                            
> HashToRandomExchange(dist0=[[$1]])
> 06-01                                              UnorderedMuxExchange
> 07-01                                                Project(type=[$0], 
> uid=[$1], rms=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 1301011)])
> 07-02                                                  
> Flatten(flattenField=[$2])
> 07-03                                                    Project(type=[$0], 
> uid=[$1], rms=[ITEM($2, 'rm')])
> 07-04                                                      
> Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/resource-manager/nested_large]], 
> selectionRoot=maprfs:/drill/testdata/resource-manager/nested_large, 
> numFiles=1, usedMetadataFile=false, columns=[`type`, `uid`, `map`.`rm`]]])
> {noformat}
> Here is a segment of the drillbit.log, starting at line 55890:
> {noformat}
> 2017-09-19 04:22:56,258 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 142 us to sort 1023 records
> 2017-09-19 04:22:56,265 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 105 us to sort 1023 records
> 2017-09-19 04:22:56,268 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,275 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 145 us to sort 1023 records
> 2017-09-19 04:22:56,354 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:3:0] DEBUG 
> o.a.d.e.p.i.p.PartitionSenderRootExec - Partitioner.next(): got next record 
> batch with status OK
> 2017-09-19 04:22:56,357 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:2] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 143 us to sort 1023 records
> 2017-09-19 04:22:56,361 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.d.exec.compile.ClassTransformer - Compiled and merged 
> PriorityQueueCopierGen50: bytecode size = 11.0 KiB, time = 124 ms.
> 2017-09-19 04:22:56,365 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:4] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 108 us to sort 1023 records
> 2017-09-19 04:22:56,367 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.d.e.p.i.x.m.PriorityQueueCopierWrapper - Copier setup complete
> 2017-09-19 04:22:56,375 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:7] DEBUG 
> o.a.d.e.t.g.SingleBatchSorterGen44 - Took 144 us to sort 1023 records
> 2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.drill.exec.vector.BigIntVector - Reallocating vector 
> [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
> 2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.drill.exec.vector.BigIntVector - Reallocating vector 
> [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
> 2017-09-19 04:22:56,396 [263f0252-fc60-7f8d-a1b1-c075876d1bd2:frag:2:0] DEBUG 
> o.a.drill.exec.vector.BigIntVector - Reallocating vector 
> [$data$(BIGINT:REQUIRED)]. # of bytes: [0] -> [0]
> {noformat}
> From here on out, the drillbit.log file is filled with "Reallocating vector" 
> messages.  The log seems to be written as fast as Drill can write to disk.  
> The logs are filled within a minute.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to