[ https://issues.apache.org/jira/browse/ASTERIXDB-1413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15255489#comment-15255489 ]
Wenhai commented on ASTERIXDB-1413: ----------------------------------- Ok. Then once you handle that, ping me to verify it please. :) > Massive parallel operators will pose a "tempory file not found error" > --------------------------------------------------------------------- > > Key: ASTERIXDB-1413 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1413 > Project: Apache AsterixDB > Issue Type: Bug > Components: Hyracks > Environment: Linux ubuntu 12.04, 24 cores + 128GB memory > 2NCs X 12 partitions with 10GB per NC > Reporter: Wenhai > Assignee: Jianfeng Jia > > When we use the fuzzyjoin patch in the context of three-way fuzzyjoin, the > following error will "ALMOST" always arise. (Here, ALMOST means it will > finally arise once we continue re-running the query). > Patch link: > {noformat} > https://asterix-gerrit.ics.uci.edu/#/c/531/ > {noformat} > Dataset > {noformat} > http://yun.baidu.com/share/link?shareid=2678954841&uk=4030601168 > {noformat} > Schema > {noformat} > drop dataverse test if exists; > create dataverse test; > use dataverse test; > create type PaperType as open { > tid:uuid, > title: string, > authors: string?, > year: int?, > conf: string?, > idx: string, > abstract: string? > } > create dataset ACM(PaperType) primary key tid autogenerated; > use dataverse test; > drop dataset ACM if exists; > create dataset ACM(PaperType) primary key tid autogenerated; > load dataset ACM > using localfs > (("path"="127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/acm_split.aa,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/acm_split.ab,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/acm_split.ac,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/acm_split.ad,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/acm_split.ae"),("format"="delimited-text"),("delimiter"="#"),("quote"="\u0000")); > use dataverse test; > create dataset OUTPUT(PaperType) primary key tid autogenerated; > load dataset OUTPUT > using localfs > (("path"="127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.aa,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.ab,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.ac,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.ad,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.ae,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.af,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/outputacm_raw.ag"),("format"="delimited-text"),("delimiter"="#"),("quote"="\u0000")); > use dataverse test; > create dataset DBLP(PaperType) primary key tid autogenerated; > load dataset DBLP > using localfs > (("path"="127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.aa,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.ab,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.ac,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.ad,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.ae,127.0.0.1:///home/hadoop/Downloads/doccorpus/reproduce/dblp_split.af"),("format"="delimited-text"),("delimiter"="#"),("quote"="\u0000")); > {noformat} > Query > {noformat} > use dataverse test; > set import-private-functions 'true' > set simthreshold '.9f'; > let $s := sum( > for $t in dataset ('ACM') > for $o in dataset('DBLP') > for $g in dataset('OUTPUT') > where word-tokens($o.authors) ~= word-tokens($t.authors) and > word-tokens($t.authors) ~= word-tokens($g.authors) > order by $o.id > return 1) > return $s > {noformat} > Error message > {noformat} > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) [FileNotFoundException] > {noformat} > Tracing information > {noformat} > org.apache.hyracks.api.exceptions.HyracksException: Job failed on account of: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.cc.job.JobRun.waitForCompletion(JobRun.java:211) > at > org.apache.hyracks.control.cc.work.WaitForJobCompletionWork$1.run(WaitForJobCompletionWork.java:48) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45) > at org.apache.hyracks.control.nc.Task.run(Task.java:319) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83) > at org.apache.hyracks.control.nc.Task.run(Task.java:263) > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212) > ... 5 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:230) > at > org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:60) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:116) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:81) > at > org.apache.hyracks.dataflow.common.io.RunFileReader.open(RunFileReader.java:47) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.applyInMemHashJoin(OptimizedHybridHashJoinOperatorDescriptor.java:670) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.joinPartitionPair(OptimizedHybridHashJoinOperatorDescriptor.java:489) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:426) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$2.close(AlgebricksMetaOperatorDescriptor.java:153) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:227) > ... 9 more > Caused by: java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) > at > org.apache.hyracks.control.nc.io.FileHandle.open(FileHandle.java:70) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:79) > ... 16 more > org.apache.hyracks.api.exceptions.HyracksException: Job failed on account of: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.cc.job.JobRun.waitForCompletion(JobRun.java:211) > at > org.apache.hyracks.control.cc.work.WaitForJobCompletionWork$1.run(WaitForJobCompletionWork.java:48) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45) > at org.apache.hyracks.control.nc.Task.run(Task.java:319) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83) > at org.apache.hyracks.control.nc.Task.run(Task.java:263) > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212) > ... 5 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:230) > at > org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:60) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:116) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:81) > at > org.apache.hyracks.dataflow.common.io.RunFileReader.open(RunFileReader.java:47) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.applyInMemHashJoin(OptimizedHybridHashJoinOperatorDescriptor.java:670) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.joinPartitionPair(OptimizedHybridHashJoinOperatorDescriptor.java:489) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:426) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$2.close(AlgebricksMetaOperatorDescriptor.java:153) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:227) > ... 9 more > Caused by: java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) > at > org.apache.hyracks.control.nc.io.FileHandle.open(FileHandle.java:70) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:79) > ... 16 more > Apr 24, 2016 12:01:39 PM org.apache.asterix.api.http.servlet.APIServlet doPost > SEVERE: Job failed on account of: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > org.apache.hyracks.api.exceptions.HyracksException: Job failed on account of: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.cc.job.JobRun.waitForCompletion(JobRun.java:211) > at > org.apache.hyracks.control.cc.work.WaitForJobCompletionWork$1.run(WaitForJobCompletionWork.java:48) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.control.common.utils.ExceptionUtils.setNodeIds(ExceptionUtils.java:45) > at org.apache.hyracks.control.nc.Task.run(Task.java:319) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83) > at org.apache.hyracks.control.nc.Task.run(Task.java:263) > ... 3 more > Caused by: java.util.concurrent.ExecutionException: > org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212) > ... 5 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:230) > at > org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.close(EmptyTupleSourceRuntimeFactory.java:60) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:116) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205) > at > org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ... 3 more > Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: > java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:81) > at > org.apache.hyracks.dataflow.common.io.RunFileReader.open(RunFileReader.java:47) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.applyInMemHashJoin(OptimizedHybridHashJoinOperatorDescriptor.java:670) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.joinPartitionPair(OptimizedHybridHashJoinOperatorDescriptor.java:489) > at > org.apache.hyracks.dataflow.std.join.OptimizedHybridHashJoinOperatorDescriptor$ProbeAndJoinActivityNode$1.close(OptimizedHybridHashJoinOperatorDescriptor.java:426) > at > org.apache.hyracks.algebricks.runtime.operators.base.AbstractOneInputOneOutputOneFramePushRuntime.close(AbstractOneInputOneOutputOneFramePushRuntime.java:57) > at > org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$2.close(AlgebricksMetaOperatorDescriptor.java:153) > at > org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.close(IndexSearchOperatorNodePushable.java:227) > ... 9 more > Caused by: java.io.FileNotFoundException: > /home/hadoop/asterixdb/hadoop/node1/io2/./RelS1188720785205710895.waf (No > such file or directory) > at java.io.RandomAccessFile.open0(Native Method) > at java.io.RandomAccessFile.open(RandomAccessFile.java:316) > at java.io.RandomAccessFile.<init>(RandomAccessFile.java:243) > at > org.apache.hyracks.control.nc.io.FileHandle.open(FileHandle.java:70) > at org.apache.hyracks.control.nc.io.IOManager.open(IOManager.java:79) > ... 16 more > Apr 24, 2016 12:01:49 PM > org.apache.hyracks.control.common.dataset.ResultStateSweeper sweep > {noformat} > Plan > {noformat} > distribute result [%0->$$22] > -- DISTRIBUTE_RESULT |UNPARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |UNPARTITIONED| > aggregate [$$22] <- [function-call: asterix:agg-sum, Args:[%0->$$278]] > -- AGGREGATE |UNPARTITIONED| > aggregate [$$278] <- [function-call: asterix:agg-local-sum, > Args:[%0->$$20]] > -- AGGREGATE |PARTITIONED| > assign [$$20] <- [AInt64: {1}] > -- ASSIGN |PARTITIONED| > project ([]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- SORT_MERGE_EXCHANGE [$$34(ASC) ] |PARTITIONED| > order (ASC, %0->$$34) > -- STABLE_SORT [$$34(ASC)] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$34]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > join (function-call: algebricks:and, > Args:[function-call: algebricks:eq, Args:[%0->$$24, %0->$$200], > function-call: algebricks:eq, Args:[%0->$$25, %0->$$201]]) > -- HYBRID_HASH_JOIN [$$24, $$25][$$200, $$201] > |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$34, $$24, $$25]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > join (function-call: algebricks:eq, > Args:[%0->$$24, %0->$$78]) > -- HYBRID_HASH_JOIN [$$24][$$78] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$24]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > data-scan []<-[$$24, $$0] <- test:ACM > -- DATASOURCE_SCAN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$78] > |PARTITIONED| > project ([$$34, $$25, $$78]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > join (function-call: algebricks:eq, > Args:[%0->$$25, %0->$$79]) > -- HYBRID_HASH_JOIN [$$25][$$79] > |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$34, $$25]) > -- STREAM_PROJECT |PARTITIONED| > assign [$$34] <- [function-call: > asterix:field-access-by-name, Args:[%0->$$1, AString: {id}]] > -- ASSIGN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > data-scan []<-[$$25, $$1] <- > test:OUTPUT > -- DATASOURCE_SCAN > |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE > |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$79] > |PARTITIONED| > group by ([$$78 := %0->$$231; $$79 > := %0->$$233]) decor ([]) { > aggregate [] <- [] > -- AGGREGATE |LOCAL| > nested tuple source > -- NESTED_TUPLE_SOURCE > |LOCAL| > } > -- EXTERNAL_GROUP_BY[$$231, $$233] > |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE > [$$231, $$233] |PARTITIONED| > project ([$$231, $$233]) > -- STREAM_PROJECT |PARTITIONED| > select (function-call: > algebricks:ge, Args:[function-call: asterix:similarity-jaccard-prefix, > Args:[%0->$$56, %0->$$86, %0->$$67, %0->$$95, %0->$$65, AFloat: {0.9}], > AFloat: {0.9}]) > -- STREAM_SELECT > |PARTITIONED| > project ([$$65, $$67, $$86, > $$231, $$56, $$233, $$95]) > -- STREAM_PROJECT > |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > join (function-call: > algebricks:eq, Args:[%0->$$65, %0->$$76]) > -- HYBRID_HASH_JOIN > [$$65][$$76] |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$86, > $$231, $$56, $$65]) > -- STREAM_PROJECT > |PARTITIONED| > assign [$$86, > $$231, $$56, $$65] <- [%0->$$126, %0->$$242, %0->$$120, %0->$$124] > -- ASSIGN > |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > replicate > -- SPLIT > |PARTITIONED| > exchange > -- > HASH_PARTITION_EXCHANGE [$$124] |PARTITIONED| > unnest > $$124 <- function-call: asterix:subset-collection, Args:[%0->$$126, AInt32: > {0}, function-call: asterix:prefix-len-jaccard, Args:[function-call: > asterix:len, Args:[%0->$$126], AFloat: {0.9}]] > -- UNNEST > |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > group > by ([$$242 := %0->$$117]) decor ([%0->$$120]) { > > aggregate [$$126] <- [function-call: asterix:listify, Args:[%0->$$132]] > > -- AGGREGATE |LOCAL| > > order (ASC, %0->$$132) > > -- IN_MEMORY_STABLE_SORT [$$132(ASC)] |LOCAL| > > select (function-call: algebricks:not, Args:[function-call: > algebricks:is-null, Args:[%0->$$241]]) > > -- STREAM_SELECT |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > -- > PRE_CLUSTERED_GROUP_BY[$$117] |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$241, $$132, $$117, $$120]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > left outer join (function-call: algebricks:eq, Args:[%0->$$133, %0->$$136]) > > -- IN_MEMORY_HASH_JOIN [$$133][$$136] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$117, $$133, $$120]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$133 <- function-call: asterix:scan-collection, > Args:[%0->$$291] > > -- UNNEST |PARTITIONED| > > assign [$$120] <- [function-call: asterix:len, Args:[%0->$$291]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$291, $$117]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$291] <- [function-call: asterix:word-tokens, > Args:[function-call: asterix:field-access-by-index, Args:[%0->$$131, AInt32: > {2}]]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$117, $$131] <- test:ACM > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$136, $$132, $$241]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$136, $$132, $$241] <- [%0->$$155, %0->$$151, %0->$$243] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- BROADCAST_EXCHANGE |PARTITIONED| > > assign [$$243] <- [TRUE] > > -- ASSIGN |PARTITIONED| > > running-aggregate [$$151] <- [function-call: asterix:tid, > Args:[]] > > -- RUNNING_AGGREGATE |PARTITIONED| > > project ([$$155]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- SORT_MERGE_EXCHANGE [$$254(ASC), $$155(ASC) ] > |PARTITIONED| > > order (ASC, %0->$$254) (ASC, %0->$$155) > > -- STABLE_SORT [$$254(ASC), $$155(ASC)] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$155 := %0->$$284]) decor ([]) { > > aggregate [$$254] <- [function-call: > asterix:sum-serial, Args:[%0->$$283]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$284] |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$284] > |PARTITIONED| > > group by ([$$284 := %0->$$157]) decor ([]) { > > aggregate [$$283] <- > [function-call: asterix:count-serial, Args:[AInt64: {1}]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$157] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$157]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$157 <- function-call: > asterix:scan-collection, Args:[%0->$$158] > > -- UNNEST |PARTITIONED| > > project ([$$158]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$158] <- [function-call: > asterix:word-tokens, Args:[function-call: asterix:field-access-by-index, > Args:[%0->$$162, AInt32: {2}]]] > > -- ASSIGN |PARTITIONED| > > project ([$$162]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > data-scan []<-[$$160, $$162] > <- test:OUTPUT > > -- DATASOURCE_SCAN > |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE > |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$95, > $$233, $$67, $$76]) > -- STREAM_PROJECT > |PARTITIONED| > assign [$$95, > $$233, $$67, $$76] <- [%0->$$145, %0->$$244, %0->$$122, %0->$$125] > -- ASSIGN > |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > replicate > -- SPLIT > |PARTITIONED| > exchange > -- > HASH_PARTITION_EXCHANGE [$$125] |PARTITIONED| > unnest > $$125 <- function-call: asterix:subset-collection, Args:[%0->$$145, AInt32: > {0}, function-call: asterix:prefix-len-jaccard, Args:[function-call: > asterix:len, Args:[%0->$$145], AFloat: {0.9}]] > -- UNNEST > |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > group > by ([$$244 := %0->$$118]) decor ([%0->$$122]) { > > aggregate [$$145] <- [function-call: asterix:listify, Args:[%0->$$151]] > > -- AGGREGATE |LOCAL| > > order (ASC, %0->$$151) > > -- IN_MEMORY_STABLE_SORT [$$151(ASC)] |LOCAL| > > select (function-call: algebricks:not, Args:[function-call: > algebricks:is-null, Args:[%0->$$243]]) > > -- STREAM_SELECT |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > -- > PRE_CLUSTERED_GROUP_BY[$$118] |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$243, $$118, $$151, $$122]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > left outer join (function-call: algebricks:eq, Args:[%0->$$152, %0->$$155]) > > -- IN_MEMORY_HASH_JOIN [$$152][$$155] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$118, $$152, $$122]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$152 <- function-call: asterix:scan-collection, > Args:[%0->$$292] > > -- UNNEST |PARTITIONED| > > assign [$$122] <- [function-call: asterix:len, Args:[%0->$$292]] > > -- ASSIGN |PARTITIONED| > > project ([$$292, $$118]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$292] <- [function-call: asterix:word-tokens, > Args:[function-call: asterix:field-access-by-index, Args:[%0->$$149, AInt32: > {2}]]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$118, $$149] <- test:OUTPUT > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- BROADCAST_EXCHANGE |PARTITIONED| > > assign [$$243] <- [TRUE] > > -- ASSIGN |PARTITIONED| > > running-aggregate [$$151] <- [function-call: asterix:tid, Args:[]] > > -- RUNNING_AGGREGATE |PARTITIONED| > > project ([$$155]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- SORT_MERGE_EXCHANGE [$$254(ASC), $$155(ASC) ] > |PARTITIONED| > > order (ASC, %0->$$254) (ASC, %0->$$155) > > -- STABLE_SORT [$$254(ASC), $$155(ASC)] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$155 := %0->$$284]) decor ([]) { > > aggregate [$$254] <- [function-call: > asterix:sum-serial, Args:[%0->$$283]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$284] |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$284] |PARTITIONED| > > group by ([$$284 := %0->$$157]) decor ([]) { > > aggregate [$$283] <- [function-call: > asterix:count-serial, Args:[AInt64: {1}]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$157] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$157]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$157 <- function-call: > asterix:scan-collection, Args:[%0->$$158] > > -- UNNEST |PARTITIONED| > > project ([$$158]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$158] <- [function-call: > asterix:word-tokens, Args:[function-call: asterix:field-access-by-index, > Args:[%0->$$162, AInt32: {2}]]] > > -- ASSIGN |PARTITIONED| > > project ([$$162]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$160, $$162] <- > test:OUTPUT > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE > |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$200] |PARTITIONED| > project ([$$200, $$201]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > join (function-call: algebricks:eq, > Args:[%0->$$26, %0->$$202]) > -- HYBRID_HASH_JOIN [$$26][$$202] |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$26]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > data-scan []<-[$$26, $$2] <- test:DBLP > -- DATASOURCE_SCAN |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > empty-tuple-source > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$202] > |PARTITIONED| > group by ([$$200 := %0->$$236; $$201 := > %0->$$238; $$202 := %0->$$240]) decor ([]) { > aggregate [] <- [] > -- AGGREGATE |LOCAL| > nested tuple source > -- NESTED_TUPLE_SOURCE |LOCAL| > } > -- EXTERNAL_GROUP_BY[$$236, $$238, $$240] > |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE [$$236, $$238, > $$240] |PARTITIONED| > project ([$$240, $$236, $$238]) > -- STREAM_PROJECT |PARTITIONED| > select (function-call: algebricks:ge, > Args:[function-call: asterix:similarity-jaccard-prefix, Args:[%0->$$178, > %0->$$209, %0->$$189, %0->$$218, %0->$$187, AFloat: {0.9}], AFloat: {0.9}]) > -- STREAM_SELECT |PARTITIONED| > project ([$$240, $$209, $$178, > $$218, $$187, $$236, $$189, $$238]) > -- STREAM_PROJECT |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > join (function-call: > algebricks:eq, Args:[%0->$$187, %0->$$198]) > -- HYBRID_HASH_JOIN > [$$187][$$198] |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE > [$$187] |PARTITIONED| > unnest $$187 <- > function-call: asterix:subset-collection, Args:[%0->$$209, AInt32: {0}, > function-call: asterix:prefix-len-jaccard, Args:[function-call: asterix:len, > Args:[%0->$$209], AFloat: {0.9}]] > -- UNNEST |PARTITIONED| > project ([$$209, $$178, > $$236, $$238]) > -- STREAM_PROJECT > |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > group by ([$$235 := > %0->$$112; $$236 := %0->$$105; $$237 := %0->$$106; $$238 := %0->$$111]) decor > ([%0->$$178]) { > aggregate > [$$209] <- [function-call: asterix:listify, Args:[%0->$$181]] > -- > AGGREGATE |LOCAL| > select > (function-call: algebricks:not, Args:[function-call: algebricks:is-null, > Args:[%0->$$234]]) > -- > STREAM_SELECT |LOCAL| > nested > tuple source > -- > NESTED_TUPLE_SOURCE |LOCAL| > } > -- > PRE_CLUSTERED_GROUP_BY[$$112, $$105, $$106, $$111] |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > order (ASC, > %0->$$112) (ASC, %0->$$105) (ASC, %0->$$106) (ASC, %0->$$111) (ASC, > %0->$$181) > -- STABLE_SORT > [$$112(ASC), $$105(ASC), $$106(ASC), $$111(ASC), $$181(ASC)] |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project > ([$$112, $$178, $$181, $$105, $$234, $$106, $$111]) > -- > STREAM_PROJECT |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > left > outer join (function-call: algebricks:eq, Args:[%0->$$179, %0->$$184]) > -- > IN_MEMORY_HASH_JOIN [$$179][$$184] |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$112, $$178, $$179, $$105, $$106, $$111]) > -- > STREAM_PROJECT |PARTITIONED| > > unnest $$179 <- function-call: asterix:scan-collection, Args:[%0->$$109] > -- > UNNEST |PARTITIONED| > > assign [$$178] <- [function-call: asterix:len, Args:[%0->$$109]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > join (function-call: algebricks:eq, Args:[%0->$$105, %0->$$106]) > > -- HYBRID_HASH_JOIN [$$105][$$106] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$109, $$105]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$109, $$105] <- [%0->$$291, %0->$$117] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$291, $$117]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$291] <- [function-call: asterix:word-tokens, > Args:[function-call: asterix:field-access-by-index, Args:[%0->$$131, AInt32: > {2}]]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$117, $$131] <- test:ACM > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$106] |PARTITIONED| > > join (function-call: algebricks:eq, Args:[%0->$$111, %0->$$112]) > > -- HYBRID_HASH_JOIN [$$111][$$112] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$111]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$111, $$115] <- test:OUTPUT > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$112] |PARTITIONED| > > group by ([$$106 := %0->$$242; $$112 := %0->$$244]) decor ([]) { > > aggregate [] <- [] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$242, $$244] |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$242, $$244] |PARTITIONED| > > project ([$$242, $$244]) > > -- STREAM_PROJECT |PARTITIONED| > > select (function-call: algebricks:ge, Args:[function-call: > asterix:similarity-jaccard-prefix, Args:[%0->$$120, %0->$$126, %0->$$122, > %0->$$145, %0->$$124, AFloat: {0.9}], AFloat: {0.9}]) > > -- STREAM_SELECT |PARTITIONED| > > project ([$$145, $$242, $$244, $$120, $$122, $$124, $$126]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > join (function-call: algebricks:eq, Args:[%0->$$124, > %0->$$125]) > > -- HYBRID_HASH_JOIN [$$124][$$125] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$124] |PARTITIONED| > > unnest $$124 <- function-call: > asterix:subset-collection, Args:[%0->$$126, AInt32: {0}, function-call: > asterix:prefix-len-jaccard, Args:[function-call: asterix:len, > Args:[%0->$$126], AFloat: {0.9}]] > > -- UNNEST |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$242 := %0->$$117]) decor > ([%0->$$120]) { > > aggregate [$$126] <- > [function-call: asterix:listify, Args:[%0->$$132]] > > -- AGGREGATE |LOCAL| > > order (ASC, %0->$$132) > > -- IN_MEMORY_STABLE_SORT > [$$132(ASC)] |LOCAL| > > select (function-call: > algebricks:not, Args:[function-call: algebricks:is-null, Args:[%0->$$241]]) > > -- STREAM_SELECT |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE > |LOCAL| > > } > > -- PRE_CLUSTERED_GROUP_BY[$$117] > |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$241, $$132, $$117, $$120]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > left outer join (function-call: > algebricks:eq, Args:[%0->$$133, %0->$$136]) > > -- IN_MEMORY_HASH_JOIN > [$$133][$$136] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > project ([$$117, $$133, $$120]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$133 <- > function-call: asterix:scan-collection, Args:[%0->$$291] > > -- UNNEST |PARTITIONED| > > assign [$$120] <- > [function-call: asterix:len, Args:[%0->$$291]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$291, > $$117]) > > -- STREAM_PROJECT > |PARTITIONED| > > assign [$$291] <- > [function-call: asterix:word-tokens, Args:[function-call: > asterix:field-access-by-index, Args:[%0->$$131, AInt32: {2}]]] > > -- ASSIGN > |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan > []<-[$$117, $$131] <- test:ACM > > -- > DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > empty-tuple-source > > -- > EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > project ([$$136, $$132, $$241]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$136, $$132, $$241] > <- [%0->$$155, %0->$$151, %0->$$243] > > -- ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- BROADCAST_EXCHANGE > |PARTITIONED| > > assign [$$243] <- > [TRUE] > > -- ASSIGN > |PARTITIONED| > > running-aggregate > [$$151] <- [function-call: asterix:tid, Args:[]] > > -- > RUNNING_AGGREGATE |PARTITIONED| > > project ([$$155]) > > -- STREAM_PROJECT > |PARTITIONED| > > exchange > > -- > SORT_MERGE_EXCHANGE [$$254(ASC), $$155(ASC) ] |PARTITIONED| > > order (ASC, > %0->$$254) (ASC, %0->$$155) > > -- > STABLE_SORT [$$254(ASC), $$155(ASC)] |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by > ([$$155 := %0->$$284]) decor ([]) { > > > aggregate [$$254] <- [function-call: asterix:sum-serial, Args:[%0->$$283]] > > > -- AGGREGATE |LOCAL| > > > nested tuple source > > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- > EXTERNAL_GROUP_BY[$$284] |PARTITIONED| > > > exchange > > -- > HASH_PARTITION_EXCHANGE [$$284] |PARTITIONED| > > group > by ([$$284 := %0->$$157]) decor ([]) { > > > aggregate [$$283] <- [function-call: asterix:count-serial, Args:[AInt64: > {1}]] > > > -- AGGREGATE |LOCAL| > > > nested tuple source > > > -- NESTED_TUPLE_SOURCE |LOCAL| > > > } > > -- > EXTERNAL_GROUP_BY[$$157] |PARTITIONED| > > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > project ([$$157]) > > > -- STREAM_PROJECT |PARTITIONED| > > > unnest $$157 <- function-call: asterix:scan-collection, Args:[%0->$$158] > > > -- UNNEST |PARTITIONED| > > > project ([$$158]) > > > -- STREAM_PROJECT |PARTITIONED| > > > assign [$$158] <- [function-call: asterix:word-tokens, > Args:[function-call: asterix:field-access-by-index, Args:[%0->$$162, AInt32: > {2}]]] > > > -- ASSIGN |PARTITIONED| > > > project ([$$162]) > > > -- STREAM_PROJECT |PARTITIONED| > > > exchange > > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > data-scan []<-[$$160, $$162] <- test:OUTPUT > > > -- DATASOURCE_SCAN |PARTITIONED| > > > exchange > > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > empty-tuple-source > > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$125] |PARTITIONED| > > unnest $$125 <- function-call: > asterix:subset-collection, Args:[%0->$$145, AInt32: {0}, function-call: > asterix:prefix-len-jaccard, Args:[function-call: asterix:len, > Args:[%0->$$145], AFloat: {0.9}]] > > -- UNNEST |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$244 := %0->$$118]) decor > ([%0->$$122]) { > > aggregate [$$145] <- > [function-call: asterix:listify, Args:[%0->$$151]] > > -- AGGREGATE |LOCAL| > > order (ASC, %0->$$151) > > -- IN_MEMORY_STABLE_SORT > [$$151(ASC)] |LOCAL| > > select (function-call: > algebricks:not, Args:[function-call: algebricks:is-null, Args:[%0->$$243]]) > > -- STREAM_SELECT |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE > |LOCAL| > > } > > -- PRE_CLUSTERED_GROUP_BY[$$118] > |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$243, $$118, $$151, $$122]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > left outer join (function-call: > algebricks:eq, Args:[%0->$$152, %0->$$155]) > > -- IN_MEMORY_HASH_JOIN > [$$152][$$155] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > project ([$$118, $$152, $$122]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$152 <- > function-call: asterix:scan-collection, Args:[%0->$$292] > > -- UNNEST |PARTITIONED| > > assign [$$122] <- > [function-call: asterix:len, Args:[%0->$$292]] > > -- ASSIGN |PARTITIONED| > > project ([$$292, $$118]) > > -- STREAM_PROJECT > |PARTITIONED| > > assign [$$292] <- > [function-call: asterix:word-tokens, Args:[function-call: > asterix:field-access-by-index, Args:[%0->$$149, AInt32: {2}]]] > > -- ASSIGN |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan > []<-[$$118, $$149] <- test:OUTPUT > > -- DATASOURCE_SCAN > |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > empty-tuple-source > > -- > EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > > replicate > > -- SPLIT |PARTITIONED| > > exchange > > -- BROADCAST_EXCHANGE > |PARTITIONED| > > assign [$$243] <- [TRUE] > > -- ASSIGN |PARTITIONED| > > running-aggregate [$$151] > <- [function-call: asterix:tid, Args:[]] > > -- RUNNING_AGGREGATE > |PARTITIONED| > > project ([$$155]) > > -- STREAM_PROJECT > |PARTITIONED| > > exchange > > -- > SORT_MERGE_EXCHANGE [$$254(ASC), $$155(ASC) ] |PARTITIONED| > > order (ASC, > %0->$$254) (ASC, %0->$$155) > > -- STABLE_SORT > [$$254(ASC), $$155(ASC)] |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by > ([$$155 := %0->$$284]) decor ([]) { > > > aggregate [$$254] <- [function-call: asterix:sum-serial, Args:[%0->$$283]] > > -- > AGGREGATE |LOCAL| > > > nested tuple source > > -- > NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- > EXTERNAL_GROUP_BY[$$284] |PARTITIONED| > > exchange > > -- > HASH_PARTITION_EXCHANGE [$$284] |PARTITIONED| > > group by > ([$$284 := %0->$$157]) decor ([]) { > > > aggregate [$$283] <- [function-call: asterix:count-serial, Args:[AInt64: {1}]] > > > -- AGGREGATE |LOCAL| > > > nested tuple source > > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- > EXTERNAL_GROUP_BY[$$157] |PARTITIONED| > > exchange > > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project > ([$$157]) > > -- > STREAM_PROJECT |PARTITIONED| > > > unnest $$157 <- function-call: asterix:scan-collection, Args:[%0->$$158] > > -- > UNNEST |PARTITIONED| > > > project ([$$158]) > > -- > STREAM_PROJECT |PARTITIONED| > > > assign [$$158] <- [function-call: asterix:word-tokens, Args:[function-call: > asterix:field-access-by-index, Args:[%0->$$162, AInt32: {2}]]] > > > -- ASSIGN |PARTITIONED| > > > project ([$$162]) > > > -- STREAM_PROJECT |PARTITIONED| > > > exchange > > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > data-scan []<-[$$160, $$162] <- test:OUTPUT > > > -- DATASOURCE_SCAN |PARTITIONED| > > > exchange > > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > > empty-tuple-source > > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > -- > SPLIT |PARTITIONED| > > exchange > -- > BROADCAST_EXCHANGE |PARTITIONED| > > assign [$$234] <- [TRUE] > > -- ASSIGN |PARTITIONED| > > running-aggregate [$$181] <- [function-call: asterix:tid, Args:[]] > > -- RUNNING_AGGREGATE |PARTITIONED| > > project ([$$184]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- SORT_MERGE_EXCHANGE [$$255(ASC), $$184(ASC) ] |PARTITIONED| > > order (ASC, %0->$$255) (ASC, %0->$$184) > > -- STABLE_SORT [$$255(ASC), $$184(ASC)] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$184 := %0->$$286]) decor ([]) { > > aggregate [$$255] <- [function-call: asterix:sum-serial, > Args:[%0->$$285]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$286] |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$286] |PARTITIONED| > > group by ([$$286 := %0->$$183]) decor ([]) { > > aggregate [$$285] <- [function-call: > asterix:count-serial, Args:[AInt64: {1}]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$183] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$183]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$183 <- function-call: asterix:scan-collection, > Args:[%0->$$169] > > -- UNNEST |PARTITIONED| > > project ([$$169]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$169] <- [function-call: > asterix:word-tokens, Args:[function-call: asterix:field-access-by-index, > Args:[%0->$$171, AInt32: {2}]]] > > -- ASSIGN |PARTITIONED| > > project ([$$171]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$172, $$171] <- test:DBLP > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > exchange > -- HASH_PARTITION_EXCHANGE > [$$198] |PARTITIONED| > unnest $$198 <- > function-call: asterix:subset-collection, Args:[%0->$$218, AInt32: {0}, > function-call: asterix:prefix-len-jaccard, Args:[function-call: asterix:len, > Args:[%0->$$218], AFloat: {0.9}]] > -- UNNEST |PARTITIONED| > exchange > -- ONE_TO_ONE_EXCHANGE > |PARTITIONED| > group by ([$$240 := > %0->$$168]) decor ([%0->$$189]) { > aggregate > [$$218] <- [function-call: asterix:listify, Args:[%0->$$192]] > -- AGGREGATE > |LOCAL| > order (ASC, > %0->$$192) > -- > IN_MEMORY_STABLE_SORT [$$192(ASC)] |LOCAL| > select > (function-call: algebricks:not, Args:[function-call: algebricks:is-null, > Args:[%0->$$239]]) > -- > STREAM_SELECT |LOCAL| > nested > tuple source > -- > NESTED_TUPLE_SOURCE |LOCAL| > } > -- > PRE_CLUSTERED_GROUP_BY[$$168] |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project ([$$192, > $$168, $$189, $$239]) > -- STREAM_PROJECT > |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > left outer join > (function-call: algebricks:eq, Args:[%0->$$190, %0->$$195]) > -- > IN_MEMORY_HASH_JOIN [$$190][$$195] |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project > ([$$168, $$189, $$190]) > -- > STREAM_PROJECT |PARTITIONED| > unnest > $$190 <- function-call: asterix:scan-collection, Args:[%0->$$293] > -- UNNEST > |PARTITIONED| > assign > [$$189] <- [function-call: asterix:len, Args:[%0->$$293]] > -- > ASSIGN |PARTITIONED| > > project ([$$293, $$168]) > -- > STREAM_PROJECT |PARTITIONED| > > assign [$$293] <- [function-call: asterix:word-tokens, Args:[function-call: > asterix:field-access-by-index, Args:[%0->$$167, AInt32: {2}]]] > -- > ASSIGN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$168, $$167] <- test:DBLP > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > project > ([$$195, $$192, $$239]) > -- > STREAM_PROJECT |PARTITIONED| > assign > [$$195, $$192, $$239] <- [%0->$$184, %0->$$181, %0->$$234] > -- ASSIGN > |PARTITIONED| > > exchange > -- > ONE_TO_ONE_EXCHANGE |PARTITIONED| > > replicate > -- > SPLIT |PARTITIONED| > > exchange > -- > BROADCAST_EXCHANGE |PARTITIONED| > > assign [$$234] <- [TRUE] > > -- ASSIGN |PARTITIONED| > > running-aggregate [$$181] <- [function-call: asterix:tid, Args:[]] > > -- RUNNING_AGGREGATE |PARTITIONED| > > project ([$$184]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- SORT_MERGE_EXCHANGE [$$255(ASC), $$184(ASC) ] |PARTITIONED| > > order (ASC, %0->$$255) (ASC, %0->$$184) > > -- STABLE_SORT [$$255(ASC), $$184(ASC)] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > group by ([$$184 := %0->$$286]) decor ([]) { > > aggregate [$$255] <- [function-call: asterix:sum-serial, > Args:[%0->$$285]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$286] |PARTITIONED| > > exchange > > -- HASH_PARTITION_EXCHANGE [$$286] |PARTITIONED| > > group by ([$$286 := %0->$$183]) decor ([]) { > > aggregate [$$285] <- [function-call: > asterix:count-serial, Args:[AInt64: {1}]] > > -- AGGREGATE |LOCAL| > > nested tuple source > > -- NESTED_TUPLE_SOURCE |LOCAL| > > } > > -- EXTERNAL_GROUP_BY[$$183] |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > project ([$$183]) > > -- STREAM_PROJECT |PARTITIONED| > > unnest $$183 <- function-call: asterix:scan-collection, > Args:[%0->$$169] > > -- UNNEST |PARTITIONED| > > project ([$$169]) > > -- STREAM_PROJECT |PARTITIONED| > > assign [$$169] <- [function-call: > asterix:word-tokens, Args:[function-call: asterix:field-access-by-index, > Args:[%0->$$171, AInt32: {2}]]] > > -- ASSIGN |PARTITIONED| > > project ([$$171]) > > -- STREAM_PROJECT |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > data-scan []<-[$$172, $$171] <- test:DBLP > > -- DATASOURCE_SCAN |PARTITIONED| > > exchange > > -- ONE_TO_ONE_EXCHANGE |PARTITIONED| > > empty-tuple-source > > -- EMPTY_TUPLE_SOURCE |PARTITIONED| > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)