camelia_c created TAJO-175:
------------------------------

             Summary: MergeJoinExec incorrect processing
                 Key: TAJO-175
                 URL: https://issues.apache.org/jira/browse/TAJO-175
             Project: Tajo
          Issue Type: Bug
          Components: physical operator
         Environment: DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.4 LTS"

Hadoop 0.20.2-cdh3u3

            Reporter: camelia_c


For query 
select dep1.dep_id, emp1.dep_id, emp1.salary from dep1 join emp1 on 
dep1.dep_id=emp1.dep_id;


And data:

---------------dep1

10,Purchasing,1
20,Shipping,1
30,Manufacturing,3
40,QA,6
50,Accounting,


create external table dep1 (dep_id int, dep_name text, loc_id int) using csv 
with ('csvfile.delimiter'=',') location 'file:/home/camelia/testdata/DEP1';

----------------- emp1

1000,Tom,Smith,10,333,100
1001,Mary,Thompson,10,555,
1002,Aron,Weber,,777,100
1003,Susan,Carlson,,999,

create external table emp1 (emp_id int, first_name text, last_name text, dep_id 
int, salary float, job_id int) using csv with ('csvfile.delimiter'=',') 
location 'file:/home/camelia/testdata/EMP1';


-------------------------------------------------

With the original MergeJoinExec, with logging info messages inserted along the 
processing steps, it doesn't output any result and it reads wrong values (12 
instead of NULL)

13/09/09 20:46:01 INFO physical.MergeJoinExec: ********rightChild.next() 
=(0=>555.0, 1=>10)

13/09/09 20:46:01 INFO physical.MergeJoinExec: ********rightChild.next() 
=(0=>777.0, 1=>12)


The TAJO output is :

tajo> select dep1.dep_id, emp1.dep_id, emp1.salary from dep1 join emp1 on 
dep1.dep_id=emp1.dep_id;
2013-09-09 20:45:52,947 INFO  client.TajoClient 
(TajoClient.java:connectionToQueryMaster(190)) - Connected to Query Master 
(qid=q_1378748585102_0001, addr=127.0.1.1:8091)
Progress: 0%, response time: 1.036 sec
Progress: 0%, response time: 2.04 sec
Progress: 0%, response time: 3.042 sec
Progress: 0%, response time: 4.045 sec
Progress: 0%, response time: 5.047 sec
Progress: 0%, response time: 6.049 sec
Progress: 0%, response time: 7.05 sec
Progress: 0%, response time: 8.052 sec
Progress: 100%, response time: 8.32 sec
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/camelia/tajo_git/incubator-tajo/tajo-dist/target/tajo-0.2.0-SNAPSHOT/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/usr/local/hadoop-2.0.3-alpha/share/hadoop/common/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
2013-09-09 20:46:02,513 WARN  util.NativeCodeLoader 
(NativeCodeLoader.java:<clinit>(62)) - Unable to load native-hadoop library for 
your platform... using builtin-java classes where applicable
2013-09-09 20:46:02,782 INFO  rpc.NettyClientBase 
(NettyClientBase.java:close(87)) - Proxy is disconnected from 127.0.1.1:8091
2013-09-09 20:46:02,784 INFO  client.TajoClient 
(TajoClient.java:closeQuery(113)) - Closed a QueryMaster connection 
(qid=q_1378748585102_0001, addr=mmm2/127.0.1.1:8091)
final state: QUERY_SUCCEEDED, init time: 1.61 sec, execution time: 0.0 sec, 
total response time: 8.32 sec
result: file:/home/camelia/tajo/q_1378748585102_0001

dep_id,  dep_id,  salary
-------------------------------
tajo> 



I shall attach archive with logs data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to