Re: Work around for using OR in Joins

2011-03-23 Thread MIS
Ning, Thanks for the reply. Yes. you are right. Using NOT and AND didn't work as expected. I'll give a try in implementing nested-loop map-side join. In the meanwhile, I brought out the expression using OR from inside the JOIN expression to be used in the filtering expression {in the WHERE clause

Re: Work around for using OR in Joins

2011-03-23 Thread MIS
Here is my query :* select table1.column10, table1.column11, count(distinct table2.column3) as total from table1 JOIN table2 ON (table1.column1='value1' and to_date(table1.column2) = '2011-01-06' and to_date(table1.column2) '2011-01-07' and table2.column3!='' and table2.column3 is NOT NULL and

Work around for using OR in Joins

2011-03-22 Thread MIS
I want to use OR in the join expression, but it seems only AND is supported as of now. I have a work around though to use DeMorgan's law {C1 OR C2 = !(!C1 AND !C2))} , but it would be nice if somebody can point me to the location in code base that would need modification to support the OR in the

Re: Work around for using OR in Joins

2011-03-22 Thread MIS
Found it at *org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.java* line no. 1122 There is some concern mentioned that supporting OR would lead to data explosion. Is it discussed/documneted in a little more detail somewhere ? If so, some pointers towards the same will be helpful. Thanks, MIS.

Re: Work around for using OR in Joins

2011-03-22 Thread Ning Zhang
Joins with OR conditions are not supported by Hive currently. I think even though you rewrite the condition to use NOT and AND only, the results may be wrong. It is quite hard to implement joins of any tables with OR conditions in a MapReduce framework. it is straightforward to implement it