Yeah, I was trying the same thing, though a little big ugly. My query needs to LJ/J with multiple tables. When there are 1 or 2 LJ/Js, rewriting works but when there are > 3 tables, the got the same exception triggered by the following bug.
https://issues.apache.org/jira/browse/HIVE-5891 Chen On Wed, Jul 30, 2014 at 10:07 PM, Eugene Koifman <ekoif...@hortonworks.com> wrote: > would manually rewriting the query from (T1 union all T2) LOJ S to > equivalent (T1 LOJ S) union all (T2 LOJ S) help work around this issue? > > > On Wed, Jul 30, 2014 at 6:19 PM, Chen Song <chen.song...@gmail.com> wrote: > >> I tried that and I got the following error. >> >> FAILED: SemanticException [Error 10227]: Not all clauses are supported >> with mapjoin hint. Please remove mapjoin hint. >> >> I then tried turning off auto join conversion. >> >> set hive.auto.convert.join=false >> >> But no luck, same error. >> >> Looks like it is a known issue, >> >> >> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.0.2/bk_releasenotes_hdp_2.0/content/ch_relnotes-hdp2.0.0.2-5-2.html >> >> Chen >> >> >> >> >> On Wed, Jul 30, 2014 at 9:10 PM, Navis류승우 <navis....@nexr.com> wrote: >> >>> Could you do it with hive.ignore.mapjoin.hint=false? Mapjoin hint is >>> ignored from hive-0.11.0 by default (see >>> https://issues.apache.org/jira/browse/HIVE-4042) >>> >>> Thanks, >>> Navis >>> >>> >>> 2014-07-31 10:04 GMT+09:00 Chen Song <chen.song...@gmail.com>: >>> >>> I am using cdh5 with hive 0.12. We have some hive jobs migrated from >>>> hive 0.10 and they are written like below: >>>> >>>> select /*+ MAPJOIN(sup) */ c1, c2, sup.c >>>> from >>>> ( >>>> select key, c1, c2 from table1 >>>> union all >>>> select key, c1, c2 from table2 >>>> ) table >>>> left outer join >>>> sup >>>> on (table.c1 = sup.key) >>>> distribute by c1 >>>> >>>> In Hive 0.10 (CDH4), Hive translates the left outer join into a map >>>> join (map only job), followed by a regular MR job for distribute by. >>>> >>>> In Hive 0.12 (CDH5), Hive is not able to convert the join into a map >>>> join. Instead it launches a common map reduce for the join, followed by >>>> another mr for distribute by. However, when I take out the union all >>>> operator, Hive seems to be able to create a single MR job, with map join on >>>> map phase, and reduce for distribute by. >>>> >>>> I read a bit on >>>> https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins >>>> and found out that there are some restrictions on map side join >>>> starting Hive 0.11. The following are not supported. >>>> >>>> >>>> - Union Followed by a MapJoin >>>> - Lateral View Followed by a MapJoin >>>> - Reduce Sink (Group By/Join/Sort By/Cluster By/Distribute By) >>>> Followed by MapJoin >>>> - MapJoin Followed by Union >>>> - MapJoin Followed by Join >>>> - MapJoin Followed by MapJoin >>>> >>>> >>>> So if one side of the table (big side) is a union of some tables and >>>> the other side is a small table, Hive would not be able to do a map join at >>>> all? Is that correct? >>>> >>>> If correct, what should I do to make the job backward compatible? >>>> >>>> -- >>>> Chen Song >>>> >>>> >>> >> >> >> -- >> Chen Song >> >> > > > -- > > Thanks, > Eugene > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. -- Chen Song