By the way, v0.10.0 adds a new CROSS JOIN feature, when you really intended to do one ;)
SELECT a.x, b.y FROM tablea a CROSS JOIN tableb b; On Mon, Jan 28, 2013 at 10:58 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > Currently order by is very inefficient in hive. Hopefully you can use sort > by in most cases. > > It should be "visible" in the plan. The plan should be very different if > you are using the ON clause vs not. Can it be easily detected is another > question. > > > On Mon, Jan 28, 2013 at 11:45 AM, David Morel <dmore...@gmail.com> wrote: > >> On 28 Jan 2013, at 14:29, Edward Capriolo wrote: >> >> Iirc hive.mapred.mode strict should prevent this. If not we should add >>> it. >>> >> >> hi Edward, >> >> Yes, that's indeed what the book claims (quoting): >> >> hive> SELECT * FROM fracture_act JOIN fracture_ads >> > WHERE fracture_act.planner_id = fracture_ads.planner_id; >> FAILED: Error in semantic analysis: In strict mode, cartesian product >> is not allowed. If you really want to perform the operation, >> +set hive.mapred.mode=nonstrict+ >> >> I am about to re-enable this setting on my cluster (after fixing all the >> queries that it broke, especially all the ORDER BY ones :-) but I hoped >> it was visible right there in the query plan, or in some other way. If >> Hive can detect it, it should be visible somewhere, right? >> >> Thanks! >> >> david >> >> >>> On Monday, January 28, 2013, David Morel <dmore...@gmail.com> wrote: >>> >>>> Hi everyone, >>>> >>>> I had to kill some queries that were taking forever, and it turns out >>>> they were doing cartesian products (missing ON clause on a JOIN). >>>> >>>> I wonder how I could see that in the EXPLAIN output (which I still find >>>> a bit cryptic). Specifically, the stage that it was stuck in was this: >>>> >>>> Stage: Stage-7 >>>> Map Reduce >>>> Alias -> Map Operator Tree: >>>> $INTNAME >>>> Reduce Output Operator >>>> sort order: >>>> tag: 1 >>>> value expressions: >>>> expr: _col1 >>>> type: int >>>> $INTNAME1 >>>> Reduce Output Operator >>>> sort order: >>>> tag: 0 >>>> value expressions: >>>> expr: _col0 >>>> type: bigint >>>> expr: _col1 >>>> type: string >>>> Reduce Operator Tree: >>>> Join Operator >>>> condition map: >>>> Inner Join 0 to 1 >>>> condition expressions: >>>> 0 {VALUE._col0} {VALUE._col1} >>>> 1 {VALUE._col1} >>>> handleSkewJoin: false >>>> outputColumnNames: _col0, _col1, _col3 >>>> File Output Operator >>>> compressed: true >>>> GlobalTableId: 0 >>>> table: >>>> input format: >>>> >>> org.apache.hadoop.mapred.SequenceFileInputFormat >>> >>>> output format: >>>> >>> org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat >>> >>>> >>>> Is there anything in there that should have alerted me? >>>> >>>> I found out by looking at the query, but I wonder if the query plan (if >>>> I could read it) would have given me that information. >>>> >>>> Thanks a lot >>>> >>>> David Morel >>>> >>>> > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330