[ 
https://issues.apache.org/jira/browse/HIVE-15160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15680437#comment-15680437
 ] 

Vineet Garg commented on HIVE-15160:
------------------------------------

TPC-DS query 35 has the same issue 
{noformat}
select   
  ca_state,
  cd_gender,
  cd_marital_status,
  count(*) cnt1,
  avg(cd_dep_count),
  max(cd_dep_count),
  sum(cd_dep_count),
  cd_dep_employed_count,
  count(*) cnt2,
  avg(cd_dep_employed_count),
  max(cd_dep_employed_count),
  sum(cd_dep_employed_count),
  cd_dep_college_count,
  count(*) cnt3,
  avg(cd_dep_college_count),
  max(cd_dep_college_count),
  sum(cd_dep_college_count)
 from
  customer c,customer_address ca,customer_demographics
 where
  c.c_current_addr_sk = ca.ca_address_sk and
  cd_demo_sk = c.c_current_cdemo_sk and 
  exists (select *
          from store_sales,date_dim
          where c.c_customer_sk = ss_customer_sk and
                ss_sold_date_sk = d_date_sk and
                d_year = 1999 and
                d_qoy < 4) and
   (exists (select *
            from web_sales,date_dim
            where c.c_customer_sk = ws_bill_customer_sk and
                  ws_sold_date_sk = d_date_sk and
                  d_year = 1999 and
                  d_qoy < 4) or 
    exists (select * 
            from catalog_sales,date_dim
            where c.c_customer_sk = cs_ship_customer_sk and
                  cs_sold_date_sk = d_date_sk and
                  d_year = 1999 and
                  d_qoy < 4))
 group by ca_state,
          cd_gender,
          cd_marital_status,
          cd_dep_count,
          cd_dep_employed_count,
          cd_dep_college_count
 order by ca_state,
          cd_gender,
          cd_marital_status,
          cd_dep_count,
          cd_dep_employed_count,
          cd_dep_college_count
 limit 100;
{noformat}

cd_dep_count, cd_dep_employed_count are not selected but are part of order by.

Not sure if this would a simple fix. [~pxiong] Have you looked into this ? Any 
idea what would it take to fix this? Quick look at error stack suggests that it 
is giving up during operator tree generation while trying to resolve column 
names (probably for oder by )

> Can't order by an unselected column
> -----------------------------------
>
>                 Key: HIVE-15160
>                 URL: https://issues.apache.org/jira/browse/HIVE-15160
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>
> If a grouping key hasn't been selected, Hive complains. For comparison, 
> Postgres does not.
> Example. Notice i_item_id is not selected:
> {code}
> select  i_item_desc
>        ,i_category
>        ,i_class
>        ,i_current_price
>        ,sum(cs_ext_sales_price) as itemrevenue
>        ,sum(cs_ext_sales_price)*100/sum(sum(cs_ext_sales_price)) over
>            (partition by i_class) as revenueratio
>  from catalog_sales
>      ,item
>      ,date_dim
>  where cs_item_sk = i_item_sk
>    and i_category in ('Jewelry', 'Sports', 'Books')
>    and cs_sold_date_sk = d_date_sk
>  and d_date between cast('2001-01-12' as date)
>                               and (cast('2001-01-12' as date) + 30 days)
>  group by i_item_id
>          ,i_item_desc
>          ,i_category
>          ,i_class
>          ,i_current_price
>  order by i_category
>          ,i_class
>          ,i_item_id
>          ,i_item_desc
>          ,revenueratio
> limit 100;
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to