[ 
https://issues.apache.org/jira/browse/SPARK-13832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202264#comment-15202264
 ] 

Xin Wu commented on SPARK-13832:
--------------------------------

what i meant is that in Spark 2.0, it seems that "grouping__id" is deprecated 
and grouping_id() is used. So i needed to change this to proceed. but after the 
query is parsed, the AnalysisException you reported in this JIRA  
{code}"org.apache.spark.sql.AnalysisException: expression 
'i_category'..."{code} is not reproducible.  As far as the later execution 
error, I am still validating whether it is related to the data or spark sql 
execution issue. But this is not a parser or analyzer error. 

In 1.6, the AnalsysException is reproducible. This this is no longer the issue 
in 2.0.. 


> TPC-DS Query 36 fails with Parser error
> ---------------------------------------
>
>                 Key: SPARK-13832
>                 URL: https://issues.apache.org/jira/browse/SPARK-13832
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.6.1
>         Environment: Red Hat Enterprise Linux Server release 7.1 (Maipo)
> Linux bigaperf116.svl.ibm.com 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 
> 18:37:38 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
>            Reporter: Roy Cecil
>
> TPC-DS query 36 fails with the following error
> Analyzer error: 16/02/28 21:22:51 INFO parse.ParseDriver: Parse Completed
> Exception in thread "main" org.apache.spark.sql.AnalysisException: expression 
> 'i_category' is neither present in the group by, nor is it an aggregate 
> function. Add to group by or wrap in first() (or first_value) if you don't 
> care which value you get.;
>         at 
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
>         at 
> org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
> Query Text pasted here for quick reference.
>   select
>     sum(ss_net_profit)/sum(ss_ext_sales_price) as gross_margin
>    ,i_category
>    ,i_class
>    ,grouping__id as lochierarchy
>    ,rank() over (
>         partition by grouping__id,
>         case when grouping__id = 0 then i_category end
>         order by sum(ss_net_profit)/sum(ss_ext_sales_price) asc) as 
> rank_within_parent
>  from
>     store_sales
>    ,date_dim       d1
>    ,item
>    ,store
>  where
>     d1.d_year = 2001
>  and d1.d_date_sk = ss_sold_date_sk
>  and i_item_sk  = ss_item_sk
>  and s_store_sk  = ss_store_sk
>  and s_state in ('TN','TN','TN','TN',
>                  'TN','TN','TN','TN')
>  group by i_category,i_class WITH ROLLUP
>  order by
>    lochierarchy desc
>   ,case when lochierarchy = 0 then i_category end
>   ,rank_within_parent
>     limit 100;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to