[ 
https://issues.apache.org/jira/browse/HIVE-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588116#comment-13588116
 ] 

Ashutosh Chauhan commented on HIVE-4080:
----------------------------------------

bq.  Support for this feature will probably be removed. Causes ambiguities when 
Query contains different partition clauses. 
Do you mean feature which this patch is introducing (ability to have lead 
function independent of UDAFs in select expr) will be removed? Consider 
following query:
{noformat}
 select p_mfgr, p_retailprice,
lead(p_retailprice,1) as l1 over (partition by p_mfgr order by p_name),
lead(p_retailprice,1, p_retailprice) as l2 over (partition by p_size order by 
p_name),
p_retailprice - lead(p_retailprice,1) 
from part;
{noformat}
My guess is ambiguity you are referring to is once we start supporting 
different partitioning in same query than last lead() in above query becomes 
ambiguous as to which partitioning function it is refering to. But my 
understanding is sql standard says that lead and lag function must always be 
associated with over clause. So, above query is illegal in standard sql. It 
must be written as:
{noformat}
 select p_mfgr, p_retailprice,
lead(p_retailprice,1) as l1 over (partition by p_mfgr order by p_name),
lead(p_retailprice,1, p_retailprice) as l2 over (partition by p_size order by 
p_name),
p_retailprice - lead(p_retailprice,1)as l3 over (partition by p_size order by 
p_name) 
from part;
{noformat} 
Now we have this concept of default partitioning which would have made first 
query legal if partitioning scheme was identical for l1 and l2. I think long 
term:
* We should keep functionality introduced in this patch to stay compliant.
* Associate default partitioning with windowing function only if there is no 
ambiguity (i.e., there is only one partitioning clause in query).
* Raise error if user doesn't specify partitioning and there are more than one 
partitioning scheme to choose from.
Same argument stands for when lead/lag functions are used as arguments with 
UDAFs. Make sense ?

Further, I think this concept of default partitioning is only extra convenience 
we are offering to hive users which is non-standard. If it turns out its 
burdensome to support this I am fine with removing it and always requiring user 
to specify over clause.
                
> Add Lead & Lag UDAFs
> --------------------
>
>                 Key: HIVE-4080
>                 URL: https://issues.apache.org/jira/browse/HIVE-4080
>             Project: Hive
>          Issue Type: Bug
>          Components: PTF-Windowing
>            Reporter: Harish Butani
>            Assignee: Harish Butani
>         Attachments: HIVE-4080.1.patch.txt, HIVE-4080.D8961.1.patch
>
>
> Currently we support Lead/Lag as navigation UDFs usable with Windowing.
> To be standard compliant we need to support Lead & Lag UDAFs.
> Will continue to support Lead/Lag UDFs as arguments to UDAFs when Windowing 
> is in play. 
> Currently allow Lead/Lag expressions to appear in SelectLists even when they 
> are not arguments to UDAFs. Support for this feature will probably be 
> removed. Causes ambiguities when Query contains different partition clauses. 
> Will provide more details with associated Jira to remove this feature.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to