[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-11 Thread Phabricator (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599707#comment-13599707
 ] 

Phabricator commented on HIVE-4108:
---

ashutoshc has accepted the revision HIVE-4108 [jira] Allow over() clause to 
contain an order by with no partition by.

  +1

REVISION DETAIL
  https://reviews.facebook.net/D9309

BRANCH
  HIVE-4108

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, hbutani


 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Harish Butani
 Attachments: HIVE-4108.D9309.1.patch


 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-11 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599715#comment-13599715
 ] 

Ashutosh Chauhan commented on HIVE-4108:


Test {{windowing_windowspec.q}} failed.

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Harish Butani
 Attachments: HIVE-4108.D9309.1.patch


 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-11 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13599718#comment-13599718
 ] 

Harish Butani commented on HIVE-4108:
-

this needs the 4142 patch. Sorry forgot to mention this.

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland
Assignee: Harish Butani
 Attachments: HIVE-4108.D9309.1.patch


 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596119#comment-13596119
 ] 

Harish Butani commented on HIVE-4108:
-

A question I have is what should the behavior be when a Query level distribute 
and/or sort is specified:
So for this query
{noformat}
select sum(x) over()
from t1
distribute by x
sort by y
{noformat}

Should the partition for sum be the entire table or be based on x. 
Today we support this query:
{noformat}
select sum(x)
from t1
distribute by x
sort by y
{noformat}
we infer the partition for sum to be x.

We also support this
{noformat}
select sum(x) over(row unbounded preceding and current row)
from t1
distribute by x
sort by y
{noformat}
again we infer the partition for sum to be x.

So:
- either we remove the concept of inferring from the Query level distribute/sort
- or a missing partition in an Over clause should imply the entire table only 
when there is no Query level distribute/sort

Does this make sense?

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596254#comment-13596254
 ] 

Ashutosh Chauhan commented on HIVE-4108:


Your table looks good. It will be good to add it in the comments in source code 
as well.

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-07 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596280#comment-13596280
 ] 

Ashutosh Chauhan commented on HIVE-4108:


I think we should remove the concept of inference from query level 
distribute/sort.
For your first query I will read that as user first intends to do a partition 
on a constant using full table (which will be first MR job) and than wants 
second partitioning on x (2nd MR job) which as you pointed out is different 
than current behavior.
For your second query, my read will be same as previous which again deviates 
from implementation.
For third query, same ambiguity.

So, in all 3 cases current behavior is different than what I would have 
expected. Automatic inference is nasty. IMO we should drop it all together. 
Distribute/Sort if present in query shouldn't impact any over() clause 
specified in the query. Whenever they are present that will just imply user 
wants another MR job using that spec (which was the behavior in HIVE before 
this work).

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-07 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13596337#comment-13596337
 ] 

Harish Butani commented on HIVE-4108:
-

Yes I have to agree. 
Seemed like a good idea, but it is not.
It gets more nebulous with multiple partitions  support (4041)
Will add a separate Jira to remove the current behavior first.

 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4108) Allow over() clause to contain an order by with no partition by

2013-03-05 Thread Harish Butani (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13594369#comment-13594369
 ] 

Harish Butani commented on HIVE-4108:
-

So this will be the behavior:
{noformat}
| has Part | has Order | has Window | note  |
| y| y | y  | everything specified  |
| y| y | n  | no window |
| y| n | y  | order = partition |
| y| n | n  | same as above |
| n| y | y  | partition on constant |
| n| y | n  | same as above |
| n| n | y  | same as above, o = p  |
| n| n | n  | the over() case   |
{noformat}


 Allow over() clause to contain an order by with no partition by
 ---

 Key: HIVE-4108
 URL: https://issues.apache.org/jira/browse/HIVE-4108
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Reporter: Brock Noland

 HIVE-4073 allows over() to be called with no partition by and no order by. We 
 should allow only an order by.
 From the review of HIVE-4073:
 Ashutosh
 {noformat}
 Can you also add following test. This should also work.
 select p_name, p_retailprice,
 avg(p_retailprice) over(order by p_name)
 from part
 partition by p_name;
 {noformat}
 Harish
 {noformat}
 This test will not work (:
 The grammar needs to be changed so:
 partitioningSpec
 @init { msgs.push(partitioningSpec clause); }
 @after { msgs.pop(); } 
 :
 partitionByClause orderByClause? - ^(TOK_PARTITIONINGSPEC partitionByClause 
 orderByClause?) |
 orderByClause - ^(TOK_PARTITIONINGSPEC orderByClause) |
 distributeByClause sortByClause? - ^(TOK_PARTITIONINGSPEC distributeByClause 
 sortByClause?) |
 sortByClause? - ^(TOK_PARTITIONINGSPEC sortByClause) |
 clusterByClause - ^(TOK_PARTITIONINGSPEC clusterByClause)
 ;
 And the SemanticAnalyzer::processPTFPartitionSpec has to handle this shape of 
 the AST Tree. The PTFTranslator also needs changes. Do this as another Jira
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira