date:20110831


 [ 
https://issues.apache.org/jira/browse/HIVE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-2184:
-

   Resolution: Fixed
Fix Version/s: 0.9.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks Chinna!


 Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
 ---

 Key: HIVE-2184
 URL: https://issues.apache.org/jira/browse/HIVE-2184
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Fix For: 0.9.0

 Attachments: HIVE-2184.1.patch, HIVE-2184.1.patch, HIVE-2184.2.patch, 
 HIVE-2184.3.patch, HIVE-2184.patch


 1)Hive.close() will call HiveMetaStoreClient.close() in this method the 
 variable standAloneClient is never become true then client.shutdown() never 
 call.
 2)Hive.close() After calling metaStoreClient.close() need to make 
 metaStoreClient=null

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1451) Creating a table stores the full address of namenode in the metadata. This leads to problems when the namenode address changes.

2011-08-31 Thread MIS (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-1451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094706#comment-13094706
]

MIS commented on HIVE-1451:
---

+1 for the issue.
This is one of those features which many assume exists by default, but doesn't.

I too have run into this and resolved it by changing the DB_LOCATION_URI column
and LOCATION in the tables DBS and SDS respectively to point to the latest
namenode URI. {My metastore was on MySql}.

This issue will help us from manually changing namenode URI in db should the
address of the namenode change.

Creating a table stores the full address of namenode in the metadata. This
leads to problems when the namenode address changes.
---

Key: HIVE-1451
URL: https://issues.apache.org/jira/browse/HIVE-1451
Project: Hive
Issue Type: Bug
Components: Metastore, Query Processor
Affects Versions: 0.5.0
Environment: Any
Reporter: Arvind Prabhakar

Here is an excerpt from table metadata for an arbitrary table {{table1}}:
{noformat}
hive describe extended table1;
OK
...
Detailed Table Information...
location:hdfs://localhost:9000/user/arvind/hive/warehouse/table1,
...
{noformat}
As can be seen, the full address of namenode is captured in the location
information for the table. This information is later used to run any queries
on the table - thus making it impossible to change the namenode location once
the table has been created. For example, for the above table, a query will
fail if the namenode is migrated from port 9000 to 8020:
{noformat}
hive select * from table1;
OK
Failed with exception java.io.IOException:java.net.ConnectException: Call to
localhost/127.0.0.1:9000
failed on connection exception: java.net.ConnectException: Connection refused
Time taken: 10.78 seconds
hive
{noformat}
It should be possible to change the namenode location regardless of when the
tables are created. Also, any query execution should work with the configured
namenode at that point in time rather than requiring the configuration to be
exactly the same at the time when the tables were created.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

RE: Hive in EC2

2011-08-31 Thread Steven Wong

When you launch an EMR cluster (or job flow in EMR terminology), it launches 
new EC2 instances, optionally with an Elastic IP assigned to the cluster's 
master host. One does not install EMR on existing EC2 (non-EMR) instances.

-Original Message-
From: MIS [mailto:misapa...@gmail.com] 
Sent: Wednesday, August 31, 2011 10:38 AM
To: dev@hive.apache.org
Cc: u...@hive.apache.org
Subject: Re: Hive in EC2

But my concern is that I cannot run the Elastic Mapreduce on specific
instances which we already own and have elastic IPs. If it is possible to do
so, then using Hive EMR should be fine enough.

Thanks,
MIS

On Wed, Aug 31, 2011 at 12:21 AM, Aggarwal, Vaibhav vagg...@amazon.comwrote:

 You could also choose to look at Amazon ElasticMapReduce.
 It allows you to provision an EC2 cluster of your choice preinstalled with
 Hive and Hadoop.

 https://cwiki.apache.org/confluence/display/Hive/HiveAmazonElasticMapReduce

 Thanks
 Vaibhav

 -Original Message-
 From: MIS [mailto:misapa...@gmail.com]
 Sent: Monday, August 29, 2011 11:03 PM
 To: u...@hive.apache.org; hive
 Subject: Hive in EC2

 Hi,

 Can somebody point me to production level setup of Hive in EC2. The intent
 is to know the setup best practices being employed.

 Thanks.

[jira] [Commented] (HIVE-2413) BlockMergeTask ignores client-specified jars

2011-08-31 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094962#comment-13094962
 ] 

He Yongqiang commented on HIVE-2413:


[junit] java.lang.IllegalArgumentException: Can not create a Path from an 
empty string
[junit] at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
[junit] at org.apache.hadoop.fs.Path.init(Path.java:90)
[junit] at 
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient.java:602)
[junit] at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)
[junit] at 
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
[junit] at 
org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:203)
[junit] at 
org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:410)
[junit] at 
org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:366)
[junit] at 
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:132)
[junit] at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
[junit] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1343)
[junit] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1134)
[junit] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:943)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:253)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:210)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:401)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
[junit] at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:638)
[junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_concatenate_indexed_table(TestCliDriver.java:1190)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

I got these error with a bunch of testcases. Here are some of them: 
rcfile_merge3.q, load_fs.q, alter_merge.q etc

can u take a look?


 BlockMergeTask ignores client-specified jars
 

 Key: HIVE-2413
 URL: https://issues.apache.org/jira/browse/HIVE-2413
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
Priority: Minor
 Attachments: HIVE-2413.v0.patch


 User-specified jars are not added to the hadoop tasks while executing a 
 BlockMergeTask resulting in a ClassNotFoundException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2417) Merging of compressed rcfiles fails to write the valuebuffer part correctly

2011-08-31 Thread He Yongqiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094963#comment-13094963
 ] 

He Yongqiang commented on HIVE-2417:


+1, will commit after tests pass

 Merging of compressed rcfiles fails to write the valuebuffer part correctly
 ---

 Key: HIVE-2417
 URL: https://issues.apache.org/jira/browse/HIVE-2417
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Krishna Kumar
Assignee: Krishna Kumar
 Attachments: HIVE-2417.v0.patch, HIVE-2417.v1.patch


 The blockmerge task does not create proper rc files when merging compressed 
 rc files as the valuebuffer writing is incorrect.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2383) Incorrect alias filtering for predicate pushdown


 [ 
https://issues.apache.org/jira/browse/HIVE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-2383:
-

   Resolution: Fixed
Fix Version/s: (was: 0.8.0)
   0.9.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Passed tests and committed to trunk.  Thanks Charles!


 Incorrect alias filtering for predicate pushdown
 

 Key: HIVE-2383
 URL: https://issues.apache.org/jira/browse/HIVE-2383
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Charles Chen
Assignee: Charles Chen
Priority: Critical
 Fix For: 0.9.0

 Attachments: HIVE-2383v1.patch, HIVE-2383v2.patch, HIVE-2383v5.patch


 The predicate pushdown optimizer starts at the topmost operators traverses 
 the operator tree, at each stage collecting predicates to be pushed down.  At 
 each operator, ive.ql.ppd.OpProcFactory.DefaultPPD.mergeWithChildrenPred is 
 called, which merges the predicates of the children nodes into the current 
 node.  The predicates are stored in hive.ql.ppd.ExprWalkerInfo.pushdownPreds 
 as a map from the alias a predicate refers to (a predicate may only refer to 
 one alias at a time as only such predicates can be pushed) to a list of such 
 predicates.  Since at each stage the alias the predicate refers to may change 
 (subqueries may change aliases), this is updated for each operator 
 (hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds is called which walks 
 the ExprNodeDesc for each predicate). When a JoinOperator is encountered, 
 mergeWithChildrenPred is passed an optional parameter aliases which 
 contains a set of aliases that can be pushed per ansi semantics (see 
 hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases).  The part that is 
 incorrect is that aliases are filtered in mergeWithChildrenPred before 
 extractPushdownPreds is called, which associates the predicates with the 
 correct alias in the current operator's context while the filtering should 
 happen after.
 In test case Q2 below, when the predicate a.bar=3 comes into the 
 JoinOperator, the alias is a coming in so it is accepted for pushdown.  
 When brought into the JoinOperator's context, however, since the predicate 
 refers to b.foo in the inner scope, we should not actually accept this for 
 pushdown.
 With the test cases
 {noformat}
 -- Q1: predicate should not be pushed on the right side of a left outer join 
 (this is correct in trunk)
 explain
 SELECT a.foo as foo1, b.foo as foo2, b.bar
 FROM pokes a LEFT OUTER JOIN pokes2 b
 ON a.foo=b.foo
 WHERE b.bar=3;
 -- Q2: predicate should not be pushed on the right side of a left outer join 
 (this is broken in trunk)
 explain
 SELECT * FROM
 (SELECT a.foo as foo1, b.foo as foo2, b.bar
 FROM pokes a LEFT OUTER JOIN pokes2 b
 ON a.foo=b.foo) a
 WHERE a.bar=3;
 -- Q3: predicate should be pushed (this is correct in trunk)
 explain
 SELECT * FROM
 (SELECT a.foo as foo1, b.foo as foo2, a.bar
 FROM pokes a JOIN pokes2 b
 ON a.foo=b.foo) a
 WHERE a.bar=3;
 {noformat}
 The current output is
 {noformat}
 hive 
  -- Q1: predicate should not be pushed on the right side of a left outer 
 join
  explain
  SELECT a.foo as foo1, b.foo as foo2, b.bar
  FROM pokes a LEFT OUTER JOIN pokes2 b
  ON a.foo=b.foo
  WHERE b.bar=3;
 OK
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_TABREF (TOK_TABNAME pokes) a) 
 (TOK_TABREF (TOK_TABNAME pokes2) b) (= (. (TOK_TABLE_OR_COL a) foo) (. 
 (TOK_TABLE_OR_COL b) foo (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
 TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) foo) foo1) 
 (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) foo) foo2) (TOK_SELEXPR (. 
 (TOK_TABLE_OR_COL b) bar))) (TOK_WHERE (= (. (TOK_TABLE_OR_COL b) bar) 3
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 a 
   TableScan
 alias: a
 Reduce Output Operator
   key expressions:
 expr: foo
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: foo
 type: int
   tag: 0
   value expressions:
 expr: foo
 type: int
 b 
   TableScan
 alias: b
 Reduce Output Operator
   key expressions:
 expr: foo
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: foo
 type: int
   tag: 1

[jira] [Resolved] (HIVE-1395) Table aliases are ambiguous


 [ 
https://issues.apache.org/jira/browse/HIVE-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi resolved HIVE-1395.
--

Resolution: Won't Fix

We're fixing the bugs and sticking with the normal SQL rules, which allow 
duplicate aliases, for the reasons mentioned above.


 Table aliases are ambiguous
 ---

 Key: HIVE-1395
 URL: https://issues.apache.org/jira/browse/HIVE-1395
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Adam Kramer

 Consider this query:
 SELECT a.num FROM (
   SELECT a.num AS num, b.num AS num2
   FROM foo a LEFT OUTER JOIN bar b ON a.num=b.num
 ) a
 WHERE a.num2 IS NULL;
 ...in this case, the table alias 'a' is ambiguous. It could be the outer 
 table (i.e., the subquery result), or it could be the inner table (foo).
 In the above case, Hive silently parses the outer reference to a as the inner 
 reference. The result, then, is akin to:
 SELECT foo.num FROM foo WHERE bar.num IS NULL. This is bad.
 The bigger problem, however, is that Hive even lets people use the same table 
 alias at multiple points in the query. We should simply throw an exception 
 during the parse stage if there is any ambiguity in which table is which, 
 just like we do if the column names are ambiguous.
 Or, if for some reason we need people to be able to use 'a' to refer to 
 multiple tables or subqueries, it would be excellent if the exact parsing 
 structure were made clear and added to the wiki. In that case, I will file a 
 separate bug JIRA to complain about how it should be different. :)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2400) Update unittests Hadoop version

2011-08-31 Thread Marcin Kurczych (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094999#comment-13094999
]

Marcin Kurczych commented on HIVE-2400:
---

I've manually replaced hadoop-core and hadoop-tools jars to Hadoop 0.20.3 ones
and everything almost worked (all tests, including new ones, which were failing
because of Hadoop 0.20.1 bugs). There's almost, because I've run into a
problem: VersionInfo.getVersion() was returning Unknown so I hardcoded
something like if(Unknown.equals(vers)) vers=0.20.3; for testing and then
everything went perfect. This must be problem with jars, I've used ones from
https://repository.apache.org/index.html#nexus-search;quick~hadoop .

Update unittests Hadoop version
---

Key: HIVE-2400
URL: https://issues.apache.org/jira/browse/HIVE-2400
Project: Hive
Issue Type: Improvement
Reporter: Marcin Kurczych
Assignee: Marcin Kurczych

Hadoop 0.20.1 used in unittests contains bugs that were fixed in later
versions of Hadoop, for example
* har:// connections cannot be indexed by (scheme, authority, username) - the
path is significant as well. Caching them in this way limits a hadoop client
to opening one archive per filesystem. It seems to be safe not to cache them,
since they wrap another connection that does the actual networking.
fixed in https://issues.apache.org/jira/browse/HADOOP-6231 .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-1342) Predicate push down get error result when sub-queries have the same alias name


 [ 
https://issues.apache.org/jira/browse/HIVE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi resolved HIVE-1342.
--

   Resolution: Fixed
Fix Version/s: 0.9.0

Fixed by committing sub-issues (not the patches attached to this issue).

 Predicate push down get error result when sub-queries have the same alias 
 name 
 ---

 Key: HIVE-1342
 URL: https://issues.apache.org/jira/browse/HIVE-1342
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Ted Xu
Assignee: Charles Chen
Priority: Critical
 Fix For: 0.9.0

 Attachments: HIVE-1342v1.patch, HIVE-1342v2.patch, HIVE-1342v3.patch, 
 HIVE-1342v4.patch, cmd.hql, explain, ppd_same_alias_1.patch, 
 ppd_same_alias_2.patch


 Query is over-optimized by PPD when sub-queries have the same alias name, see 
 the query:
 ---
 create table if not exists dm_fact_buyer_prd_info_d (
   category_id string
   ,gmv_trade_num  int
   ,user_idint
   )
 PARTITIONED BY (ds int);
 set hive.optimize.ppd=true;
 set hive.map.aggr=true;
 explain select category_id1,category_id2,assoc_idx
 from (
   select 
   category_id1
   , category_id2
   , count(distinct user_id) as assoc_idx
   from (
   select 
   t1.category_id as category_id1
   , t2.category_id as category_id2
   , t1.user_id
   from (
   select category_id, user_id
   from dm_fact_buyer_prd_info_d
   group by category_id, user_id ) t1
   join (
   select category_id, user_id
   from dm_fact_buyer_prd_info_d
   group by category_id, user_id ) t2 on 
 t1.user_id=t2.user_id 
   ) t1
   group by category_id1, category_id2 ) t_o
   where category_id1  category_id2
   and assoc_idx  2;
 -
 The query above will fail when execute, throwing exception: can not cast 
 UDFOpNotEqual(Text, IntWritable) to UDFOpNotEqual(Text, Text). 
 I explained the query and the execute plan looks really wired ( only Stage-1, 
 see the highlighted predicate):
 ---
 Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t_o:t1:t1:dm_fact_buyer_prd_info_d 
   TableScan
 alias: dm_fact_buyer_prd_info_d
 Filter Operator
   predicate:
   expr: *(category_id  user_id)*
   type: boolean
   Select Operator
 expressions:
   expr: category_id
   type: string
   expr: user_id
   type: bigint
 outputColumnNames: category_id, user_id
 Group By Operator
   keys:
 expr: category_id
 type: string
 expr: user_id
 type: bigint
   mode: hash
   outputColumnNames: _col0, _col1
   Reduce Output Operator
 key expressions:
   expr: _col0
   type: string
   expr: _col1
   type: bigint
 sort order: ++
 Map-reduce partition columns:
   expr: _col0
   type: string
   expr: _col1
   type: bigint
 tag: -1
   Reduce Operator Tree:
 Group By Operator
   keys:
 expr: KEY._col0
 type: string
 expr: KEY._col1
 type: bigint
   mode: mergepartial
   outputColumnNames: _col0, _col1
   Select Operator
 expressions:
   expr: _col0
   type: string
   expr: _col1
   type: bigint
 outputColumnNames: _col0, _col1
 File Output Operator
   compressed: true
   GlobalTableId: 0
   table:
   input format: 
 org.apache.hadoop.mapred.SequenceFileInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat

Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 00:08:37.474019)


Review request for hive.


Changes
---

Fixed ppd_outer_join4.q.out


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163856 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163856 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles

[jira] [Updated] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


 [ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-2337:
---

Attachment: HIVE-2337v4.patch

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7, 
 _col10, _col11, _col12
   Filter Operator
 predicate:

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095013#comment-13095013
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 00:08:37.474019)


Review request for hive.


Changes
---

Fixed ppd_outer_join4.q.out


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163856 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163856 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095015#comment-13095015
 ] 

Charles Chen commented on HIVE-2337:


I've fixed the test output--it seems to be an improvement.

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7,

[jira] [Commented] (HIVE-2383) Incorrect alias filtering for predicate pushdown


[ 
https://issues.apache.org/jira/browse/HIVE-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095020#comment-13095020
 ] 

John Sichi commented on HIVE-2383:
--

Oh, um, also:  +1.


 Incorrect alias filtering for predicate pushdown
 

 Key: HIVE-2383
 URL: https://issues.apache.org/jira/browse/HIVE-2383
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Charles Chen
Assignee: Charles Chen
Priority: Critical
 Fix For: 0.9.0

 Attachments: HIVE-2383v1.patch, HIVE-2383v2.patch, HIVE-2383v5.patch


 The predicate pushdown optimizer starts at the topmost operators traverses 
 the operator tree, at each stage collecting predicates to be pushed down.  At 
 each operator, ive.ql.ppd.OpProcFactory.DefaultPPD.mergeWithChildrenPred is 
 called, which merges the predicates of the children nodes into the current 
 node.  The predicates are stored in hive.ql.ppd.ExprWalkerInfo.pushdownPreds 
 as a map from the alias a predicate refers to (a predicate may only refer to 
 one alias at a time as only such predicates can be pushed) to a list of such 
 predicates.  Since at each stage the alias the predicate refers to may change 
 (subqueries may change aliases), this is updated for each operator 
 (hive.ql.ppd.ExprWalkerProcFactory.extractPushdownPreds is called which walks 
 the ExprNodeDesc for each predicate). When a JoinOperator is encountered, 
 mergeWithChildrenPred is passed an optional parameter aliases which 
 contains a set of aliases that can be pushed per ansi semantics (see 
 hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases).  The part that is 
 incorrect is that aliases are filtered in mergeWithChildrenPred before 
 extractPushdownPreds is called, which associates the predicates with the 
 correct alias in the current operator's context while the filtering should 
 happen after.
 In test case Q2 below, when the predicate a.bar=3 comes into the 
 JoinOperator, the alias is a coming in so it is accepted for pushdown.  
 When brought into the JoinOperator's context, however, since the predicate 
 refers to b.foo in the inner scope, we should not actually accept this for 
 pushdown.
 With the test cases
 {noformat}
 -- Q1: predicate should not be pushed on the right side of a left outer join 
 (this is correct in trunk)
 explain
 SELECT a.foo as foo1, b.foo as foo2, b.bar
 FROM pokes a LEFT OUTER JOIN pokes2 b
 ON a.foo=b.foo
 WHERE b.bar=3;
 -- Q2: predicate should not be pushed on the right side of a left outer join 
 (this is broken in trunk)
 explain
 SELECT * FROM
 (SELECT a.foo as foo1, b.foo as foo2, b.bar
 FROM pokes a LEFT OUTER JOIN pokes2 b
 ON a.foo=b.foo) a
 WHERE a.bar=3;
 -- Q3: predicate should be pushed (this is correct in trunk)
 explain
 SELECT * FROM
 (SELECT a.foo as foo1, b.foo as foo2, a.bar
 FROM pokes a JOIN pokes2 b
 ON a.foo=b.foo) a
 WHERE a.bar=3;
 {noformat}
 The current output is
 {noformat}
 hive 
  -- Q1: predicate should not be pushed on the right side of a left outer 
 join
  explain
  SELECT a.foo as foo1, b.foo as foo2, b.bar
  FROM pokes a LEFT OUTER JOIN pokes2 b
  ON a.foo=b.foo
  WHERE b.bar=3;
 OK
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_LEFTOUTERJOIN (TOK_TABREF (TOK_TABNAME pokes) a) 
 (TOK_TABREF (TOK_TABNAME pokes2) b) (= (. (TOK_TABLE_OR_COL a) foo) (. 
 (TOK_TABLE_OR_COL b) foo (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
 TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (. (TOK_TABLE_OR_COL a) foo) foo1) 
 (TOK_SELEXPR (. (TOK_TABLE_OR_COL b) foo) foo2) (TOK_SELEXPR (. 
 (TOK_TABLE_OR_COL b) bar))) (TOK_WHERE (= (. (TOK_TABLE_OR_COL b) bar) 3
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 a 
   TableScan
 alias: a
 Reduce Output Operator
   key expressions:
 expr: foo
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: foo
 type: int
   tag: 0
   value expressions:
 expr: foo
 type: int
 b 
   TableScan
 alias: b
 Reduce Output Operator
   key expressions:
 expr: foo
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: foo
 type: int
   tag: 1
   value expressions:
 expr: foo
 type: int
 expr: bar
 type: int

Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 00:19:17.176704)


Review request for hive.


Changes
---

Rebased to current trunk


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095022#comment-13095022
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 00:19:17.176704)


Review request for hive.


Changes
---

Rebased to current trunk


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id

[jira] [Updated] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


 [ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-2337:
---

Fix Version/s: 0.9.0
   Status: Patch Available  (was: Open)

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7,

[jira] [Updated] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


 [ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-2337:
---

Attachment: HIVE-2337v5.patch

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7, 
 _col10, _col11, _col12

Re: Review Request: Support archiving for multiple partitions if the table is partitioned by multiple columns

2011-08-31 Thread Marcin Kurczych


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1259/
---

(Updated 2011-09-01 01:23:23.280266)


Review request for hive, Paul Yang and namit jain.


Changes
---

Reverted accidentally deleted line.


Summary
---

Allowing archiving at chosen level. When table is partitioned by ds, hr, min it 
allows archiving at ds level, hr level and min level. Corresponding syntaxes 
are:
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08');
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11');
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11', min='30');

You cannot do much to archived partitions. You can read them. You cannot write 
to them / overwrite them. You can drop single archived partitions, but not 
parts of bigger archives.


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153271 
  trunk/data/files/archive_corrupt.rc UNKNOWN 
  trunk/metastore/if/hive_metastore.thrift 1153271 
  trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h 1153271 
  trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp 1153271 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Constants.java
 1153271 
  
trunk/metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
 1153271 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/constants.py 1153271 
  trunk/metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb 1153271 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1153271 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/DummyPartition.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1153271 
  trunk/ql/src/test/queries/clientnegative/archive_insert1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi5.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi6.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi7.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/archive_corrupt.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/archive_multi.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive1.q.out 1153271 
  trunk/ql/src/test/results/clientnegative/archive2.q.out 1153271 
  trunk/ql/src/test/results/clientnegative/archive_insert1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi5.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi6.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi7.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec3.q.out PRE-CREATION

[jira] [Updated] (HIVE-2388) Facing issues while executing commands on hive shell. The system throws following error: only on Windows Cygwin setup

2011-08-31 Thread Siddharth tiwari (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth tiwari updated HIVE-2388:
---

Fix Version/s: 0.7.1
   Labels: patch  (was: )
 Release Note: 
For cygwin and windows pls use the attached start.sh to start hive rather than 
hive.sh
it calls internally the same. version 2 patch would be uploaded soon with 
permanent solutin in the jar.
 Hadoop Flags: [Reviewed]
   Status: Patch Available  (was: Open)

 Facing issues while executing commands on hive shell. The system throws 
 following error: only on Windows Cygwin setup
 -

 Key: HIVE-2388
 URL: https://issues.apache.org/jira/browse/HIVE-2388
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor
Affects Versions: 0.7.1
 Environment: Cygwin Windows
Reporter: Siddharth tiwari
Priority: Critical
  Labels: patch
 Fix For: 0.7.1

 Attachments: start.sh

   Original Estimate: 456h
  Remaining Estimate: 456h

 DDL runs well but the following command describes throw error pls help with 
 resolution and how to get about it
 hive show tables
  ;
 FAILED: Hive Internal Error: 
 java.lang.IllegalArgumentException(java.net.URISyntaxException: Relative path 
 in absolute URI: file:C:/cygwin/tmp//siddharth/hive_2011-08-18_
 03-11-05_208_1818592223695168110)
 java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
 path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_181859222369516
 8110
 at org.apache.hadoop.fs.Path.initialize(Path.java:140)
 at org.apache.hadoop.fs.Path.init(Path.java:132)
 at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:142)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalScratchDir(Context.java:168)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalTmpFileURI(Context.java:282)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:205)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_1818592223695168110
 at java.net.URI.checkPath(URI.java:1787)
 at java.net.URI.init(URI.java:735)
 at org.apache.hadoop.fs.Path.initialize(Path.java:137)
 ... 16 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2388) Facing issues while executing commands on hive shell. The system throws following error: only on Windows Cygwin setup

2011-08-31 Thread Siddharth tiwari (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth tiwari updated HIVE-2388:
---

Attachment: start.sh

use this file to start hive on cygwin

bin/start.sh

 Facing issues while executing commands on hive shell. The system throws 
 following error: only on Windows Cygwin setup
 -

 Key: HIVE-2388
 URL: https://issues.apache.org/jira/browse/HIVE-2388
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor
Affects Versions: 0.7.1
 Environment: Cygwin Windows
Reporter: Siddharth tiwari
Priority: Critical
  Labels: patch
 Fix For: 0.7.1

 Attachments: start.sh

   Original Estimate: 456h
  Remaining Estimate: 456h

 DDL runs well but the following command describes throw error pls help with 
 resolution and how to get about it
 hive show tables
  ;
 FAILED: Hive Internal Error: 
 java.lang.IllegalArgumentException(java.net.URISyntaxException: Relative path 
 in absolute URI: file:C:/cygwin/tmp//siddharth/hive_2011-08-18_
 03-11-05_208_1818592223695168110)
 java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
 path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_181859222369516
 8110
 at org.apache.hadoop.fs.Path.initialize(Path.java:140)
 at org.apache.hadoop.fs.Path.init(Path.java:132)
 at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:142)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalScratchDir(Context.java:168)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalTmpFileURI(Context.java:282)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:205)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_1818592223695168110
 at java.net.URI.checkPath(URI.java:1787)
 at java.net.URI.init(URI.java:735)
 at org.apache.hadoop.fs.Path.initialize(Path.java:137)
 ... 16 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2388) Facing issues while executing commands on hive shell. The system throws following error: only on Windows Cygwin setup

2011-08-31 Thread Siddharth tiwari (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth tiwari updated HIVE-2388:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 Facing issues while executing commands on hive shell. The system throws 
 following error: only on Windows Cygwin setup
 -

 Key: HIVE-2388
 URL: https://issues.apache.org/jira/browse/HIVE-2388
 Project: Hive
  Issue Type: Bug
  Components: CLI, Query Processor
Affects Versions: 0.7.1
 Environment: Cygwin Windows
Reporter: Siddharth tiwari
Priority: Critical
  Labels: patch
 Fix For: 0.7.1

 Attachments: start.sh

   Original Estimate: 456h
  Remaining Estimate: 456h

 DDL runs well but the following command describes throw error pls help with 
 resolution and how to get about it
 hive show tables
  ;
 FAILED: Hive Internal Error: 
 java.lang.IllegalArgumentException(java.net.URISyntaxException: Relative path 
 in absolute URI: file:C:/cygwin/tmp//siddharth/hive_2011-08-18_
 03-11-05_208_1818592223695168110)
 java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative 
 path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_181859222369516
 8110
 at org.apache.hadoop.fs.Path.initialize(Path.java:140)
 at org.apache.hadoop.fs.Path.init(Path.java:132)
 at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:142)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalScratchDir(Context.java:168)
 at 
 org.apache.hadoop.hive.ql.Context.getLocalTmpFileURI(Context.java:282)
 at 
 org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:205)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.net.URISyntaxException: Relative path in absolute URI: 
 file:C:/cygwin/tmp//siddharth/hive_2011-08-18_03-11-05_208_1818592223695168110
 at java.net.URI.checkPath(URI.java:1787)
 at java.net.URI.init(URI.java:735)
 at org.apache.hadoop.fs.Path.initialize(Path.java:137)
 ... 16 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095082#comment-13095082
 ] 

jirapos...@reviews.apache.org commented on HIVE-2247:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1105/
---

(Updated 2011-09-01 02:23:59.714244)


Review request for Siying Dong.


Changes
---

+work.getInputs().add(new ReadEntity(oldPart));
+work.getOutputs().add(new WriteEntity(newPart));


Summary
---

Implement ALTER TABLE PARTITION RENAME function to rename a partition. 
Add HiveQL syntax ALTER TABLE bar PARTITION (k1='v1', k2='v2') RENAME TO 
PARTITION (k1='v3', k2='v4');
This is my first Hive diff, I just learn everything from existing codebase and 
may not have a good understanding on it. 
Feel free to inform me if I make something wrong. Thanks


This addresses bug HIVE-2247.
https://issues.apache.org/jira/browse/HIVE-2247


Diffs (updated)
-

  trunk/metastore/if/hive_metastore.thrift 1145366 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1145366 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1145366 
  
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1145366 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1145366 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1145366 
  
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1145366 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1145366 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1145366 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1145366 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure2.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure3.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_rename_partition.q 
PRE-CREATION 
  
trunk/ql/src/test/queries/clientpositive/alter_rename_partition_authorization.q 
PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure2.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure3.q.out 
PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_rename_partition.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1105/diff


Testing
---

Add a partition A in the table
Rename partition A to partition B
Show the partitions in the table, it returns partition B.
SELECT the data from partition A, it returns no results
SELECT the data from partition B, it returns the data originally stored in 
partition A


Thanks,

Weiyan



 ALTER TABLE RENAME PARTITION
 

 Key: HIVE-2247
 URL: https://issues.apache.org/jira/browse/HIVE-2247
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Weiyan Wang
 Attachments: HIVE-2247.3.patch.txt,

[jira] [Updated] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-08-31 Thread Weiyan Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiyan Wang updated HIVE-2247:
--

Attachment: HIVE-2247.8.patch.txt

+work.getInputs().add(new ReadEntity(oldPart));
+work.getOutputs().add(new WriteEntity(newPart));

 ALTER TABLE RENAME PARTITION
 

 Key: HIVE-2247
 URL: https://issues.apache.org/jira/browse/HIVE-2247
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Weiyan Wang
 Attachments: HIVE-2247.3.patch.txt, HIVE-2247.4.patch.txt, 
 HIVE-2247.5.patch.txt, HIVE-2247.6.patch.txt, HIVE-2247.7.patch.txt, 
 HIVE-2247.8.patch.txt


 We need a ALTER TABLE TABLE RENAME PARTITIONfunction that is similar t ALTER 
 TABLE RENAME.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread John Sichi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/#review1710
---



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
https://reviews.apache.org/r/1275/#comment3884

There is a weird non-ASCII character on this line.


- John


On 2011-09-01 00:19:17, Charles Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1275/
 ---
 
 (Updated 2011-09-01 00:19:17)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 https://issues.apache.org/jira/browse/HIVE-2337
 
 
 This addresses bug HIVE-2337.
 https://issues.apache.org/jira/browse/HIVE-2337
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
  1163875 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
  1163875 
 
 Diff: https://reviews.apache.org/r/1275/diff
 
 
 Testing
 ---
 
 Unit tests passed
 
 
 Thanks,
 
 Charles

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095104#comment-13095104
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/#review1710
---



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
https://reviews.apache.org/r/1275/#comment3884

There is a weird non-ASCII character on this line.


- John


On 2011-09-01 00:19:17, Charles Chen wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1275/
bq.  ---
bq.  
bq.  (Updated 2011-09-01 00:19:17)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HIVE-2337
bq.  
bq.  
bq.  This addresses bug HIVE-2337.
bq.  https://issues.apache.org/jira/browse/HIVE-2337
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 
bq.  
bq.  Diff: https://reviews.apache.org/r/1275/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Unit tests passed
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Charles
bq.  
bq.



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095105#comment-13095105
 ] 

John Sichi commented on HIVE-2337:
--

Charles, did you intentionally omit the new ppd_outer_join5.q from the latest 
patch?

Also, there's a weird non-ASCII character in the Javadoc.


 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0}

[jira] [Commented] (HIVE-1545) Add a bunch of UDFs and UDAFs

2011-08-31 Thread cyril liao (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095106#comment-13095106
 ] 

cyril liao commented on HIVE-1545:
--

com.facebook.hive.udf.lib.UDFUtils is not included.

Would you please upload it?

 Add a bunch of UDFs and UDAFs
 -

 Key: HIVE-1545
 URL: https://issues.apache.org/jira/browse/HIVE-1545
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Jonathan Chang
Assignee: Jonathan Chang
Priority: Minor
 Attachments: core.tar.gz, ext.tar.gz, udfs.tar.gz, udfs.tar.gz


 Here some UD(A)Fs which can be incorporated into the Hive distribution:
 UDFArgMax - Find the 0-indexed index of the largest argument. e.g., ARGMAX(4, 
 5, 3) returns 1.
 UDFBucket - Find the bucket in which the first argument belongs. e.g., 
 BUCKET(x, b_1, b_2, b_3, ...), will return the smallest i such that x  b_{i} 
 but = b_{i+1}. Returns 0 if x is smaller than all the buckets.
 UDFFindInArray - Finds the 1-index of the first element in the array given as 
 the second argument. Returns 0 if not found. Returns NULL if either argument 
 is NULL. E.g., FIND_IN_ARRAY(5, array(1,2,5)) will return 3. FIND_IN_ARRAY(5, 
 array(1,2,3)) will return 0.
 UDFGreatCircleDist - Finds the great circle distance (in km) between two 
 lat/long coordinates (in degrees).
 UDFLDA - Performs LDA inference on a vector given fixed topics.
 UDFNumberRows - Number successive rows starting from 1. Counter resets to 1 
 whenever any of its parameters changes.
 UDFPmax - Finds the maximum of a set of columns. e.g., PMAX(4, 5, 3) returns 
 5.
 UDFRegexpExtractAll - Like REGEXP_EXTRACT except that it returns all matches 
 in an array.
 UDFUnescape - Returns the string unescaped (using C/Java style unescaping).
 UDFWhich - Given a boolean array, return the indices which are TRUE.
 UDFJaccard
 UDAFCollect - Takes all the values associated with a row and converts it into 
 a list. Make sure to have: set hive.map.aggr = false;
 UDAFCollectMap - Like collect except that it takes tuples and generates a map.
 UDAFEntropy - Compute the entropy of a column.
 UDAFPearson (BROKEN!!!) - Computes the pearson correlation between two 
 columns.
 UDAFTop - TOP(KEY, VAL) - returns the KEY associated with the largest value 
 of VAL.
 UDAFTopN (BROKEN!!!) - Like TOP except returns a list of the keys associated 
 with the N (passed as the third parameter) largest values of VAL.
 UDAFHistogram

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 04:26:59.076177)


Review request for hive.


Changes
---

Oops fixed dropped unit test, javadoc character


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095110#comment-13095110
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 04:26:59.076177)


Review request for hive.


Changes
---

Oops fixed dropped unit test, javadoc character


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key

[jira] [Updated] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


 [ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-2337:
---

Attachment: HIVE-2337v6.patch

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch, HIVE-2337v6.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7, 
 _col10, _col11,

Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 04:42:25.815081)


Review request for hive.


Changes
---

Added TestParse changes ??


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/compiler/plan/input4.q.xml
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/compiler/plan/join8.q.xml
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-31 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13095116#comment-13095116
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-09-01 04:42:25.815081)


Review request for hive.


Changes
---

Added TestParse changes ??


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join4.q.out
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/compiler/plan/input4.q.xml
 1163875 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/compiler/plan/join8.q.xml
 1163875 

Diff: https://reviews.apache.org/r/1275/diff


Testing
---

Unit tests passed


Thanks,

Charles



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch, HIVE-2337v6.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr:

[jira] [Updated] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins


 [ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-2337:
---

Attachment: HIVE-2337v7.patch

 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Fix For: 0.9.0

 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch, 
 HIVE-2337v4.patch, HIVE-2337v5.patch, HIVE-2337v6.patch, HIVE-2337v7.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN c INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 2
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
   Reduce Operator Tree:
 Join Operator
   condition map:
Outer Join 0 to 1
Inner Join 1 to 2
   condition expressions:
 0 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 1 {VALUE._col0} {VALUE._col1} {VALUE._col2}
 2 {VALUE._col0} {VALUE._col1} {VALUE._col2}
   handleSkewJoin: false
   outputColumnNames: _col0, _col1, _col2, _col5, _col6, _col7,

Re: Review Request: HIVE-1989: recognize transitivity of predicates on join keys