[jira] [Updated] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-2344:
--

Attachment: ppd_udf_col.q.out.txt

bq. Any other filter on 'udf selected as column alias in select' will also be 
pushed down always.

Attaching test output with faulty explain plans. 

 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: Amareshwari Sriramadasu
 Attachments: ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-2344:
--

Attachment: hive-patch-2344.txt

bq. Any other filter on 'udf selected as column alias in select' will also be 
pushed down always. Do we want to do this? Might address in a separate jira.

Addressing this also in the patch.

 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: Amareshwari Sriramadasu
 Attachments: hive-patch-2344.txt, ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081567#comment-13081567
 ] 

jirapos...@reviews.apache.org commented on HIVE-2344:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1404/
---

Review request for hive, John Sichi and Yongqiang He.


Summary
---

Any filter on 'udf selected as column alias in select' will be pushed down 
through the select operator, which it should not. Patch addresses this by 
walking through the udf expression again.


This addresses bug HIVE-2344.
https://issues.apache.org/jira/browse/HIVE-2344


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1153812 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1153812 
  trunk/ql/src/test/queries/clientpositive/ppd_udf_col.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/ppd_udf_col.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1404/diff


Testing
---


Thanks,

Amareshwari



 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: Amareshwari Sriramadasu
 Attachments: hive-patch-2344.txt, ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-2344:
--

Fix Version/s: 0.8.0
Affects Version/s: 0.8.0
   Status: Patch Available  (was: Open)

 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: He Yongqiang
Assignee: Amareshwari Sriramadasu
 Fix For: 0.8.0

 Attachments: hive-patch-2344.txt, ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2358) JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types

2011-08-09 Thread Mythili Gopalakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081610#comment-13081610
 ] 

Mythili Gopalakrishnan commented on HIVE-2358:
--

Tested this patch and here are my results.

With Patch  HIVE-2358 on a FLOAT Column
---
DatabaseMetaData.getColumns () COLUMN_SIZE returns 7
DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 7

ResultSetMetaData.getPrecision() returns 7
ResultSetMetaData.getScale() returns 7

With Patch  HIVE-2358 on a DOUBLE Column
---
DatabaseMetaData.getColumns () COLUMN_SIZE returns 15
DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 15

ResultSetMetaData.getPrecision() returns 15
ResultSetMetaData.getScale() returns 15

--Mythili

 JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types
 --

 Key: HIVE-2358
 URL: https://issues.apache.org/jira/browse/HIVE-2358
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2358.patch


 My patch for HIVE-1631 did not ensure the following (from comment on 1631):
 -
 Mythili Gopalakrishnan added a comment - 08/Aug/11 08:42
 Just tested this fix and does NOT work correctly. Here are my findings on a 
 FLOAT column
 Without Patch on a FLOAT Column
 
 DatabaseMetaData.getColumns () COLUMN_SIZE returns 12
 DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 0
 ResultSetMetaData.getPrecision() returns 0
 ResultSetMetaData.getScale() returns 0
 With Patch on a FLOAT Column
 
 DatabaseMetaData.getColumns () COLUMN_SIZE returns 24
 DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 0
 ResultSetMetaData.getPrecision() returns 7
 ResultSetMetaData.getScale() returns 7
 Also both DatabaseMetadata and ResultSetMetaData must return the same 
 information for Precision and Scale for FLOAT,DOUBLE types.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081630#comment-13081630
 ] 

Chinna Rao Lalam commented on HIVE-2196:


Patch updated with below information

*The below configurations are not used in the code base so removed from the 
hive-default.xml*

hive.mapjoin.hashtable.initialCapacity 
hive.mapjoin.hashtable.loadfactor 
hive.mapjoin.smalltable.filesize 
hive.optimize.pruner 
hive.stats.jdbc.atomic 
hive.concurrency.manager 

*The below configurations are not used in the code base but these 
configurations are required for the system so need to maintain in the 
hive-default.xml and no need to add in the HiveConf.java*

javax.jdo.option.ConnectionDriverName 
javax.jdo.PersistenceManagerFactoryClass 
javax.jdo.option.DetachAllOnCommit 
javax.jdo.option.NonTransactionalRead 
javax.jdo.option.ConnectionUserName 
fs.har.impl 


 Ensure HiveConf includes all properties defined in hive-default.xml
 ---

 Key: HIVE-2196
 URL: https://issues.apache.org/jira/browse/HIVE-2196
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2196.patch


 There are a bunch of properties that are defined in hive-default.xml but not 
 in HiveConf.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2196:
---

Attachment: HIVE-2196.1.patch

 Ensure HiveConf includes all properties defined in hive-default.xml
 ---

 Key: HIVE-2196
 URL: https://issues.apache.org/jira/browse/HIVE-2196
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2196.1.patch, HIVE-2196.patch


 There are a bunch of properties that are defined in hive-default.xml but not 
 in HiveConf.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2337: Predicate pushdown erroneously conservative with outer joins

2011-08-09 Thread Charles Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-08-09 17:41:41.427733)


Review request for hive.


Changes
---

Unit tests passed


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1153598 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1275/diff


Testing (updated)
---

Unit tests passed


Thanks,

Charles



Re: Review Request: HIVE-1342: Predicate push down get error result when sub-queries have the same alias name

2011-08-09 Thread Charles Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1329/
---

(Updated 2011-08-09 17:42:48.808459)


Review request for hive.


Changes
---

Unit tests passed


Summary
---

https://issues.apache.org/jira/browse/HIVE-1342


This addresses bug HIVE-1342.
https://issues.apache.org/jira/browse/HIVE-1342


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1155166 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_repeated_alias.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_repeated_alias.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1329/diff


Testing (updated)
---

Unit tests passed


Thanks,

Charles



[jira] [Commented] (HIVE-1342) Predicate push down get error result when sub-queries have the same alias name

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081789#comment-13081789
 ] 

jirapos...@reviews.apache.org commented on HIVE-1342:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1329/
---

(Updated 2011-08-09 17:42:48.808459)


Review request for hive.


Changes
---

Unit tests passed


Summary
---

https://issues.apache.org/jira/browse/HIVE-1342


This addresses bug HIVE-1342.
https://issues.apache.org/jira/browse/HIVE-1342


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1155166 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_repeated_alias.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_repeated_alias.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1329/diff


Testing (updated)
---

Unit tests passed


Thanks,

Charles



 Predicate push down get error result when sub-queries have the same alias 
 name 
 ---

 Key: HIVE-1342
 URL: https://issues.apache.org/jira/browse/HIVE-1342
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Ted Xu
Assignee: Charles Chen
Priority: Critical
 Attachments: HIVE-1342v1.patch, cmd.hql, explain, 
 ppd_same_alias_1.patch, ppd_same_alias_2.patch


 Query is over-optimized by PPD when sub-queries have the same alias name, see 
 the query:
 ---
 create table if not exists dm_fact_buyer_prd_info_d (
   category_id string
   ,gmv_trade_num  int
   ,user_idint
   )
 PARTITIONED BY (ds int);
 set hive.optimize.ppd=true;
 set hive.map.aggr=true;
 explain select category_id1,category_id2,assoc_idx
 from (
   select 
   category_id1
   , category_id2
   , count(distinct user_id) as assoc_idx
   from (
   select 
   t1.category_id as category_id1
   , t2.category_id as category_id2
   , t1.user_id
   from (
   select category_id, user_id
   from dm_fact_buyer_prd_info_d
   group by category_id, user_id ) t1
   join (
   select category_id, user_id
   from dm_fact_buyer_prd_info_d
   group by category_id, user_id ) t2 on 
 t1.user_id=t2.user_id 
   ) t1
   group by category_id1, category_id2 ) t_o
   where category_id1  category_id2
   and assoc_idx  2;
 -
 The query above will fail when execute, throwing exception: can not cast 
 UDFOpNotEqual(Text, IntWritable) to UDFOpNotEqual(Text, Text). 
 I explained the query and the execute plan looks really wired ( only Stage-1, 
 see the highlighted predicate):
 ---
 Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t_o:t1:t1:dm_fact_buyer_prd_info_d 
   TableScan
 alias: dm_fact_buyer_prd_info_d
 Filter Operator
   predicate:
   expr: *(category_id  user_id)*
   type: boolean
   Select Operator
 expressions:
   expr: category_id
   type: string
   expr: user_id
   type: bigint
 outputColumnNames: category_id, user_id
 Group By Operator
   keys:
 expr: category_id
 type: string
 expr: user_id
 type: bigint
   mode: hash
   outputColumnNames: _col0, _col1
   Reduce Output Operator
 key expressions:
   expr: _col0
   type: string
   expr: _col1
   type: bigint
 sort order: ++
 Map-reduce partition columns:
   expr: _col0
   type: string
   expr: _col1
   type: bigint
 

[jira] [Commented] (HIVE-2337) Predicate pushdown erroneously conservative with outer joins

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081788#comment-13081788
 ] 

jirapos...@reviews.apache.org commented on HIVE-2337:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1275/
---

(Updated 2011-08-09 17:41:41.427733)


Review request for hive.


Changes
---

Unit tests passed


Summary
---

https://issues.apache.org/jira/browse/HIVE-2337


This addresses bug HIVE-2337.
https://issues.apache.org/jira/browse/HIVE-2337


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
 1153598 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/ppd_outer_join5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/ppd_outer_join5.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1275/diff


Testing (updated)
---

Unit tests passed


Thanks,

Charles



 Predicate pushdown erroneously conservative with outer joins
 

 Key: HIVE-2337
 URL: https://issues.apache.org/jira/browse/HIVE-2337
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Charles Chen
Assignee: Charles Chen
 Attachments: HIVE-2337v1.patch, HIVE-2337v2.patch, HIVE-2337v3.patch


 The predicate pushdown filter is not applying left associativity of joins 
 correctly in determining possible aliases for pushing predicates.
 In hive.ql.ppd.OpProcFactory.JoinPPD.getQualifiedAliases, the criteria for 
 pushing aliases is specified as:
 {noformat}
 /**
  * Figures out the aliases for whom it is safe to push predicates based on
  * ANSI SQL semantics For inner join, all predicates for all aliases can 
 be
  * pushed For full outer join, none of the predicates can be pushed as 
 that
  * would limit the number of rows for join For left outer join, all the
  * predicates on the left side aliases can be pushed up For right outer
  * join, all the predicates on the right side aliases can be pushed up 
 Joins
  * chain containing both left and right outer joins are treated as full
  * outer join. [...]
  *
  * @param op
  *  Join Operator
  * @param rr
  *  Row resolver
  * @return set of qualified aliases
  */
 {noformat}
 Since hive joins are left associative, something like a RIGHT OUTER JOIN b 
 LEFT OUTER JOIN cĀ INNER JOIN d should be interpreted as ((a RIGHT OUTER 
 JOIN b) LEFT OUTER JOIN c) INNER JOIN d, so there would be cases where joins 
 with both left and right outer joins can have aliases that can be pushed.  
 Here, aliases b and d are eligible to be pushed up while the current criteria 
 provide that none are eligible.
 Using:
 {noformat}
 create table t1 (id int, key string, value string);
 create table t2 (id int, key string, value string);
 create table t3 (id int, key string, value string);
 create table t4 (id int, key string, value string);
 {noformat}
 For example, the query
 {noformat}
 explain select * from t1 full outer join t2 on t1.id=t2.id join t3 on 
 t2.id=t3.id where t3.id=20; 
 {noformat}
 currently gives
 {noformat}
 STAGE DEPENDENCIES:
   Stage-1 is a root stage
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-1
 Map Reduce
   Alias - Map Operator Tree:
 t1 
   TableScan
 alias: t1
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 0
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t2 
   TableScan
 alias: t2
 Reduce Output Operator
   key expressions:
 expr: id
 type: int
   sort order: +
   Map-reduce partition columns:
 expr: id
 type: int
   tag: 1
   value expressions:
 expr: id
 type: int
 expr: key
 type: string
 expr: value
 type: string
 t3 
   TableScan
 alias: t3
 Reduce Output Operator
   key expressions:
   

[jira] [Commented] (HIVE-2352) create empty files if and only if table is bucketed and hive.enforce.bucketing=true

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081809#comment-13081809
 ] 

Ning Zhang commented on HIVE-2352:
--

Franklink there are 2 more diffs are the last change. input26.q. and 
bucketmapjoin2.q. It seems the changes are more involved than what we expect, 
can you run the whole unit tests before submitting the next fix? 

 create empty files if and only if table is bucketed and 
 hive.enforce.bucketing=true
 ---

 Key: HIVE-2352
 URL: https://issues.apache.org/jira/browse/HIVE-2352
 Project: Hive
  Issue Type: Bug
Reporter: Franklin Hu
Assignee: Franklin Hu
Priority: Minor
 Fix For: 0.8.0

 Attachments: hive-2352.1.patch, hive-2352.2.patch, hive-2352.3.patch


 create table t1 (key int, value string) stored as rcfile;
 insert overwrite table t1 select * from src where false;
 Creates an empty RCFile with no rows and size 151B. The file not should be 
 created since there are no rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1987) HWI admin_list_jobs JSP page throws exception

2011-08-09 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1987:
--

Attachment: hive-1987.patch.txt.1

 HWI admin_list_jobs JSP page throws exception
 -

 Key: HIVE-1987
 URL: https://issues.apache.org/jira/browse/HIVE-1987
 Project: Hive
  Issue Type: Bug
  Components: Web UI
Affects Versions: 0.7.0, 0.7.1
Reporter: Carl Steinbach
Assignee: Edward Capriolo
 Attachments: hive-1987.patch.txt.1


 It looks like the admin_list_jobs.jsp page is trying to reference 
 ExecDriver.runningJobKillURIs, which is now a private to ExecDriver:
 {code}
 RequestURI=/hwi/admin_list_jobs.jsp
 Caused by:
 org.apache.jasper.JasperException: Unable to compile class for JSP
 An error occurred at line: 24 in the jsp file: /admin_list_jobs.jsp
 Generated servlet error:
 The field ExecDriver.runningJobKillURIs is not visible
 An error occurred at line: 27 in the jsp file: /admin_list_jobs.jsp
 Generated servlet error:
 The field ExecDriver.runningJobKillURIs is not visible
   at 
 org.apache.jasper.compiler.DefaultErrorHandler.javacError(DefaultErrorHandler.java:84)
   at 
 org.apache.jasper.compiler.ErrorDispatcher.javacError(ErrorDispatcher.java:328)
   at 
 org.apache.jasper.compiler.JDTCompiler.generateClass(JDTCompiler.java:409)
   at org.apache.jasper.compiler.Compiler.compile(Compiler.java:288)
   at org.apache.jasper.compiler.Compiler.compile(Compiler.java:267)
   at org.apache.jasper.compiler.Compiler.compile(Compiler.java:255)
   at 
 org.apache.jasper.JspCompilationContext.compile(JspCompilationContext.java:563)
   at 
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:293)
   at 
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
   at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
   at 
 org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
   at 
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
   at 
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at 
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at 
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at 
 org.mortbay.jetty.handler.RequestLogHandler.handle(RequestLogHandler.java:49)
   at 
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
   at org.mortbay.jetty.Server.handle(Server.java:324)
   at 
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
   at 
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
   at 
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
   at 
 org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: Support archiving for multiple partitions if the table is partitioned by multiple columns

2011-08-09 Thread Ning Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1259/#review1359
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java
https://reviews.apache.org/r/1259/#comment3058

Do you want to return NULL for input like (hr='13') or return a non-null 
PartSpecInfo but all fields (prefixFields and prefixValues) are null. 

It seems the function implements the 2nd option. In this case how do you 
distinguish the cases where  partition spec is hr='13' and there is no 
partition spec at all (meaning all partitions in the table)? Should we raise an 
exception for the first case (hr='13') since it's not the correct usage?



trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java
https://reviews.apache.org/r/1259/#comment3056

our coding convention is like:

if () {
} else {
}



trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
https://reviews.apache.org/r/1259/#comment3059

Here the argument Map doesn't guarantee orders of the key-value pair. 
You'll need to use LinkedHashMap for that purpose. 



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
https://reviews.apache.org/r/1259/#comment3060

indentation


- Ning


On 2011-08-09 01:28:13, Marcin Kurczych wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1259/
 ---
 
 (Updated 2011-08-09 01:28:13)
 
 
 Review request for hive, Paul Yang and namit jain.
 
 
 Summary
 ---
 
 Allowing archiving at chosen level. When table is partitioned by ds, hr, min 
 it allows archiving at ds level, hr level and min level. Corresponding 
 syntaxes are:
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08');
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11');
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11', min='30');
 
 You cannot do much to archived partitions. You can read them. You cannot 
 write to them / overwrite them. You can drop single archived partitions, but 
 not parts of bigger archives.
 
 
 Diffs
 -
 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1153271 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/DummyPartition.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1153271 
   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/constants.py 1153271 
   trunk/metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb 1153271 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 
 PRE-CREATION 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153271 
   trunk/data/conf/hive-site.xml 1153271 
   trunk/metastore/if/hive_metastore.thrift 1153271 
   trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h 1153271 
   trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp 1153271 
   
 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Constants.java
  1153271 
   
 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
  1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1153271 
   trunk/ql/src/test/queries/clientnegative/archive_insert1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi5.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi6.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi7.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec3.q 

Re: Review Request: Support archiving for multiple partitions if the table is partitioned by multiple columns

2011-08-09 Thread Marcin Kurczych


 On 2011-08-03 22:39:04, Paul Yang wrote:
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, lines 
  1172-1173
  https://reviews.apache.org/r/1259/diff/1/?file=30272#file30272line1172
 
  Should be info or debug, not error

ok


 On 2011-08-03 22:39:04, Paul Yang wrote:
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java, line 1105
  https://reviews.apache.org/r/1259/diff/1/?file=30272#file30272line1105
 
  One possible issue is that if the user changes the value of this 
  through the CLI (i.e. with a set xxx=yyy;), it wouldn't take effect. It 
  should be read in the constructor or in the methods.

ok


- Marcin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1259/#review1278
---


On 2011-08-09 01:28:13, Marcin Kurczych wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1259/
 ---
 
 (Updated 2011-08-09 01:28:13)
 
 
 Review request for hive, Paul Yang and namit jain.
 
 
 Summary
 ---
 
 Allowing archiving at chosen level. When table is partitioned by ds, hr, min 
 it allows archiving at ds level, hr level and min level. Corresponding 
 syntaxes are:
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08');
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11');
 ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11', min='30');
 
 You cannot do much to archived partitions. You can read them. You cannot 
 write to them / overwrite them. You can drop single archived partitions, but 
 not parts of bigger archives.
 
 
 Diffs
 -
 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1153271 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
  1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/DummyPartition.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1153271 
   trunk/metastore/src/gen/thrift/gen-py/hive_metastore/constants.py 1153271 
   trunk/metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb 1153271 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 
 PRE-CREATION 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153271 
   trunk/data/conf/hive-site.xml 1153271 
   trunk/metastore/if/hive_metastore.thrift 1153271 
   trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h 1153271 
   trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp 1153271 
   
 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Constants.java
  1153271 
   
 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
  1153271 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1153271 
   trunk/ql/src/test/queries/clientnegative/archive_insert1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_insert4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi4.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi5.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi6.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_multi7.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec1.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec2.q PRE-CREATION 
   trunk/ql/src/test/queries/clientnegative/archive_partspec3.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/archive_corrupt.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/archive_multi.q PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/archive1.q.out 1153271 
   trunk/ql/src/test/results/clientnegative/archive2.q.out 1153271 
   trunk/ql/src/test/results/clientnegative/archive_insert1.q.out PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/archive_insert2.q.out PRE-CREATION 
   

[jira] [Resolved] (HIVE-2347) Make Hadoop Job ID available after task finishes executing

2011-08-09 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang resolved HIVE-2347.
--

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]

Committed. Thanks Kevin!

 Make Hadoop Job ID available after task finishes executing
 --

 Key: HIVE-2347
 URL: https://issues.apache.org/jira/browse/HIVE-2347
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.8.0

 Attachments: HIVE-2347.1.patch.txt


 After Map Reduce tasks finish the execute method (ExecDriver and 
 BlockMergeTask) the Hadoop Job ID is inaccessible to the Driver, and hence 
 the hooks it runs.  Expose this information could help to improve logging, 
 debugging, etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2156) Improve error messages emitted during task execution

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081847#comment-13081847
 ] 

Ning Zhang commented on HIVE-2156:
--

Syed, can you update the review board with your latest patch as well? Otherwise 
it's hard to track the additional changes.

 Improve error messages emitted during task execution
 

 Key: HIVE-2156
 URL: https://issues.apache.org/jira/browse/HIVE-2156
 Project: Hive
  Issue Type: Improvement
Reporter: Syed S. Albiz
Assignee: Syed S. Albiz
 Attachments: HIVE-2156.1.patch, HIVE-2156.10.patch, 
 HIVE-2156.11.patch, HIVE-2156.12.patch, HIVE-2156.2.patch, HIVE-2156.4.patch, 
 HIVE-2156.8.patch, HIVE-2156.9.patch


 Follow-up to HIVE-1731
 A number of issues were related to reporting errors from task execution and 
 surfacing these in a more useful form.
 Currently a cryptic message with Execution Error and a return code and 
 class name of the task is emitted.
 The most useful log messages here are emitted to the local logs, which can be 
 found through jobtracker. Having either a pointer to these logs as part of 
 the error message or the actual content would improve the usefulness 
 substantially. It may also warrant looking into how the underlying error 
 reporting through Hadoop is done and if more information can be propagated up 
 from there.
 Specific issues raised in  HIVE-1731:
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 * issue was in regexp_extract syntax
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 * tried: desc table_does_not_exist;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread chinnarao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1423/
---

Review request for hive, Carl Steinbach and John Sichi.


Summary
---

The below configurations are not used in the code base so removed from the 
hive-default.xml

hive.mapjoin.hashtable.initialCapacity 
hive.mapjoin.hashtable.loadfactor 
hive.mapjoin.smalltable.filesize 
hive.optimize.pruner 
hive.stats.jdbc.atomic 
hive.concurrency.manager 

The below configurations are not used in the code base but these configurations 
are required for the system so need to maintain in the hive-default.xml and no 
need to add in the HiveConf.java

javax.jdo.option.ConnectionDriverName 
javax.jdo.PersistenceManagerFactoryClass 
javax.jdo.option.DetachAllOnCommit 
javax.jdo.option.NonTransactionalRead 
javax.jdo.option.ConnectionUserName 
fs.har.impl 


This addresses bug HIVE-2196.
https://issues.apache.org/jira/browse/HIVE-2196


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155275 
  trunk/conf/hive-default.xml 1155275 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1155275 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionRemote.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteHiveMetaStore.java
 1155275 
  
trunk/shims/src/common/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerChain.java
 1155275 
  
trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 
1155275 

Diff: https://reviews.apache.org/r/1423/diff


Testing
---

All unit tests passed


Thanks,

chinna



[jira] [Commented] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081850#comment-13081850
 ] 

jirapos...@reviews.apache.org commented on HIVE-2196:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1423/
---

Review request for hive, Carl Steinbach and John Sichi.


Summary
---

The below configurations are not used in the code base so removed from the 
hive-default.xml

hive.mapjoin.hashtable.initialCapacity 
hive.mapjoin.hashtable.loadfactor 
hive.mapjoin.smalltable.filesize 
hive.optimize.pruner 
hive.stats.jdbc.atomic 
hive.concurrency.manager 

The below configurations are not used in the code base but these configurations 
are required for the system so need to maintain in the hive-default.xml and no 
need to add in the HiveConf.java

javax.jdo.option.ConnectionDriverName 
javax.jdo.PersistenceManagerFactoryClass 
javax.jdo.option.DetachAllOnCommit 
javax.jdo.option.NonTransactionalRead 
javax.jdo.option.ConnectionUserName 
fs.har.impl 


This addresses bug HIVE-2196.
https://issues.apache.org/jira/browse/HIVE-2196


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155275 
  trunk/conf/hive-default.xml 1155275 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1155275 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMarkPartitionRemote.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreAuthorization.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
 1155275 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestRemoteHiveMetaStore.java
 1155275 
  
trunk/shims/src/common/java/org/apache/hadoop/hive/io/HiveIOExceptionHandlerChain.java
 1155275 
  
trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 
1155275 

Diff: https://reviews.apache.org/r/1423/diff


Testing
---

All unit tests passed


Thanks,

chinna



 Ensure HiveConf includes all properties defined in hive-default.xml
 ---

 Key: HIVE-2196
 URL: https://issues.apache.org/jira/browse/HIVE-2196
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2196.1.patch, HIVE-2196.patch


 There are a bunch of properties that are defined in hive-default.xml but not 
 in HiveConf.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread Chinna Rao Lalam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081851#comment-13081851
 ] 

Chinna Rao Lalam commented on HIVE-2196:


Added review board request.

https://reviews.apache.org/r/1423/


 Ensure HiveConf includes all properties defined in hive-default.xml
 ---

 Key: HIVE-2196
 URL: https://issues.apache.org/jira/browse/HIVE-2196
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2196.1.patch, HIVE-2196.patch


 There are a bunch of properties that are defined in hive-default.xml but not 
 in HiveConf.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2156: Improve Execution Error Messages

2011-08-09 Thread Syed Albiz

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/777/
---

(Updated 2011-08-09 19:33:53.896288)


Review request for hive and John Sichi.


Changes
---

update patch to avoid TestExecDriver errors and include miniMR mode testcase


Summary
---

- Add local error messages to point to job logs and provide TaskIDs
- Add a timeout to the fetching of task logs and errors


This addresses bug HIVE-2156.
https://issues.apache.org/jira/browse/HIVE-2156


Diffs (updated)
-

  build-common.xml 4856c5f 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f 
  conf/hive-default.xml 3a4f833 
  contrib/src/test/results/clientnegative/case_with_row_sequence.q.out 4447c65 
  ql/build.xml 449b47a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1c6f092 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java e687b1a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 3d5e95d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cec0d46 
  ql/src/test/queries/clientnegative/minimr_broken_pipe.q PRE-CREATION 
  ql/src/test/results/clientnegative/dyn_part3.q.out 5f4df65 
  ql/src/test/results/clientnegative/index_compact_entry_limit.q.out 06437c7 
  ql/src/test/results/clientnegative/index_compact_size_limit.q.out 54f1262 
  ql/src/test/results/clientnegative/minimr_broken_pipe.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/script_broken_pipe1.q.out d33d2cc 
  ql/src/test/results/clientnegative/script_broken_pipe2.q.out afbaa44 
  ql/src/test/results/clientnegative/script_broken_pipe3.q.out fe8f757 
  ql/src/test/results/clientnegative/script_error.q.out c72d780 
  ql/src/test/results/clientnegative/udf_reflect_neg.q.out f2082a3 
  ql/src/test/results/clientnegative/udf_test_error.q.out 5fd9a00 
  ql/src/test/results/clientnegative/udf_test_error_reduce.q.out ddc5e5b 
  ql/src/test/results/clientpositive/auto_join25.q.out 362b2fa 
  ql/src/test/results/clientpositive/mapjoin_hook.q.out acdeefa 
  ql/src/test/templates/TestNegativeCliDriver.vm ec13f79 

Diff: https://reviews.apache.org/r/777/diff


Testing
---

Tested TestNegativeCliDriver in both local and miniMR mode


Thanks,

Syed



[jira] [Commented] (HIVE-2156) Improve error messages emitted during task execution

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081865#comment-13081865
 ] 

jirapos...@reviews.apache.org commented on HIVE-2156:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/777/
---

(Updated 2011-08-09 19:33:53.896288)


Review request for hive and John Sichi.


Changes
---

update patch to avoid TestExecDriver errors and include miniMR mode testcase


Summary
---

- Add local error messages to point to job logs and provide TaskIDs
- Add a timeout to the fetching of task logs and errors


This addresses bug HIVE-2156.
https://issues.apache.org/jira/browse/HIVE-2156


Diffs (updated)
-

  build-common.xml 4856c5f 
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b46976f 
  conf/hive-default.xml 3a4f833 
  contrib/src/test/results/clientnegative/case_with_row_sequence.q.out 4447c65 
  ql/build.xml 449b47a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1c6f092 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapRedTask.java e687b1a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 3d5e95d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java cec0d46 
  ql/src/test/queries/clientnegative/minimr_broken_pipe.q PRE-CREATION 
  ql/src/test/results/clientnegative/dyn_part3.q.out 5f4df65 
  ql/src/test/results/clientnegative/index_compact_entry_limit.q.out 06437c7 
  ql/src/test/results/clientnegative/index_compact_size_limit.q.out 54f1262 
  ql/src/test/results/clientnegative/minimr_broken_pipe.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/script_broken_pipe1.q.out d33d2cc 
  ql/src/test/results/clientnegative/script_broken_pipe2.q.out afbaa44 
  ql/src/test/results/clientnegative/script_broken_pipe3.q.out fe8f757 
  ql/src/test/results/clientnegative/script_error.q.out c72d780 
  ql/src/test/results/clientnegative/udf_reflect_neg.q.out f2082a3 
  ql/src/test/results/clientnegative/udf_test_error.q.out 5fd9a00 
  ql/src/test/results/clientnegative/udf_test_error_reduce.q.out ddc5e5b 
  ql/src/test/results/clientpositive/auto_join25.q.out 362b2fa 
  ql/src/test/results/clientpositive/mapjoin_hook.q.out acdeefa 
  ql/src/test/templates/TestNegativeCliDriver.vm ec13f79 

Diff: https://reviews.apache.org/r/777/diff


Testing
---

Tested TestNegativeCliDriver in both local and miniMR mode


Thanks,

Syed



 Improve error messages emitted during task execution
 

 Key: HIVE-2156
 URL: https://issues.apache.org/jira/browse/HIVE-2156
 Project: Hive
  Issue Type: Improvement
Reporter: Syed S. Albiz
Assignee: Syed S. Albiz
 Attachments: HIVE-2156.1.patch, HIVE-2156.10.patch, 
 HIVE-2156.11.patch, HIVE-2156.12.patch, HIVE-2156.2.patch, HIVE-2156.4.patch, 
 HIVE-2156.8.patch, HIVE-2156.9.patch


 Follow-up to HIVE-1731
 A number of issues were related to reporting errors from task execution and 
 surfacing these in a more useful form.
 Currently a cryptic message with Execution Error and a return code and 
 class name of the task is emitted.
 The most useful log messages here are emitted to the local logs, which can be 
 found through jobtracker. Having either a pointer to these logs as part of 
 the error message or the actual content would improve the usefulness 
 substantially. It may also warrant looking into how the underlying error 
 reporting through Hadoop is done and if more information can be propagated up 
 from there.
 Specific issues raised in  HIVE-1731:
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 * issue was in regexp_extract syntax
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 * tried: desc table_does_not_exist;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Jenkins build is back to normal : Hive-trunk-h0.21 #883

2011-08-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/883/




[jira] [Updated] (HIVE-2233) Show current database in hive prompt

2011-08-09 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2233:
--

Status: Patch Available  (was: Open)

 Show current database in hive prompt
 

 Key: HIVE-2233
 URL: https://issues.apache.org/jira/browse/HIVE-2233
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.8.0

 Attachments: HIVE-2233.patch


 Currently the hive prompt doesn't show which database the user in.  It would 
 be nice if it were something along the lines of {noformat}hive 
 (prod_tracking){noformat} or such.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2233) Show current database in hive prompt

2011-08-09 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated HIVE-2233:
--

Attachment: HIVE-2233.patch

Patch for review.
* Add new option {{hive.cli.print.current.db}} that, if enabled, will show the 
currently selected database in the prompt like so:
{noformat}Hive history 
file=/tmp/jhoman/hive_job_log_jhoman_201108091406_1596091783.txt

hive (default) show
 
   tables; 
 
OK  
 
Time taken: 3.07 seconds
 
hive (default) use abcdefghijklmnopqrstuvwxyz; 
 
OK  
 
Time taken: 0.013 seconds   
 
hive (abcdefghijklmnopqrstuvwxyz) show 
 
  tables;  
 
OK  
 
Time taken: 0.095 seconds   
 
hive (abcdefghijklmnopqrstuvwxyz) set hive.cli.print.current.db=false; 
 
hive{noformat}

The current database is obtained by creating a Hive instance within the 
CliSessionState.  This appears to be a reasonable solution, though an 
alternative would be to add a get_current_db call to the Thrift interface and 
go through the already existing HiveClient in the CliSessionState.  

Since this only affects interactive clients, TestCliDriver tests aren't 
applicable.  It's been tested manually (as above), and I'd be happy to write 
some unit tests, but I've already got two patches that add Mockito waiting to 
be committed (HIVE-2334 and HIVE-2171), so there's not much point in adding it 
again.   

 Show current database in hive prompt
 

 Key: HIVE-2233
 URL: https://issues.apache.org/jira/browse/HIVE-2233
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Jakob Homan
Assignee: Jakob Homan
 Fix For: 0.8.0

 Attachments: HIVE-2233.patch


 Currently the hive prompt doesn't show which database the user in.  It would 
 be nice if it were something along the lines of {noformat}hive 
 (prod_tracking){noformat} or such.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2346: Allow hooks to be run when a job fails.

2011-08-09 Thread Ning Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/#review1364
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
https://reviews.apache.org/r/1295/#comment3082

As a convention, we should declare the variable as interface (List in this 
case) rather than implementation (ArrayList). 



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
https://reviews.apache.org/r/1295/#comment3086

comments are not correct: post execution -- failure


- Ning


On 2011-08-04 19:06:43, Kevin Wilfong wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1295/
 ---
 
 (Updated 2011-08-04 19:06:43)
 
 
 Review request for hive and Ning Zhang.
 
 
 Summary
 ---
 
 I added a new type of hook, which will be run when a job fails.
 
 
 This addresses bug HIVE-2346.
 https://issues.apache.org/jira/browse/HIVE-2346
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153966 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153966 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 1153966 
 
 Diff: https://reviews.apache.org/r/1295/diff
 
 
 Testing
 ---
 
 I ran the TestCliDriver and TestNegativeCliDriver test suites and verified 
 they passed.
 
 In addition, I created a sample hook, which simply logged that it was run.  I 
 verified it was run on a failure, but not when a job succeeded.
 
 
 Thanks,
 
 Kevin
 




[jira] [Commented] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081944#comment-13081944
 ] 

jirapos...@reviews.apache.org commented on HIVE-2346:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/#review1364
---



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
https://reviews.apache.org/r/1295/#comment3082

As a convention, we should declare the variable as interface (List in this 
case) rather than implementation (ArrayList). 



trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java
https://reviews.apache.org/r/1295/#comment3086

comments are not correct: post execution -- failure


- Ning


On 2011-08-04 19:06:43, Kevin Wilfong wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1295/
bq.  ---
bq.  
bq.  (Updated 2011-08-04 19:06:43)
bq.  
bq.  
bq.  Review request for hive and Ning Zhang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I added a new type of hook, which will be run when a job fails.
bq.  
bq.  
bq.  This addresses bug HIVE-2346.
bq.  https://issues.apache.org/jira/browse/HIVE-2346
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153966 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153966 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 
1153966 
bq.  
bq.  Diff: https://reviews.apache.org/r/1295/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I ran the TestCliDriver and TestNegativeCliDriver test suites and verified 
they passed.
bq.  
bq.  In addition, I created a sample hook, which simply logged that it was run. 
 I verified it was run on a failure, but not when a job succeeded.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.



 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: Support archiving for multiple partitions if the table is partitioned by multiple columns

2011-08-09 Thread Marcin Kurczych

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1259/
---

(Updated 2011-08-09 21:46:08.628498)


Review request for hive, Paul Yang and namit jain.


Changes
---

Changes requested by Ning.


Summary
---

Allowing archiving at chosen level. When table is partitioned by ds, hr, min it 
allows archiving at ds level, hr level and min level. Corresponding syntaxes 
are:
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08');
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11');
ALTER TABLE test ARCHIVE PARTITION (ds='2008-04-08', hr='11', min='30');

You cannot do much to archived partitions. You can read them. You cannot write 
to them / overwrite them. You can drop single archived partitions, but not 
parts of bigger archives.


Diffs (updated)
-

  trunk/ql/src/test/results/clientnegative/archive_multi2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_insert1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive2.q.out 1153271 
  trunk/ql/src/test/results/clientnegative/archive1.q.out 1153271 
  trunk/ql/src/test/queries/clientpositive/archive_multi.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/archive_corrupt.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_partspec2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi6.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi7.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi4.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_multi5.q PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1153271 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/lockmgr/zookeeper/ZooKeeperHiveLockManager.java
 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/DummyPartition.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1153271 
  trunk/ql/src/test/queries/clientnegative/archive_insert1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/archive_insert2.q PRE-CREATION 
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1153271 
  trunk/data/conf/hive-site.xml 1153271 
  trunk/metastore/if/hive_metastore.thrift 1153271 
  trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.h 1153271 
  trunk/metastore/src/gen/thrift/gen-cpp/hive_metastore_constants.cpp 1153271 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/Constants.java
 1153271 
  
trunk/metastore/src/gen/thrift/gen-php/hive_metastore/hive_metastore_constants.php
 1153271 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/constants.py 1153271 
  trunk/metastore/src/gen/thrift/gen-rb/hive_metastore_constants.rb 1153271 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java 
1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1153271 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ArchiveUtils.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1153271 
  trunk/ql/src/test/results/clientnegative/archive_multi3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi4.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi5.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi6.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_multi7.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/archive_partspec3.q.out PRE-CREATION 
  

[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-08-09 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081956#comment-13081956
 ] 

Paul Yang commented on HIVE-2246:
-

+1 - tests passed. Will commit.

 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, 
 HIVE-2246.8.patch


 Note: this patch proposes a schema change, and is therefore incompatible with 
 the current metastore.
 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.
 Please see the latest review board for additional implementation details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2346: Allow hooks to be run when a job fails.

2011-08-09 Thread Kevin Wilfong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/
---

(Updated 2011-08-09 22:01:28.619424)


Review request for hive and Ning Zhang.


Changes
---

Made the changes requested by nzhang.

Regarding the change ArrayList -- List, I had copied that code from the 
analogous methods for Post and Pre Exec hooks.  So I made the corresponding 
changes in those methods too.


Summary
---

I added a new type of hook, which will be run when a job fails.


This addresses bug HIVE-2346.
https://issues.apache.org/jira/browse/HIVE-2346


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 1155569 

Diff: https://reviews.apache.org/r/1295/diff


Testing
---

I ran the TestCliDriver and TestNegativeCliDriver test suites and verified they 
passed.

In addition, I created a sample hook, which simply logged that it was run.  I 
verified it was run on a failure, but not when a job succeeded.


Thanks,

Kevin



[jira] [Commented] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081960#comment-13081960
 ] 

jirapos...@reviews.apache.org commented on HIVE-2346:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/
---

(Updated 2011-08-09 22:01:28.619424)


Review request for hive and Ning Zhang.


Changes
---

Made the changes requested by nzhang.

Regarding the change ArrayList -- List, I had copied that code from the 
analogous methods for Post and Pre Exec hooks.  So I made the corresponding 
changes in those methods too.


Summary
---

I added a new type of hook, which will be run when a job fails.


This addresses bug HIVE-2346.
https://issues.apache.org/jira/browse/HIVE-2346


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 1155569 

Diff: https://reviews.apache.org/r/1295/diff


Testing
---

I ran the TestCliDriver and TestNegativeCliDriver test suites and verified they 
passed.

In addition, I created a sample hook, which simply logged that it was run.  I 
verified it was run on a failure, but not when a job succeeded.


Thanks,

Kevin



 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-2362) HiveConf properties not appearing in the output of 'set' or 'set -v'

2011-08-09 Thread Jakob Homan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan reassigned HIVE-2362:
-

Assignee: Jakob Homan

 HiveConf properties not appearing in the output of 'set' or 'set -v'
 

 Key: HIVE-2362
 URL: https://issues.apache.org/jira/browse/HIVE-2362
 Project: Hive
  Issue Type: Bug
  Components: CLI, Configuration
Reporter: Carl Steinbach
Assignee: Jakob Homan
Priority: Blocker
 Fix For: 0.8.0




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-08-09 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang resolved HIVE-2246.
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Release Note: This makes an incompatible change in the metastore DB table 
schema from previous versions (0.8). Older metastores created with previous 
versions of Hive will need to be upgraded with the supplied scripts.

Committed. Thanks Sohan!

 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Fix For: 0.8.0

 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch, HIVE-2246.4.patch, 
 HIVE-2246.8.patch


 Note: this patch proposes a schema change, and is therefore incompatible with 
 the current metastore.
 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.
 Please see the latest review board for additional implementation details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1916) Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081980#comment-13081980
 ] 

jirapos...@reviews.apache.org commented on HIVE-1916:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1441/
---

Review request for hive and Ning Zhang.


Summary
---

Default behavior will be as before.
Adding new Hive conf vars to make the column names include the aggregation 
function and params.


This addresses bug HIVE-1916.
https://issues.apache.org/jira/browse/HIVE-1916


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155181 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1155181 
  trunk/ql/src/test/queries/clientpositive/autogen_colname.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/autogen_colname.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1441/diff


Testing
---

Added new query file with expected results. All unit tests pass


Thanks,

sameerm



 Change Default Alias For Aggregated Columns (_c1)
 -

 Key: HIVE-1916
 URL: https://issues.apache.org/jira/browse/HIVE-1916
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
 Environment: All
Reporter: James Mayfield
Priority: Minor

 Problem:
 When running a Hive query that aggregates (does a group by operation), Hive 
 will automatically name this column _c0, _c1, _c2, etc.. This is a problem 
 because Hive will not then execute a query against a column that begins with 
 _ and then the user has to manually input back-ticks in order to get the 
 query to run.
 Potential Solution:
 Hive should by default call these columns by their query assignment like 
 sum_active30day_users or if that is not possible, call it something simple 
 like column_1 so that users can then query the new column without adding 
 special back-ticks.
 Example Query:
 SELECT a.ds, COUNT(a.num_accounts)
 Example Result:
 ds, count_num_accounts OR ds, column_1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-2268) CREATE.. TABLE.. LIKE should not inherit the original owner of the table.

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2268.
--

Resolution: Duplicate

 CREATE.. TABLE.. LIKE should not inherit the original owner of the table.
 -

 Key: HIVE-2268
 URL: https://issues.apache.org/jira/browse/HIVE-2268
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Affects Versions: 0.7.0, 0.7.1, 0.8.0
Reporter: Esteban Gutierrez
Assignee: Charles Chen
  Labels: create, ddl, table
 Attachments: hive-2268.1.patch


 When a new table is created by using CREATE.. TABLE.. LIKE.. the new table 
 inherits the existing owner of the table, this is issue is potentially 
 conflicting for multiuser environments where Hive authorization is planned 
 for future use.
 -- alice creates table 
 CREATE EXTERNAL TABLE foo(bar double)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
 STORED AS TEXTFILE LOCATION '/user/alice/foo';
 -- table owner is alice as expected
 hive DESCRIBE EXTENDED foo;
 OK
 bar double  
  
 Detailed Table Information  Table(tableName:foo, dbName:default, 
 {color:red} owner:alice {color}, createTime:1309996190, lastAccessTime:0, 
 retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:bar, type:double, 
 comment:null)], location:hdfs://localhost/user/alice/foo, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=,, field.delim=,, line.delim=   
 }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], 
 parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1309996190}, 
 viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)  
 -- bob calls CREATE..TABLE..LIKE
 CREATE TABLE foo_like LIKE foo;
 -- bob created a new table using like but the owner stills alice
 -- but the expected is owner:bob
 hive  DESCRIBE EXTENDED foo_like;
 OK
 bar double  
  
 Detailed Table Information  Table(tableName:foo_like, dbName:default, 
 {color:red} owner:alice {color}, createTime:1309996554, lastAccessTime:0, 
 retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:bar, type:double, 
 comment:null)], location:hdfs://localhost/user/hive/warehouse/foo_like, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=,, field.delim=,, line.delim=  
 }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], 
 parameters:{transient_lastDdlTime=1309996554}, viewOriginalText:null, 
 viewExpandedText:null, tableType:MANAGED_TABLE)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1916) Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread Sameer M (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081981#comment-13081981
 ] 

Sameer M commented on HIVE-1916:


Added patch for review https://reviews.apache.org/r/1441/
Default behaviour will be as before.
Adding new Hive conf vars to make the column names include the aggregation 
function and params.

 Change Default Alias For Aggregated Columns (_c1)
 -

 Key: HIVE-1916
 URL: https://issues.apache.org/jira/browse/HIVE-1916
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
 Environment: All
Reporter: James Mayfield
Priority: Minor

 Problem:
 When running a Hive query that aggregates (does a group by operation), Hive 
 will automatically name this column _c0, _c1, _c2, etc.. This is a problem 
 because Hive will not then execute a query against a column that begins with 
 _ and then the user has to manually input back-ticks in order to get the 
 query to run.
 Potential Solution:
 Hive should by default call these columns by their query assignment like 
 sum_active30day_users or if that is not possible, call it something simple 
 like column_1 so that users can then query the new column without adding 
 special back-ticks.
 Example Query:
 SELECT a.ds, COUNT(a.num_accounts)
 Example Result:
 ds, count_num_accounts OR ds, column_1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (HIVE-2268) CREATE.. TABLE.. LIKE should not inherit the original owner of the table.

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-2268:
--


 CREATE.. TABLE.. LIKE should not inherit the original owner of the table.
 -

 Key: HIVE-2268
 URL: https://issues.apache.org/jira/browse/HIVE-2268
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Affects Versions: 0.7.0, 0.7.1, 0.8.0
Reporter: Esteban Gutierrez
Assignee: Charles Chen
  Labels: create, ddl, table
 Attachments: hive-2268.1.patch


 When a new table is created by using CREATE.. TABLE.. LIKE.. the new table 
 inherits the existing owner of the table, this is issue is potentially 
 conflicting for multiuser environments where Hive authorization is planned 
 for future use.
 -- alice creates table 
 CREATE EXTERNAL TABLE foo(bar double)
 ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
 STORED AS TEXTFILE LOCATION '/user/alice/foo';
 -- table owner is alice as expected
 hive DESCRIBE EXTENDED foo;
 OK
 bar double  
  
 Detailed Table Information  Table(tableName:foo, dbName:default, 
 {color:red} owner:alice {color}, createTime:1309996190, lastAccessTime:0, 
 retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:bar, type:double, 
 comment:null)], location:hdfs://localhost/user/alice/foo, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=,, field.delim=,, line.delim=   
 }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], 
 parameters:{EXTERNAL=TRUE, transient_lastDdlTime=1309996190}, 
 viewOriginalText:null, viewExpandedText:null, tableType:EXTERNAL_TABLE)  
 -- bob calls CREATE..TABLE..LIKE
 CREATE TABLE foo_like LIKE foo;
 -- bob created a new table using like but the owner stills alice
 -- but the expected is owner:bob
 hive  DESCRIBE EXTENDED foo_like;
 OK
 bar double  
  
 Detailed Table Information  Table(tableName:foo_like, dbName:default, 
 {color:red} owner:alice {color}, createTime:1309996554, lastAccessTime:0, 
 retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:bar, type:double, 
 comment:null)], location:hdfs://localhost/user/hive/warehouse/foo_like, 
 inputFormat:org.apache.hadoop.mapred.TextInputFormat, 
 outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, 
 compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
 serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe, 
 parameters:{serialization.format=,, field.delim=,, line.delim=  
 }), bucketCols:[], sortCols:[], parameters:{}), partitionKeys:[], 
 parameters:{transient_lastDdlTime=1309996554}, viewOriginalText:null, 
 viewExpandedText:null, tableType:MANAGED_TABLE)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1218) CREATE TABLE t LIKE some_view should create a new empty base table, but instead creates a copy of view

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1218:
-

  Component/s: (was: Logging)
   Query Processor
Fix Version/s: 0.8.0

 CREATE TABLE t LIKE some_view should create a new empty base table, but 
 instead creates a copy of view
 --

 Key: HIVE-1218
 URL: https://issues.apache.org/jira/browse/HIVE-1218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Charles Chen
 Fix For: 0.8.0

 Attachments: HIVE-1218v0.patch, HIVE-1218v2.patch, HIVE-1218v3.patch, 
 HIVE-1218v4.patch, HIVE-1218v5.patch, HIVE-1218v6.patch, HIVE-1218v7.patch, 
 HIVE-1218v8.patch


 I think it should copy only the column definitions from the view metadata.  
 Currently it is copying the entire descriptor, resulting in a new view 
 instead of a new base table.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2100) virtual column references inside subqueries cause execution exceptions

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2100:
-

  Component/s: Query Processor
Fix Version/s: 0.8.0

 virtual column references inside subqueries cause execution exceptions
 --

 Key: HIVE-2100
 URL: https://issues.apache.org/jira/browse/HIVE-2100
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Joydeep Sen Sarma
Assignee: Syed S. Albiz
 Fix For: 0.8.0

 Attachments: HIVE-2100.2.patch, HIVE-2100.4.patch, HIVE-2100.txt


 example:
 create table jssarma_nilzma_bad as select a.fname, a.offset, a.val from 
 (select 
 hash(eventid,userid,eventtime,browsercookie,userstate,useragent,userip,serverip,clienttime,geoid,countrycode\
 ,actionid,lastimpressionid,lastnavimpressionid,impressiontype,fullurl,fullreferrer,pagesection,modulesection,adsection)
  as val, INPUT__FILE__NAME as fname, BLOCK__OFFSET__INSIDE__FILE as offset 
 from nectar_impression_lzma_unverified where ds='2010-07-28') a join 
 jssarma_hc_diff b on (a.val=b.val);
 causes
 Caused by: java.lang.RuntimeException: Map operator initialization failed
   at 
 org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121)
   ... 18 more
 Caused by: java.lang.RuntimeException: cannot find field input__file__name 
 from 
 [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@664310d0,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3d04fc23,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@12457d21,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@101a0ae6,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1dc18a4c,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@d5e92d7,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3bfa681c,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@34c92507,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@19e09a4,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2e8aeed0,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2344b18f,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@72e5355f,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@26132ae7,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@3465b738,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@1dfd868,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@ef894ce,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@61f1680f,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2fe6e305,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@5f4275d4,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@445e228,
  
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@802b249]
   at 
 org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:321)
   at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:96)
   at 
 org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:878)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:904)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
   at 
 org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:73)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433)
   at 
 org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:133)
   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357)
   

[jira] [Updated] (HIVE-936) dynamic partitions creation based on values

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-936:


Issue Type: Task  (was: New Feature)

 dynamic partitions creation based on values
 ---

 Key: HIVE-936
 URL: https://issues.apache.org/jira/browse/HIVE-936
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.7.0

 Attachments: dp_design.txt


 If a Hive table is created as partitioned, DML could only inserted into one 
 partitioin per query. Ideally partitions should be created on the fly based 
 on the value of the partition columns. As an example:
 {{{
   create table T (a int, b string) partitioned by (ds string);
   insert overwrite table T select a, b, ds from S where ds = '2009-11-01' 
 and ds = '2009-11-16';
 }}}
 should be able to execute in one DML rather than possibley 16 DML for each 
 distinct ds values. CTAS and alter table should be able to do the same thing:
 {{{
   create table T partitioned by (ds string) as select * from S where ds = 
 '2009-11-01' and ds = '2009-11-16';
 }}}
  and
 {{{
   create table T(a int, b string, ds string);
   insert overwrite table T select * from S where ds = '2009-11-1' and ds = 
 '2009-11-16';
   alter table T partitioned by (ds);
 }}}
 should all return the same results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-936) dynamic partitions creation based on values

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-936:


Component/s: Query Processor

 dynamic partitions creation based on values
 ---

 Key: HIVE-936
 URL: https://issues.apache.org/jira/browse/HIVE-936
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.7.0

 Attachments: dp_design.txt


 If a Hive table is created as partitioned, DML could only inserted into one 
 partitioin per query. Ideally partitions should be created on the fly based 
 on the value of the partition columns. As an example:
 {{{
   create table T (a int, b string) partitioned by (ds string);
   insert overwrite table T select a, b, ds from S where ds = '2009-11-01' 
 and ds = '2009-11-16';
 }}}
 should be able to execute in one DML rather than possibley 16 DML for each 
 distinct ds values. CTAS and alter table should be able to do the same thing:
 {{{
   create table T partitioned by (ds string) as select * from S where ds = 
 '2009-11-01' and ds = '2009-11-16';
 }}}
  and
 {{{
   create table T(a int, b string, ds string);
   insert overwrite table T select * from S where ds = '2009-11-1' and ds = 
 '2009-11-16';
   alter table T partitioned by (ds);
 }}}
 should all return the same results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-936) dynamic partitions creation based on values

2011-08-09 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-936:


Fix Version/s: 0.7.0

 dynamic partitions creation based on values
 ---

 Key: HIVE-936
 URL: https://issues.apache.org/jira/browse/HIVE-936
 Project: Hive
  Issue Type: Task
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang
 Fix For: 0.7.0

 Attachments: dp_design.txt


 If a Hive table is created as partitioned, DML could only inserted into one 
 partitioin per query. Ideally partitions should be created on the fly based 
 on the value of the partition columns. As an example:
 {{{
   create table T (a int, b string) partitioned by (ds string);
   insert overwrite table T select a, b, ds from S where ds = '2009-11-01' 
 and ds = '2009-11-16';
 }}}
 should be able to execute in one DML rather than possibley 16 DML for each 
 distinct ds values. CTAS and alter table should be able to do the same thing:
 {{{
   create table T partitioned by (ds string) as select * from S where ds = 
 '2009-11-01' and ds = '2009-11-16';
 }}}
  and
 {{{
   create table T(a int, b string, ds string);
   insert overwrite table T select * from S where ds = '2009-11-1' and ds = 
 '2009-11-16';
   alter table T partitioned by (ds);
 }}}
 should all return the same results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2363) Implicitly CLUSTER BY when dynamically partitioning

2011-08-09 Thread Adam Kramer (JIRA)
Implicitly CLUSTER BY when dynamically partitioning
---

 Key: HIVE-2363
 URL: https://issues.apache.org/jira/browse/HIVE-2363
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Adam Kramer
Priority: Critical


Whenever someone is dynamically creating partitions, the underlying 
implementation is to look at the output data, write it to a file so long as the 
partition columns are contiguous, then to close that file and open a new one if 
the partition column changes. This leads to potentially way too many files 
generated.

The solution is to ensure that a partition column's data all appears in a row 
and on the same reducer. I.e., to cluster by the partitioning columns on the 
way out.

This improvement is to detect whether a query is clustering by the eventual 
partition columns, and if not, to do so as an additional step at the end of the 
query. This will potentially save lots of space.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-1916: Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread sam_vm

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1441/
---

Review request for hive and Ning Zhang.


Summary
---

Default behavior will be as before.
Adding new Hive conf vars to make the column names include the aggregation 
function and params.


This addresses bug HIVE-1916.
https://issues.apache.org/jira/browse/HIVE-1916


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155181 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1155181 
  trunk/ql/src/test/queries/clientpositive/autogen_colname.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/autogen_colname.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1441/diff


Testing
---

Added new query file with expected results. All unit tests pass


Thanks,

sameerm



Re: Review Request: HIVE-1916: Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1441/#review1368
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/1441/#comment3136

Please add these definitions to hive-default.xml along with a description 
for each property.



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1441/#comment3140

These are initialized in the constructor. Please don't also initialize them 
here.



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1441/#comment3137

Please run checkstyle and correct any violations introduced by your changes.



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1441/#comment3142

Create a static final variable for the 20 char length limit.


- Carl


On 2011-08-09 22:40:56, sameerm wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1441/
 ---
 
 (Updated 2011-08-09 22:40:56)
 
 
 Review request for hive and Ning Zhang.
 
 
 Summary
 ---
 
 Default behavior will be as before.
 Adding new Hive conf vars to make the column names include the aggregation 
 function and params.
 
 
 This addresses bug HIVE-1916.
 https://issues.apache.org/jira/browse/HIVE-1916
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155181 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1155181 
   trunk/ql/src/test/queries/clientpositive/autogen_colname.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/autogen_colname.q.out PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/1441/diff
 
 
 Testing
 ---
 
 Added new query file with expected results. All unit tests pass
 
 
 Thanks,
 
 sameerm
 




[jira] [Commented] (HIVE-1916) Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081986#comment-13081986
 ] 

Ning Zhang commented on HIVE-1916:
--

+1. Will commit if tests pass. 

 Change Default Alias For Aggregated Columns (_c1)
 -

 Key: HIVE-1916
 URL: https://issues.apache.org/jira/browse/HIVE-1916
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
 Environment: All
Reporter: James Mayfield
Priority: Minor

 Problem:
 When running a Hive query that aggregates (does a group by operation), Hive 
 will automatically name this column _c0, _c1, _c2, etc.. This is a problem 
 because Hive will not then execute a query against a column that begins with 
 _ and then the user has to manually input back-ticks in order to get the 
 query to run.
 Potential Solution:
 Hive should by default call these columns by their query assignment like 
 sum_active30day_users or if that is not possible, call it something simple 
 like column_1 so that users can then query the new column without adding 
 special back-ticks.
 Example Query:
 SELECT a.ds, COUNT(a.num_accounts)
 Example Result:
 ds, count_num_accounts OR ds, column_1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2346:


Attachment: HIVE-2346.2.patch.txt

 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt, HIVE-2346.2.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081991#comment-13081991
 ] 

John Sichi commented on HIVE-2344:
--

+1.  Will commit when tests pass.

 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: He Yongqiang
Assignee: Amareshwari Sriramadasu
 Fix For: 0.8.0

 Attachments: hive-patch-2344.txt, ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2346: Allow hooks to be run when a job fails.

2011-08-09 Thread Ning Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/#review1370
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/1295/#comment3144

Kevin, can you add this parameter to conf/hive-default.xml? We should add 
all newly introduced parameters to that file. 

Otherwise it looks good to me. 


- Ning


On 2011-08-09 22:01:28, Kevin Wilfong wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1295/
 ---
 
 (Updated 2011-08-09 22:01:28)
 
 
 Review request for hive and Ning Zhang.
 
 
 Summary
 ---
 
 I added a new type of hook, which will be run when a job fails.
 
 
 This addresses bug HIVE-2346.
 https://issues.apache.org/jira/browse/HIVE-2346
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155569 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155569 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 1155569 
 
 Diff: https://reviews.apache.org/r/1295/diff
 
 
 Testing
 ---
 
 I ran the TestCliDriver and TestNegativeCliDriver test suites and verified 
 they passed.
 
 In addition, I created a sample hook, which simply logged that it was run.  I 
 verified it was run on a failure, but not when a job succeeded.
 
 
 Thanks,
 
 Kevin
 




[jira] [Commented] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081993#comment-13081993
 ] 

jirapos...@reviews.apache.org commented on HIVE-2346:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/#review1370
---



trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
https://reviews.apache.org/r/1295/#comment3144

Kevin, can you add this parameter to conf/hive-default.xml? We should add 
all newly introduced parameters to that file. 

Otherwise it looks good to me. 


- Ning


On 2011-08-09 22:01:28, Kevin Wilfong wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1295/
bq.  ---
bq.  
bq.  (Updated 2011-08-09 22:01:28)
bq.  
bq.  
bq.  Review request for hive and Ning Zhang.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  I added a new type of hook, which will be run when a job fails.
bq.  
bq.  
bq.  This addresses bug HIVE-2346.
bq.  https://issues.apache.org/jira/browse/HIVE-2346
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155569 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155569 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 
1155569 
bq.  
bq.  Diff: https://reviews.apache.org/r/1295/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  I ran the TestCliDriver and TestNegativeCliDriver test suites and verified 
they passed.
bq.  
bq.  In addition, I created a sample hook, which simply logged that it was run. 
 I verified it was run on a failure, but not when a job succeeded.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Kevin
bq.  
bq.



 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt, HIVE-2346.2.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2360) create dynamic partition if and only if intermediate source has files

2011-08-09 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2360:
--

Attachment: hive-2360.2.patch

refactor to use one call to Utilities.getFileStatusRecurse()

 create dynamic partition if and only if intermediate source has files
 -

 Key: HIVE-2360
 URL: https://issues.apache.org/jira/browse/HIVE-2360
 Project: Hive
  Issue Type: Bug
Reporter: Franklin Hu
Assignee: Franklin Hu
Priority: Minor
 Fix For: 0.8.0

 Attachments: hive-2360.1.patch, hive-2360.2.patch


 There are some conditions under which a partition description is created due 
 to insert overwriting a table using dynamic partitioning for partitions that 
 that are empty (have no files).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-2360 create dynamic partition iff intermediate source has files

2011-08-09 Thread Franklin Hu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1442/
---

Review request for hive, Ning Zhang and Siying Dong.


Summary
---

There are some conditions under which partition descriptions are created in 
memory and committed to the metastore despite there being no intermediate or 
final finals in that directory (due to dynamic partitioning).
In this change, a check is done to only call loadPartitions that have files in 
them.


This addresses bug HIVE-2360.
https://issues.apache.org/jira/browse/HIVE-2360


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1155968 

Diff: https://reviews.apache.org/r/1442/diff


Testing
---

Unit tests pass


Thanks,

Franklin



[jira] [Commented] (HIVE-1916) Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081994#comment-13081994
 ] 

Ning Zhang commented on HIVE-1916:
--

sorry Carl and I were commenting on the same time. @Sameer, can you update your 
patch regarding to Carl's comments? 

 Change Default Alias For Aggregated Columns (_c1)
 -

 Key: HIVE-1916
 URL: https://issues.apache.org/jira/browse/HIVE-1916
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
 Environment: All
Reporter: James Mayfield
Priority: Minor

 Problem:
 When running a Hive query that aggregates (does a group by operation), Hive 
 will automatically name this column _c0, _c1, _c2, etc.. This is a problem 
 because Hive will not then execute a query against a column that begins with 
 _ and then the user has to manually input back-ticks in order to get the 
 query to run.
 Potential Solution:
 Hive should by default call these columns by their query assignment like 
 sum_active30day_users or if that is not possible, call it something simple 
 like column_1 so that users can then query the new column without adding 
 special back-ticks.
 Example Query:
 SELECT a.ds, COUNT(a.num_accounts)
 Example Result:
 ds, count_num_accounts OR ds, column_1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2346:


Attachment: HIVE-2346.3.patch.txt

 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt, HIVE-2346.2.patch.txt, 
 HIVE-2346.3.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081998#comment-13081998
 ] 

jirapos...@reviews.apache.org commented on HIVE-2346:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1295/
---

(Updated 2011-08-09 23:44:00.972625)


Review request for hive and Ning Zhang.


Changes
---

Added the hive.exec.failure.hooks property to hive-default.xml


Summary
---

I added a new type of hook, which will be run when a job fails.


This addresses bug HIVE-2346.
https://issues.apache.org/jira/browse/HIVE-2346


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155569 
  trunk/conf/hive-default.xml 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155569 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/hooks/HookContext.java 1155569 

Diff: https://reviews.apache.org/r/1295/diff


Testing
---

I ran the TestCliDriver and TestNegativeCliDriver test suites and verified they 
passed.

In addition, I created a sample hook, which simply logged that it was run.  I 
verified it was run on a failure, but not when a job succeeded.


Thanks,

Kevin



 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt, HIVE-2346.2.patch.txt, 
 HIVE-2346.3.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2344) filter is removed due to regression of HIVE-1538

2011-08-09 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-2344:
-

Assignee: Ido Hadanny  (was: Amareshwari Sriramadasu)
  Status: Open  (was: Patch Available)

I'm getting many regression test failures due to EXPLAIN plan changes.


 filter is removed due to regression of HIVE-1538
 

 Key: HIVE-2344
 URL: https://issues.apache.org/jira/browse/HIVE-2344
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: He Yongqiang
Assignee: Ido Hadanny
 Fix For: 0.8.0

 Attachments: hive-patch-2344.txt, ppd_udf_col.q.out.txt


  select * from 
  (
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1)s where s.randum1230.1 limit 20;
 This is returning results...
 and 
  explain
  select type_bucket,randum123
  from (SELECT *, cast(rand() as double) AS randum123 FROM tbl where ds = ...) 
 a
  where randum123 =0.1
 shows that there is no filter.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread Kevin Wilfong (JIRA)
Make performance logging configurable.
--

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong


Currently, the way we measure the length of time spent by a piece of code is 
using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
made the class responsible for logging this data configurable, it would allow 
for more flexibility, in particular, we would not be restricted to logging to 
an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #884

2011-08-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/884/changes

Changes:

[nzhang] HIVE-2347. Make Hadoop Job ID available after task finish executing 
(Kevin Wilfong via Ning Zhang)

--
[...truncated 33862 lines...]

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-metastore/0.8.0-SNAPSHOT/hive-metastore-0.8.0-20110810.000515-63.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 1659K from apache.snapshots.https
[artifact:deploy] Uploaded 1659K
[artifact:deploy] [INFO] Uploading project information for hive-metastore 
0.8.0-20110810.000515-63
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-metastore:0.8.0-SNAPSHOT'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-metastore'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-serde/0.8.0-SNAPSHOT/hive-serde-0.8.0-20110810.000518-63.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 453K from apache.snapshots.https
[artifact:deploy] Uploaded 453K
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-serde'
[artifact:deploy] [INFO] Uploading project information for hive-serde 
0.8.0-20110810.000518-63
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-serde:0.8.0-SNAPSHOT'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-service/0.8.0-SNAPSHOT/hive-service-0.8.0-20110810.000520-63.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 170K from apache.snapshots.https
[artifact:deploy] Uploaded 170K
[artifact:deploy] [INFO] Uploading project information for hive-service 
0.8.0-20110810.000520-63
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-service'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 

[jira] [Commented] (HIVE-2272) add TIMESTAMP data type

2011-08-09 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082007#comment-13082007
 ] 

Siying Dong commented on HIVE-2272:
---

+1, please open a follow up JIRA for setting timezones.

 add TIMESTAMP data type
 ---

 Key: HIVE-2272
 URL: https://issues.apache.org/jira/browse/HIVE-2272
 Project: Hive
  Issue Type: New Feature
Reporter: Franklin Hu
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2272.1.patch, hive-2272.10.patch, 
 hive-2272.2.patch, hive-2272.3.patch, hive-2272.4.patch, hive-2272.5.patch, 
 hive-2272.6.patch, hive-2272.7.patch, hive-2272.8.patch, hive-2272.9.patch


 Add TIMESTAMP type to serde2 that supports unix timestamp (1970-01-01 
 00:00:01 UTC to 2038-01-19 03:14:07 UTC) with optional nanosecond precision 
 using both LazyBinary and LazySimple SerDes. 
 For LazySimpleSerDe, the data is stored in jdbc compliant java.sql.Timestamp 
 parsable strings.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2347) Make Hadoop Job ID available after task finishes executing

2011-08-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082010#comment-13082010
 ] 

Hudson commented on HIVE-2347:
--

Integrated in Hive-trunk-h0.21 #884 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/884/])
HIVE-2347. Make Hadoop Job ID available after task finish executing (Kevin 
Wilfong via Ning Zhang)

nzhang : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1155493
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java


 Make Hadoop Job ID available after task finishes executing
 --

 Key: HIVE-2347
 URL: https://issues.apache.org/jira/browse/HIVE-2347
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Fix For: 0.8.0

 Attachments: HIVE-2347.1.patch.txt


 After Map Reduce tasks finish the execute method (ExecDriver and 
 BlockMergeTask) the Hadoop Job ID is inaccessible to the Driver, and hence 
 the hooks it runs.  Expose this information could help to improve logging, 
 debugging, etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082012#comment-13082012
 ] 

Kevin Wilfong commented on HIVE-2364:
-

https://reviews.apache.org/r/1443/

 Make performance logging configurable.
 --

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2364.1.patch.txt


 Currently, the way we measure the length of time spent by a piece of code is 
 using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
 made the class responsible for logging this data configurable, it would allow 
 for more flexibility, in particular, we would not be restricted to logging to 
 an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2364:


Attachment: HIVE-2364.1.patch.txt

 Make performance logging configurable.
 --

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2364.1.patch.txt


 Currently, the way we measure the length of time spent by a piece of code is 
 using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
 made the class responsible for logging this data configurable, it would allow 
 for more flexibility, in particular, we would not be restricted to logging to 
 an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082014#comment-13082014
 ] 

jirapos...@reviews.apache.org commented on HIVE-2364:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1443/
---

Review request for hive and Ning Zhang.


Summary
---

I created a new class PerfLogger, which wraps the old functionality of 
Utilities' PerfLogBegin and PerfLogBegin methods.  I also added a config 
variable hive.exec.perf.logger.  The value of this variable can be changed to 
point to a subclass of PerfLogger which can customize the code in PerfLogBegin 
and PerfLogEnd.


This addresses bug HIVE-2364.
https://issues.apache.org/jira/browse/HIVE-2364


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155439 
  trunk/conf/hive-default.xml 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
1155439 

Diff: https://reviews.apache.org/r/1443/diff


Testing
---

I ran the code as it is and made sure it continued to log the performance 
messages as before.

I also created a subclass of PerfLogger and used its class name as the value of 
hive.exec.perf.logger, and verified the subclass's code was run.


Thanks,

Kevin



 Make performance logging configurable.
 --

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2364.1.patch.txt


 Currently, the way we measure the length of time spent by a piece of code is 
 using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
 made the class responsible for logging this data configurable, it would allow 
 for more flexibility, in particular, we would not be restricted to logging to 
 an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: Make performance logging configurable.

2011-08-09 Thread Kevin Wilfong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1443/
---

(Updated 2011-08-10 00:18:50.035513)


Review request for hive and Ning Zhang.


Changes
---

Forgot to add the licensing info to the top of PerfLogger.


Summary
---

I created a new class PerfLogger, which wraps the old functionality of 
Utilities' PerfLogBegin and PerfLogBegin methods.  I also added a config 
variable hive.exec.perf.logger.  The value of this variable can be changed to 
point to a subclass of PerfLogger which can customize the code in PerfLogBegin 
and PerfLogEnd.


This addresses bug HIVE-2364.
https://issues.apache.org/jira/browse/HIVE-2364


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155439 
  trunk/conf/hive-default.xml 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
1155439 

Diff: https://reviews.apache.org/r/1443/diff


Testing
---

I ran the code as it is and made sure it continued to log the performance 
messages as before.

I also created a subclass of PerfLogger and used its class name as the value of 
hive.exec.perf.logger, and verified the subclass's code was run.


Thanks,

Kevin



[jira] [Updated] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-2364:


Attachment: HIVE-2364.2.patch.txt

 Make performance logging configurable.
 --

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2364.1.patch.txt, HIVE-2364.2.patch.txt


 Currently, the way we measure the length of time spent by a piece of code is 
 using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
 made the class responsible for logging this data configurable, it would allow 
 for more flexibility, in particular, we would not be restricted to logging to 
 an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2364) Make performance logging configurable.

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082017#comment-13082017
 ] 

jirapos...@reviews.apache.org commented on HIVE-2364:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1443/
---

(Updated 2011-08-10 00:18:50.035513)


Review request for hive and Ning Zhang.


Changes
---

Forgot to add the licensing info to the top of PerfLogger.


Summary
---

I created a new class PerfLogger, which wraps the old functionality of 
Utilities' PerfLogBegin and PerfLogBegin methods.  I also added a config 
variable hive.exec.perf.logger.  The value of this variable can be changed to 
point to a subclass of PerfLogger which can customize the code in PerfLogBegin 
and PerfLogEnd.


This addresses bug HIVE-2364.
https://issues.apache.org/jira/browse/HIVE-2364


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155439 
  trunk/conf/hive-default.xml 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1155439 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/log/PerfLogger.java PRE-CREATION 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/ppr/PartitionPruner.java 
1155439 

Diff: https://reviews.apache.org/r/1443/diff


Testing
---

I ran the code as it is and made sure it continued to log the performance 
messages as before.

I also created a subclass of PerfLogger and used its class name as the value of 
hive.exec.perf.logger, and verified the subclass's code was run.


Thanks,

Kevin



 Make performance logging configurable.
 --

 Key: HIVE-2364
 URL: https://issues.apache.org/jira/browse/HIVE-2364
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2364.1.patch.txt, HIVE-2364.2.patch.txt


 Currently, the way we measure the length of time spent by a piece of code is 
 using the methods PerfLogBegin and PerfLogEnd in the Utilities class.  If we 
 made the class responsible for logging this data configurable, it would allow 
 for more flexibility, in particular, we would not be restricted to logging to 
 an Apache Commons Log.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-1916: Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread sam_vm

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1441/
---

(Updated 2011-08-10 00:56:44.290697)


Review request for hive and Ning Zhang.


Changes
---

Added hive.autogen.columnname.prefix.label and 
hive.autogen.columnname.prefix.includefuncname to hive-default.xml
Created a static final variable for the 20 char length limit in 
SemanticAnalyzer.java
Took out default variable values from SemanticAnalyzer.java (will be set in 
constructor)
Fixed checkstyle line length warnings introduced by patch in HiveConf.java, 
SemanticAnalyzer.java
Added some more comments in SemanticAnalyzer.java


Summary
---

Default behavior will be as before.
Adding new Hive conf vars to make the column names include the aggregation 
function and params.


This addresses bug HIVE-1916.
https://issues.apache.org/jira/browse/HIVE-1916


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155181 
  trunk/conf/hive-default.xml 1155181 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1155181 
  trunk/ql/src/test/queries/clientpositive/autogen_colname.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/autogen_colname.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1441/diff


Testing
---

Added new query file with expected results. All unit tests pass


Thanks,

sameerm



[jira] [Commented] (HIVE-1916) Change Default Alias For Aggregated Columns (_c1)

2011-08-09 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082032#comment-13082032
 ] 

jirapos...@reviews.apache.org commented on HIVE-1916:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1441/
---

(Updated 2011-08-10 00:56:44.290697)


Review request for hive and Ning Zhang.


Changes
---

Added hive.autogen.columnname.prefix.label and 
hive.autogen.columnname.prefix.includefuncname to hive-default.xml
Created a static final variable for the 20 char length limit in 
SemanticAnalyzer.java
Took out default variable values from SemanticAnalyzer.java (will be set in 
constructor)
Fixed checkstyle line length warnings introduced by patch in HiveConf.java, 
SemanticAnalyzer.java
Added some more comments in SemanticAnalyzer.java


Summary
---

Default behavior will be as before.
Adding new Hive conf vars to make the column names include the aggregation 
function and params.


This addresses bug HIVE-1916.
https://issues.apache.org/jira/browse/HIVE-1916


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1155181 
  trunk/conf/hive-default.xml 1155181 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1155181 
  trunk/ql/src/test/queries/clientpositive/autogen_colname.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/autogen_colname.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1441/diff


Testing
---

Added new query file with expected results. All unit tests pass


Thanks,

sameerm



 Change Default Alias For Aggregated Columns (_c1)
 -

 Key: HIVE-1916
 URL: https://issues.apache.org/jira/browse/HIVE-1916
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
 Environment: All
Reporter: James Mayfield
Priority: Minor

 Problem:
 When running a Hive query that aggregates (does a group by operation), Hive 
 will automatically name this column _c0, _c1, _c2, etc.. This is a problem 
 because Hive will not then execute a query against a column that begins with 
 _ and then the user has to manually input back-ticks in order to get the 
 query to run.
 Potential Solution:
 Hive should by default call these columns by their query assignment like 
 sum_active30day_users or if that is not possible, call it something simple 
 like column_1 so that users can then query the new column without adding 
 special back-ticks.
 Example Query:
 SELECT a.ds, COUNT(a.num_accounts)
 Example Result:
 ds, count_num_accounts OR ds, column_1

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1101) Extend Hive ODBC to support more functions

2011-08-09 Thread Brian Lau (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082106#comment-13082106
 ] 

Brian Lau commented on HIVE-1101:
-

unixODBC-2.2.14-p2-HIVE-1101/DriverManager/SQLAllocHandle.c
contains the line
   return SQL_ERROR; // Zn: just for testing !!

 Extend Hive ODBC to support more functions
 --

 Key: HIVE-1101
 URL: https://issues.apache.org/jira/browse/HIVE-1101
 Project: Hive
  Issue Type: New Feature
  Components: ODBC
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1101.patch, unixODBC-2.2.14-p2-HIVE-1101.tar.gz


 Currently Hive ODBC driver only support a a minimum list of functions to 
 ensure some application work. Some other applications require more function 
 support. These functions include:
 *SQLCancel
 *SQLFetchScroll
 *SQLGetData   
 *SQLGetInfo
 *SQLMoreResults
 *SQLRowCount
 *SQLSetConnectAtt
 *SQLSetStmtAttr
 *SQLEndTran
 *SQLPrepare
 *SQLNumParams
 *SQLDescribeParam
 *SQLBindParameter
 *SQLGetConnectAttr
 *SQLSetEnvAttr
 *SQLPrimaryKeys (not ODBC API? Hive does not support primary keys yet)
 *SQLForeignKeys (not ODBC API? Hive does not support foreign keys yet)
 We should support as many of them as possible. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2196) Ensure HiveConf includes all properties defined in hive-default.xml

2011-08-09 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2196:
---

Status: Patch Available  (was: Open)

 Ensure HiveConf includes all properties defined in hive-default.xml
 ---

 Key: HIVE-2196
 URL: https://issues.apache.org/jira/browse/HIVE-2196
 Project: Hive
  Issue Type: Bug
  Components: Configuration
Reporter: Carl Steinbach
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2196.1.patch, HIVE-2196.patch


 There are a bunch of properties that are defined in hive-default.xml but not 
 in HiveConf.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #885

2011-08-09 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/885/changes

Changes:

[pauly] HIVE-2246. Dedupe tables' column schemas from partitions in the 
metastore db (Sohan Jain via pauly)

--
[...truncated 33777 lines...]

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-metastore/0.8.0-SNAPSHOT/hive-metastore-0.8.0-20110810.043604-64.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 1664K from apache.snapshots.https
[artifact:deploy] Uploaded 1664K
[artifact:deploy] [INFO] Uploading project information for hive-metastore 
0.8.0-20110810.043604-64
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-metastore:0.8.0-SNAPSHOT'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-metastore'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-serde/0.8.0-SNAPSHOT/hive-serde-0.8.0-20110810.043606-64.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 453K from apache.snapshots.https
[artifact:deploy] Uploaded 453K
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-serde'
[artifact:deploy] [INFO] Uploading project information for hive-serde 
0.8.0-20110810.043606-64
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-serde:0.8.0-SNAPSHOT'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-service/0.8.0-SNAPSHOT/hive-service-0.8.0-20110810.043609-64.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 170K from apache.snapshots.https
[artifact:deploy] Uploaded 170K
[artifact:deploy] [INFO] Uploading project information for hive-service 
0.8.0-20110810.043609-64
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-service'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 

[jira] [Commented] (HIVE-2310) CREATE EXTERNAL TABLE should require a valid LOCATION clause

2011-08-09 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082134#comment-13082134
 ] 

Ashutosh Chauhan commented on HIVE-2310:


What about following:
{code}
create table mytbl (a string) location '/tmp/tbl';
{code}

Is it fine to allow location for a table without it being 'external' ? I think 
semantics should be if you want to specify location then table must be external.


 CREATE EXTERNAL TABLE should require a valid LOCATION clause
 

 Key: HIVE-2310
 URL: https://issues.apache.org/jira/browse/HIVE-2310
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: Carl Steinbach
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2310.1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2358) JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types

2011-08-09 Thread Mythili Gopalakrishnan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082135#comment-13082135
 ] 

Mythili Gopalakrishnan commented on HIVE-2358:
--

Yes Patrick, I am OK for this patch to get committed. 

 JDBC DatabaseMetaData and ResultSetMetaData need to match for particular types
 --

 Key: HIVE-2358
 URL: https://issues.apache.org/jira/browse/HIVE-2358
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2358.patch


 My patch for HIVE-1631 did not ensure the following (from comment on 1631):
 -
 Mythili Gopalakrishnan added a comment - 08/Aug/11 08:42
 Just tested this fix and does NOT work correctly. Here are my findings on a 
 FLOAT column
 Without Patch on a FLOAT Column
 
 DatabaseMetaData.getColumns () COLUMN_SIZE returns 12
 DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 0
 ResultSetMetaData.getPrecision() returns 0
 ResultSetMetaData.getScale() returns 0
 With Patch on a FLOAT Column
 
 DatabaseMetaData.getColumns () COLUMN_SIZE returns 24
 DatabaseMetaData.getColumns () DECIMAL_DIGITS - returns 0
 ResultSetMetaData.getPrecision() returns 7
 ResultSetMetaData.getScale() returns 7
 Also both DatabaseMetadata and ResultSetMetaData must return the same 
 information for Precision and Scale for FLOAT,DOUBLE types.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2156) Improve error messages emitted during task execution

2011-08-09 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2156:
-

Status: Open  (was: Patch Available)

 Improve error messages emitted during task execution
 

 Key: HIVE-2156
 URL: https://issues.apache.org/jira/browse/HIVE-2156
 Project: Hive
  Issue Type: Improvement
Reporter: Syed S. Albiz
Assignee: Syed S. Albiz
 Attachments: HIVE-2156.1.patch, HIVE-2156.10.patch, 
 HIVE-2156.11.patch, HIVE-2156.12.patch, HIVE-2156.2.patch, HIVE-2156.4.patch, 
 HIVE-2156.8.patch, HIVE-2156.9.patch


 Follow-up to HIVE-1731
 A number of issues were related to reporting errors from task execution and 
 surfacing these in a more useful form.
 Currently a cryptic message with Execution Error and a return code and 
 class name of the task is emitted.
 The most useful log messages here are emitted to the local logs, which can be 
 found through jobtracker. Having either a pointer to these logs as part of 
 the error message or the actual content would improve the usefulness 
 substantially. It may also warrant looking into how the underlying error 
 reporting through Hadoop is done and if more information can be propagated up 
 from there.
 Specific issues raised in  HIVE-1731:
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 * issue was in regexp_extract syntax
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 * tried: desc table_does_not_exist;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2156) Improve error messages emitted during task execution

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082142#comment-13082142
 ] 

Ning Zhang commented on HIVE-2156:
--

Syed, there are still 2 tests failing:  minimr_broken_pipe.q in 
TestNegativeCliDriver and TestNegativeMinimrCliDriver. Can you take yet another 
look? Also please rerun all unit tests and make sure all of them pass before 
submitting. Otherwise it takes too many iterations. 

 Improve error messages emitted during task execution
 

 Key: HIVE-2156
 URL: https://issues.apache.org/jira/browse/HIVE-2156
 Project: Hive
  Issue Type: Improvement
Reporter: Syed S. Albiz
Assignee: Syed S. Albiz
 Attachments: HIVE-2156.1.patch, HIVE-2156.10.patch, 
 HIVE-2156.11.patch, HIVE-2156.12.patch, HIVE-2156.2.patch, HIVE-2156.4.patch, 
 HIVE-2156.8.patch, HIVE-2156.9.patch


 Follow-up to HIVE-1731
 A number of issues were related to reporting errors from task execution and 
 surfacing these in a more useful form.
 Currently a cryptic message with Execution Error and a return code and 
 class name of the task is emitted.
 The most useful log messages here are emitted to the local logs, which can be 
 found through jobtracker. Having either a pointer to these logs as part of 
 the error message or the actual content would improve the usefulness 
 substantially. It may also warrant looking into how the underlying error 
 reporting through Hadoop is done and if more information can be propagated up 
 from there.
 Specific issues raised in  HIVE-1731:
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 * issue was in regexp_extract syntax
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask
 * tried: desc table_does_not_exist;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2346) Add hooks to run when execution fails.

2011-08-09 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082145#comment-13082145
 ] 

Ning Zhang commented on HIVE-2346:
--

+1. Will commit if tests pass.

 Add hooks to run when execution fails.
 --

 Key: HIVE-2346
 URL: https://issues.apache.org/jira/browse/HIVE-2346
 Project: Hive
  Issue Type: Improvement
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
 Attachments: HIVE-2346.1.patch.txt, HIVE-2346.2.patch.txt, 
 HIVE-2346.3.patch.txt


 Currently, when a query fails, the Post Execution Hooks are not run.
 Adding hooks to be run when a query fails could allow for better logging etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira