date:20110425


 [ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2126:
---

Status: Patch Available  (was: Open)

 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: Hive's symlink text input format should be able to work with ComineHiveInputFormat

2011-04-25 Thread Yongqiang He


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/653/
---

Review request for hive.


Summary
---

Hive's symlink text input format should be able to work with 
ComineHiveInputFormat


This addresses bug hive-2126.
https://issues.apache.org/jira/browse/hive-2126


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096093 
  trunk/conf/hive-default.xml 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/ReworkMapredInputFormat.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java 
1096093 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1096093 
  
trunk/ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
1096093 

Diff: https://reviews.apache.org/r/653/diff


Testing
---


Thanks,

Yongqiang

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat


[ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024760#comment-13024760
 ] 

He Yongqiang commented on HIVE-2126:


review board: https://reviews.apache.org/r/653/

 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

2011-04-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024761#comment-13024761
 ] 

jirapos...@reviews.apache.org commented on HIVE-2126:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/653/
---

Review request for hive.


Summary
---

Hive's symlink text input format should be able to work with 
ComineHiveInputFormat


This addresses bug hive-2126.
https://issues.apache.org/jira/browse/hive-2126


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096093 
  trunk/conf/hive-default.xml 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/ReworkMapredInputFormat.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java 
1096093 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
1096093 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1096093 
  
trunk/ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
1096093 

Diff: https://reviews.apache.org/r/653/diff


Testing
---


Thanks,

Yongqiang



 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-2131) Bitmap Operation UDF doesn't clear return list

Bitmap Operation UDF doesn't clear return list
--

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang


The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
evaluate() is called, causing each subsequent call to a bitmap operation to 
return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2131) Bitmap Operation UDF doesn't clear return list


 [ 
https://issues.apache.org/jira/browse/HIVE-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-2131:
---

Attachment: HIVE-2131.1.patch

Small patch that solves this problem.

 Bitmap Operation UDF doesn't clear return list
 --

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-2131.1.patch


 The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
 evaluate() is called, causing each subsequent call to a bitmap operation to 
 return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2131) Bitmap Operation UDF doesn't clear return list


 [ 
https://issues.apache.org/jira/browse/HIVE-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-2131:
---

Status: Patch Available  (was: Open)

 Bitmap Operation UDF doesn't clear return list
 --

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-2131.1.patch


 The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
 evaluate() is called, causing each subsequent call to a bitmap operation to 
 return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

[
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024840#comment-13024840
]

Namit Jain commented on HIVE-2126:
--

I havent taken a look at the code - but a high level question.
Should we call it SymbolicInputFormat instead ?
I mean, it should work for all kinds of files - not just text files.
For backward compatibility, we can make SymbolicTextInputFormat extend
SymbolicInputFormat.

Hive's symlink text input format should be able to work with
ComineHiveInputFormat
--

Key: HIVE-2126
URL: https://issues.apache.org/jira/browse/HIVE-2126
Project: Hive
Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
Attachments: HIVE-2126.1.patch

at compile time, if a partition's file format is SymlinkTextInputFormat, will
replace the symlink path with paths in the symlink file. This way, it will
work with Hive's HiveCombineFileInputFormat.
The reason we are doing it at compile time is because:
1) At run time, the input path is not only used to get record reader, but
also used for hive to get aliases and thus operator tree. But the
CombineHiveInputFormat can have multiple paths for each split, and when
switching paths, it also set the job with new input file name. So it always
require a real input path name. Can not fake it.
2) if write a new input format, it will require a lot of duplication work
with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2131) Bitmap Operation UDF doesn't clear return list


[ 
https://issues.apache.org/jira/browse/HIVE-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024842#comment-13024842
 ] 

John Sichi commented on HIVE-2131:
--

Can you add a test case?

 Bitmap Operation UDF doesn't clear return list
 --

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-2131.1.patch


 The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
 evaluate() is called, causing each subsequent call to a bitmap operation to 
 return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

[
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024883#comment-13024883
]

He Yongqiang commented on HIVE-2126:

The reason of usingReworkMapredInputFormat is that the interface
reworkMapred can also be used by other formats in future, like some other
file format also want to change the mapred work depends on the input.
what do you think?

Hive's symlink text input format should be able to work with
ComineHiveInputFormat
--

Key: HIVE-2126
URL: https://issues.apache.org/jira/browse/HIVE-2126
Project: Hive
Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
Attachments: HIVE-2126.1.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

[
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024891#comment-13024891
]

Namit Jain commented on HIVE-2126:
--

It might be simpler if the HiveInputFormat (or a new interface which extends
InputFommat) adds
this new method.

All Hive input formats will implement the above interface. The default
implementation does nothing,

You dont need code like below:
if
(partDesc.getInputFileFormatClass().equals(SymlinkTextInputFormat.class)) {
//change to TextInputFormat

You always call a new method, which is a no-op for all other input formats
right now.

Hive's symlink text input format should be able to work with
ComineHiveInputFormat
--

Key: HIVE-2126
URL: https://issues.apache.org/jira/browse/HIVE-2126
Project: Hive
Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
Attachments: HIVE-2126.1.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures


 [ 
https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2127:
-

Attachment: HIVE-2127.patch

 Improve stats gathering reliability by retries on failures
 --

 Key: HIVE-2127
 URL: https://issues.apache.org/jira/browse/HIVE-2127
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2127.patch


 Stats publishing and aggregation only try once and if there is any exception 
 it will fail and return. If many mappers/reducers updating stats at the same 
 time, it is very common to get lock timeout. We should make stats more 
 reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: HIVE-2127. Improve stats gathering reliability by retries on failures

2011-04-25 Thread Ning Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/664/
---

Review request for hive.


Summary
---

The major changes are:

 0) 2 parameters are introduced: hive.stats.retries.max (default 0) to be the 
maximum # of retries on SQLException failures, and hive.stats.retries.wait 
(default 3 sec) to be the base time window (explained below) to wait before the 
next retry. 

 1) introduced a couple of Utilities function to execute SQL queries with 
retries on failures. One Utilities function is to determine the wait time based 
on the number of failures and a base wait window (same as the one introduced in 
HDFS-767 for DFSClient to retry on BlockMissingExceptions). The actual wait 
time is determined by baseWindow * failues + baseWindow * (failure + 1) * 
(random number between [0.0,1.0]).

 2) changed the JDBCStatsAggregator.java to use PreparedStatement to be able to 
use executeWithRetries(). 

 3) change the JDBCStatsPublisher.java and JDBCStasAggregator.java to use 
retries on SQL connections and SQL executions. 


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1095959 
  trunk/conf/hive-default.xml 1095959 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1095959 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
1095959 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
1095959 

Diff: https://reviews.apache.org/r/664/diff


Testing
---

Running unit tests. 


Thanks,

Ning

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures


 [ 
https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2127:
-

Status: Patch Available  (was: Open)

Review board: https://reviews.apache.org/r/664/


 Improve stats gathering reliability by retries on failures
 --

 Key: HIVE-2127
 URL: https://issues.apache.org/jira/browse/HIVE-2127
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2127.patch


 Stats publishing and aggregation only try once and if there is any exception 
 it will fail and return. If many mappers/reducers updating stats at the same 
 time, it is very common to get lock timeout. We should make stats more 
 reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[REMINDER] Hive Contrib Meeting Today + Preliminary Agenda

2011-04-25 Thread Carl Steinbach

Hi,

The April Hive Contributors meeting is convening later today at Facebook.
See the Meetup page for more details:
http://www.meetup.com/Hive-Contributors-Group/

Here's the preliminary agenda for the meeting:

1) Should we do an 0.7.1 release?
2) Scheduling the 0.8.0 release.
3) Improving the testing situation.
4) Plan out agenda for Hive Contributor Day at Yahoo on June 30
5) Status update on HCatalog
6) Looking for volunteers for improving code quality/maintainability (e.g.
checkstyle enforcement)
7) Suggestions for candidate projects for FB summer interns on Hive

Please email me if you want to add any additional items to the list.

Thanks.

Carl

Re: [REMINDER] Hive Contrib Meeting Today + Preliminary Agenda

2011-04-25 Thread Edward Capriolo

On Mon, Apr 25, 2011 at 2:38 PM, Carl Steinbach c...@cloudera.com wrote:
 Hi,

 The April Hive Contributors meeting is convening later today at Facebook.
 See the Meetup page for more details:
 http://www.meetup.com/Hive-Contributors-Group/

 Here's the preliminary agenda for the meeting:

 1) Should we do an 0.7.1 release?
 2) Scheduling the 0.8.0 release.
 3) Improving the testing situation.
 4) Plan out agenda for Hive Contributor Day at Yahoo on June 30
 5) Status update on HCatalog
 6) Looking for volunteers for improving code quality/maintainability (e.g.
 checkstyle enforcement)
 7) Suggestions for candidate projects for FB summer interns on Hive

 Please email me if you want to add any additional items to the list.

 Thanks.

 Carl


Sorry I will not be able to attend. As for #3. Once we complete the
ticket for hive in maven, I have a tool called hive_unit which can
launch a micro hive cluster inside a unit test and verifies the result
rows (does not verify the parsing, planning etc like the current CLI
does).

Edward

Build failed in Jenkins: Hive-trunk-h0.20 #691

2011-04-25 Thread Apache Jenkins Server

See https://builds.apache.org/hudson/job/Hive-trunk-h0.20/691/

--
[...truncated 30012 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-25_12-29-34_873_6844360564202271837/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-04-25 12:29:37,966 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-25_12-29-34_873_6844360564202271837/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104251229_1192548424.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-25_12-29-39_515_6608217275322098033/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-25_12-29-39_515_6608217275322098033/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104251229_2003801571.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type:

[jira] [Updated] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat


 [ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2126:
---

Attachment: HIVE-2126.2.patch

 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch, HIVE-2126.2.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat


[ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024956#comment-13024956
 ] 

Namit Jain commented on HIVE-2126:
--

can you update the review board ?

 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch, HIVE-2126.2.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat

2011-04-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024957#comment-13024957
 ] 

jirapos...@reviews.apache.org commented on HIVE-2126:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/653/
---

(Updated 2011-04-25 20:48:09.176419)


Review request for hive.


Changes
---

added a new class SymbolicInputFormat, and move the reworkMapred to this new 
class.
move the new code from SemanticAnalyzer to a Utilities method


Summary
---

Hive's symlink text input format should be able to work with 
ComineHiveInputFormat


This addresses bug hive-2126.
https://issues.apache.org/jira/browse/hive-2126


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096548 
  trunk/conf/hive-default.xml 1096548 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1096548 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096548 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1096548 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/ReworkMapredInputFormat.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java 
PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/SymlinkTextInputFormat.java 
1096548 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ExplainSemanticAnalyzer.java 
1096548 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1096548 
  
trunk/ql/src/test/org/apache/hadoop/hive/ql/io/TestSymlinkTextInputFormat.java 
1096548 

Diff: https://reviews.apache.org/r/653/diff


Testing
---


Thanks,

Yongqiang



 Hive's symlink text input format should be able to work with 
 ComineHiveInputFormat
 --

 Key: HIVE-2126
 URL: https://issues.apache.org/jira/browse/HIVE-2126
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: HIVE-2126.1.patch, HIVE-2126.2.patch


 at compile time, if a partition's file format is SymlinkTextInputFormat, will 
 replace the symlink path with paths in the symlink file. This way, it will 
 work with Hive's HiveCombineFileInputFormat.
 The reason we are doing it at compile time is because:
 1) At run time, the input path is not only used to get record reader, but 
 also used for hive to get aliases and thus operator tree. But the 
 CombineHiveInputFormat can have multiple paths for each split, and when 
 switching paths, it also set the job with new input file name. So it always 
 require a real input path name. Can not fake it.
 2) if write a new input format, it will require a lot of duplication work 
 with existing CombineHiveInputFormat.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2131) Bitmap Operation UDF doesn't clear return list


 [ 
https://issues.apache.org/jira/browse/HIVE-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-2131:
---

Attachment: HIVE-2131.2.patch

I've updated the udf_bitmap_and and udf_bitmap_or tests so that they would have 
detected the bug in the old code.

 Bitmap Operation UDF doesn't clear return list
 --

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-2131.1.patch, HIVE-2131.2.patch


 The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
 evaluate() is called, causing each subsequent call to a bitmap operation to 
 return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2125) alter table concatenate fails and deletes data


 [ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2125:
---

Status: Patch Available  (was: Open)

 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical
 Attachments: HIVE-2125.1.patch


 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: alter table concatenate fails and deletes data

2011-04-25 Thread Yongqiang He


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/665/
---

Review request for hive.


Summary
---

alter table concatenate fails and deletes data

It is because the number of reducers is set to -1.

In this patch, it is set to zero. 

Also added a move task as the child task of the merge task. added a conf to 
control whether to check index or not, and add the job name for the merge job.


This addresses bug HIVE-2125.
https://issues.apache.org/jira/browse/HIVE-2125


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096599 
  trunk/conf/hive-default.xml 1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096599 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1096599 
  trunk/ql/src/test/queries/clientnegative/alter_merge_index.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_merge_index.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_merge_index.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_merge_index.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/665/diff


Testing
---


Thanks,

Yongqiang

[jira] [Commented] (HIVE-2125) alter table concatenate fails and deletes data


[ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024993#comment-13024993
 ] 

He Yongqiang commented on HIVE-2125:


It is because the number of reducers is set to -1.
In this patch, it is set to zero. 
Also added a move task as the child task of the merge task. added a conf to 
control whether to check index or not, and add the job name for the merge job.

 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical
 Attachments: HIVE-2125.1.patch


 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2125) alter table concatenate fails and deletes data

2011-04-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13024994#comment-13024994
 ] 

jirapos...@reviews.apache.org commented on HIVE-2125:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/665/
---

Review request for hive.


Summary
---

alter table concatenate fails and deletes data

It is because the number of reducers is set to -1.

In this patch, it is set to zero. 

Also added a move task as the child task of the merge task. added a conf to 
control whether to check index or not, and add the job name for the merge job.


This addresses bug HIVE-2125.
https://issues.apache.org/jira/browse/HIVE-2125


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096599 
  trunk/conf/hive-default.xml 1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096599 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1096599 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1096599 
  trunk/ql/src/test/queries/clientnegative/alter_merge_index.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_merge_index.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_merge_index.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_merge_index.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/665/diff


Testing
---


Thanks,

Yongqiang



 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical
 Attachments: HIVE-2125.1.patch


 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (HIVE-2133) DROP TABLE IF EXISTS should not fail if a view of that name exists

DROP TABLE IF EXISTS should not fail if a view of that name exists
--

 Key: HIVE-2133
 URL: https://issues.apache.org/jira/browse/HIVE-2133
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: John Sichi
Assignee: John Sichi


We should match MySQL behavior on this.  Likewise for DROP VIEW IF EXISTS when 
a table of that name exists.

Note that without IF EXISTS, we still want the statement to fail when the 
existing object type does not match.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2125) alter table concatenate fails and deletes data

2011-04-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025005#comment-13025005
 ] 

jirapos...@reviews.apache.org commented on HIVE-2125:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/665/#review554
---



trunk/conf/hive-default.xml
https://reviews.apache.org/r/665/#comment1169

can you make the indentation consistent with the other property elements?



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
https://reviews.apache.org/r/665/#comment1170

So after adding this, does the block-level merge after INSERT OVERWRITE be 
automatically supported?


- Ning


On 2011-04-25 22:47:17, Yongqiang He wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/665/
bq.  ---
bq.  
bq.  (Updated 2011-04-25 22:47:17)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  alter table concatenate fails and deletes data
bq.  
bq.  It is because the number of reducers is set to -1.
bq.  
bq.  In this patch, it is set to zero. 
bq.  
bq.  Also added a move task as the child task of the merge task. added a conf 
to control whether to check index or not, and add the job name for the merge 
job.
bq.  
bq.  
bq.  This addresses bug HIVE-2125.
bq.  https://issues.apache.org/jira/browse/HIVE-2125
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096599 
bq.trunk/conf/hive-default.xml 1096599 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1096599 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096599 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1096599 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1096599 
bq.trunk/ql/src/test/queries/clientnegative/alter_merge_index.q 
PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/alter_merge_index.q 
PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/alter_merge_index.q.out 
PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/alter_merge_index.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/665/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Yongqiang
bq.  
bq.



 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical
 Attachments: HIVE-2125.1.patch


 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures


 [ 
https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2127:
-

Attachment: HIVE-2127.2.patch

 Improve stats gathering reliability by retries on failures
 --

 Key: HIVE-2127
 URL: https://issues.apache.org/jira/browse/HIVE-2127
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2127.2.patch, HIVE-2127.patch


 Stats publishing and aggregation only try once and if there is any exception 
 it will fail and return. If many mappers/reducers updating stats at the same 
 time, it is very common to get lock timeout. We should make stats more 
 reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: HIVE-2127. Improve stats gathering reliability by retries on failures

2011-04-25 Thread Ning Zhang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/664/
---

(Updated 2011-04-25 23:42:52.373740)


Review request for hive.


Changes
---

This patch has the following changes:
 1) changed Utilities.executeWithRetry() to take a SQLCommand helper class and 
a PreparedStatement to allow customized functions to be included in the retry 
logic.
 2) defined Utilities.connectWithRetry() and Utilities.prepareWithRetry() with 
the similar logic as executeWithRetry().
 3) The major change in the retry logic is that the Utilities.*withRetry() 
functions handles a subclass of SQLException: SQLTransientException. These are 
the exceptions that can be resolved when simply retrying after some time. Other 
types of SQLExceptions (SQLNonTransientException and SQLRecoverableException 
are thrown to the callers to handle).
 4) the caller (JDBCStatsAggregator/Publisher) handles SQLRecoverableException 
in the similar fashion as Utilities.*withRetry(), but it will close the 
SQLConnection first and wait some time before reopen the connection and retry 
the SQL statements. 


Summary
---

The major changes are:

 0) 2 parameters are introduced: hive.stats.retries.max (default 0) to be the 
maximum # of retries on SQLException failures, and hive.stats.retries.wait 
(default 3 sec) to be the base time window (explained below) to wait before the 
next retry. 

 1) introduced a couple of Utilities function to execute SQL queries with 
retries on failures. One Utilities function is to determine the wait time based 
on the number of failures and a base wait window (same as the one introduced in 
HDFS-767 for DFSClient to retry on BlockMissingExceptions). The actual wait 
time is determined by baseWindow * failues + baseWindow * (failure + 1) * 
(random number between [0.0,1.0]).

 2) changed the JDBCStatsAggregator.java to use PreparedStatement to be able to 
use executeWithRetries(). 

 3) change the JDBCStatsPublisher.java and JDBCStasAggregator.java to use 
retries on SQL connections and SQL executions. 


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096632 
  trunk/conf/hive-default.xml 1096632 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1096632 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsAggregator.java 
1096632 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/stats/jdbc/JDBCStatsPublisher.java 
1096632 

Diff: https://reviews.apache.org/r/664/diff


Testing
---

Running unit tests. 


Thanks,

Ning

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures


 [ 
https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-2127:
-

Status: Patch Available  (was: Open)

Updated the review board. 

 Improve stats gathering reliability by retries on failures
 --

 Key: HIVE-2127
 URL: https://issues.apache.org/jira/browse/HIVE-2127
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-2127.2.patch, HIVE-2127.patch


 Stats publishing and aggregation only try once and if there is any exception 
 it will fail and return. If many mappers/reducers updating stats at the same 
 time, it is very common to get lock timeout. We should make stats more 
 reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: alter table concatenate fails and deletes data

2011-04-25 Thread Yongqiang He



 On 2011-04-25 23:20:59, Ning Zhang wrote:
  trunk/conf/hive-default.xml, line 1042
  https://reviews.apache.org/r/665/diff/1/?file=17317#file17317line1042
 
  can you make the indentation consistent with the other property 
  elements?

It shows the same indentation on my local. So it might just be an review board 
display issue.


 On 2011-04-25 23:20:59, Ning Zhang wrote:
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java, 
  line 1203
  https://reviews.apache.org/r/665/diff/1/?file=17321#file17321line1203
 
  So after adding this, does the block-level merge after INSERT OVERWRITE 
  be automatically supported?

No. Not automatically supported. We still need to do some work there. but it is 
a separate issue.


- Yongqiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/665/#review554
---


On 2011-04-25 22:47:17, Yongqiang He wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/665/
 ---
 
 (Updated 2011-04-25 22:47:17)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 alter table concatenate fails and deletes data
 
 It is because the number of reducers is set to -1.
 
 In this patch, it is set to zero. 
 
 Also added a move task as the child task of the merge task. added a conf to 
 control whether to check index or not, and add the job name for the merge job.
 
 
 This addresses bug HIVE-2125.
 https://issues.apache.org/jira/browse/HIVE-2125
 
 
 Diffs
 -
 
   trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096599 
   trunk/conf/hive-default.xml 1096599 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1096599 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096599 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
  1096599 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1096599 
   trunk/ql/src/test/queries/clientnegative/alter_merge_index.q PRE-CREATION 
   trunk/ql/src/test/queries/clientpositive/alter_merge_index.q PRE-CREATION 
   trunk/ql/src/test/results/clientnegative/alter_merge_index.q.out 
 PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/alter_merge_index.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/665/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Yongqiang

[jira] [Commented] (HIVE-2125) alter table concatenate fails and deletes data

2011-04-25 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025014#comment-13025014
 ] 

jirapos...@reviews.apache.org commented on HIVE-2125:
-



bq.  On 2011-04-25 23:20:59, Ning Zhang wrote:
bq.   trunk/conf/hive-default.xml, line 1042
bq.   https://reviews.apache.org/r/665/diff/1/?file=17317#file17317line1042
bq.  
bq.   can you make the indentation consistent with the other property 
elements?

It shows the same indentation on my local. So it might just be an review board 
display issue.


bq.  On 2011-04-25 23:20:59, Ning Zhang wrote:
bq.   
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java, 
line 1203
bq.   https://reviews.apache.org/r/665/diff/1/?file=17321#file17321line1203
bq.  
bq.   So after adding this, does the block-level merge after INSERT 
OVERWRITE be automatically supported?

No. Not automatically supported. We still need to do some work there. but it is 
a separate issue.


- Yongqiang


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/665/#review554
---


On 2011-04-25 22:47:17, Yongqiang He wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/665/
bq.  ---
bq.  
bq.  (Updated 2011-04-25 22:47:17)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  alter table concatenate fails and deletes data
bq.  
bq.  It is because the number of reducers is set to -1.
bq.  
bq.  In this patch, it is set to zero. 
bq.  
bq.  Also added a move task as the child task of the merge task. added a conf 
to control whether to check index or not, and add the job name for the merge 
job.
bq.  
bq.  
bq.  This addresses bug HIVE-2125.
bq.  https://issues.apache.org/jira/browse/HIVE-2125
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1096599 
bq.trunk/conf/hive-default.xml 1096599 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1096599 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1096599 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1096599 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1096599 
bq.trunk/ql/src/test/queries/clientnegative/alter_merge_index.q 
PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/alter_merge_index.q 
PRE-CREATION 
bq.trunk/ql/src/test/results/clientnegative/alter_merge_index.q.out 
PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/alter_merge_index.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/665/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Yongqiang
bq.  
bq.



 alter table concatenate fails and deletes data
 --

 Key: HIVE-2125
 URL: https://issues.apache.org/jira/browse/HIVE-2125
 Project: Hive
  Issue Type: Bug
Reporter: Joydeep Sen Sarma
Assignee: He Yongqiang
Priority: Critical
 Attachments: HIVE-2125.1.patch


 the number of reducers is not set by this command (unlike other hive 
 queries). since mapred.reduce.tasks=-1 (to let hive infer this automatically) 
 - jobtracker fails the job (number of reducers cannot be negative)
 hive alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 alter table ad_imps_2 partition(ds='2009-06-16') concatenate;
 Starting Job = job_201103101203_453180, Tracking URL = 
 http://curium.data.facebook.com:50030/jobdetails.jsp?jobid=job_201103101203_453180
 Kill Command = /mnt/vol/hive/sites/curium/hadoop/bin/../bin/hadoop job  
 -Dmapred.job.tracker=curium.data.facebook.com:50029 -kill 
 job_201103101203_453180
 Hadoop job information for null: number of mappers: 0; number of reducers: 0
 2011-04-22 10:21:24,046 null map = 100%,  reduce = 100%
 Ended Job = job_201103101203_453180 with errors
 Moved to trash: /user/facebook/warehouse/ad_imps_2/_backup.ds=2009-06-16
 after the job fails - the partition is deleted
 thankfully it's still in trash

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (HIVE-1433) move tables created in QTestUtil to a init file


 [ 
https://issues.apache.org/jira/browse/HIVE-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi reassigned HIVE-1433:


Assignee: Carl Steinbach  (was: John Sichi)

 move tables created in QTestUtil to a init file
 ---

 Key: HIVE-1433
 URL: https://issues.apache.org/jira/browse/HIVE-1433
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Namit Jain
Assignee: Carl Steinbach

 Followup for https://issues.apache.org/jira/browse/HIVE-1405

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2131) Bitmap Operation UDF doesn't clear return list


 [ 
https://issues.apache.org/jira/browse/HIVE-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-2131:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
   Status: Resolved  (was: Patch Available)

Committed.  Thanks Marquis!


 Bitmap Operation UDF doesn't clear return list
 --

 Key: HIVE-2131
 URL: https://issues.apache.org/jira/browse/HIVE-2131
 Project: Hive
  Issue Type: Bug
Reporter: Marquis Wang
Assignee: Marquis Wang
 Fix For: 0.8.0

 Attachments: HIVE-2131.1.patch, HIVE-2131.2.patch


 The AbstractGenericUDFEWAHBitmapBop.java does not clear the return list when 
 evaluate() is called, causing each subsequent call to a bitmap operation to 
 return the wrong values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2126) Hive's symlink text input format should be able to work with ComineHiveInputFormat