[jira] [Resolved] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-07-21 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang resolved HIVE-2201.


Resolution: Fixed

committed, thanks Siying!

 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, 
 HIVE-2201.4.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #838

2011-07-21 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/838/changes

Changes:

[heyongqiang] HIVE-2209:add support for map comparision in serde layer (Krishna 
Kumar via He Yongqiang)

--
[...truncated 20304 lines...]
[junit] #
[junit] Running org.apache.hadoop.hive.ql.parse.TestParseNegative
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.ql.parse.TestParseNegative FAILED 
(crashed)
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGBUS (0x7) at pc=0xf76fd7a4, pid=19865, tid=4137978736
[junit] #
[junit] # JRE version: 6.0_20-b02
[junit] # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode 
linux-x86 )
[junit] # Problematic frame:
[junit] # C  [libc.so.6+0x1117a4]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ql/hs_err_pid19865.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
[junit] Running org.apache.hadoop.hive.ql.tool.TestLineageInfo
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.ql.tool.TestLineageInfo FAILED (crashed)
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGBUS (0x7) at pc=0xf76fa7a4, pid=19872, tid=4137966448
[junit] #
[junit] # JRE version: 6.0_20-b02
[junit] # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode 
linux-x86 )
[junit] # Problematic frame:
[junit] # C  [libc.so.6+0x1117a4]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ql/hs_err_pid19872.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateAdd
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.ql.udf.TestUDFDateAdd FAILED (crashed)
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGBUS (0x7) at pc=0xf76937a4, pid=19879, tid=4137544560
[junit] #
[junit] # JRE version: 6.0_20-b02
[junit] # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode 
linux-x86 )
[junit] # Problematic frame:
[junit] # C  [libc.so.6+0x1117a4]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ql/hs_err_pid19879.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateDiff
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.ql.udf.TestUDFDateDiff FAILED (crashed)
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGBUS (0x7) at pc=0xf76db7a4, pid=19886, tid=4137839472
[junit] #
[junit] # JRE version: 6.0_20-b02
[junit] # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode 
linux-x86 )
[junit] # Problematic frame:
[junit] # C  [libc.so.6+0x1117a4]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ql/hs_err_pid19886.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #
[junit] Running org.apache.hadoop.hive.ql.udf.TestUDFDateSub
[junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0 sec
[junit] Test org.apache.hadoop.hive.ql.udf.TestUDFDateSub FAILED (crashed)
[junit] #
[junit] # A fatal error has been detected by the Java Runtime Environment:
[junit] #
[junit] #  SIGBUS (0x7) at pc=0xf76857a4, pid=19893, tid=4137487216
[junit] #
[junit] # JRE version: 6.0_20-b02
[junit] # Java VM: Java HotSpot(TM) Server VM (16.3-b01 mixed mode 
linux-x86 )
[junit] # Problematic frame:
[junit] # C  [libc.so.6+0x1117a4]
[junit] #
[junit] # An error report file with more information is saved as:
[junit] # 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ql/hs_err_pid19893.log
[junit] #
[junit] # If you would like to submit a bug report, please visit:
[junit] #   http://java.sun.com/webapps/bugreport/crash.jsp
[junit] #

[jira] [Commented] (HIVE-2209) Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object

2011-07-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068838#comment-13068838
 ] 

Hudson commented on HIVE-2209:
--

Integrated in Hive-trunk-h0.21 #838 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/838/])
HIVE-2209:add support for map comparision in serde layer (Krishna Kumar via 
He Yongqiang)

heyongqiang : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149027
Files : 
* 
/hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestCrossMapEqualComparer.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
* 
/hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestSimpleMapEqualComparer.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/SimpleMapEqualComparer.java
* 
/hive/trunk/serde/src/test/org/apache/hadoop/hive/serde2/objectinspector/TestFullMapEqualComparer.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/MapEqualComparer.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/FullMapEqualComparer.java
* 
/hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/CrossMapEqualComparer.java


 Provide a way by which ObjectInspectorUtils.compare can be extended by the 
 caller for comparing maps which are part of the object
 -

 Key: HIVE-2209
 URL: https://issues.apache.org/jira/browse/HIVE-2209
 Project: Hive
  Issue Type: Improvement
Reporter: Krishna Kumar
Assignee: Krishna Kumar
Priority: Minor
 Attachments: HIVE-2209v0.patch, HIVE-2209v2.patch, HIVE2209v1.patch


 Now ObjectInspectorUtils.compare throws an exception if a map is contained 
 (recursively) within the objects being compared. Two obvious implementations 
 are
 - a simple map comparer which assumes keys of the first map can be used to 
 fetch values from the second
 - a 'cross-product' comparer which compares every pair of key-value pairs in 
 the two maps, and calls a match if and only if all pairs are matched
 Note that it would be difficult to provide a transitive 
 greater-than/less-than indication with maps so that is not in scope. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #839

2011-07-21 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/839/changes

Changes:

[heyongqiang] HIVE-2201:reduce name node calls in hive by creating temporary 
directories (Siying Dong via He Yongqiang)

--
[...truncated 4330 lines...]
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java
AUql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIndex.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStringToMap.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNull.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/UDTFCollector.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFArrayContains.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPOr.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFEWAHBitmap.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumericHistogram.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFStruct.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMax.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPNotEqual.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVarianceSample.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFStd.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBridge.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBridge.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapEmpty.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUtils.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NGramEstimator.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPAnd.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqual.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCoalesce.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPLessThan.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFField.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFElt.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFExplode.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFPercentileApprox.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFUnion.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFVariance.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/Collector.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrGreaterThan.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFInstr.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFWhen.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCollectSet.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTF.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPGreaterThan.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapAnd.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDFEWAHBitmapBop.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPEqualOrLessThan.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSize.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCovarianceSample.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseCompare.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFHistogramNumeric.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFConcatWS.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFCorrelation.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFEWAHBitmapOr.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFHash.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFResolver2.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFMin.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/AbstractGenericUDAFResolver.java
A 
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFSentences.java
A ql/src/java/org/apache/hadoop/hive/ql/udf/generic/package-info.java
A 

[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-07-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068839#comment-13068839
 ] 

Hudson commented on HIVE-2201:
--

Integrated in Hive-trunk-h0.21 #839 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/839/])
HIVE-2201:reduce name node calls in hive by creating temporary directories 
(Siying Dong via He Yongqiang)

heyongqiang : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149047
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileOutputFormat.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java


 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, 
 HIVE-2201.4.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-21 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Fix Version/s: 0.8.0
   Status: Patch Available  (was: Open)

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2296.1.patch, hive-2296.2.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Work stopped] (HIVE-2296) bad compressed file names from insert into

2011-07-21 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-2296 stopped by Franklin Hu.

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2296.1.patch, hive-2296.2.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: Cli: Print Hadoop's CPU milliseconds

2011-07-21 Thread Siying Dong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

(Updated 2011-07-21 17:30:55.228025)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

fix a bug


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-21 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2236:
--

Attachment: HIVE-2236.3.patch

fix a bug

 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch, HIVE-2236.3.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-21 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2236:
--

Status: Open  (was: Patch Available)

 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch, HIVE-2236.3.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-21 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2236:
--

Status: Patch Available  (was: Open)

 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch, HIVE-2236.3.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069094#comment-13069094
 ] 

jirapos...@reviews.apache.org commented on HIVE-2236:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

(Updated 2011-07-21 17:30:55.228025)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

fix a bug


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch, HIVE-2236.3.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2285) Pretty-print output of DESCRIBE TABLE EXTENDED

2011-07-21 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069117#comment-13069117
 ] 

Carl Steinbach commented on HIVE-2285:
--

Formatted output should be the default behavior of DESCRIBE TABLE EXTENDED, but 
due to backwards compatibility issues DESCRIBE FORMATTED was introduced 
instead. In this ticket I'm proposing that we introduce a configuration 
variable hive.formatted.describe.extended, which when set to true will cause 
the output of DESCRIBE EXTENDED to be formatted.

 Pretty-print output of DESCRIBE TABLE EXTENDED
 --

 Key: HIVE-2285
 URL: https://issues.apache.org/jira/browse/HIVE-2285
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Reporter: Carl Steinbach
Assignee: Carl Steinbach



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-07-21 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069126#comment-13069126
 ] 

Siying Dong commented on HIVE-2247:
---

I'm looking at the patch. Please test the backward compatible between the old 
server, new client and new server, old client. Please come by if you don't know 
how to test it.

 ALTER TABLE RENAME PARTITION
 

 Key: HIVE-2247
 URL: https://issues.apache.org/jira/browse/HIVE-2247
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Weiyan Wang
 Attachments: HIVE-2247.3.patch.txt, HIVE-2247.4.patch.txt, 
 HIVE-2247.5.patch.txt


 We need a ALTER TABLE TABLE RENAME PARTITIONfunction that is similar t ALTER 
 TABLE RENAME.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2139:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Patrick!

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Attachment: fix_npe.patch

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069134#comment-13069134
 ] 

Vaibhav Aggarwal commented on HIVE-2297:


Some of the file systems can return null if there are no objects to list.
Added a fix for that.

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Fix NPE in ConditionalResolverSkewJoin
--

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Status: Patch Available  (was: Open)

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Attachment: HIVE-2298.patch

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Fix UDAFPercentile to tolerate null percentiles
---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch

UDAFPercentile when passed null percentile list will throw a null pointer 
exception.
Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2298) Fix UDAFPercentile to tolerate null percentiles

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2298:
---

Status: Patch Available  (was: Open)

 Fix UDAFPercentile to tolerate null percentiles
 ---

 Key: HIVE-2298
 URL: https://issues.apache.org/jira/browse/HIVE-2298
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2298.patch


 UDAFPercentile when passed null percentile list will throw a null pointer 
 exception.
 Submitting a small fix for that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)
Optimize Hive query startup time for multiple partitions


 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal


Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O(n) operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Description: 
Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O n operation.

  was:
Added an optimization to the way input splits are computed.
Reduced an O(n^2) operation to O(n) operation.


 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal

 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2297) Fix NPE in ConditionalResolverSkewJoin

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2297:
---

Attachment: HIVE-2297.patch

 Fix NPE in ConditionalResolverSkewJoin
 --

 Key: HIVE-2297
 URL: https://issues.apache.org/jira/browse/HIVE-2297
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2297.patch, fix_npe.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Attachment: HIVE-2299.patch

 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
 Attachments: HIVE-2299.patch


 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2299) Optimize Hive query startup time for multiple partitions

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-2299:
---

Assignee: Vaibhav Aggarwal
  Status: Patch Available  (was: Open)

 Optimize Hive query startup time for multiple partitions
 

 Key: HIVE-2299
 URL: https://issues.apache.org/jira/browse/HIVE-2299
 Project: Hive
  Issue Type: Improvement
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2299.patch


 Added an optimization to the way input splits are computed.
 Reduced an O(n^2) operation to O n operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069146#comment-13069146
 ] 

jirapos...@reviews.apache.org commented on HIVE-2247:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1105/#review1156
---


Please try to add the new column in the middle first. If that works, we should 
do that way to make it consistent with alter_table() call. If that doesn't 
work, it's OK to add it to the end now.


trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/1105/#comment2385

Why we still need another function call rename_partition_core()? Can't we 
just modify alter_partition_core() to always use the same logic?


- Siying


On 2011-07-21 01:20:25, Weiyan Wang wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1105/
bq.  ---
bq.  
bq.  (Updated 2011-07-21 01:20:25)
bq.  
bq.  
bq.  Review request for Siying Dong.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Implement ALTER TABLE PARTITION RENAME function to rename a partition. 
bq.  Add HiveQL syntax ALTER TABLE bar PARTITION (k1='v1', k2='v2') RENAME TO 
PARTITION (k1='v3', k2='v4');
bq.  This is my first Hive diff, I just learn everything from existing codebase 
and may not have a good understanding on it. 
bq.  Feel free to inform me if I make something wrong. Thanks
bq.  
bq.  
bq.  This addresses bug HIVE-2247.
bq.  https://issues.apache.org/jira/browse/HIVE-2247
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.trunk/metastore/if/hive_metastore.thrift 1145366 
bq.trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1145366 
bq.trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1145366 
bq.
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1145366 
bq.
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1145366 
bq.
trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1145366 
bq.
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1145366 
bq.
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1145366 
bq.trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1145366 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
1145366 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1145366 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1145366 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1145366 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1145366 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1145366 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1145366 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1145366 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 
1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 1145366 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 
1145366 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 
PRE-CREATION 
bq.
trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure.q 
PRE-CREATION 
bq.
trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure2.q 
PRE-CREATION 
bq.
trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure3.q 
PRE-CREATION 
bq.trunk/ql/src/test/queries/clientpositive/alter_rename_partition.q 
PRE-CREATION 
bq.
trunk/ql/src/test/queries/clientpositive/alter_rename_partition_authorization.q 
PRE-CREATION 
bq.
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure.q.out 
PRE-CREATION 
bq.
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure2.q.out 
PRE-CREATION 
bq.
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure3.q.out 
PRE-CREATION 
bq.

[jira] [Updated] (HIVE-2086) Add test coverage for external table data loss issue

2011-07-21 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2086:
-

Summary: Add test coverage for external table data loss issue  (was: Data 
loss with external table)

 Add test coverage for external table data loss issue
 

 Key: HIVE-2086
 URL: https://issues.apache.org/jira/browse/HIVE-2086
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
 Environment: Amazon  elastics mapreduce cluster
Reporter: Q Long
Assignee: Jonathan Natkins
 Attachments: HIVE-2086.1.patch, HIVE-2086.2.patch, HIVE-2086.3.patch, 
 create_like.q.out


 Data loss when using create external table like statement. 
 1) Set up an external table S, point to location L. Populate data in S.
 2) Create another external table T, using statement like this:
 create external table T like S location L
Make sure table T point to the same location as the original table S.
 3) Query table T, see the same set of data in S.
 4) drop table T.
 5) Query table S will return nothing, and location L is deleted. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2086) Add test coverage for external table data loss issue

2011-07-21 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2086:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Natty!

 Add test coverage for external table data loss issue
 

 Key: HIVE-2086
 URL: https://issues.apache.org/jira/browse/HIVE-2086
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
 Environment: Amazon  elastics mapreduce cluster
Reporter: Q Long
Assignee: Jonathan Natkins
 Fix For: 0.8.0

 Attachments: HIVE-2086.1.patch, HIVE-2086.2.patch, HIVE-2086.3.patch, 
 create_like.q.out


 Data loss when using create external table like statement. 
 1) Set up an external table S, point to location L. Populate data in S.
 2) Create another external table T, using statement like this:
 create external table T like S location L
Make sure table T point to the same location as the original table S.
 3) Query table T, see the same set of data in S.
 4) drop table T.
 5) Query table S will return nothing, and location L is deleted. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #840

2011-07-21 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/840/

--
[...truncated 36020 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2011-07-21_12-25-43_113_324537898771116411/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-07-21 12:25:46,011 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2011-07-21_12-25-43_113_324537898771116411/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201107211225_1946650877.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
file:/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/jenkins/hive_2011-07-21_12-25-47_339_5931340635384748549/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/jenkins/hive_2011-07-21_12-25-47_339_5931340635384748549/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/service/tmp/hive_job_log_jenkins_201107211225_522256161.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] 

Build failed in Jenkins: Hive-trunk-h0.21 #841

2011-07-21 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/841/changes

Changes:

[cws] HIVE-2086. Add test coverage for external table data loss issue (Jonathan 
Natkins via cws)

[cws] HIVE-2139. Enable HiveServer to accept -hiveconf option (Patrick Hunt via 
cws)

--
[...truncated 14342 lines...]
[junit] o[2] class = class java.util.HashMap
[junit] o = [234, [firstString, secondString], {firstKey=1, secondKey=2}, 
-234, 1.0, -2.5]
[junit] Testing protocol: org.apache.thrift.protocol.TBinaryProtocol
[junit] TypeName = 
struct_hello:int,2bye:arraystring,another:mapstring,int,nhello:int,d:double,nd:double
[junit] bytes 
=x08xffxffx00x00x00xeax0fxffxfex0bx00x00x00x02x00x00x00x0bx66x69x72x73x74x53x74x72x69x6ex67x00x00x00x0cx73x65x63x6fx6ex64x53x74x72x69x6ex67x0dxffxfdx0bx08x00x00x00x02x00x00x00x08x66x69x72x73x74x4bx65x79x00x00x00x01x00x00x00x09x73x65x63x6fx6ex64x4bx65x79x00x00x00x02x08xffxfcxffxffxffx16x04xffxfbx3fxf0x00x00x00x00x00x00x04xffxfaxc0x04x00x00x00x00x00x00x00
[junit] o class = class java.util.ArrayList
[junit] o size = 6
[junit] o[0] class = class java.lang.Integer
[junit] o[1] class = class java.util.ArrayList
[junit] o[2] class = class java.util.HashMap
[junit] o = [234, [firstString, secondString], {firstKey=1, secondKey=2}, 
-234, 1.0, -2.5]
[junit] Testing protocol: org.apache.thrift.protocol.TJSONProtocol
[junit] TypeName = 
struct_hello:int,2bye:arraystring,another:mapstring,int,nhello:int,d:double,nd:double
[junit] bytes 
=x7bx22x2dx31x22x3ax7bx22x69x33x32x22x3ax32x33x34x7dx2cx22x2dx32x22x3ax7bx22x6cx73x74x22x3ax5bx22x73x74x72x22x2cx32x2cx22x66x69x72x73x74x53x74x72x69x6ex67x22x2cx22x73x65x63x6fx6ex64x53x74x72x69x6ex67x22x5dx7dx2cx22x2dx33x22x3ax7bx22x6dx61x70x22x3ax5bx22x73x74x72x22x2cx22x69x33x32x22x2cx32x2cx7bx22x66x69x72x73x74x4bx65x79x22x3ax31x2cx22x73x65x63x6fx6ex64x4bx65x79x22x3ax32x7dx5dx7dx2cx22x2dx34x22x3ax7bx22x69x33x32x22x3ax2dx32x33x34x7dx2cx22x2dx35x22x3ax7bx22x64x62x6cx22x3ax31x2ex30x7dx2cx22x2dx36x22x3ax7bx22x64x62x6cx22x3ax2dx32x2ex35x7dx7d
[junit] bytes in text 
={-1:{i32:234},-2:{lst:[str,2,firstString,secondString]},-3:{map:[str,i32,2,{firstKey:1,secondKey:2}]},-4:{i32:-234},-5:{dbl:1.0},-6:{dbl:-2.5}}
[junit] o class = class java.util.ArrayList
[junit] o size = 6
[junit] o[0] class = class java.lang.Integer
[junit] o[1] class = class java.util.ArrayList
[junit] o[2] class = class java.util.HashMap
[junit] o = [234, [firstString, secondString], {firstKey=1, secondKey=2}, 
-234, 1.0, -2.5]
[junit] Testing protocol: 
org.apache.hadoop.hive.serde2.thrift.TCTLSeparatedProtocol
[junit] TypeName = 
struct_hello:int,2bye:arraystring,another:mapstring,int,nhello:int,d:double,nd:double
[junit] bytes 
=x32x33x34x01x66x69x72x73x74x53x74x72x69x6ex67x02x73x65x63x6fx6ex64x53x74x72x69x6ex67x01x66x69x72x73x74x4bx65x79x03x31x02x73x65x63x6fx6ex64x4bx65x79x03x32x01x2dx32x33x34x01x31x2ex30x01x2dx32x2ex35
[junit] bytes in text 
=234firstStringsecondStringfirstKey1secondKey2-2341.0-2.5
[junit] o class = class java.util.ArrayList
[junit] o size = 6
[junit] o[0] class = class java.lang.Integer
[junit] o[1] class = class java.util.ArrayList
[junit] o[2] class = class java.util.HashMap
[junit] o = [234, [firstString, secondString], {firstKey=1, secondKey=2}, 
-234, 1.0, -2.5]
[junit] Beginning Test testTBinarySortableProtocol:
[junit] Testing struct test { double hello}
[junit] Testing struct test { i32 hello}
[junit] Testing struct test { i64 hello}
[junit] Testing struct test { string hello}
[junit] Testing struct test { string hello, double another}
[junit] Test testTBinarySortableProtocol passed!
[junit] bytes in text =234  firstStringsecondString
firstKey1secondKey2
[junit] compare to=234  firstStringsecondString
firstKey1secondKey2
[junit] o class = class java.util.ArrayList
[junit] o size = 3
[junit] o[0] class = class java.lang.Integer
[junit] o[1] class = class java.util.ArrayList
[junit] o[2] class = class java.util.HashMap
[junit] o = [234, [firstString, secondString], {firstKey=1, secondKey=2}]
[junit] bytes in text =234  firstStringsecondString
firstKey1secondKey2
[junit] compare to=234  firstStringsecondString
firstKey1secondKey2
[junit] o class = class java.util.ArrayList
[junit] o size = 3
[junit] o = [234, null, {firstKey=1, secondKey=2}]
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.305 sec
[junit] Running org.apache.hadoop.hive.serde2.lazy.TestLazyArrayMapStruct
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.169 sec
[junit] Running org.apache.hadoop.hive.serde2.lazy.TestLazyPrimitive
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.146 sec
[junit] Running org.apache.hadoop.hive.serde2.lazy.TestLazySimpleSerDe

[jira] [Commented] (HIVE-2086) Add test coverage for external table data loss issue

2011-07-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069164#comment-13069164
 ] 

Hudson commented on HIVE-2086:
--

Integrated in Hive-trunk-h0.21 #841 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/841/])
HIVE-2086. Add test coverage for external table data loss issue (Jonathan 
Natkins via cws)

cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149331
Files : 
* /hive/trunk/data/files/ext_test
* /hive/trunk/ql/src/test/queries/clientpositive/create_like.q
* /hive/trunk/ql/src/test/results/clientpositive/create_like.q.out
* /hive/trunk/data/files/ext_test/test.dat
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/QTestUtil.java
* /hive/trunk/build-common.xml


 Add test coverage for external table data loss issue
 

 Key: HIVE-2086
 URL: https://issues.apache.org/jira/browse/HIVE-2086
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.7.0
 Environment: Amazon  elastics mapreduce cluster
Reporter: Q Long
Assignee: Jonathan Natkins
 Fix For: 0.8.0

 Attachments: HIVE-2086.1.patch, HIVE-2086.2.patch, HIVE-2086.3.patch, 
 create_like.q.out


 Data loss when using create external table like statement. 
 1) Set up an external table S, point to location L. Populate data in S.
 2) Create another external table T, using statement like this:
 create external table T like S location L
Make sure table T point to the same location as the original table S.
 3) Query table T, see the same set of data in S.
 4) drop table T.
 5) Query table S will return nothing, and location L is deleted. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069163#comment-13069163
 ] 

Hudson commented on HIVE-2139:
--

Integrated in Hive-trunk-h0.21 #841 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/841/])
HIVE-2139. Enable HiveServer to accept -hiveconf option (Patrick Hunt via 
cws)

cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149311
Files : 
* /hive/trunk/common/src/java/org/apache/hadoop/hive/common/LogUtils.java
* /hive/trunk/hwi/src/java/org/apache/hadoop/hive/hwi/HWISessionItem.java
* /hive/trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java
* 
/hive/trunk/common/src/java/org/apache/hadoop/hive/common/cli/CommonCliOptions.java
* /hive/trunk/metastore/ivy.xml
* /hive/trunk/bin/ext/metastore.sh
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
* /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java
* /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java
* /hive/trunk/common/ivy.xml
* /hive/trunk/common/build.xml
* 
/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
* /hive/trunk/common/src/java/org/apache/hadoop/hive/common/cli
* /hive/trunk/bin/ext/hiveserver.sh


 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069184#comment-13069184
 ] 

Patrick Hunt commented on HIVE-2139:


Should I update the docs for this? Where? If so any guidelines for doing so? 
(version differences for example)

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2300) Move TempStatsStore directory to build/test

2011-07-21 Thread Carl Steinbach (JIRA)
Move TempStatsStore directory to build/test
---

 Key: HIVE-2300
 URL: https://issues.apache.org/jira/browse/HIVE-2300
 Project: Hive
  Issue Type: Bug
  Components: Statistics, Testing Infrastructure
Reporter: Carl Steinbach




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HIVE-1604) Patch to allow variables in Hive

2011-07-21 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal reassigned HIVE-1604:
--

Assignee: Vaibhav Aggarwal

 Patch to allow variables in Hive
 

 Key: HIVE-1604
 URL: https://issues.apache.org/jira/browse/HIVE-1604
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1604.patch


 Patch to Hive which allows command line substitution.
 The patch modifies the Hive command line driver and options processor to 
 support the following arguments:
 hive  [-d key=value] [-define key=value] 
   -dSubsitution to apply to script
   -define   Subsitution to apply to script

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread Charles Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-1078:
---

Status: Open  (was: Patch Available)

Fails create_or_replace_view.q

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-1078: CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread Charles Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/
---

(Updated 2011-07-21 22:07:29.150219)


Review request for hive.


Changes
---

Fix failure of create_or_replace_view.q


Summary
---

https://issues.apache.org/jira/browse/HIVE-1078


This addresses bug HIVE-1078.
https://issues.apache.org/jira/browse/HIVE-1078


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
 1146902 

Diff: https://reviews.apache.org/r/1058/diff


Testing
---

Passes unit tests


Thanks,

Charles



[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069240#comment-13069240
 ] 

jirapos...@reviews.apache.org commented on HIVE-1078:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/
---

(Updated 2011-07-21 22:07:29.150219)


Review request for hive.


Changes
---

Fix failure of create_or_replace_view.q


Summary
---

https://issues.apache.org/jira/browse/HIVE-1078


This addresses bug HIVE-1078.
https://issues.apache.org/jira/browse/HIVE-1078


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
 1146902 

Diff: https://reviews.apache.org/r/1058/diff


Testing
---

Passes unit tests


Thanks,

Charles



 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch, HIVE-1078v9.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: 

[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread Charles Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-1078:
---

Attachment: HIVE-1078v9.patch

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch, HIVE-1078v9.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread Charles Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-1078:
---

Status: Patch Available  (was: Open)

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch, HIVE-1078v9.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069245#comment-13069245
 ] 

Carl Steinbach commented on HIVE-2139:
--

I think doc updates should go here:

https://cwiki.apache.org/confluence/display/Hive/HiveServer

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069251#comment-13069251
 ] 

Patrick Hunt commented on HIVE-2139:


Ok, I'll update that. Do you all have some common way to handle documenting the 
fact that pre-0.8.0 it's one way, post 0.8.0 it's another? Is there an example 
you can point me to?

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2128) Automatic Indexing with multiple tables

2011-07-21 Thread Syed S. Albiz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed S. Albiz updated HIVE-2128:


Attachment: HIVE-2128.6.patch

 Automatic Indexing with multiple tables
 ---

 Key: HIVE-2128
 URL: https://issues.apache.org/jira/browse/HIVE-2128
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2128.1.patch, HIVE-2128.1.patch, HIVE-2128.2.patch, 
 HIVE-2128.4.patch, HIVE-2128.5.patch, HIVE-2128.6.patch


 Make automatic indexing work with jobs which access multiple tables.  We'll 
 probably need to modify the way that the index input format works in order to 
 associate index formats/files with specific tables.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2128) Automatic Indexing with multiple tables

2011-07-21 Thread Syed S. Albiz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed S. Albiz updated HIVE-2128:


Status: Patch Available  (was: Open)

 Automatic Indexing with multiple tables
 ---

 Key: HIVE-2128
 URL: https://issues.apache.org/jira/browse/HIVE-2128
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2128.1.patch, HIVE-2128.1.patch, HIVE-2128.2.patch, 
 HIVE-2128.4.patch, HIVE-2128.5.patch, HIVE-2128.6.patch


 Make automatic indexing work with jobs which access multiple tables.  We'll 
 probably need to modify the way that the index input format works in order to 
 associate index formats/files with specific tables.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2128) Automatic Indexing with multiple tables

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069299#comment-13069299
 ] 

jirapos...@reviews.apache.org commented on HIVE-2128:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1010/
---

(Updated 2011-07-21 23:52:23.929900)


Review request for hive and John Sichi.


Changes
---

Added order by to testcases. This revealed an existing bug where we would walk 
the entire operator tree for each task in the task tree in 
IndexWhereTaskDispatcher. I amended this to only walk the subset of the 
operator tree in the current task.


Summary
---

Grab the indexed tables during optimized query generation, grab the associated 
path URIs, and keep those around in the Configuration object. When the job is 
passed to ExecDriver, this data is extracted and used in HiveIndexedInputFormat 
to decide whether to use the index file or delegate to the parent 
(HiveInputFormat) class. Not sure if this is robust. 


This addresses bug HIVE-2128.
https://issues.apache.org/jira/browse/HIVE-2128


Diffs (updated)
-

  ql/src/test/results/clientpositive/index_bitmap_auto_partitioned.q.out 
4c9efd1 
  ql/src/test/results/clientpositive/index_auto_self_join.q.out PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexResult.java b9b586e 
  ql/src/java/org/apache/hadoop/hive/ql/index/HiveIndexedInputFormat.java 
f1ee95d 
  ql/src/java/org/apache/hadoop/hive/ql/index/bitmap/BitmapIndexHandler.java 
61bbbf5 
  ql/src/java/org/apache/hadoop/hive/ql/index/compact/CompactIndexHandler.java 
7c91946 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereProcessor.java
 dbc489f 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/index/IndexWhereTaskDispatcher.java
 da084f6 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java a03a9a6 
  ql/src/test/queries/clientpositive/index_auto_mult_tables.q PRE-CREATION 
  ql/src/test/queries/clientpositive/index_auto_mult_tables_compact.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/index_auto_self_join.q PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_mult_tables.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/index_auto_mult_tables_compact.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/1010/diff


Testing
---

added new testcase index_auto_mult_tables.q


Thanks,

Syed



 Automatic Indexing with multiple tables
 ---

 Key: HIVE-2128
 URL: https://issues.apache.org/jira/browse/HIVE-2128
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Attachments: HIVE-2128.1.patch, HIVE-2128.1.patch, HIVE-2128.2.patch, 
 HIVE-2128.4.patch, HIVE-2128.5.patch, HIVE-2128.6.patch


 Make automatic indexing work with jobs which access multiple tables.  We'll 
 probably need to modify the way that the index input format works in order to 
 associate index formats/files with specific tables.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-21 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069304#comment-13069304
 ] 

John Sichi commented on HIVE-1078:
--

+1.  Will commit when tests pass.

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch, HIVE-1078v9.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2296) bad compressed file names from insert into

2011-07-21 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069314#comment-13069314
 ] 

Siying Dong commented on HIVE-2296:
---

+1

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2296.1.patch, hive-2296.2.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-21 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069328#comment-13069328
 ] 

Carl Steinbach commented on HIVE-2139:
--

@Patrick: Good idea. I created a page on the wiki for notes like this: 
https://cwiki.apache.org/confluence/display/Hive/HiveChangeLog

Please add a blurb there and be sure to include a link back to this ticket. 
Thanks!

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Fix For: 0.8.0

 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-2246: Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread Sohan Jain

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1183/
---

Review request for hive, Ning Zhang and Paul Yang.


Summary
---

This patch tries to make minimal changes to the API while keeping migration 
short and somewhat easy to revert.

The new schema can be described as follows:
- CDS is a table corresponding to Column Descriptor objects.  Currently, it 
only stores a CD_ID.
- COLUMNS_V2 is a table corresponding to MFieldSchema objects, or columns.  A 
Column Descriptor holds a list of columns.  COLUMNS_V2 has a foreign key to the 
CD_ID to which it belongs.
- SDS was modified to reference a Column Descriptor. So SDS now has a foreign 
key to a CD_ID which describes its columns.

During migration, we create Column Descriptors for tables in a straightforward 
manner: their columns are now just wrapped inside a column descriptor.  The SDS 
of partitions use their parent table's column descriptor, since currently a 
partition and its table share the same list of columns.

When altering or adding a partition, give it it's parent table's column 
descriptor IF the columns they describe are the same.  Otherwise, create a new 
column descriptor for its columns.

When adding or altering a table, create a new column descriptor every time.

Whenever you drop a storage descriptor (e.g, when dropping tables or 
partitions), check to see if the related column descriptor has any other 
references in the table.  That is, check to see if any other storage 
descriptors point to that column descriptor.  If none do, then delete that 
column descriptor.  This check is in place so we don't have unreferenced column 
descriptors and columns hanging around after schema evolution for tables.


This addresses bug HIVE-2246.
https://issues.apache.org/jira/browse/HIVE-2246


Diffs
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1148945 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MStorageDescriptor.java
 1148945 
  trunk/metastore/src/model/package.jdo 1148945 

Diff: https://reviews.apache.org/r/1183/diff


Testing
---

Passes facebook's regression testing and all existing test cases.  In one 
instance, before migration, the overhead involved with storage descriptors and 
columns was ~11 GB.  After migration, the overhead was ~1.5 GB.


Thanks,

Sohan



[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069389#comment-13069389
 ] 

jirapos...@reviews.apache.org commented on HIVE-2246:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1183/
---

Review request for hive, Ning Zhang and Paul Yang.


Summary
---

This patch tries to make minimal changes to the API while keeping migration 
short and somewhat easy to revert.

The new schema can be described as follows:
- CDS is a table corresponding to Column Descriptor objects.  Currently, it 
only stores a CD_ID.
- COLUMNS_V2 is a table corresponding to MFieldSchema objects, or columns.  A 
Column Descriptor holds a list of columns.  COLUMNS_V2 has a foreign key to the 
CD_ID to which it belongs.
- SDS was modified to reference a Column Descriptor. So SDS now has a foreign 
key to a CD_ID which describes its columns.

During migration, we create Column Descriptors for tables in a straightforward 
manner: their columns are now just wrapped inside a column descriptor.  The SDS 
of partitions use their parent table's column descriptor, since currently a 
partition and its table share the same list of columns.

When altering or adding a partition, give it it's parent table's column 
descriptor IF the columns they describe are the same.  Otherwise, create a new 
column descriptor for its columns.

When adding or altering a table, create a new column descriptor every time.

Whenever you drop a storage descriptor (e.g, when dropping tables or 
partitions), check to see if the related column descriptor has any other 
references in the table.  That is, check to see if any other storage 
descriptors point to that column descriptor.  If none do, then delete that 
column descriptor.  This check is in place so we don't have unreferenced column 
descriptors and columns hanging around after schema evolution for tables.


This addresses bug HIVE-2246.
https://issues.apache.org/jira/browse/HIVE-2246


Diffs
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1148945 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MStorageDescriptor.java
 1148945 
  trunk/metastore/src/model/package.jdo 1148945 

Diff: https://reviews.apache.org/r/1183/diff


Testing
---

Passes facebook's regression testing and all existing test cases.  In one 
instance, before migration, the overhead involved with storage descriptors and 
columns was ~11 GB.  After migration, the overhead was ~1.5 GB.


Thanks,

Sohan



 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2246.2.patch


 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread Sohan Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sohan Jain updated HIVE-2246:
-

Description: 
Note: this patch proposes a schema change, and is therefore incompatible with 
the current metastore.

We can re-organize the JDO models to reduce space usage to keep the metastore 
scalable for the future.  Currently, partitions are the fastest growing objects 
in the metastore, and the metastore keeps a separate copy of the columns list 
for each partition.  We can normalize the metastore db by decoupling Columns 
from Storage Descriptors and not storing duplicate lists of the columns for 
each partition. 

An idea is to create an additional level of indirection with a Column 
Descriptor that has a list of columns.  A table has a reference to its latest 
Column Descriptor (note: a table may have more than one Column Descriptor in 
the case of schema evolution).  Partitions and Indexes can reference the same 
Column Descriptors as their parent table.

Currently, the COLUMNS table in the metastore has roughly (number of partitions 
+ number of tables) * (average number of columns pertable) rows.  We can reduce 
this to (number of tables) * (average number of columns per table) rows, while 
incurring a small cost proportional to the number of tables to store the Column 
Descriptors.

Please see the latest review board for additional implementation details.

  was:
We can re-organize the JDO models to reduce space usage to keep the metastore 
scalable for the future.  Currently, partitions are the fastest growing objects 
in the metastore, and the metastore keeps a separate copy of the columns list 
for each partition.  We can normalize the metastore db by decoupling Columns 
from Storage Descriptors and not storing duplicate lists of the columns for 
each partition. 

An idea is to create an additional level of indirection with a Column 
Descriptor that has a list of columns.  A table has a reference to its latest 
Column Descriptor (note: a table may have more than one Column Descriptor in 
the case of schema evolution).  Partitions and Indexes can reference the same 
Column Descriptors as their parent table.

Currently, the COLUMNS table in the metastore has roughly (number of partitions 
+ number of tables) * (average number of columns pertable) rows.  We can reduce 
this to (number of tables) * (average number of columns per table) rows, while 
incurring a small cost proportional to the number of tables to store the Column 
Descriptors.

   Tags: metastore, schema, JDO

 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2246.2.patch


 Note: this patch proposes a schema change, and is therefore incompatible with 
 the current metastore.
 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.
 Please see the latest review board for additional implementation details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-2246: Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread Sohan Jain

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1183/
---

(Updated 2011-07-22 05:30:29.026246)


Review request for hive, Ning Zhang and Paul Yang.


Changes
---

Adding some files I missed in the last diff.


Summary
---

This patch tries to make minimal changes to the API while keeping migration 
short and somewhat easy to revert.

The new schema can be described as follows:
- CDS is a table corresponding to Column Descriptor objects.  Currently, it 
only stores a CD_ID.
- COLUMNS_V2 is a table corresponding to MFieldSchema objects, or columns.  A 
Column Descriptor holds a list of columns.  COLUMNS_V2 has a foreign key to the 
CD_ID to which it belongs.
- SDS was modified to reference a Column Descriptor. So SDS now has a foreign 
key to a CD_ID which describes its columns.

During migration, we create Column Descriptors for tables in a straightforward 
manner: their columns are now just wrapped inside a column descriptor.  The SDS 
of partitions use their parent table's column descriptor, since currently a 
partition and its table share the same list of columns.

When altering or adding a partition, give it it's parent table's column 
descriptor IF the columns they describe are the same.  Otherwise, create a new 
column descriptor for its columns.

When adding or altering a table, create a new column descriptor every time.

Whenever you drop a storage descriptor (e.g, when dropping tables or 
partitions), check to see if the related column descriptor has any other 
references in the table.  That is, check to see if any other storage 
descriptors point to that column descriptor.  If none do, then delete that 
column descriptor.  This check is in place so we don't have unreferenced column 
descriptors and columns hanging around after schema evolution for tables.


This addresses bug HIVE-2246.
https://issues.apache.org/jira/browse/HIVE-2246


Diffs (updated)
-

  trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql PRE-CREATION 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1148945 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MColumnDescriptor.java
 PRE-CREATION 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MStorageDescriptor.java
 1148945 
  trunk/metastore/src/model/package.jdo 1148945 

Diff: https://reviews.apache.org/r/1183/diff


Testing
---

Passes facebook's regression testing and all existing test cases.  In one 
instance, before migration, the overhead involved with storage descriptors and 
columns was ~11 GB.  After migration, the overhead was ~1.5 GB.


Thanks,

Sohan



[jira] [Updated] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread Sohan Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sohan Jain updated HIVE-2246:
-

Attachment: HIVE-2246.3.patch

Adding some missing files that I forgot to svn add

 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch


 Note: this patch proposes a schema change, and is therefore incompatible with 
 the current metastore.
 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.
 Please see the latest review board for additional implementation details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2246) Dedupe tables' column schemas from partitions in the metastore db

2011-07-21 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069392#comment-13069392
 ] 

jirapos...@reviews.apache.org commented on HIVE-2246:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1183/
---

(Updated 2011-07-22 05:30:29.026246)


Review request for hive, Ning Zhang and Paul Yang.


Changes
---

Adding some files I missed in the last diff.


Summary
---

This patch tries to make minimal changes to the API while keeping migration 
short and somewhat easy to revert.

The new schema can be described as follows:
- CDS is a table corresponding to Column Descriptor objects.  Currently, it 
only stores a CD_ID.
- COLUMNS_V2 is a table corresponding to MFieldSchema objects, or columns.  A 
Column Descriptor holds a list of columns.  COLUMNS_V2 has a foreign key to the 
CD_ID to which it belongs.
- SDS was modified to reference a Column Descriptor. So SDS now has a foreign 
key to a CD_ID which describes its columns.

During migration, we create Column Descriptors for tables in a straightforward 
manner: their columns are now just wrapped inside a column descriptor.  The SDS 
of partitions use their parent table's column descriptor, since currently a 
partition and its table share the same list of columns.

When altering or adding a partition, give it it's parent table's column 
descriptor IF the columns they describe are the same.  Otherwise, create a new 
column descriptor for its columns.

When adding or altering a table, create a new column descriptor every time.

Whenever you drop a storage descriptor (e.g, when dropping tables or 
partitions), check to see if the related column descriptor has any other 
references in the table.  That is, check to see if any other storage 
descriptors point to that column descriptor.  If none do, then delete that 
column descriptor.  This check is in place so we don't have unreferenced column 
descriptors and columns hanging around after schema evolution for tables.


This addresses bug HIVE-2246.
https://issues.apache.org/jira/browse/HIVE-2246


Diffs (updated)
-

  trunk/metastore/scripts/upgrade/mysql/008-HIVE-2246.mysql.sql PRE-CREATION 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1148945 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MColumnDescriptor.java
 PRE-CREATION 
  
trunk/metastore/src/model/org/apache/hadoop/hive/metastore/model/MStorageDescriptor.java
 1148945 
  trunk/metastore/src/model/package.jdo 1148945 

Diff: https://reviews.apache.org/r/1183/diff


Testing
---

Passes facebook's regression testing and all existing test cases.  In one 
instance, before migration, the overhead involved with storage descriptors and 
columns was ~11 GB.  After migration, the overhead was ~1.5 GB.


Thanks,

Sohan



 Dedupe tables' column schemas from partitions in the metastore db
 -

 Key: HIVE-2246
 URL: https://issues.apache.org/jira/browse/HIVE-2246
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Attachments: HIVE-2246.2.patch, HIVE-2246.3.patch


 Note: this patch proposes a schema change, and is therefore incompatible with 
 the current metastore.
 We can re-organize the JDO models to reduce space usage to keep the metastore 
 scalable for the future.  Currently, partitions are the fastest growing 
 objects in the metastore, and the metastore keeps a separate copy of the 
 columns list for each partition.  We can normalize the metastore db by 
 decoupling Columns from Storage Descriptors and not storing duplicate lists 
 of the columns for each partition. 
 An idea is to create an additional level of indirection with a Column 
 Descriptor that has a list of columns.  A table has a reference to its 
 latest Column Descriptor (note: a table may have more than one Column 
 Descriptor in the case of schema evolution).  Partitions and Indexes can 
 reference the same Column Descriptors as their parent table.
 Currently, the COLUMNS table in the metastore has roughly (number of 
 partitions + number of tables) * (average number of columns pertable) rows.  
 We can reduce this to (number of tables) * (average number of columns per 
 table) rows, while incurring a small cost proportional to the number of 
 tables to store the Column Descriptors.
 Please see the latest review board for additional implementation details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira