[jira] [Updated] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters
[ https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4349: - Status: Open (was: Patch Available) -1 for the following reasons: * In the near future we're going to stop manually constructing the classpath and let Ivy (or maybe even Maven or Gradle) do it for us. When this happens this change will break. * This problem can be avoided in the first place by ensuring that the root of Hive's source directory is = 11 characters. Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters Key: HIVE-4349 URL: https://issues.apache.org/jira/browse/HIVE-4349 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Xi Fang Fix For: 0.11.0 Attachments: HIVE-4349.1.patch If the Hive enlistment root path is longer than 12 chars then test classpath “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the Hive unit tests on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3384) HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification
[ https://issues.apache.org/jira/browse/HIVE-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644474#comment-13644474 ] Willem van Asperen commented on HIVE-3384: -- Just co'ed svn and it seems this error is still in trunk. HIVE JDBC module won't compile under JDK1.7 as new methods added in JDBC specification -- Key: HIVE-3384 URL: https://issues.apache.org/jira/browse/HIVE-3384 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 0.10.0 Reporter: Weidong Bian Assignee: Chris Drome Priority: Minor Fix For: 0.11.0 Attachments: D6873-0.9.1.patch, D6873.1.patch, D6873.2.patch, D6873.3.patch, D6873.4.patch, D6873.5.patch, D6873.6.patch, D6873.7.patch, HIVE-3384-0.10.patch, HIVE-3384-2012-12-02.patch, HIVE-3384-2012-12-04.patch, HIVE-3384.2.patch, HIVE-3384-branch-0.9.patch, HIVE-3384.patch, HIVE-JDK7-JDBC.patch jdbc module couldn't be compiled with jdk7 as it adds some abstract method in the JDBC specification some error info: error: HiveCallableStatement is not abstract and does not override abstract method TgetObject(String,ClassT) in CallableStatement . . . -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4381) Implement vectorized aggregation expressions
[ https://issues.apache.org/jira/browse/HIVE-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4381: -- Attachment: HIVE-4381.D10551.2.patch rusanu updated the revision HIVE-4381 [jira] Implement vectorized aggregation expressions. update patch after 4f7470d Reviewers: JIRA REVISION DETAIL https://reviews.facebook.net/D10551 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10551?vs=32901id=33147#toc AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorGroupByOperator.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ColumnExpression.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/VectorAggregateExpression.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFAvgLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFCountDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFCountLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMaxDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMaxLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMinDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFMinLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdPopLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFStdSampLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFSumLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarPopLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampDouble.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/aggregates/gen/VectorUDAFVarSampLong.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFAvg.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFCount.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFMinMax.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFSum.txt ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/VectorUDAFVar.txt ql/src/test/org/apache/hadoop/hive/ql/exec/vector/TestVectorGroupByOperator.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeCaptureOutputDesc.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeCaptureOutputOperator.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorDataSourceOperator.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorDataSourceOperatorDesc.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchBase.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchFromConcat.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchFromIterables.java ql/src/test/org/apache/hadoop/hive/ql/exec/vector/util/FakeVectorRowBatchFromRepeats.java To: JIRA, rusanu Implement vectorized aggregation expressions Key: HIVE-4381 URL: https://issues.apache.org/jira/browse/HIVE-4381 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Jitendra Nath Pandey Assignee: Remus Rusanu Labels: patch Fix For: vectorization-branch Attachments: HIVE-4381.D10449.1.patch, HIVE-4381.D10449.2.patch, HIVE-4381.D10449.3.patch, HIVE-4381.D10449.4.patch, HIVE-4381.D10551.1.patch, HIVE-4381.D10551.2.patch Vectorized implementation for sum, min, max, average and count. -- This message is automatically generated by JIRA. If you think it was sent
[jira] [Commented] (HIVE-4381) Implement vectorized aggregation expressions
[ https://issues.apache.org/jira/browse/HIVE-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644563#comment-13644563 ] Phabricator commented on HIVE-4381: --- ashutoshc has accepted the revision HIVE-4381 [jira] Implement vectorized aggregation expressions. +1 REVISION DETAIL https://reviews.facebook.net/D10551 BRANCH vectorization ARCANIST PROJECT hive To: JIRA, ashutoshc, rusanu Implement vectorized aggregation expressions Key: HIVE-4381 URL: https://issues.apache.org/jira/browse/HIVE-4381 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Jitendra Nath Pandey Assignee: Remus Rusanu Labels: patch Fix For: vectorization-branch Attachments: HIVE-4381.D10449.1.patch, HIVE-4381.D10449.2.patch, HIVE-4381.D10449.3.patch, HIVE-4381.D10449.4.patch, HIVE-4381.D10551.1.patch, HIVE-4381.D10551.2.patch Vectorized implementation for sum, min, max, average and count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4381) Implement vectorized aggregation expressions
[ https://issues.apache.org/jira/browse/HIVE-4381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4381: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to branch. Thanks, Remus! Implement vectorized aggregation expressions Key: HIVE-4381 URL: https://issues.apache.org/jira/browse/HIVE-4381 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: vectorization-branch Reporter: Jitendra Nath Pandey Assignee: Remus Rusanu Labels: patch Fix For: vectorization-branch Attachments: HIVE-4381.D10449.1.patch, HIVE-4381.D10449.2.patch, HIVE-4381.D10449.3.patch, HIVE-4381.D10449.4.patch, HIVE-4381.D10551.1.patch, HIVE-4381.D10551.2.patch Vectorized implementation for sum, min, max, average and count. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4383: --- Status: Open (was: Patch Available) Patch is not applying cleanly on branch. Can you please rebase it? Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4383.1.patch, HIVE-4383.2.patch Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4370) Change ORC tree readers to return batches of rows instead of a row
[ https://issues.apache.org/jira/browse/HIVE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4370: --- Affects Version/s: vectorization-branch Status: Open (was: Patch Available) Patch is not applying cleanly. Can you please rebase it? Change ORC tree readers to return batches of rows instead of a row --- Key: HIVE-4370 URL: https://issues.apache.org/jira/browse/HIVE-4370 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Sarvesh Sakalanaga Assignee: Sarvesh Sakalanaga Attachments: HIVE-4370.1.patch, HIVE-4370.2.patch, HIVE-4370.3.patch Change ORC Record reader and Tree readers to return a set of Rows instead of a row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4389) thrift files are re-generated by compiling
[ https://issues.apache.org/jira/browse/HIVE-4389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644631#comment-13644631 ] Gang Tim Liu commented on HIVE-4389: +1 thrift files are re-generated by compiling -- Key: HIVE-4389 URL: https://issues.apache.org/jira/browse/HIVE-4389 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4389.1.patch I am not sure what is going on, but there seems to be a bunch of thrift changes if I perform ant thriftif. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4370) Change ORC tree readers to return batches of rows instead of a row
[ https://issues.apache.org/jira/browse/HIVE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarvesh Sakalanaga updated HIVE-4370: - Attachment: HIVE-4370.4.patch Sure. Attached a new patch which is rebased. Change ORC tree readers to return batches of rows instead of a row --- Key: HIVE-4370 URL: https://issues.apache.org/jira/browse/HIVE-4370 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Sarvesh Sakalanaga Assignee: Sarvesh Sakalanaga Attachments: HIVE-4370.1.patch, HIVE-4370.2.patch, HIVE-4370.3.patch, HIVE-4370.4.patch, HIVE-4370.4.patch, HIVE-4370.4.patch Change ORC Record reader and Tree readers to return a set of Rows instead of a row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4370) Change ORC tree readers to return batches of rows instead of a row
[ https://issues.apache.org/jira/browse/HIVE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarvesh Sakalanaga updated HIVE-4370: - Attachment: HIVE-4370.4.patch Hi Ashutosh, Sorry about that. Can you try the one attached? Thanks, Sarvesh Change ORC tree readers to return batches of rows instead of a row --- Key: HIVE-4370 URL: https://issues.apache.org/jira/browse/HIVE-4370 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Sarvesh Sakalanaga Assignee: Sarvesh Sakalanaga Attachments: HIVE-4370.1.patch, HIVE-4370.2.patch, HIVE-4370.3.patch, HIVE-4370.4.patch, HIVE-4370.4.patch, HIVE-4370.4.patch Change ORC Record reader and Tree readers to return a set of Rows instead of a row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4370) Change ORC tree readers to return batches of rows instead of a row
[ https://issues.apache.org/jira/browse/HIVE-4370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarvesh Sakalanaga updated HIVE-4370: - Attachment: HIVE-4370.4.patch Hi Ashutosh, Sorry about that. Can you try the one attached? Thanks, Sarvesh Change ORC tree readers to return batches of rows instead of a row --- Key: HIVE-4370 URL: https://issues.apache.org/jira/browse/HIVE-4370 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Sarvesh Sakalanaga Assignee: Sarvesh Sakalanaga Attachments: HIVE-4370.1.patch, HIVE-4370.2.patch, HIVE-4370.3.patch, HIVE-4370.4.patch, HIVE-4370.4.patch, HIVE-4370.4.patch Change ORC Record reader and Tree readers to return a set of Rows instead of a row. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [WebHCat] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Attachment: HIVE-4441-1.patch [WebHCat] WebHCat does not honor user home directory Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Summary: [HCatalog] WebHCat does not honor user home directory (was: [WebHCat] WebHCat does not honor user home directory) [HCatalog] WebHCat does not honor user home directory - Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
Daniel Dai created HIVE-: Summary: [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Component/s: HCatalog [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4442: - Component/s: HCatalog [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4441) [HCatalog] WebHCat does not honor user home directory
[ https://issues.apache.org/jira/browse/HIVE-4441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4441: - Component/s: HCatalog [HCatalog] WebHCat does not honor user home directory - Key: HIVE-4441 URL: https://issues.apache.org/jira/browse/HIVE-4441 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4441-1.patch If I submit a job as user A and I specify statusdir as a relative path, I would expect results to be stored in the folder relative to the user A's home folder. For example, if I run: {code}curl -s -d user.name=hdinsightuser -d execute=show+tables; -d statusdir=pokes.output 'http://localhost:50111/templeton/v1/hive'{code} I get the results under: {code}/user/hdp/pokes.output{code} And I expect them to be under: {code}/user/hdinsightuser/pokes.output{code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4442: - Attachment: HIVE-4442-1.patch [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4442-1.patch Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch Attach patch. The patch also contains e2e tests for HIVE-4442. That is because HIVE-4442 and HIVE-4443 are very intervolved and it is harder to separate the tests. [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: (was: HIVE-4443-1.patch) [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4442) [HCatalog] WebHCat should not override user.name parameter for Queue call
[ https://issues.apache.org/jira/browse/HIVE-4442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644674#comment-13644674 ] Daniel Dai commented on HIVE-4442: -- Attach patch. Note the e2e tests is intervolved with HIVE-4443. I include all tests in HIVE-4443. [HCatalog] WebHCat should not override user.name parameter for Queue call - Key: HIVE-4442 URL: https://issues.apache.org/jira/browse/HIVE-4442 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4442-1.patch Currently templeton for the Queue call uses the user.name to filter the results of the call in addition to the default security. Ideally the filter is an optional parameter to the call independent of the security check. I would suggest a parameter in addition to GET queue (jobs) give you all the jobs a user have permission: GET queue?showall=true -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4444) [HCatalog] WebHCat Hive should support equivalent parameters as Pig
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-: - Attachment: HIVE--1.patch [HCatalog] WebHCat Hive should support equivalent parameters as Pig Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE--1.patch Currently there is no files and args parameter in Hive. We shall add them to make them similar to Pig. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Weird issue with running tests
Hi folks, I'm running into a weird issue with testing HIVE-3682. I don't think it's so much to do with the jira at hand itself, as it is to do with the test or the testing framework. Basically, if I run the .q file test itself, it succeeds. If I run it as part of a broader ant test, it fails, and seemingly consistently. What's more, the reason it fails is that the produced .out file does not match the golden output in an interesting way. I'm attaching the two files with this mail if anyone wants to look at it further, but the diff between them is as follows: 926,929c926 REHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','^@163:val_163^@ --- 163:val_163 943c940 REHOOK: type: CREATETABLE^@444:val_444^@ --- 444:val_444 1027a1025,1029 PREHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' PREHOOK: type: CREATETABLE Note#1 : the PREHOOK log line for a create table seems to have been logged before the !cat that preceeded it finished logging. Note#2 : The ^@ separators seem to be indicating a switch between streams writing out to the .out file. Note#3 : The P in PREHOOK seems to get gobbled up each time this happens. To me, this looks like !cat runs in a separate thread or at least a separate PrintStream that it hasn't quite completely flushed its buffers. Is there a way to force this? I mean, yes, I suppose I can go edit QTestUtil.execute so as to put an explicit flush, but I won't know if that works or not till after I do a complete test run(given that a solo .q run succeeds), and even then, if it succeeds, I won't know if that is what fixed it. Has anyone hit something like this before or have any thoughts/theories? Thanks, -Sushanth
Re: Weird issue with running tests
Hi Sushanth, I would suggest to try dfs -cat in your test instead of !cat, because for ! we fork a different process, so its possible streams get mangled up, but dfs -cat would get you what you want without needing to fork. Thanks, Ashutosh On Mon, Apr 29, 2013 at 10:46 AM, Sushanth Sowmyan khorg...@gmail.comwrote: Hi folks, I'm running into a weird issue with testing HIVE-3682. I don't think it's so much to do with the jira at hand itself, as it is to do with the test or the testing framework. Basically, if I run the .q file test itself, it succeeds. If I run it as part of a broader ant test, it fails, and seemingly consistently. What's more, the reason it fails is that the produced .out file does not match the golden output in an interesting way. I'm attaching the two files with this mail if anyone wants to look at it further, but the diff between them is as follows: 926,929c926 REHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','^@163:val_163^@ --- 163:val_163 943c940 REHOOK: type: CREATETABLE^@444:val_444^@ --- 444:val_444 1027a1025,1029 PREHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' PREHOOK: type: CREATETABLE Note#1 : the PREHOOK log line for a create table seems to have been logged before the !cat that preceeded it finished logging. Note#2 : The ^@ separators seem to be indicating a switch between streams writing out to the .out file. Note#3 : The P in PREHOOK seems to get gobbled up each time this happens. To me, this looks like !cat runs in a separate thread or at least a separate PrintStream that it hasn't quite completely flushed its buffers. Is there a way to force this? I mean, yes, I suppose I can go edit QTestUtil.execute so as to put an explicit flush, but I won't know if that works or not till after I do a complete test run(given that a solo .q run succeeds), and even then, if it succeeds, I won't know if that is what fixed it. Has anyone hit something like this before or have any thoughts/theories? Thanks, -Sushanth
Re: Weird issue with running tests
Aha, that makes sense. Thanks! On Mon, Apr 29, 2013 at 10:55 AM, Ashutosh Chauhan hashut...@apache.org wrote: Hi Sushanth, I would suggest to try dfs -cat in your test instead of !cat, because for ! we fork a different process, so its possible streams get mangled up, but dfs -cat would get you what you want without needing to fork. Thanks, Ashutosh On Mon, Apr 29, 2013 at 10:46 AM, Sushanth Sowmyan khorg...@gmail.comwrote: Hi folks, I'm running into a weird issue with testing HIVE-3682. I don't think it's so much to do with the jira at hand itself, as it is to do with the test or the testing framework. Basically, if I run the .q file test itself, it succeeds. If I run it as part of a broader ant test, it fails, and seemingly consistently. What's more, the reason it fails is that the produced .out file does not match the golden output in an interesting way. I'm attaching the two files with this mail if anyone wants to look at it further, but the diff between them is as follows: 926,929c926 REHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','^@163:val_163^@ --- 163:val_163 943c940 REHOOK: type: CREATETABLE^@444:val_444^@ --- 444:val_444 1027a1025,1029 PREHOOK: query: create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' PREHOOK: type: CREATETABLE Note#1 : the PREHOOK log line for a create table seems to have been logged before the !cat that preceeded it finished logging. Note#2 : The ^@ separators seem to be indicating a switch between streams writing out to the .out file. Note#3 : The P in PREHOOK seems to get gobbled up each time this happens. To me, this looks like !cat runs in a separate thread or at least a separate PrintStream that it hasn't quite completely flushed its buffers. Is there a way to force this? I mean, yes, I suppose I can go edit QTestUtil.execute so as to put an explicit flush, but I won't know if that works or not till after I do a complete test run(given that a solo .q run succeeds), and even then, if it succeeds, I won't know if that is what fixed it. Has anyone hit something like this before or have any thoughts/theories? Thanks, -Sushanth
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: (was: HIVE-4443-1.patch) [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4443) [HCatalog] Have an option for GET queue to return all job information in single call
[ https://issues.apache.org/jira/browse/HIVE-4443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4443: - Attachment: HIVE-4443-1.patch [HCatalog] Have an option for GET queue to return all job information in single call - Key: HIVE-4443 URL: https://issues.apache.org/jira/browse/HIVE-4443 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4443-1.patch Currently do display a summary of all jobs, one has to call GET queue to retrieve all the jobids and then call GET queue/:jobid for each job. It would be nice to do this in a single call. I would suggest: * GET queue - mark deprecate * GET queue/jobID - mark deprecate * DELETE queue/jobID - mark deprecate * GET jobs - return the list of JSON objects jobid but no detailed info * GET jobs/fields=* - return the list of JSON objects containing detailed Job info * GET jobs/jobID - return the single JSON object containing the detailed Job info for the job with the given ID (equivalent to GET queue/jobID) * DELETE jobs/jobID - equivalent to DELETE queue/jobID -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3952) merge map-job followed by map-reduce job
[ https://issues.apache.org/jira/browse/HIVE-3952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Kumar Vavilapalli updated HIVE-3952: -- Attachment: HIVE-3952-20130428-branch-0.11-bugfix.txt Patch with only the bug fix. The previously failing tests pass now. merge map-job followed by map-reduce job Key: HIVE-3952 URL: https://issues.apache.org/jira/browse/HIVE-3952 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Vinod Kumar Vavilapalli Fix For: 0.11.0 Attachments: hive.3952.1.patch, HIVE-3952-20130226.txt, HIVE-3952-20130227.1.txt, HIVE-3952-20130301.txt, HIVE-3952-20130421.txt, HIVE-3952-20130424.txt, HIVE-3952-20130428-branch-0.11-bugfix.txt, HIVE-3952-20130428-branch-0.11.txt, HIVE-3952-20130428-branch-0.11-v2.txt Consider the query like: select count(*) FROM ( select idOne, idTwo, value FROM bigTable JOIN smallTableOne on (bigTable.idOne = smallTableOne.idOne) ) firstjoin JOIN smallTableTwo on (firstjoin.idTwo = smallTableTwo.idTwo); where smallTableOne and smallTableTwo are smaller than hive.auto.convert.join.noconditionaltask.size and hive.auto.convert.join.noconditionaltask is set to true. The joins are collapsed into mapjoins, and it leads to a map-only job (for the map-joins) followed by a map-reduce job (for the group by). Ideally, the map-only job should be merged with the following map-reduce job. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-3682: -- Attachment: HIVE-3682.D10275.4.patch khorgath updated the revision HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice. Converted !cat to dfs -cat to prevent issues with multiple streams writing to the .out file Reviewers: ashutoshc, JIRA, omalley REVISION DETAIL https://reviews.facebook.net/D10275 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10275?vs=33045id=33153#toc BRANCH HIVE-3682 ARCANIST PROJECT hive AFFECTED FILES data/files/array_table.txt data/files/map_table.txt ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out To: JIRA, ashutoshc, omalley, khorgath when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Sushanth Sowmyan Fix For: 0.11.0 Attachments: HIVE-3682-1.patch, HIVE-3682.D10275.1.patch, HIVE-3682.D10275.2.patch, HIVE-3682.D10275.3.patch, HIVE-3682.D10275.4.patch, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4383: -- Attachment: HIVE-4383.3.patch Updated patch to apply to current version of public vectorization branch Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4383.1.patch, HIVE-4383.2.patch, HIVE-4383.3.patch Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: Implement vectorized string column-scalar filters
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10840/ --- Review request for hive. Description --- Implement vectorized string column-scalar filters. Includes changes equivalent to HIVE-4348 to correct unit test build failure on Windows in this branch. This addresses bug HIVE-4383. https://issues.apache.org/jira/browse/HIVE-4383 Diffs - hbase-handler/src/test/templates/TestHBaseCliDriver.vm 9c1651a hbase-handler/src/test/templates/TestHBaseNegativeCliDriver.vm 5940cbb ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColNotEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java 318541b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringColumnCompareScalar.txt PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java fe34b11 Diff: https://reviews.apache.org/r/10840/diff/ Testing --- Thanks, Eric Hanson
[jira] [Created] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
Xi Fang created HIVE-4445: - Summary: Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Reporter: Xi Fang The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644725#comment-13644725 ] Eric Hanson commented on HIVE-4383: --- Code review available at https://reviews.apache.org/r/10840/ Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4383.1.patch, HIVE-4383.2.patch, HIVE-4383.3.patch Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanth Sowmyan updated HIVE-3682: --- Attachment: HIVE-3682.D10275.4.patch.for.0.11 Attaching 0.11 patch for latest patch. when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Sushanth Sowmyan Fix For: 0.11.0 Attachments: HIVE-3682-1.patch, HIVE-3682.D10275.1.patch, HIVE-3682.D10275.2.patch, HIVE-3682.D10275.3.patch, HIVE-3682.D10275.4.patch, HIVE-3682.D10275.4.patch.for.0.11, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4373) Hive Version returned by HiveDatabaseMetaData.getDatabaseProductVersion is incorrect
[ https://issues.apache.org/jira/browse/HIVE-4373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4373: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk and 0.11 branch. Thanks, Thejas! Hive Version returned by HiveDatabaseMetaData.getDatabaseProductVersion is incorrect Key: HIVE-4373 URL: https://issues.apache.org/jira/browse/HIVE-4373 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Deepesh Khandelwal Assignee: Thejas M Nair Priority: Minor Fix For: 0.11.0 Attachments: HIVE-4373.1.patch, HIVE-4373.2.patch, HIVE-4373.3.patch When running beeline {code} % beeline -u 'jdbc:hive2://localhost:1' -n hive -p passwd -d org.apache.hive.jdbc.HiveDriver Connecting to jdbc:hive2://localhost:1 Connected to: Hive (version 0.10.0) Driver: Hive (version 0.11.0) Transaction isolation: TRANSACTION_REPEATABLE_READ {code} The Hive version in the Connected to: string says 0.10.0 instead of 0.11.0. Looking at the code it seems that the version is hardcoded at two places: line 250 in jdbc/src/java/org/apache/hive/jdbc/HiveDatabaseMetaData.java line 833 in jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
[ https://issues.apache.org/jira/browse/HIVE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated HIVE-4445: -- Environment: Windows Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Attachments: HIVE-4445.1.patch The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
[ https://issues.apache.org/jira/browse/HIVE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated HIVE-4445: -- Attachment: HIVE-4445.1.patch Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Reporter: Xi Fang Attachments: HIVE-4445.1.patch The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
[ https://issues.apache.org/jira/browse/HIVE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xi Fang updated HIVE-4445: -- Fix Version/s: 0.11.0 Status: Patch Available (was: Open) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Fix For: 0.11.0 Attachments: HIVE-4445.1.patch The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4445) Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases
[ https://issues.apache.org/jira/browse/HIVE-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644751#comment-13644751 ] Xi Fang commented on HIVE-4445: --- Update the .q script files for these test cases so that they can work on Windows: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q Fix the Hive unit test failures on Windows when Linux scripts or commands are used in test cases Key: HIVE-4445 URL: https://issues.apache.org/jira/browse/HIVE-4445 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.11.0 Environment: Windows Reporter: Xi Fang Fix For: 0.11.0 Attachments: HIVE-4445.1.patch The following unit tests fail on Windows because Linux scripts or commands are used in the test cases or .q files: 1. TestMinimrCliDriver: scriptfile1.q 2. TestNegativeMinimrCliDriver: mapreduce_stack_trace_hadoop20.q, minimr_broken_pipe.q 3. TestCliDriver: hiveprofiler_script0.q -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Implement vectorized string column-scalar filters
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10840/ --- (Updated April 29, 2013, 6:53 p.m.) Review request for hive. Changes --- removed changes related to HIVE-4348 Description --- Implement vectorized string column-scalar filters. Includes changes equivalent to HIVE-4348 to correct unit test build failure on Windows in this branch. This addresses bug HIVE-4383. https://issues.apache.org/jira/browse/HIVE-4383 Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColNotEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java 318541b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringColumnCompareScalar.txt PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java fe34b11 Diff: https://reviews.apache.org/r/10840/diff/ Testing --- Thanks, Eric Hanson
[jira] [Updated] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4383: -- Attachment: HIVE-4384.4.patch removed changes related to 4348 (unit test compile failure) Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4383.1.patch, HIVE-4383.2.patch, HIVE-4383.3.patch, HIVE-4384.4.patch Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Implement vectorized string column-scalar filters
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10840/#review19883 --- Ship it! Ship It! - Ashutosh Chauhan On April 29, 2013, 6:53 p.m., Eric Hanson wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10840/ --- (Updated April 29, 2013, 6:53 p.m.) Review request for hive. Description --- Implement vectorized string column-scalar filters. Includes changes equivalent to HIVE-4348 to correct unit test build failure on Windows in this branch. This addresses bug HIVE-4383. https://issues.apache.org/jira/browse/HIVE-4383 Diffs - ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColGreaterStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColLessStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/gen/FilterStringColNotEqualStringScalar.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/CodeGen.java 318541b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/templates/FilterStringColumnCompareScalar.txt PRE-CREATION ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java fe34b11 Diff: https://reviews.apache.org/r/10840/diff/ Testing --- Thanks, Eric Hanson
[jira] [Resolved] (HIVE-4383) Implement vectorized string column-scalar filters
[ https://issues.apache.org/jira/browse/HIVE-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4383. Resolution: Fixed Fix Version/s: vectorization-branch Committed to branch. Thanks, Eric! Implement vectorized string column-scalar filters - Key: HIVE-4383 URL: https://issues.apache.org/jira/browse/HIVE-4383 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4383.1.patch, HIVE-4383.2.patch, HIVE-4383.3.patch, HIVE-4384.4.patch Create patch for implementing string columns compared with scalars as vectorized filters, and apply it to vectorization branch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3739) Hive auto convert join result error: java.lang.InstantiationException: org.antlr.runtime.CommonToken
[ https://issues.apache.org/jira/browse/HIVE-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644781#comment-13644781 ] Garry Turkington commented on HIVE-3739: I've been seeing this issue on and off but couldn't find a pattern until today. The variable appears to be JDK 7 vs JDK6. On a CDH 4.2.1 cluster running Oracle JDK 6 u32 64-bit my Hive queries that take use of auto join run successfully. But as a test I tried running the cluster with Oracle JDK 7 U21 64-bit and the same Hive queries throw lots of these Antlr exceptions. On further investigation one of my client boxes has JDK 7 by default and this is where the errors were seen in the past; JDK 6 clients didn't show the issue. As mentioned this was on Cloudera 4.2.1 i.e. Hive 0.10; not sure if core Hive views JDK 7 as a supported platform or not, I see other Jiras resolved that fix JDK 7 problems but the getting started page only mentions JDK 6: https://cwiki.apache.org/confluence/display/Hive/GettingStarted Garry Hive auto convert join result error: java.lang.InstantiationException: org.antlr.runtime.CommonToken Key: HIVE-3739 URL: https://issues.apache.org/jira/browse/HIVE-3739 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0 Environment: hive.auto.convert.join=true Reporter: fantasy After I set hive.auto.convert.join=true. Any HiveQL with a join executed in hive result a error as this: - java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... --- can anyone tell why? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Status: Patch Available (was: Open) Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4435.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Attachment: HIVE-4435.1.patch Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4435.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Description: The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644840#comment-13644840 ] Shreepadma Venugopalan commented on HIVE-4435: -- The fix is to use hash functions that are pairwise independent. More on pairwise independence and family of hash functions - http://people.csail.mit.edu/ronitt/COURSE/S12/handouts/lec5.pdf Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-4435: Column stats: Distinct value estimator should use hash functions that are pairwise independent
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10841/ --- Review request for hive. Description --- Fixes the FM estimator to use hash functions that are pairwise independent. This addresses bug HIVE-4435. https://issues.apache.org/jira/browse/HIVE-4435 Diffs - ql/src/java/org/apache/hadoop/hive/ql/udf/generic/NumDistinctValueEstimator.java 69e6f46 Diff: https://reviews.apache.org/r/10841/diff/ Testing --- The estimates are within the expected error after this fix. Tested on TPCH of varying sizes. Thanks, Shreepadma Venugopalan
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644844#comment-13644844 ] Shreepadma Venugopalan commented on HIVE-4435: -- review board: https://reviews.apache.org/r/10841/ Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shreepadma Venugopalan updated HIVE-4435: - Attachment: chart_1(1).png Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4435) Column stats: Distinct value estimator should use hash functions that are pairwise independent
[ https://issues.apache.org/jira/browse/HIVE-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644850#comment-13644850 ] Shreepadma Venugopalan commented on HIVE-4435: -- Attached plot of relative error vs. number of distinct values after the fix. Dataset: TPC-H of varying sizes up to 10TB hive.stats.ndv.error = 5% (standard error for the estimator) Column types: String, Long, Double Column stats: Distinct value estimator should use hash functions that are pairwise independent -- Key: HIVE-4435 URL: https://issues.apache.org/jira/browse/HIVE-4435 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Assignee: Shreepadma Venugopalan Attachments: chart_1(1).png, HIVE-4435.1.patch The current implementation of Flajolet-Martin estimator to estimate the number of distinct values doesn't use hash functions that are pairwise independent. This is problematic because the input values don't distribute uniformly. When run on large TPC-H data sets, this leads to a huge discrepancy for primary key columns. Primary key columns are typically a monotonically increasing sequence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4446) [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE-4444
[ https://issues.apache.org/jira/browse/HIVE-4446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-4446: - Attachment: HIVE-4446-1.patch [HCatalog] Documentation for HIVE-4442, HIVE-4443, HIVE- Key: HIVE-4446 URL: https://issues.apache.org/jira/browse/HIVE-4446 Project: Hive Issue Type: Improvement Components: HCatalog Reporter: Daniel Dai Attachments: HIVE-4446-1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3682) when output hive table to file,users should could have a separator of their own choice
[ https://issues.apache.org/jira/browse/HIVE-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3682: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk and 0.11 branch. Thanks, Sushanth! when output hive table to file,users should could have a separator of their own choice -- Key: HIVE-3682 URL: https://issues.apache.org/jira/browse/HIVE-3682 Project: Hive Issue Type: New Feature Components: CLI Affects Versions: 0.8.1 Environment: Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux java version 1.6.0_25 hadoop-0.20.2-cdh3u0 hive-0.8.1 Reporter: caofangkun Assignee: Sushanth Sowmyan Fix For: 0.11.0 Attachments: HIVE-3682-1.patch, HIVE-3682.D10275.1.patch, HIVE-3682.D10275.2.patch, HIVE-3682.D10275.3.patch, HIVE-3682.D10275.4.patch, HIVE-3682.D10275.4.patch.for.0.11, HIVE-3682.with.serde.patch By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a arraystring, b arraystring) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath ../hive/examples/files/arraytest.txt overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAPSTRING, STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4349) Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters
[ https://issues.apache.org/jira/browse/HIVE-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13644998#comment-13644998 ] Carl Steinbach commented on HIVE-4349: -- I want to add that a better place for a patch like this is Ant's JUnit task. That way everyone automatically benefits from the fix without having to gunk up their build files with special case logic and bespoke Ant tasks. Based on the prevalence of this problem I'm kind of surprised that someone hasn't already done this. Fix the Hive unit test failures when the Hive enlistment root path is longer than ~12 characters Key: HIVE-4349 URL: https://issues.apache.org/jira/browse/HIVE-4349 Project: Hive Issue Type: Bug Affects Versions: 0.11.0 Reporter: Xi Fang Fix For: 0.11.0 Attachments: HIVE-4349.1.patch If the Hive enlistment root path is longer than 12 chars then test classpath “hadoop.testcp” is exceeding the 8K chars so we are unable to run most of the Hive unit tests on Windows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4385) Implement vectorized LIKE filter
[ https://issues.apache.org/jira/browse/HIVE-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4385: -- Attachment: HIVE-4385.3.patch Based off most recent vectorization branch Implement vectorized LIKE filter Key: HIVE-4385 URL: https://issues.apache.org/jira/browse/HIVE-4385 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4385.2.patch, HIVE-4385.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4385) Implement vectorized LIKE filter
[ https://issues.apache.org/jira/browse/HIVE-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645038#comment-13645038 ] Eric Hanson commented on HIVE-4385: --- Code review available at https://reviews.apache.org/r/10844/ Implement vectorized LIKE filter Key: HIVE-4385 URL: https://issues.apache.org/jira/browse/HIVE-4385 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4385.2.patch, HIVE-4385.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4385) Implement vectorized LIKE filter
[ https://issues.apache.org/jira/browse/HIVE-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Hanson updated HIVE-4385: -- Status: Patch Available (was: Open) apply to vectorization branch Implement vectorized LIKE filter Key: HIVE-4385 URL: https://issues.apache.org/jira/browse/HIVE-4385 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: HIVE-4385.2.patch, HIVE-4385.3.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.10.0-SNAPSHOT-h0.20.1 #135
See https://builds.apache.org/job/Hive-0.10.0-SNAPSHOT-h0.20.1/135/ -- [...truncated 41967 lines...] [junit] Hadoop job information for null: number of mappers: 0; number of reducers: 0 [junit] 2013-04-29 16:33:11,107 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] Execution completed successfully [junit] Mapred Local Task Succeeded . Convert the Join into MapJoin [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-29_16-33-07_856_7912485236401692106/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201304291633_223742817.txt [junit] Copying file: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] Table default.testhivedrivertable stats: [num_partitions: 0, num_files: 1, num_rows: 0, total_size: 5812, raw_data_size: 0] [junit] POSTHOOK: query: load data local inpath '/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-29_16-33-12_495_3859802960431102280/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/localscratchdir/hive_2013-04-29_16-33-12_495_3859802960431102280/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=/x1/jenkins/jenkins-slave/workspace/Hive-0.10.0-SNAPSHOT-h0.20.1/hive/build/service/tmp/hive_job_log_jenkins_201304291633_605803189.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable
[jira] [Created] (HIVE-4447) hcatalog version numbers need to be updated
Ashutosh Chauhan created HIVE-4447: -- Summary: hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #362
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/362/ -- [...truncated 36511 lines...] [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-04-29_16-57-50_551_7553408207680302479/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/362/artifact/hive/build/service/tmp/hive_job_log_jenkins_201304291657_645085803.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] Copying file: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/jenkins/hive_2013-04-29_16-57-55_154_3606858085066822556/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2013-04-29_16-57-55_154_3606858085066822556/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/362/artifact/hive/build/service/tmp/hive_job_log_jenkins_201304291657_1454527651.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/362/artifact/hive/build/service/tmp/hive_job_log_jenkins_201304291657_1510446235.txt [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/362/artifact/hive/build/service/tmp/hive_job_log_jenkins_201304291657_1387074030.txt [junit] Copying file: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK:
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Affects Version/s: 0.12.0 Fix Version/s: 0.12.0 JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: HIVE-4232-trunk-3.patch HIVE-4232-0.11-3.patch New patch fixes test failure. JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-0.11-3.patch, HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch, HIVE-4232-trunk-3.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4392: -- Attachment: HIVE-4392.D10431.4.patch navis updated the revision HIVE-4392 [jira] Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns. Addressed comments Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D10431 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10431?vs=32847id=33177#toc AFFECTED FILES metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java metastore/src/java/org/apache/hadoop/hive/metastore/MetaStoreUtils.java ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/test/queries/clientpositive/ctas_colname.q ql/src/test/results/clientpositive/ctas_colname.q.out To: JIRA, ashutoshc, navis Cc: hbutani Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns -- Key: HIVE-4392 URL: https://issues.apache.org/jira/browse/HIVE-4392 Project: Hive Issue Type: Bug Components: Query Processor Environment: Apache Hadoop 0.20.1 Apache Hive Trunk Reporter: caofangkun Assignee: Navis Priority: Minor Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command =
[jira] [Commented] (HIVE-4392) Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns
[ https://issues.apache.org/jira/browse/HIVE-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645096#comment-13645096 ] Navis commented on HIVE-4392: - [~rhbutani] / [~ashutoshc] Updated patch. Could you take a look at it? Illogical InvalidObjectException throwed when use mulit aggregate functions with star columns -- Key: HIVE-4392 URL: https://issues.apache.org/jira/browse/HIVE-4392 Project: Hive Issue Type: Bug Components: Query Processor Environment: Apache Hadoop 0.20.1 Apache Hive Trunk Reporter: caofangkun Assignee: Navis Priority: Minor Attachments: HIVE-4392.D10431.1.patch, HIVE-4392.D10431.2.patch, HIVE-4392.D10431.3.patch, HIVE-4392.D10431.4.patch For Example: hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0003, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0003 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0003 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:09:28,017 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:09:34,054 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:09:37,074 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0003 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 12 SUCCESS Total MapReduce CPU Time Spent: 0 msec hive (default) create table liza_1 as select *, sum(key), sum(value) from new_src group by key, value; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=number In order to limit the maximum number of reducers: set hive.exec.reducers.max=number In order to set a constant number of reducers: set mapred.reduce.tasks=number Starting Job = job_201304191025_0004, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0004 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0004 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 1 2013-04-22 11:11:58,945 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:12:01,964 Stage-1 map = 0%, reduce = 100% 2013-04-22 11:12:04,982 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0004 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 FAILED: Error in metadata: InvalidObjectException(message:liza_1 is not a valid object name) FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask MapReduce Jobs Launched: Job 0: Reduce: 1 HDFS Read: 0 HDFS Write: 0 SUCCESS Total MapReduce CPU Time Spent: 0 msec But the following tow Queries work: hive (default) create table liza_1 as select * from new_src; Total MapReduce jobs = 3 Launching Job 1 out of 3 Number of reduce tasks is set to 0 since there's no reduce operator Starting Job = job_201304191025_0006, Tracking URL = http://hd17-vm5:51030/jobdetails.jsp?jobid=job_201304191025_0006 Kill Command = /home/zongren/hadoop-current/bin/../bin/hadoop job -kill job_201304191025_0006 Hadoop job information for Stage-1: number of mappers: 0; number of reducers: 0 2013-04-22 11:15:00,681 Stage-1 map = 0%, reduce = 0% 2013-04-22 11:15:03,697 Stage-1 map = 100%, reduce = 100% Ended Job = job_201304191025_0006 Stage-4 is selected by condition resolver. Stage-3 is filtered out by condition resolver. Stage-5 is filtered out by condition resolver. Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive-scratchdir/hive_2013-04-22_11-14-54_632_6709035018023861094/-ext-10001 Moving data to: hdfs://hd17-vm5:9101/user/zongren/hive/liza_1 Table default.liza_1 stats: [num_partitions: 0, num_files: 0, num_rows: 0, total_size: 0, raw_data_size: 0] MapReduce Jobs
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: (was: HIVE-4232-0.11-3.patch) JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3739) Hive auto convert join result error: java.lang.InstantiationException: org.antlr.runtime.CommonToken
[ https://issues.apache.org/jira/browse/HIVE-3739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645102#comment-13645102 ] Navis commented on HIVE-3739: - It's XMLEncoder complaining that AST is not serializable (similar with HIVE-4222). Is hive in CDH different with vanillar hive? Hive auto convert join result error: java.lang.InstantiationException: org.antlr.runtime.CommonToken Key: HIVE-3739 URL: https://issues.apache.org/jira/browse/HIVE-3739 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.9.0 Environment: hive.auto.convert.join=true Reporter: fantasy After I set hive.auto.convert.join=true. Any HiveQL with a join executed in hive result a error as this: - java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... java.lang.InstantiationException: org.antlr.runtime.CommonToken Continuing ... java.lang.RuntimeException: failed to evaluate: unbound=Class.new(); Continuing ... --- can anyone tell why? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: HIVE-4232-0.11-3.patch Missed a file. JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: (was: HIVE-4232-0.11-3.patch) JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: (was: HIVE-4232-trunk-3.patch) JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4232) JDBC2 HiveConnection has odd defaults
[ https://issues.apache.org/jira/browse/HIVE-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Drome updated HIVE-4232: -- Attachment: HIVE-4232-3-trunk.patch HIVE-4232-3-0.11.patch Uploaded renamed files. JDBC2 HiveConnection has odd defaults - Key: HIVE-4232 URL: https://issues.apache.org/jira/browse/HIVE-4232 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0, 0.12.0 Reporter: Chris Drome Assignee: Chris Drome Fix For: 0.11.0, 0.12.0 Attachments: HIVE-4232-1.patch, HIVE-4232-2.patch, HIVE-4232-3-0.11.patch, HIVE-4232-3-trunk.patch, HIVE-4232.patch HiveConnection defaults to using a plain SASL transport if auth is not set. To get a raw transport auth must be set to noSasl; furthermore noSasl is case sensitive. Code tries to infer Kerberos or plain authentication based on the presence of principal. There is no provision for specifying QOP level. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
ObjectInspectorUtils.copyToStandardObject and union types
Calling ObjectInspectorUtils.copyToStandardObject() on a union type object returns just the object without any sort of information about its type. (See lines 302-310, which start case UNION:.) I'd like to change the method to return an object of type UnionObject so that we retain type information. Does anyone have any issues with this? Siyang Chen
[jira] [Updated] (HIVE-2379) Hive/HBase integration could be improved
[ https://issues.apache.org/jira/browse/HIVE-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2379: Attachment: HIVE-2379-0.11.patch.txt Sorry, missed this. Hive/HBase integration could be improved Key: HIVE-2379 URL: https://issues.apache.org/jira/browse/HIVE-2379 Project: Hive Issue Type: Bug Components: CLI, Clients, HBase Handler Affects Versions: 0.7.1, 0.8.0, 0.9.0 Reporter: Roman Shaposhnik Assignee: Navis Priority: Critical Fix For: 0.12.0 Attachments: HIVE-2379-0.11.patch.txt, HIVE-2379.D7347.1.patch, HIVE-2379.D7347.2.patch, HIVE-2379.D7347.3.patch For now any Hive/HBase queries would require the following jars to be explicitly added via hive's add jar command: add jar /usr/lib/hive/lib/hbase-0.90.1-cdh3u0.jar; add jar /usr/lib/hive/lib/hive-hbase-handler-0.7.0-cdh3u0.jar; add jar /usr/lib/hive/lib/zookeeper-3.3.1.jar; add jar /usr/lib/hive/lib/guava-r06.jar; the longer term solution, perhaps, should be to have the code at submit time call hbase's TableMapREduceUtil.addDependencyJar(job, HBaseStorageHandler.class) to ship it in distributedcache. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it
[ https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-4209: Attachment: HIVE-4209.6.patch.txt Cache evaluation result of deterministic expression and reuse it Key: HIVE-4209 URL: https://issues.apache.org/jira/browse/HIVE-4209 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, HIVE-4209.D9585.5.patch For example, {noformat} select key from src where key + 1 100 AND key + 1 200 limit 3; {noformat} key + 1 need not to be evaluated twice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4448) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker
Shuaishuai Nie created HIVE-4448: Summary: Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker Key: HIVE-4448 URL: https://issues.apache.org/jira/browse/HIVE-4448 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4448.1.patch Unit test cases which not using QTestUtil will pass incompatible Windows path of METASTOREWAREHOUSE to HiveConf which result in creating the /test/data/warehouse folder in the wrong location in Windows. This folder will not be deleted at the beginning of the unit test and the content will cause failure of unit tests if run the same test case repeatedly. The root cause of this problem is for path like this pfile://C:\hive\build\ql/test/data/warehouse, the C:\hive\build\ part will be parsed as authority of the path and removed from the path string. The patch will fix this problem and make the unit test result consistent between Windows and Linux. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4448) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker
[ https://issues.apache.org/jira/browse/HIVE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4448: - Attachment: (was: HIVE-4448.1.patch) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker --- Key: HIVE-4448 URL: https://issues.apache.org/jira/browse/HIVE-4448 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4448.1.patch Unit test cases which not using QTestUtil will pass incompatible Windows path of METASTOREWAREHOUSE to HiveConf which result in creating the /test/data/warehouse folder in the wrong location in Windows. This folder will not be deleted at the beginning of the unit test and the content will cause failure of unit tests if run the same test case repeatedly. The root cause of this problem is for path like this pfile://C:\hive\build\ql/test/data/warehouse, the C:\hive\build\ part will be parsed as authority of the path and removed from the path string. The patch will fix this problem and make the unit test result consistent between Windows and Linux. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4448) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker
[ https://issues.apache.org/jira/browse/HIVE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuaishuai Nie updated HIVE-4448: - Status: Patch Available (was: Open) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker --- Key: HIVE-4448 URL: https://issues.apache.org/jira/browse/HIVE-4448 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4448.1.patch Unit test cases which not using QTestUtil will pass incompatible Windows path of METASTOREWAREHOUSE to HiveConf which result in creating the /test/data/warehouse folder in the wrong location in Windows. This folder will not be deleted at the beginning of the unit test and the content will cause failure of unit tests if run the same test case repeatedly. The root cause of this problem is for path like this pfile://C:\hive\build\ql/test/data/warehouse, the C:\hive\build\ part will be parsed as authority of the path and removed from the path string. The patch will fix this problem and make the unit test result consistent between Windows and Linux. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4448) Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker
[ https://issues.apache.org/jira/browse/HIVE-4448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645130#comment-13645130 ] Shuaishuai Nie commented on HIVE-4448: -- All the unit test cases which not using QTestUtil can use the same method in the patch to convert the paths in the HiveConf and avoid inconsistent test result on Windows Fix metastore warehouse incorrect path on Windows for test case TestExecDriver and TestHiveMetaStoreChecker --- Key: HIVE-4448 URL: https://issues.apache.org/jira/browse/HIVE-4448 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.11.0 Environment: Windows Reporter: Shuaishuai Nie Assignee: Shuaishuai Nie Attachments: HIVE-4448.1.patch Unit test cases which not using QTestUtil will pass incompatible Windows path of METASTOREWAREHOUSE to HiveConf which result in creating the /test/data/warehouse folder in the wrong location in Windows. This folder will not be deleted at the beginning of the unit test and the content will cause failure of unit tests if run the same test case repeatedly. The root cause of this problem is for path like this pfile://C:\hive\build\ql/test/data/warehouse, the C:\hive\build\ part will be parsed as authority of the path and removed from the path string. The patch will fix this problem and make the unit test result consistent between Windows and Linux. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4447) hcatalog version numbers need to be updated
[ https://issues.apache.org/jira/browse/HIVE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4447: - Attachment: hcat-releasable-build.patch hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan Attachments: hcat-releasable-build.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4447) hcatalog version numbers need to be updated
[ https://issues.apache.org/jira/browse/HIVE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4447: - Status: Patch Available (was: Open) hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan Attachments: hcat-releasable-build.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive
[ https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-4196: -- Attachment: HIVE-4196.v1.patch draft patch for review. based on phase mentioned in design doc. Deviates slighlty... 1) adds a couple of (temporary) rest calls to enable/disable streaming on a table. Later these will be replaced with support in DDL. 2) Also also HTTP methods are GET for easy testing with web browser 3) Authentication disabled on the new streaming HTTP methods Usage Examples on db named 'sdb' table named 'log' : 1) *Setup db table with single partition column 'date':* hcat -e create database sdb; use sdb; create table log(msg string, region string) partitioned by (date string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; 2) *To check streaming status:* http://localhost:50111/templeton/v1/streaming/status?database=sdbtable=log 3) *Enable Streaming:* http://localhost:50111/templeton/v1/streaming/enable?database=sdbtable=logcol=datevalue=1000 4) *Get Chunk File to write to:* http://localhost:50111/templeton/v1/streaming/chunkget?database=sdbtable=logschema=blahformat=blahrecord_separator=blahfield_separator=blah 5) *Commit Chunk File:* http://localhost:50111/templeton/v1/streaming/chunkcommit?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/2 6) *Abort Chunk File:* http://localhost:50111/templeton/v1/streaming/chunkabort?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/3 7) *Roll Partition:* http://localhost:50111/templeton/v1/streaming/partitionroll?database=sdbtable=logpartition_column=datepartition_value=3000 Support for Streaming Partitions in Hive Key: HIVE-4196 URL: https://issues.apache.org/jira/browse/HIVE-4196 Project: Hive Issue Type: New Feature Components: Database/Schema, HCatalog Affects Versions: 0.10.1 Reporter: Roshan Naik Assignee: Roshan Naik Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign.docx, HIVE-4196.v1.patch Motivation: Allow Hive users to immediately query data streaming in through clients such as Flume. Currently Hive partitions must be created after all the data for the partition is available. Thereafter, data in the partitions is considered immutable. This proposal introduces the notion of a streaming partition into which new files an be committed periodically and made available for queries before the partition is closed and converted into a standard partition. The admin enables streaming partition on a table using DDL. He provides the following pieces of information: - Name of the partition in the table on which streaming is enabled - Frequency at which the streaming partition should be closed and converted into a standard partition. Tables with streaming partition enabled will be partitioned by one and only one column. It is assumed that this column will contain a timestamp. Closing the current streaming partition converts it into a standard partition. Based on the specified frequency, the current streaming partition is closed and a new one created for future writes. This is referred to as 'rolling the partition'. A streaming partition's life cycle is as follows: - A new streaming partition is instantiated for writes - Streaming clients request (via webhcat) for a HDFS file name into which they can write a chunk of records for a specific table. - Streaming clients write a chunk (via webhdfs) to that file and commit it(via webhcat). Committing merely indicates that the chunk has been written completely and ready for serving queries. - When the partition is rolled, all committed chunks are swept into single directory and a standard partition pointing to that directory is created. The streaming partition is closed and new streaming partition is created. Rolling the partition is atomic. Streaming clients are agnostic of partition rolling. - Hive queries will be able to query the partition that is currently open for streaming. only committed chunks will be visible. read consistency will be ensured so that repeated reads of the same partition will be idempotent for the lifespan of the query. Partition rolling requires an active agent/thread running to check when it is time to roll and trigger the roll. This could be either be achieved by using an external agent such as Oozie (preferably) or an internal agent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4447) hcatalog version numbers need to be updated
[ https://issues.apache.org/jira/browse/HIVE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645171#comment-13645171 ] Ashutosh Chauhan commented on HIVE-4447: +1 hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan Attachments: hcat-releasable-build.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: Draft patch for review. Based on phase 1 mentioned in design doc.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10857/ --- Review request for hive. Description --- Draft patch for review. based on phase 1 mentioned in design doc. Deviates slighlty from doc in the follow ways... 1) adds a couple of (temporary) rest calls to enable/disable streaming on a table. Later these will be replaced with support in DDL. 2) Also also HTTP methods are GET for easy testing with web browser 3) Authentication disabled on the new streaming HTTP methods Usage Examples on db named 'sdb' table named 'log' : 1) Setup db table with single partition column 'date': hcat -e create database sdb; use sdb; create table log(msg string, region string) partitioned by (date string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE; 2) To check streaming status: http://localhost:50111/templeton/v1/streaming/status?database=sdbtable=log 3) Enable Streaming: http://localhost:50111/templeton/v1/streaming/enable?database=sdbtable=logcol=datevalue=1000 4) Get Chunk File to write to: http://localhost:50111/templeton/v1/streaming/chunkget?database=sdbtable=logschema=blahformat=blahrecord_separator=blahfield_separator=blah 5) Commit Chunk File: http://localhost:50111/templeton/v1/streaming/chunkcommit?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/2 6) Abort Chunk File: http://localhost:50111/templeton/v1/streaming/chunkabort?database=sdbtable=logchunkfile=/user/hive/streaming/tmp/sdb/log/3 7) Roll Partition: http://localhost:50111/templeton/v1/streaming/partitionroll?database=sdbtable=logpartition_column=datepartition_value=3000 This addresses bug HIVE-4196. https://issues.apache.org/jira/browse/HIVE-4196 Diffs - common/src/java/org/apache/hadoop/hive/conf/HiveConf.java c61d95b hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/HcatStreamingDelegator.java PRE-CREATION hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/Server.java 29ac4b3 metastore/if/hive_metastore.thrift c2051f4 metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 7b31d28 metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 3d69472 metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 3b90b44 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedInfo.java d8d6e71 metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/SkewedValueList.java 030b54a metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 5929cda metastore/src/gen/thrift/gen-php/metastore/ThriftHiveMetastore.php a69d214 metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 6fd2cce metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 9b856e5 metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 25aa30c metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java dc14084 metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java cef50f4 metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java a2d6b1b metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 2079337 metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 233fb46 metastore/src/model/org/apache/hadoop/hive/metastore/model/MTable.java 2a78ce9 metastore/src/model/package.jdo a84d2bf metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 00eb0b4 Diff: https://reviews.apache.org/r/10857/diff/ Testing --- Manual testing only Thanks, Roshan Naik
[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive
[ https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-4196: -- Attachment: (was: HCatalogStreamingIngestFunctionalSpecificationandDesign.docx) Support for Streaming Partitions in Hive Key: HIVE-4196 URL: https://issues.apache.org/jira/browse/HIVE-4196 Project: Hive Issue Type: New Feature Components: Database/Schema, HCatalog Affects Versions: 0.10.1 Reporter: Roshan Naik Assignee: Roshan Naik Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- apr 29- patch1.docx, HIVE-4196.v1.patch Motivation: Allow Hive users to immediately query data streaming in through clients such as Flume. Currently Hive partitions must be created after all the data for the partition is available. Thereafter, data in the partitions is considered immutable. This proposal introduces the notion of a streaming partition into which new files an be committed periodically and made available for queries before the partition is closed and converted into a standard partition. The admin enables streaming partition on a table using DDL. He provides the following pieces of information: - Name of the partition in the table on which streaming is enabled - Frequency at which the streaming partition should be closed and converted into a standard partition. Tables with streaming partition enabled will be partitioned by one and only one column. It is assumed that this column will contain a timestamp. Closing the current streaming partition converts it into a standard partition. Based on the specified frequency, the current streaming partition is closed and a new one created for future writes. This is referred to as 'rolling the partition'. A streaming partition's life cycle is as follows: - A new streaming partition is instantiated for writes - Streaming clients request (via webhcat) for a HDFS file name into which they can write a chunk of records for a specific table. - Streaming clients write a chunk (via webhdfs) to that file and commit it(via webhcat). Committing merely indicates that the chunk has been written completely and ready for serving queries. - When the partition is rolled, all committed chunks are swept into single directory and a standard partition pointing to that directory is created. The streaming partition is closed and new streaming partition is created. Rolling the partition is atomic. Streaming clients are agnostic of partition rolling. - Hive queries will be able to query the partition that is currently open for streaming. only committed chunks will be visible. read consistency will be ensured so that repeated reads of the same partition will be idempotent for the lifespan of the query. Partition rolling requires an active agent/thread running to check when it is time to roll and trigger the roll. This could be either be achieved by using an external agent such as Oozie (preferably) or an internal agent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4196) Support for Streaming Partitions in Hive
[ https://issues.apache.org/jira/browse/HIVE-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roshan Naik updated HIVE-4196: -- Attachment: HCatalogStreamingIngestFunctionalSpecificationandDesign- apr 29- patch1.docx Support for Streaming Partitions in Hive Key: HIVE-4196 URL: https://issues.apache.org/jira/browse/HIVE-4196 Project: Hive Issue Type: New Feature Components: Database/Schema, HCatalog Affects Versions: 0.10.1 Reporter: Roshan Naik Assignee: Roshan Naik Attachments: HCatalogStreamingIngestFunctionalSpecificationandDesign- apr 29- patch1.docx, HIVE-4196.v1.patch Motivation: Allow Hive users to immediately query data streaming in through clients such as Flume. Currently Hive partitions must be created after all the data for the partition is available. Thereafter, data in the partitions is considered immutable. This proposal introduces the notion of a streaming partition into which new files an be committed periodically and made available for queries before the partition is closed and converted into a standard partition. The admin enables streaming partition on a table using DDL. He provides the following pieces of information: - Name of the partition in the table on which streaming is enabled - Frequency at which the streaming partition should be closed and converted into a standard partition. Tables with streaming partition enabled will be partitioned by one and only one column. It is assumed that this column will contain a timestamp. Closing the current streaming partition converts it into a standard partition. Based on the specified frequency, the current streaming partition is closed and a new one created for future writes. This is referred to as 'rolling the partition'. A streaming partition's life cycle is as follows: - A new streaming partition is instantiated for writes - Streaming clients request (via webhcat) for a HDFS file name into which they can write a chunk of records for a specific table. - Streaming clients write a chunk (via webhdfs) to that file and commit it(via webhcat). Committing merely indicates that the chunk has been written completely and ready for serving queries. - When the partition is rolled, all committed chunks are swept into single directory and a standard partition pointing to that directory is created. The streaming partition is closed and new streaming partition is created. Rolling the partition is atomic. Streaming clients are agnostic of partition rolling. - Hive queries will be able to query the partition that is currently open for streaming. only committed chunks will be visible. read consistency will be ensured so that repeated reads of the same partition will be idempotent for the lifespan of the query. Partition rolling requires an active agent/thread running to check when it is time to roll and trigger the roll. This could be either be achieved by using an external agent such as Oozie (preferably) or an internal agent. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it
[ https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645175#comment-13645175 ] Namit Jain commented on HIVE-4209: -- Thanks [~navis] Looks good. Can you commit it if tests pass ? +1 Cache evaluation result of deterministic expression and reuse it Key: HIVE-4209 URL: https://issues.apache.org/jira/browse/HIVE-4209 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, HIVE-4209.D9585.5.patch For example, {noformat} select key from src where key + 1 100 AND key + 1 200 limit 3; {noformat} key + 1 need not to be evaluated twice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4209) Cache evaluation result of deterministic expression and reuse it
[ https://issues.apache.org/jira/browse/HIVE-4209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645176#comment-13645176 ] Phabricator commented on HIVE-4209: --- njain has accepted the revision HIVE-4209 [jira] Cache evaluation result of deterministic expression and reuse it. REVISION DETAIL https://reviews.facebook.net/D9585 BRANCH HIVE-4209 ARCANIST PROJECT hive To: JIRA, njain, navis Cc: njain Cache evaluation result of deterministic expression and reuse it Key: HIVE-4209 URL: https://issues.apache.org/jira/browse/HIVE-4209 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-4209.6.patch.txt, HIVE-4209.D9585.1.patch, HIVE-4209.D9585.2.patch, HIVE-4209.D9585.3.patch, HIVE-4209.D9585.4.patch, HIVE-4209.D9585.5.patch For example, {noformat} select key from src where key + 1 100 AND key + 1 200 limit 3; {noformat} key + 1 need not to be evaluated twice. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4440) SMB Operator spills to disk like it's 1999
[ https://issues.apache.org/jira/browse/HIVE-4440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645178#comment-13645178 ] Namit Jain commented on HIVE-4440: -- I really like the title of the jira. Changing the parameter name is backward incompatible. Can you support both the current parameter and the proposed parameter for now ? Document it clearly, and say that the current parameter hive.mapjoin.bucket.cache.size will not be supported for this from 0.13 or something like that. SMB Operator spills to disk like it's 1999 -- Key: HIVE-4440 URL: https://issues.apache.org/jira/browse/HIVE-4440 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-4440.1.patch I was recently looking into some performance issue with a query that used SMB join and was running really slow. Turns out that the SMB join by default caches only 100 values per key before spilling to disk. That seems overly conservative to me. Changing the parameter resulted in a ~5x speedup - quite significant. The parameter is: hive.mapjoin.bucket.cache.size Which right now is only used the SMB Operator as far as I can tell. The parameter was introduced originally (3 yrs ago) for the map join operator (looks like pre-SMB) and set to 100 to avoid OOM. That seems to have been in a different context though where you had to avoid running out of memory with the cached hash table in the same process, I think. Two things I'd like to propose: a) Rename it to what it does: hive.smbjoin.cache.rows b) Set it to something less restrictive: 1 If you string together a 5 table smb join with a map join and a map-side group by aggregation you might still run out of memory, but the renamed parameter should be easier to find and reduce. For most queries, I would think that 1 is still a reasonable number to cache (On the reduce side we use 25000 for shuffle joins). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4447) hcatalog version numbers need to be updated
[ https://issues.apache.org/jira/browse/HIVE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-4447: - Resolution: Fixed Assignee: Alan Gates Status: Resolved (was: Patch Available) Patch checked into branch 0.11. Thanks Ashutosh for the review. hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan Assignee: Alan Gates Attachments: hcat-releasable-build.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4438) Remove unused join configuration parameter: hive.mapjoin.size.key
[ https://issues.apache.org/jira/browse/HIVE-4438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4438: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! Remove unused join configuration parameter: hive.mapjoin.size.key - Key: HIVE-4438 URL: https://issues.apache.org/jira/browse/HIVE-4438 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4438.1.patch The config parameter that used to limit the number of cached rows per key is no longer used in the code base. I suggest to remove it to make things less confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4439) Remove unused join configuration parameter: hive.mapjoin.cache.numrows
[ https://issues.apache.org/jira/browse/HIVE-4439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4439: --- Resolution: Fixed Fix Version/s: 0.12.0 Status: Resolved (was: Patch Available) Committed to trunk. Thanks, Gunther! Remove unused join configuration parameter: hive.mapjoin.cache.numrows -- Key: HIVE-4439 URL: https://issues.apache.org/jira/browse/HIVE-4439 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.12.0 Attachments: HIVE-4439.1.patch The description says: How many rows should be cached by jdbm for map join. I can't find any reference to that parameter in the code however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4064) Handle db qualified names consistently across all HiveQL statements
[ https://issues.apache.org/jira/browse/HIVE-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645269#comment-13645269 ] Jeremy Rayner commented on HIVE-4064: - Any progress on this issue? I understand that there are workarounds, but I would be nice if this got resolved sometime in the near future. Handle db qualified names consistently across all HiveQL statements --- Key: HIVE-4064 URL: https://issues.apache.org/jira/browse/HIVE-4064 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.10.0 Reporter: Shreepadma Venugopalan Hive doesn't consistently handle db qualified names across all HiveQL statements. While some HiveQL statements such as SELECT support DB qualified names, other such as CREATE INDEX doesn't. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Is HCatalog stand-alone going to die ?
Hi, I have followed the discussion about the merging of HCatalog into Hive. However, it is not clear to me whether new stand-alone versions of Hcatalog are going to be released. Is 0.5.0-incubating the last ? Will be possible to build only hcatalog from Hive tree ? Regards, Rodrigo Trujillo
[VOTE] Apache Hive 0.11.0 Release Candidate 0
Hey all, I am excited to announce availability of Apache Hive 0.11.0 Release Candidate 0 at: http://people.apache.org/~hashutosh/hive-0.11.0-rc0/ Maven artifacts are available here: https://repository.apache.org/content/repositories/orgapachehive-154/ This release has many goodies including HiveServer2, windowing and analytical functions, decimal data type, better query planning, performance enhancements and various bug fixes. In total, we resolved more than 350 issues. Full list of fixed issues can be found at: http://s.apache.org/8Fr Voting will conclude in 72 hours. Hive PMC Members: Please test and vote. Thanks, Ashutosh (On behalf of Hive contributors who made 0.11 a possibility)
[jira] [Updated] (HIVE-4447) hcatalog version numbers need to be updated
[ https://issues.apache.org/jira/browse/HIVE-4447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-4447: --- Fix Version/s: 0.11.0 hcatalog version numbers need to be updated Key: HIVE-4447 URL: https://issues.apache.org/jira/browse/HIVE-4447 Project: Hive Issue Type: Bug Components: HCatalog Reporter: Ashutosh Chauhan Assignee: Alan Gates Fix For: 0.11.0 Attachments: hcat-releasable-build.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira