[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160033#comment-14160033 ] Navis commented on HIVE-8319: - [~ashutoshc] This is trivial enough but able to provide various expansions to hiveserver2. Could you review this? Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8352) Enable windowing.q for spark
[ https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160055#comment-14160055 ] Hive QA commented on HIVE-8352: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673055/HIVE-8352.1-spark.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6739 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_parallel {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/196/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-196/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12673055 Enable windowing.q for spark Key: HIVE-8352 URL: https://issues.apache.org/jira/browse/HIVE-8352 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Priority: Minor Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, hive-8385.patch We should enable windowing.q for basic windowing coverage. After checking out the spark branch, we would build: {noformat} $ mvn clean install -DskipTests -Phadoop-2 $ cd itests/ $ mvn clean install -DskipTests -Phadoop-2 {noformat} Then generate the windowing.q.out file: {noformat} $ cd qtest-spark/ $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 -Dtest.output.overwrite=true {noformat} Compare the output against MapReduce: {noformat} $ diff -y -W 150 ../../ql/src/test/results/clientpositive/spark/windowing.q.out ../../ql/src/test/results/clientpositive/windowing.q.out| less {noformat} And if everything looks good, add it to {{spark.query.files}} in {{./itests/src/test/resources/testconfiguration.properties}} then submit the patch including the .q file -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7205) Wrong results when union all of grouping followed by group by with correlation optimization
[ https://issues.apache.org/jira/browse/HIVE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160067#comment-14160067 ] Hive QA commented on HIVE-7205: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673048/HIVE-7205.4.patch.txt {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6525 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_stats_counter org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1128/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1128/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12673048 Wrong results when union all of grouping followed by group by with correlation optimization --- Key: HIVE-7205 URL: https://issues.apache.org/jira/browse/HIVE-7205 Project: Hive Issue Type: Bug Affects Versions: 0.12.0, 0.13.0, 0.13.1 Reporter: dima machlin Assignee: Navis Priority: Critical Attachments: HIVE-7205.1.patch.txt, HIVE-7205.2.patch.txt, HIVE-7205.3.patch.txt, HIVE-7205.4.patch.txt use case : table TBL (a string,b string) contains single row : 'a','a' the following query : {code:sql} select b, sum(cc) from ( select b,count(1) as cc from TBL group by b union all select a as b,count(1) as cc from TBL group by a ) z group by b {code} returns a 1 a 1 while set hive.optimize.correlation=true; if we change set hive.optimize.correlation=false; it returns correct results : a 2 The plan with correlation optimization : {code:sql} ABSTRACT SYNTAX TREE: (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL b (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME DB TBL))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL a) b) (TOK_SELEXPR (TOK_FUNCTION count 1) cc)) (TOK_GROUPBY (TOK_TABLE_OR_COL a) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_TABLE_OR_COL b)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_TABLE_OR_COL cc (TOK_GROUPBY (TOK_TABLE_OR_COL b STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 is a root stage STAGE PLANS: Stage: Stage-1 Map Reduce Alias - Map Operator Tree: null-subquery1:z-subquery1:TBL TableScan alias: TBL Select Operator expressions: expr: b type: string outputColumnNames: b Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: b type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key expressions: expr: _col0 type: string sort order: + Map-reduce partition columns: expr: _col0 type: string tag: 0 value expressions: expr: _col1 type: bigint null-subquery2:z-subquery2:TBL TableScan alias: TBL Select Operator expressions: expr: a type: string outputColumnNames: a Group By Operator aggregations: expr: count(1) bucketGroup: false keys: expr: a type: string mode: hash outputColumnNames: _col0, _col1 Reduce Output Operator key
[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out
[ https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160068#comment-14160068 ] Vaibhav Gumashta commented on HIVE-8193: [~thejas] None of the failures are related. Thanks! Hook HiveServer2 dynamic service discovery with session time out Key: HIVE-8193 URL: https://issues.apache.org/jira/browse/HIVE-8193 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8193.1.patch For dynamic service discovery, if the HiveServer2 instance is removed from ZooKeeper, currently, on the last client close, the server shuts down. However, we need to ensure that this also happens when a session is closed on timeout and no current sessions exit on this instance of HiveServer2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace
[ https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160070#comment-14160070 ] Vaibhav Gumashta commented on HIVE-8172: cc [~thejas] HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace Key: HIVE-8172 URL: https://issues.apache.org/jira/browse/HIVE-8172 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Critical Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8172.1.patch Currently the client provides a url like: jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2. The zooKeeperNamespace param when not provided should use the default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8324) Shim KerberosName (causes build failure on hadoop-1)
[ https://issues.apache.org/jira/browse/HIVE-8324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-8324: --- Resolution: Fixed Status: Resolved (was: Patch Available) The test failure is not related. Patch committed to trunk and 14. Thanks for reviewing [~szehon]! Shim KerberosName (causes build failure on hadoop-1) Key: HIVE-8324 URL: https://issues.apache.org/jira/browse/HIVE-8324 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Szehon Ho Assignee: Vaibhav Gumashta Priority: Blocker Fix For: 0.14.0 Attachments: HIVE-8324.1.patch, HIVE-8324.2.patch Unfortunately even after HIVE-8265, there are still more compile failures. {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-service: Compilation failure: Compilation failure: [ERROR] /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[35,54] cannot find symbol [ERROR] symbol: class KerberosName [ERROR] location: package org.apache.hadoop.security.authentication.util [ERROR] /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,7] cannot find symbol [ERROR] symbol: class KerberosName [ERROR] location: class org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction [ERROR] /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[241,43] cannot find symbol [ERROR] symbol: class KerberosName [ERROR] location: class org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction [ERROR] /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,7] cannot find symbol [ERROR] symbol: class KerberosName [ERROR] location: class org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction [ERROR] /Users/szehon/svn-repos/trunk/service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java:[252,43] cannot find symbol [ERROR] symbol: class KerberosName [ERROR] location: class org.apache.hive.service.cli.thrift.ThriftHttpServlet.HttpKerberosServerAction {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 24136: HIVE-4329: HCatalog should use getHiveRecordWriter.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/24136/ --- (Updated Oct. 6, 2014, 8:03 a.m.) Review request for hive. Changes --- Rebase on trunk. Disable specific test methods for storage formats. Bugs: HIVE-4329 https://issues.apache.org/jira/browse/HIVE-4329 Repository: hive-git Description --- HIVE-4329: HCatalog should use getHiveRecordWriter. Diffs (updated) - hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 4fdb5c985108bb3225cf945024ae679745e5f3bc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultOutputFormatContainer.java 3a07b0ca7c1956d45e611005cbc5ba2464596471 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DefaultRecordWriterContainer.java 209d7bcef5624100c6cdbc2a0a137dcaf1c1fc42 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/DynamicPartitionFileRecordWriterContainer.java 4df912a935221e527c106c754ff233d212df9246 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputFormatContainer.java 1a7595fd6dd0a5ffbe529bc24015c482068233bf hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileRecordWriterContainer.java 2a883d6517bfe732b6a6dffa647d9d44e4145b38 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java bfa8657cd1b16aec664aab3e22b430b304a3698d hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseOutputFormat.java 4f7a74a002cedf3b54d0133041184fbcd9d9c4ab hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatMapRedUtil.java b651cb323771843da43667016a7dd2c9d9a1ddac hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java 694739821a202780818924d54d10edb707cfbcfa hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java 1980ef50af42499e0fed8863b6ff7a45f926d9fc hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InternalUtil.java 9b979395e47e54aac87487cb990824e3c3a2ee19 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputFormatContainer.java d83b003f9c16e78a39b3cc7ce810ff19f70848c2 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/RecordWriterContainer.java 5905b46178b510b3a43311739fea2b95f47b4ed7 hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/StaticPartitionFileRecordWriterContainer.java b3ea76e6a79f94e09972bc060c06105f60087b71 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java ee57f3fd126af2e36039f84686a4169ef6267593 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java 0d87c6ce2b9a2169c3b7c9d80ff33417279fb465 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalDynamicPartitioned.java 58764a5d093524a0a3566e6db817fdb4b2364ac8 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalNonPartitioned.java 6e060c08ce03b71a4f2216f5137d73b468e5be46 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalPartitioned.java 9f16b3b9811c2020adfb6a2da7eb76ac1bc8cfb9 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableDynamicPartitioned.java 5b18739d0e9a92b94a6cc2647bc37d1aa0c0e5ca hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutableNonPartitioned.java 354ae109adbec93363a5f3813413dcc50bd8ffa3 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMutablePartitioned.java a22a993c8f154fcbf2faaaea2ab1ce69c4f13717 hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatNonPartitioned.java 174a92f443cb5deeb4972f4016109ecedae8bd3e hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatPartitioned.java a386415fb406bb0cda18f7913650874d6a236e21 hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/PigHCatUtil.java 36221b77d52474393668284d12877fd6b43c88d6 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoader.java 5eabba151b6b39b8e251fbbce2ffd4b9f7b503c6 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatLoaderComplexSchema.java 447f39fade0b5d562dd30915377a3ddf8dd422cd hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorer.java a380f619493c12c440679f501a401d0a61788838 hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestHCatStorerMulti.java 0c3ec8bd93f2a50d2d44c2d892180142613dc68d hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/TestUtil.java 8a652f0bb9323497bbcc7fd4a76f616ee8917c1e ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java 2ad7330365b8327e6f1b78ad5b9760e252d1339b
[jira] [Updated] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Chen updated HIVE-4329: - Attachment: HIVE-4329.4.patch Attaching a new patch rebased on master, incorporating the test utils from HIVE-7286 to disable specific test methods for given storage formats. HCatalog should use getHiveRecordWriter rather than getRecordWriter --- Key: HIVE-4329 URL: https://issues.apache.org/jira/browse/HIVE-4329 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sean Busbey Assignee: David Chen Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, HIVE-4329.3.patch, HIVE-4329.4.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-6692) Location for new table or partition should be a write entity
[ https://issues.apache.org/jira/browse/HIVE-6692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis resolved HIVE-6692. - Resolution: Won't Fix Because locations for new table or partition is decided to be read entities by policy, which is felt strange to me still, the remaining part of the patch is whether to use qualified path or simple string for path type entities. I'll close this and make a new issue for that. Location for new table or partition should be a write entity Key: HIVE-6692 URL: https://issues.apache.org/jira/browse/HIVE-6692 Project: Hive Issue Type: Task Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-6692.1.patch.txt Locations for create table and alter table add partitionshould be write entities. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8357) Path type entities should use qualified path rather than string
Navis created HIVE-8357: --- Summary: Path type entities should use qualified path rather than string Key: HIVE-8357 URL: https://issues.apache.org/jira/browse/HIVE-8357 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string
[ https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8357: Attachment: HIVE-8357.1.patch.txt Running preliminary test, expecting many test fails. Path type entities should use qualified path rather than string --- Key: HIVE-8357 URL: https://issues.apache.org/jira/browse/HIVE-8357 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8357.1.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8357) Path type entities should use qualified path rather than string
[ https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-8357: Status: Patch Available (was: Open) Path type entities should use qualified path rather than string --- Key: HIVE-8357 URL: https://issues.apache.org/jira/browse/HIVE-8357 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8357.1.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8186) Self join may fail if one side has VCs and other doesn't
[ https://issues.apache.org/jira/browse/HIVE-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160159#comment-14160159 ] Hive QA commented on HIVE-8186: --- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673047/HIVE-8186.3.patch.txt {color:green}SUCCESS:{color} +1 6525 tests passed Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1129/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1129/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12673047 Self join may fail if one side has VCs and other doesn't Key: HIVE-8186 URL: https://issues.apache.org/jira/browse/HIVE-8186 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-8186.1.patch.txt, HIVE-8186.2.patch.txt, HIVE-8186.3.patch.txt See comments. This also fails on trunk, although not on original join_vc query -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7733) Ambiguous column reference error on query
[ https://issues.apache.org/jira/browse/HIVE-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160271#comment-14160271 ] Hive QA commented on HIVE-7733: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673054/HIVE-7733.5.patch.txt {color:red}ERROR:{color} -1 due to 54 failed/errored test(s), 6526 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_correctness org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cluster org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_or_replace_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cte_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_describe_formatted_view_partitioned_json org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_field_garbage org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_repeated_alias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_col org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subq_where_serialization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_explain_rewrite org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_exists_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_explain_rewrite org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_in_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notexists_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_unqualcolumnrefs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_views org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_temp_table_subquery1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_compare_java_string org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_to_unix_timestamp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_top_level org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_view_inputs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_windowing_streaming org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_exists org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_subquery_in org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_as_select_with_partition org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_failure6 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col0 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_ambiguous_col2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_create_or_replace_view7 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalid_select_column_with_subquery org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_invalidate_view1 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_recursive_view {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1130/testReport Console output:
[jira] [Commented] (HIVE-7641) INSERT ... SELECT with no source table leads to NPE
[ https://issues.apache.org/jira/browse/HIVE-7641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160302#comment-14160302 ] Xuefu Zhang commented on HIVE-7641: --- Looking at the patch, it seems making more sense to return an error in the case, in order to be consist with regular select x from table query, in which error is given if from table is missed. INSERT ... SELECT with no source table leads to NPE --- Key: HIVE-7641 URL: https://issues.apache.org/jira/browse/HIVE-7641 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.13.1 Reporter: Lenni Kuff Assignee: Navis Attachments: HIVE-7641.1.patch.txt When no source table is provided for an INSERT statement Hive fails with NPE. {code} 0: jdbc:hive2://localhost:11050/default create table test_tbl(i int); No rows affected (0.333 seconds) 0: jdbc:hive2://localhost:11050/default insert into table test_tbl select 1; Error: Error while compiling statement: FAILED: NullPointerException null (state=42000,code=4) -- Get a NPE even when using incorrect syntax (no TABLE keyword) 0: jdbc:hive2://localhost:11050/default insert into test_tbl select 1; Error: Error while compiling statement: FAILED: NullPointerException null (state=42000,code=4) -- Works when a source table is provided 0: jdbc:hive2://localhost:11050/default insert into table test_tbl select 1 from foo; No rows affected (5.751 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4329) HCatalog should use getHiveRecordWriter rather than getRecordWriter
[ https://issues.apache.org/jira/browse/HIVE-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160352#comment-14160352 ] Hive QA commented on HIVE-4329: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673066/HIVE-4329.4.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6563 tests executed *Failed tests:* {noformat} org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.testPigPopulation {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1131/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1131/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12673066 HCatalog should use getHiveRecordWriter rather than getRecordWriter --- Key: HIVE-4329 URL: https://issues.apache.org/jira/browse/HIVE-4329 Project: Hive Issue Type: Bug Components: HCatalog, Serializers/Deserializers Affects Versions: 0.14.0 Environment: discovered in Pig, but it looks like the root cause impacts all non-Hive users Reporter: Sean Busbey Assignee: David Chen Attachments: HIVE-4329.0.patch, HIVE-4329.1.patch, HIVE-4329.2.patch, HIVE-4329.3.patch, HIVE-4329.4.patch Attempting to write to a HCatalog defined table backed by the AvroSerde fails with the following stacktrace: {code} java.lang.ClassCastException: org.apache.hadoop.io.NullWritable cannot be cast to org.apache.hadoop.io.LongWritable at org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat$1.write(AvroContainerOutputFormat.java:84) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:253) at org.apache.hcatalog.mapreduce.FileRecordWriterContainer.write(FileRecordWriterContainer.java:53) at org.apache.hcatalog.pig.HCatBaseStorer.putNext(HCatBaseStorer.java:242) at org.apache.hcatalog.pig.HCatStorer.putNext(HCatStorer.java:52) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:139) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:98) at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:559) at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:85) {code} The proximal cause of this failure is that the AvroContainerOutputFormat's signature mandates a LongWritable key and HCat's FileRecordWriterContainer forces a NullWritable. I'm not sure of a general fix, other than redefining HiveOutputFormat to mandate a WritableComparable. It looks like accepting WritableComparable is what's done in the other Hive OutputFormats, and there's no reason AvroContainerOutputFormat couldn't also be changed, since it's ignoring the key. That way fixing things so FileRecordWriterContainer can always use NullWritable could get spun into a different issue? The underlying cause for failure to write to AvroSerde tables is that AvroContainerOutputFormat doesn't meaningfully implement getRecordWriter, so fixing the above will just push the failure into the placeholder RecordWriter. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160428#comment-14160428 ] Ashutosh Chauhan commented on HIVE-8319: cc: [~thejas] , [~vgumashta] Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8137) Empty ORC file handling
[ https://issues.apache.org/jira/browse/HIVE-8137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160456#comment-14160456 ] Pankit Thapar commented on HIVE-8137: - [~gopalv] , could you please comment on the failures. I don't think that the above failures are due to my patch. Could you please comment on the same? Also, could you please review the patch as well? Empty ORC file handling --- Key: HIVE-8137 URL: https://issues.apache.org/jira/browse/HIVE-8137 Project: Hive Issue Type: Improvement Components: File Formats Affects Versions: 0.13.1 Reporter: Pankit Thapar Fix For: 0.14.0 Attachments: HIVE-8137.patch Hive 13 does not handle reading of a zero size Orc File properly. An Orc file is suposed to have a post-script which the ReaderIml class tries to read and initialize the footer with it. But in case, the file is empty or is of zero size, then it runs into an IndexOutOfBound Exception because of ReaderImpl trying to read in its constructor. Code Snippet : //get length of PostScript int psLen = buffer.get(readSize - 1) 0xff; In the above code, readSize for an empty file is zero. I see that ensureOrcFooter() method performs some sanity checks for footer , so, either we can move the above code snippet to ensureOrcFooter() and throw a Malformed ORC file exception or we can create a dummy Reader that does not initialize footer and basically has hasNext() set to false so that it returns false on the first call. Basically, I would like to know what might be the correct way to handle an empty ORC file in a mapred job? Should we neglect it and not throw an exception or we can throw an exeption that the ORC file is malformed. Please let me know your thoughts on this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160480#comment-14160480 ] Ken Williams commented on HIVE-6050: I'm also looking for a workaround to this - I'm seeing the error when trying to connect to a 0.13 Hive. JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8358) Constant folding should happen before predicate pushdown
Ashutosh Chauhan created HIVE-8358: -- Summary: Constant folding should happen before predicate pushdown Key: HIVE-8358 URL: https://issues.apache.org/jira/browse/HIVE-8358 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan So, that partition pruning and transitive predicate propagation may take advantage of constant folding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown
[ https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8358: --- Status: Patch Available (was: Open) Constant folding should happen before predicate pushdown Key: HIVE-8358 URL: https://issues.apache.org/jira/browse/HIVE-8358 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8358.patch So, that partition pruning and transitive predicate propagation may take advantage of constant folding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8358) Constant folding should happen before predicate pushdown
[ https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8358: --- Attachment: HIVE-8358.patch Running tests to see if there any failures. Not ready for review yet. Constant folding should happen before predicate pushdown Key: HIVE-8358 URL: https://issues.apache.org/jira/browse/HIVE-8358 Project: Hive Issue Type: Improvement Components: Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8358.patch So, that partition pruning and transitive predicate propagation may take advantage of constant folding. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6050) JDBC backward compatibility is broken
[ https://issues.apache.org/jira/browse/HIVE-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160519#comment-14160519 ] Brock Noland commented on HIVE-6050: AFAIK there is no present work around. The server must be higher or equal to the client. JDBC backward compatibility is broken - Key: HIVE-6050 URL: https://issues.apache.org/jira/browse/HIVE-6050 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.13.0 Reporter: Szehon Ho Assignee: Carl Steinbach Priority: Blocker Connect from JDBC driver of Hive 0.13 (TProtocolVersion=v4) to HiveServer2 of Hive 0.10 (TProtocolVersion=v1), will return the following exception: {noformat} java.sql.SQLException: Could not establish connection to jdbc:hive2://localhost:1/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:336) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:158) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.jdbc.MyTestJdbcDriver2.getConnection(MyTestJdbcDriver2.java:73) at org.apache.hive.jdbc.MyTestJdbcDriver2.lt;initgt;(MyTestJdbcDriver2.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:187) at org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:236) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:233) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:39) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:523) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1063) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:914) Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null) at org.apache.thrift.TApplicationException.read(TApplicationException.java:108) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:160) at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:147) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:327) ... 37 more {noformat} On code analysis, it looks like the 'client_protocol' scheme is a ThriftEnum, which doesn't seem to be backward-compatible. Look at the code path in the generated file 'TOpenSessionReq.java', method TOpenSessionReqStandardScheme.read(): 1. The method will call 'TProtocolVersion.findValue()' on the thrift protocol's byte stream, which returns null if the client is sending an enum value unknown to the server. (v4 is unknown to server) 2. The method will then call struct.validate(), which will throw the above exception because of null version. So doesn't look like the current backward-compatibility scheme will work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8352) Enable windowing.q for spark
[ https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160524#comment-14160524 ] Brock Noland commented on HIVE-8352: [~jxiang] does parallel.q pass for you locally, without test.overwrite.output=true? If it does pass, can you open a subtask of HIVE-7292 to investigate the flakiness? +1 pending resolution of parallel.q Enable windowing.q for spark Key: HIVE-8352 URL: https://issues.apache.org/jira/browse/HIVE-8352 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Priority: Minor Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, hive-8385.patch We should enable windowing.q for basic windowing coverage. After checking out the spark branch, we would build: {noformat} $ mvn clean install -DskipTests -Phadoop-2 $ cd itests/ $ mvn clean install -DskipTests -Phadoop-2 {noformat} Then generate the windowing.q.out file: {noformat} $ cd qtest-spark/ $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 -Dtest.output.overwrite=true {noformat} Compare the output against MapReduce: {noformat} $ diff -y -W 150 ../../ql/src/test/results/clientpositive/spark/windowing.q.out ../../ql/src/test/results/clientpositive/windowing.q.out| less {noformat} And if everything looks good, add it to {{spark.query.files}} in {{./itests/src/test/resources/testconfiguration.properties}} then submit the patch including the .q file -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8359) Map containing null values are not correctly written in Parquet files
Frédéric TERRAZZONI created HIVE-8359: - Summary: Map containing null values are not correctly written in Parquet files Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. Don't know how to attach it here though ... :) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frédéric TERRAZZONI updated HIVE-8359: -- Description: Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. was: Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. Don't know how to attach it here though ... :) Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8359) Map containing null values are not correctly written in Parquet files
[ https://issues.apache.org/jira/browse/HIVE-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frédéric TERRAZZONI updated HIVE-8359: -- Attachment: map_null_val.avro Avro file containing the sample data. To reproduce the issue, just create a Hive table from this file and issue a {code} CREATE TABLE broken_parquet_table STORED AS PARQUET AS SELECT * FROM the_avro_table; SELECT * FROM broken_parquet_table; {code} Map containing null values are not correctly written in Parquet files - Key: HIVE-8359 URL: https://issues.apache.org/jira/browse/HIVE-8359 Project: Hive Issue Type: Bug Affects Versions: 0.13.1 Reporter: Frédéric TERRAZZONI Attachments: map_null_val.avro Tried write a mapstring,string column in a Parquet file. The table should contain : {code} {key3:val3,key4:null} {key3:val3,key4:null} {key1:null,key2:val2} {key3:val3,key4:null} {key3:val3,key4:null} {code} ... and when you do a query like {code}SELECT * from mytable{code} We can see that the table is corrupted : {code} {key3:val3} {key4:val3} {key3:val2} {key4:val3} {key1:val3} {code} I've not been able to read the Parquet file in our software afterwards, and consequently I suspect it to be corrupted. For those who are interested, I generated this Parquet table from an Avro file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8319) Add configuration for custom services in hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160536#comment-14160536 ] Thejas M Nair commented on HIVE-8319: - This patch is making the Service interface public. We should mark it with @public annotation in that case, and probably @unstable or (at least @evolving) as well. The interface also needs some cleanup, so that unused functions are removed (such as register/unregister). We should also clarify the public/private api status of the classes within org.apache.hive.service package, as users might also end up using classes like CompositeService. (I think marking them as @private unless it is clear that users would benefit from it and it can be kept stable). Add configuration for custom services in hiveserver2 Key: HIVE-8319 URL: https://issues.apache.org/jira/browse/HIVE-8319 Project: Hive Issue Type: Improvement Components: HiveServer2 Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8319.1.patch.txt NO PRECOMMIT TESTS Register services to hiveserver2, for example, {noformat} property namehive.server2.service.classesname valuecom.nexr.hive.service.HiveStatus,com.nexr.hive.service.AzkabanServicevalue /property property nameazkaban.ssl.portname name...name /property {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8357) Path type entities should use qualified path rather than string
[ https://issues.apache.org/jira/browse/HIVE-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160549#comment-14160549 ] Hive QA commented on HIVE-8357: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673067/HIVE-8357.1.patch.txt {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 6524 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_add_part_multiple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_map_ppr_multi_distinct org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_udf_local_resource org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithDfsResource org.apache.hadoop.hive.ql.TestCreateUdfEntities.testUdfWithLocalResource {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1132/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1132/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12673067 Path type entities should use qualified path rather than string --- Key: HIVE-8357 URL: https://issues.apache.org/jira/browse/HIVE-8357 Project: Hive Issue Type: Improvement Components: Authorization Reporter: Navis Assignee: Navis Priority: Minor Attachments: HIVE-8357.1.patch.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan
[ https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-8225: -- Status: Patch Available (was: Open) CBO trunk merge: union11 test fails due to incorrect plan - Key: HIVE-8225 URL: https://issues.apache.org/jira/browse/HIVE-8225 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, HIVE-8225.inprogress.patch, HIVE-8225.patch The result changes to as if the union didn't have count() inside. The issue can be fixed by using srcunion.value outside the subquery in count (replace count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node from union-ed queries is not present in AST at all, which might cause this result. -Interestingly, adding group by to each query in a union produces completely weird result (count(1) is 309 for each key, whereas it should be 1 and the logical incorrect value if internal count is lost is 500)- Nm, that groups by table column called key, which is weird but is what Hive does -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26209: CBO trunk merge: union11 test fails due to incorrect plan
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26209/ --- (Updated Oct. 6, 2014, 5:39 p.m.) Review request for hive and Ashutosh Chauhan. Repository: hive-git Description --- create a derived table with new proj and aggr to address it Diffs (updated) - ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/PlanModifierForASTConv.java 3d90ae7 ql/src/test/queries/clientpositive/cbo_correctness.q f7f0722 ql/src/test/results/clientpositive/cbo_correctness.q.out 3335d4d ql/src/test/results/clientpositive/tez/cbo_correctness.q.out 5920612 Diff: https://reviews.apache.org/r/26209/diff/ Testing --- Thanks, pengcheng xiong
[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan
[ https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-8225: -- Status: Open (was: Patch Available) CBO trunk merge: union11 test fails due to incorrect plan - Key: HIVE-8225 URL: https://issues.apache.org/jira/browse/HIVE-8225 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, HIVE-8225.inprogress.patch, HIVE-8225.patch The result changes to as if the union didn't have count() inside. The issue can be fixed by using srcunion.value outside the subquery in count (replace count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node from union-ed queries is not present in AST at all, which might cause this result. -Interestingly, adding group by to each query in a union produces completely weird result (count(1) is 309 for each key, whereas it should be 1 and the logical incorrect value if internal count is lost is 500)- Nm, that groups by table column called key, which is weird but is what Hive does -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8225) CBO trunk merge: union11 test fails due to incorrect plan
[ https://issues.apache.org/jira/browse/HIVE-8225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-8225: -- Attachment: HIVE-8225.5.patch CBO trunk merge: union11 test fails due to incorrect plan - Key: HIVE-8225 URL: https://issues.apache.org/jira/browse/HIVE-8225 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8225.1.patch, HIVE-8225.2.patch, HIVE-8225.3.patch, HIVE-8225.4.patch, HIVE-8225.5.patch, HIVE-8225.inprogress.patch, HIVE-8225.inprogress.patch, HIVE-8225.patch The result changes to as if the union didn't have count() inside. The issue can be fixed by using srcunion.value outside the subquery in count (replace count(1) with count(srcunion.value)). Otherwise, it looks like count(1) node from union-ed queries is not present in AST at all, which might cause this result. -Interestingly, adding group by to each query in a union produces completely weird result (count(1) is 309 for each key, whereas it should be 1 and the logical incorrect value if internal count is lost is 500)- Nm, that groups by table column called key, which is weird but is what Hive does -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.
[ https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-8340: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to trunk and 14. Thanks for the patch [~xiaobingo]! HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start. - Key: HIVE-8340 URL: https://issues.apache.org/jira/browse/HIVE-8340 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, HIVE-8340.4.patch On stopping the HS2 service from the services tab, it only kills the root process and does not kill the child java process. As a result resources are not freed and this throws an error on restarting from command line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.
[ https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160567#comment-14160567 ] Vaibhav Gumashta commented on HIVE-8340: Thanks for reviewing the configs [~leftylev] HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start. - Key: HIVE-8340 URL: https://issues.apache.org/jira/browse/HIVE-8340 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, HIVE-8340.4.patch On stopping the HS2 service from the services tab, it only kills the root process and does not kill the child java process. As a result resources are not freed and this throws an error on restarting from command line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8360) Add cross cluster support for webhcat E2E tests
Aswathy Chellammal Sreekumar created HIVE-8360: -- Summary: Add cross cluster support for webhcat E2E tests Key: HIVE-8360 URL: https://issues.apache.org/jira/browse/HIVE-8360 Project: Hive Issue Type: Test Components: Tests, WebHCat Environment: Secure cluster Reporter: Aswathy Chellammal Sreekumar In current Webhcat E2E test setup, cross domain secure cluster runs will fail since the realm name for user principles are not included in the kinit command. This patch concatenates the realm name to the user principal there by resulting in a successful kinit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8360) Add cross cluster support for webhcat E2E tests
[ https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aswathy Chellammal Sreekumar updated HIVE-8360: --- Attachment: AD-MIT.patch Including the patch that implements cross domain support in secure cluster for E2E tests. Please review the same. Add cross cluster support for webhcat E2E tests --- Key: HIVE-8360 URL: https://issues.apache.org/jira/browse/HIVE-8360 Project: Hive Issue Type: Test Components: Tests, WebHCat Environment: Secure cluster Reporter: Aswathy Chellammal Sreekumar Attachments: AD-MIT.patch In current Webhcat E2E test setup, cross domain secure cluster runs will fail since the realm name for user principles are not included in the kinit command. This patch concatenates the realm name to the user principal there by resulting in a successful kinit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.
[ https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-8340: - Labels: TODOC14 (was: ) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start. - Key: HIVE-8340 URL: https://issues.apache.org/jira/browse/HIVE-8340 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, HIVE-8340.4.patch On stopping the HS2 service from the services tab, it only kills the root process and does not kill the child java process. As a result resources are not freed and this throws an error on restarting from command line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8340) HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start.
[ https://issues.apache.org/jira/browse/HIVE-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160612#comment-14160612 ] Lefty Leverenz commented on HIVE-8340: -- Doc note: This adds *hive.hadoop.classpath* to HiveConf.java, so it needs to be documented in the wiki. Although the parameter doesn't start with hive.server2..., it belongs in the HiveServer2 section: * [Configuration Properties -- HiveServer2 | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2] HiveServer2 service doesn't stop backend jvm process, which prevents follow-up service start. - Key: HIVE-8340 URL: https://issues.apache.org/jira/browse/HIVE-8340 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Environment: Windows Reporter: Xiaobing Zhou Assignee: Xiaobing Zhou Priority: Critical Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8340.1.patch, HIVE-8340.2.patch, HIVE-8340.3.patch, HIVE-8340.4.patch On stopping the HS2 service from the services tab, it only kills the root process and does not kill the child java process. As a result resources are not freed and this throws an error on restarting from command line. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite
[ https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8336: - Resolution: Fixed Fix Version/s: 0.14.0 Status: Resolved (was: Patch Available) Committed to trunk and branch .14 Update pom, now that Optiq is renamed to Calcite Key: HIVE-8336 URL: https://issues.apache.org/jira/browse/HIVE-8336 Project: Hive Issue Type: Bug Reporter: Julian Hyde Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-8336.1.patch Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 and OPTIQ-430. There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite
[ https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160619#comment-14160619 ] Gunther Hagleitner commented on HIVE-8336: -- [~leftylev] i've changed the name in hiveconf on commit. Update pom, now that Optiq is renamed to Calcite Key: HIVE-8336 URL: https://issues.apache.org/jira/browse/HIVE-8336 Project: Hive Issue Type: Bug Reporter: Julian Hyde Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-8336.1.patch Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 and OPTIQ-430. There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8336) Update pom, now that Optiq is renamed to Calcite
[ https://issues.apache.org/jira/browse/HIVE-8336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160622#comment-14160622 ] Vikram Dixit K commented on HIVE-8336: -- +1 for 0.14 Update pom, now that Optiq is renamed to Calcite Key: HIVE-8336 URL: https://issues.apache.org/jira/browse/HIVE-8336 Project: Hive Issue Type: Bug Reporter: Julian Hyde Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-8336.1.patch Apache Optiq is in the process of renaming to Apache Calcite. See INFRA-8413 and OPTIQ-430. There is not yet a snapshot of {groupId: 'org.apache.calcite', artifactId: 'calcite-*'} deployed to nexus. When there is, I'll post a patch to pom.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-8258: - Status: Open (was: Patch Available) The unit test is failing due to timing issues. Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-8258: - Attachment: HIVE-8258.4.patch A new version of the patch that actually makes sure the cleaner goes through the loop rather than relying on timing and hoping it works out. Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-8258: - Status: Patch Available (was: Open) Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez
[ https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8344: - Status: Open (was: Patch Available) Hive on Tez sets mapreduce.framework.name to yarn-tez - Key: HIVE-8344 URL: https://issues.apache.org/jira/browse/HIVE-8344 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, we don't switch back when the user specifies MR as exec engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez
[ https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8344: - Status: Patch Available (was: Open) Hive on Tez sets mapreduce.framework.name to yarn-tez - Key: HIVE-8344 URL: https://issues.apache.org/jira/browse/HIVE-8344 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, we don't switch back when the user specifies MR as exec engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8344) Hive on Tez sets mapreduce.framework.name to yarn-tez
[ https://issues.apache.org/jira/browse/HIVE-8344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8344: - Attachment: HIVE-8344.2.patch Hive on Tez sets mapreduce.framework.name to yarn-tez - Key: HIVE-8344 URL: https://issues.apache.org/jira/browse/HIVE-8344 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Attachments: HIVE-8344.1.patch, HIVE-8344.2.patch This was done to run MR jobs when in Tez mode (emulate MR on Tez). However, we don't switch back when the user specifies MR as exec engine. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)
[ https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160650#comment-14160650 ] Szehon Ho commented on HIVE-7375: - [~brocknoland] I had filed this sometime back to try to catch hadoop-1 compile errors in precommit. (At the time trying to avoid having to fund an additional precommit machine cluster for hadoop-1). Are you thinking we can get funding for one more cluster for hadoop-1 in the near future, as HIVE-8351 suggests? If so , I can resolve this JIRA in favor of that one. Add option in test infra to compile in other profiles (like hadoop-1) - Key: HIVE-7375 URL: https://issues.apache.org/jira/browse/HIVE-7375 Project: Hive Issue Type: Test Reporter: Szehon Ho Assignee: Szehon Ho As we are seeing some commits breaking hadoop-1 compilation due to lack of pre-commit converage, it might be nice to add an option in the test infra to compile on optional profiles as a pre-step before testing on the main profile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7375) Add option in test infra to compile in other profiles (like hadoop-1)
[ https://issues.apache.org/jira/browse/HIVE-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160667#comment-14160667 ] Brock Noland commented on HIVE-7375: Yes, I think we can resolve this one in favor of HIVE-8351. Add option in test infra to compile in other profiles (like hadoop-1) - Key: HIVE-7375 URL: https://issues.apache.org/jira/browse/HIVE-7375 Project: Hive Issue Type: Test Reporter: Szehon Ho Assignee: Szehon Ho As we are seeing some commits breaking hadoop-1 compilation due to lack of pre-commit converage, it might be nice to add an option in the test infra to compile on optional profiles as a pre-step before testing on the main profile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8361) NPE in PTFOperator when there are empty partitions
Harish Butani created HIVE-8361: --- Summary: NPE in PTFOperator when there are empty partitions Key: HIVE-8361 URL: https://issues.apache.org/jira/browse/HIVE-8361 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Here is a simple query to reproduce this: {code} select sum(p_size) over (partition by p_mfgr ) from part where p_mfgr = 'some non existent mfgr'; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions
[ https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-8361: Status: Patch Available (was: Open) NPE in PTFOperator when there are empty partitions -- Key: HIVE-8361 URL: https://issues.apache.org/jira/browse/HIVE-8361 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-8361.1.patch Here is a simple query to reproduce this: {code} select sum(p_size) over (partition by p_mfgr ) from part where p_mfgr = 'some non existent mfgr'; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8361) NPE in PTFOperator when there are empty partitions
[ https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-8361: Attachment: HIVE-8361.1.patch NPE in PTFOperator when there are empty partitions -- Key: HIVE-8361 URL: https://issues.apache.org/jira/browse/HIVE-8361 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-8361.1.patch Here is a simple query to reproduce this: {code} select sum(p_size) over (partition by p_mfgr ) from part where p_mfgr = 'some non existent mfgr'; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp
[ https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mostafa Mokhtar updated HIVE-8292: -- Attachment: HIVE-8292.1.patch This patch addresses the regression but doesn't handle multiple inputs for SMB join. Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp --- Key: HIVE-8292 URL: https://issues.apache.org/jira/browse/HIVE-8292 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: cn105 Reporter: Mostafa Mokhtar Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch Reading from bucketed partitioned tables has significantly higher overhead compared to non-bucketed non-partitioned files. 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp 5% the CPU in {code} Path onepath = normalizePath(onefile); {code} And 45% the CPU in {code} onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri()); {code} From the profiler {code} Stack Trace Sample CountPercentage(%) hive.ql.exec.tez.MapRecordSource.processRow(Object) 5,327 62.348 hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326 62.336 hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851 56.777 hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849 56.753 java.net.URI.relativize(URI) 3,903 45.681 java.net.URI.relativize(URI, URI) 3,903 45.681 java.net.URI.normalize(String) 2,169 25.386 java.net.URI.equal(String, String) 526 6.156 java.net.URI.equalIgnoringCase(String, String) 1 0.012 java.lang.String.substring(int) 1 0.012 hive.ql.exec.MapOperator.normalizePath(String)506 5.922 org.apache.commons.logging.impl.Log4JLogger.info(Object) 32 0.375 java.net.URI.equals(Object) 12 0.14 java.util.HashMap$KeySet.iterator() 5 0.059 java.util.HashMap.get(Object)4 0.047 java.util.LinkedHashMap.get(Object) 3 0.035 hive.ql.exec.Operator.cleanUpInputFileChanged() 1 0.012 hive.ql.exec.Operator.forward(Object, ObjectInspector) 473 5.536 hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1 0.012 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp
[ https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160683#comment-14160683 ] Mostafa Mokhtar commented on HIVE-8292: --- [~vikram.dixit] Patch which addresses the regression attached. Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp --- Key: HIVE-8292 URL: https://issues.apache.org/jira/browse/HIVE-8292 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: cn105 Reporter: Mostafa Mokhtar Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch Reading from bucketed partitioned tables has significantly higher overhead compared to non-bucketed non-partitioned files. 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp 5% the CPU in {code} Path onepath = normalizePath(onefile); {code} And 45% the CPU in {code} onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri()); {code} From the profiler {code} Stack Trace Sample CountPercentage(%) hive.ql.exec.tez.MapRecordSource.processRow(Object) 5,327 62.348 hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326 62.336 hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851 56.777 hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849 56.753 java.net.URI.relativize(URI) 3,903 45.681 java.net.URI.relativize(URI, URI) 3,903 45.681 java.net.URI.normalize(String) 2,169 25.386 java.net.URI.equal(String, String) 526 6.156 java.net.URI.equalIgnoringCase(String, String) 1 0.012 java.lang.String.substring(int) 1 0.012 hive.ql.exec.MapOperator.normalizePath(String)506 5.922 org.apache.commons.logging.impl.Log4JLogger.info(Object) 32 0.375 java.net.URI.equals(Object) 12 0.14 java.util.HashMap$KeySet.iterator() 5 0.059 java.util.HashMap.get(Object)4 0.047 java.util.LinkedHashMap.get(Object) 3 0.035 hive.ql.exec.Operator.cleanUpInputFileChanged() 1 0.012 hive.ql.exec.Operator.forward(Object, ObjectInspector) 473 5.536 hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1 0.012 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160690#comment-14160690 ] Szehon Ho commented on HIVE-6500: - Hi [~leftylev] I had a question about docs. I came across an outdated wiki page still mentioning db as the only option, should that page be maintained as FS is now supported? [https://cwiki.apache.org/confluence/display/Hive/StatsDev|https://cwiki.apache.org/confluence/display/Hive/StatsDev] It is actually not linked from the top, but it does seem useful. Not sure the policy for these pages? Stats collection via filesystem --- Key: HIVE-6500 URL: https://issues.apache.org/jira/browse/HIVE-6500 Project: Hive Issue Type: New Feature Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC14 Fix For: 0.13.0 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch Recently, support for stats gathering via counter was [added | https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has following issues: * [Length of counter group name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] * [Length of counter name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] * [Number of distinct counter groups are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] * [Number of distinct counters are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] Although, these limits are configurable, but setting them to higher value implies increased memory load on AM and job history server. Now, whether these limits makes sense or not is [debatable | https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that Hive doesn't make use of counters features of framework so that it we can evolve this feature without relying on support from framework. Filesystem based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]
Jimmy Xiang created HIVE-8362: - Summary: Investigate flaky test parallel.q [Spark Branch] Key: HIVE-8362 URL: https://issues.apache.org/jira/browse/HIVE-8362 Project: Hive Issue Type: Sub-task Reporter: Jimmy Xiang Assignee: Jimmy Xiang Test parallel.q is flaky. It fails sometimes with error like: {noformat} Failed tests: TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected exception junit.framework.AssertionFailedError: Client Execution results failed with error code = 1 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8168) With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation)
[ https://issues.apache.org/jira/browse/HIVE-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-8168: - Attachment: HIVE-8168.4.patch Addressed [~mmokhtar]'s review comments. With dynamic partition enabled fact table selectivity is not taken into account when generating the physical plan (Use CBO cardinality using physical plan generation) -- Key: HIVE-8168 URL: https://issues.apache.org/jira/browse/HIVE-8168 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Prasanth J Priority: Critical Labels: performance Fix For: vectorization-branch, 0.14.0 Attachments: HIVE-8168.1.patch, HIVE-8168.2.patch, HIVE-8168.3.patch, HIVE-8168.4.patch When calculating estimate row counts data size during physical plan generation in StatsRulesProcFactory doesn't know that there will be dynamic partition pruning and it is hard to know how many partitions will qualify at runtime, as a result with Dynamic partition pruning enabled a query 32 can run with 570 compared to 70 tasks with dynamic partition pruning disabled and actual partition filters on the fact table. The long term solution for this issue is to use the cardinality estimates from CBO as it takes into account join selectivity and such, estimate from CBO won't address the number of the tasks used for the partitioned table but they will address the incorrect number of tasks used for the concequent reducers where the majority of the slowdown is coming from. Plan dynamic partition pruning on {code} Map 5 Map Operator Tree: TableScan alias: ss filterExpr: ss_store_sk is not null (type: boolean) Statistics: Num rows: 550076554 Data size: 47370018896 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: ss_store_sk is not null (type: boolean) Statistics: Num rows: 275038277 Data size: 23685009448 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {ss_store_sk} {ss_net_profit} 1 keys: 0 ss_sold_date_sk (type: int) 1 d_date_sk (type: int) outputColumnNames: _col6, _col21 input vertices: 1 Map 1 Statistics: Num rows: 302542112 Data size: 26053511168 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {_col21} 1 {s_county} {s_state} keys: 0 _col6 (type: int) 1 s_store_sk (type: int) outputColumnNames: _col21, _col80, _col81 input vertices: 1 Map 2 Statistics: Num rows: 332796320 Data size: 28658862080 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Left Semi Join 0 to 1 condition expressions: 0 {_col21} {_col80} {_col81} 1 keys: 0 _col81 (type: string) 1 _col0 (type: string) outputColumnNames: _col21, _col80, _col81 input vertices: 1 Reducer 11 Statistics: Num rows: 366075968 Data size: 31524749312 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col81 (type: string), _col80 (type: string), _col21 (type: float) outputColumnNames: _col81, _col80, _col21 Statistics: Num rows: 366075968 Data size: 31524749312 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: sum(_col21) keys: _col81 (type: string), _col80 (type: string), '0'
[jira] [Commented] (HIVE-8352) Enable windowing.q for spark
[ https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160716#comment-14160716 ] Jimmy Xiang commented on HIVE-8352: --- Parallel.q is ok for me locally sometimes. Filed HIVE-8362 to look into the failure. Enable windowing.q for spark Key: HIVE-8352 URL: https://issues.apache.org/jira/browse/HIVE-8352 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Priority: Minor Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, hive-8385.patch We should enable windowing.q for basic windowing coverage. After checking out the spark branch, we would build: {noformat} $ mvn clean install -DskipTests -Phadoop-2 $ cd itests/ $ mvn clean install -DskipTests -Phadoop-2 {noformat} Then generate the windowing.q.out file: {noformat} $ cd qtest-spark/ $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 -Dtest.output.overwrite=true {noformat} Compare the output against MapReduce: {noformat} $ diff -y -W 150 ../../ql/src/test/results/clientpositive/spark/windowing.q.out ../../ql/src/test/results/clientpositive/windowing.q.out| less {noformat} And if everything looks good, add it to {{spark.query.files}} in {{./itests/src/test/resources/testconfiguration.properties}} then submit the patch including the .q file -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1
Szehon Ho created HIVE-8363: --- Summary: AccumuloStorageHandler compile failure hadoop-1 Key: HIVE-8363 URL: https://issues.apache.org/jira/browse/HIVE-8363 Project: Hive Issue Type: Bug Components: StorageHandler Affects Versions: 0.14.0 Reporter: Szehon Ho Priority: Blocker There's an error about AccumuloStorageHandler compiling on hadoop-1. It seems the signature of split() is not the same. Looks like we can should use another utils to fix this. {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-accumulo-handler: Compilation failure [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52] no suitable method found for split(java.lang.String,char) [ERROR] method org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not applicable {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez
Gunther Hagleitner created HIVE-8364: Summary: We're not waiting for all inputs in MapRecordProcessor on Tez Key: HIVE-8364 URL: https://issues.apache.org/jira/browse/HIVE-8364 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: 0.14.0 Seems like this could be a race condition: We're blocking for some inputs to become available, but the main MR input is just assumed ready... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez
[ https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8364: - Attachment: HIVE-8364.1.patch Proposed patch. We're not waiting for all inputs in MapRecordProcessor on Tez - Key: HIVE-8364 URL: https://issues.apache.org/jira/browse/HIVE-8364 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: HIVE-8364.1.patch Seems like this could be a race condition: We're blocking for some inputs to become available, but the main MR input is just assumed ready... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8292) Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp
[ https://issues.apache.org/jira/browse/HIVE-8292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160729#comment-14160729 ] Gopal V commented on HIVE-8292: --- [~mmokhtar]: Probably better to just read exec context off mapOp.getExecContext(). Reading from partitioned bucketed tables has high overhead in MapOperator.cleanUpInputFileChangedOp --- Key: HIVE-8292 URL: https://issues.apache.org/jira/browse/HIVE-8292 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Environment: cn105 Reporter: Mostafa Mokhtar Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: 2014_09_29_14_46_04.jfr, HIVE-8292.1.patch Reading from bucketed partitioned tables has significantly higher overhead compared to non-bucketed non-partitioned files. 50% of the profile is spent in MapOperator.cleanUpInputFileChangedOp 5% the CPU in {code} Path onepath = normalizePath(onefile); {code} And 45% the CPU in {code} onepath.toUri().relativize(fpath.toUri()).equals(fpath.toUri()); {code} From the profiler {code} Stack Trace Sample CountPercentage(%) hive.ql.exec.tez.MapRecordSource.processRow(Object) 5,327 62.348 hive.ql.exec.vector.VectorMapOperator.process(Writable)5,326 62.336 hive.ql.exec.Operator.cleanUpInputFileChanged() 4,851 56.777 hive.ql.exec.MapOperator.cleanUpInputFileChangedOp() 4,849 56.753 java.net.URI.relativize(URI) 3,903 45.681 java.net.URI.relativize(URI, URI) 3,903 45.681 java.net.URI.normalize(String) 2,169 25.386 java.net.URI.equal(String, String) 526 6.156 java.net.URI.equalIgnoringCase(String, String) 1 0.012 java.lang.String.substring(int) 1 0.012 hive.ql.exec.MapOperator.normalizePath(String)506 5.922 org.apache.commons.logging.impl.Log4JLogger.info(Object) 32 0.375 java.net.URI.equals(Object) 12 0.14 java.util.HashMap$KeySet.iterator() 5 0.059 java.util.HashMap.get(Object)4 0.047 java.util.LinkedHashMap.get(Object) 3 0.035 hive.ql.exec.Operator.cleanUpInputFileChanged() 1 0.012 hive.ql.exec.Operator.forward(Object, ObjectInspector) 473 5.536 hive.ql.exec.mr.ExecMapperContext.inputFileChanged()1 0.012 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8364) We're not waiting for all inputs in MapRecordProcessor on Tez
[ https://issues.apache.org/jira/browse/HIVE-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8364: - Status: Patch Available (was: Open) We're not waiting for all inputs in MapRecordProcessor on Tez - Key: HIVE-8364 URL: https://issues.apache.org/jira/browse/HIVE-8364 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Vikram Dixit K Fix For: 0.14.0 Attachments: HIVE-8364.1.patch Seems like this could be a race condition: We're blocking for some inputs to become available, but the main MR input is just assumed ready... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8227) NPE w/ hive on tez when doing unions on empty tables
[ https://issues.apache.org/jira/browse/HIVE-8227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-8227: - Fix Version/s: 0.14.0 NPE w/ hive on tez when doing unions on empty tables Key: HIVE-8227 URL: https://issues.apache.org/jira/browse/HIVE-8227 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Fix For: 0.14.0 Attachments: HIVE-8227.1.patch, HIVE-8227.2.patch We're looking at aliasToWork.values() to determine input paths etc. This can contain nulls when we're scanning empty tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8272) Query with particular decimal expression causes NPE during execution initialization
[ https://issues.apache.org/jira/browse/HIVE-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160762#comment-14160762 ] Ashutosh Chauhan commented on HIVE-8272: +1 Query with particular decimal expression causes NPE during execution initialization --- Key: HIVE-8272 URL: https://issues.apache.org/jira/browse/HIVE-8272 Project: Hive Issue Type: Bug Components: Logical Optimizer, Physical Optimizer Reporter: Matt McCline Assignee: Jason Dere Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8272.1.patch Query: {code} select cast(sum(dc)*100 as decimal(11,3)) as c1 from somedecimaltable order by c1 limit 100; {code} Fails during execution initialization due to *null* ExprNodeDesc. Noticed while trying to simplify a Vectorization issue and realized it was a more general issue. {code} Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:154) ... 22 more Caused by: java.lang.RuntimeException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:215) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:427) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:65) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:464) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:420) at org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:193) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:380) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:133) ... 22 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154) at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.getExprString(ExprNodeGenericFuncDesc.java:154) at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initializeOp(ReduceSinkOperator.java:148) ... 38 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8258) Compactor cleaners can be starved on a busy table or partition.
[ https://issues.apache.org/jira/browse/HIVE-8258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-8258: - Status: Open (was: Patch Available) Found an issue where this patch prevents the initiator from starting properly. Compactor cleaners can be starved on a busy table or partition. --- Key: HIVE-8258 URL: https://issues.apache.org/jira/browse/HIVE-8258 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.13.1 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8258.2.patch, HIVE-8258.3.patch, HIVE-8258.4.patch, HIVE-8258.patch Currently the cleaning thread in the compactor does not run on a table or partition while any locks are held on this partition. This leaves it open to starvation in the case of a busy table or partition. It only needs to wait until all locks on the table/partition at the time of the compaction have expired. Any jobs initiated after that (and thus any locks obtained) will be for the new versions of the files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8365) TPCDS query #7 fails with IndexOutOfBoundsException [Spark Branch]
Xuefu Zhang created HIVE-8365: - Summary: TPCDS query #7 fails with IndexOutOfBoundsException [Spark Branch] Key: HIVE-8365 URL: https://issues.apache.org/jira/browse/HIVE-8365 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Xuefu Zhang Running TPCDS query #17, given below, results IndexOutOfBoundsException: {code} 14/10/06 12:24:05 ERROR executor.Executor: Exception in task 0.0 in stage 7.0 (TID 2) java.lang.IndexOutOfBoundsException: Index: 1902425, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.hive.com.esotericsoftware.kryo.util.MapReferenceResolver.getReadObject(MapReferenceResolver.java:42) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readReferenceOrNull(Kryo.java:820) at org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:670) at org.apache.hadoop.hive.ql.exec.spark.KryoSerializer.deserialize(KryoSerializer.java:51) at org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:114) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:139) at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:92) at scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42) at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:56) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:722) {code} The query is: {code} select i_item_id, avg(ss_quantity) agg1, avg(ss_list_price) agg2, avg(ss_coupon_amt) agg3, avg(ss_sales_price) agg4 from store_sales, customer_demographics, date_dim, item, promotion where ss_sold_date_sk = d_date_sk and ss_item_sk = i_item_sk and ss_cdemo_sk = cd_demo_sk and ss_promo_sk = p_promo_sk and cd_gender = 'F' and cd_marital_status = 'W' and cd_education_status = 'Primary' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 1998 and ss_sold_date_sk between 2450815 and 2451179 -- partition key filter group by i_item_id order by i_item_id limit 100; {code}, though many other TPCDS queries give the same exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8366) CBO fails if there is a table sample in subquery
Ashutosh Chauhan created HIVE-8366: -- Summary: CBO fails if there is a table sample in subquery Key: HIVE-8366 URL: https://issues.apache.org/jira/browse/HIVE-8366 Project: Hive Issue Type: Bug Components: CBO, Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8366.patch Bail out from cbo in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery
[ https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8366: --- Attachment: HIVE-8366.patch CBO fails if there is a table sample in subquery Key: HIVE-8366 URL: https://issues.apache.org/jira/browse/HIVE-8366 Project: Hive Issue Type: Bug Components: CBO, Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8366.patch Bail out from cbo in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8366) CBO fails if there is a table sample in subquery
[ https://issues.apache.org/jira/browse/HIVE-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-8366: --- Status: Patch Available (was: Open) CBO fails if there is a table sample in subquery Key: HIVE-8366 URL: https://issues.apache.org/jira/browse/HIVE-8366 Project: Hive Issue Type: Bug Components: CBO, Logical Optimizer Affects Versions: 0.14.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-8366.patch Bail out from cbo in such cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160813#comment-14160813 ] Lefty Leverenz commented on HIVE-6500: -- Good catch, [~szehon]. Yes, the Newly Created Tables section of the StatsDev wikidoc needs to be updated, keeping in mind that releases 0.7 though 0.12 have jdbc:derby as the default for *hive.stats.dbclass* so we can't just swap in the new default value. Linking to/from *hive.stats.dbclass* in the Configuration Properties doc will help with future maintenance. * [StatsDev -- Newly Created Tables | https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-NewlyCreatedTables] * [Configuration Properties -- hive.stats.dbclass | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass] Also, the HiveConf.java description of *hive.stats.dbclass* omits the fs value. I can correct that in the next patch for HIVE-6586, perhaps using the wiki description or a variant of it: {quote} The storage that stores temporary Hive statistics. In FS based statistics collection, each task writes statistics it has collected in a file on the filesystem, which will be aggregated after the job has finished. Supported values are fs (filesystem), jdbc(:.*), hbase, counter and custom (HIVE-6500). {quote} Suggested changes to that description: (1) change FS to filesystem (fs), (2) remove or move (HIVE-6500) so it doesn't imply that HIVE-6500 added custom, (3) change jdbc(:.*) to jdbc:database and explain that database can be derby, mysql, ... and what others -- is there a complete list anywhere? P.S. What do you mean by It is actually not linked from the top? Top of what? Maybe you mean it belongs on the Home page. Currently it's listed on the LanguageManual page, but that's easy to change -- we can even list it both places. Stats collection via filesystem --- Key: HIVE-6500 URL: https://issues.apache.org/jira/browse/HIVE-6500 Project: Hive Issue Type: New Feature Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC14 Fix For: 0.13.0 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch Recently, support for stats gathering via counter was [added | https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has following issues: * [Length of counter group name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] * [Length of counter name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] * [Number of distinct counter groups are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] * [Number of distinct counters are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] Although, these limits are configurable, but setting them to higher value implies increased memory load on AM and job history server. Now, whether these limits makes sense or not is [debatable | https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that Hive doesn't make use of counters features of framework so that it we can evolve this feature without relying on support from framework. Filesystem based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Review Request 26379: Disable cbo for tablesample
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26379/ --- Review request for hive and John Pullokkaran. Bugs: HIVE-8366 https://issues.apache.org/jira/browse/HIVE-8366 Repository: hive-git Description --- Disable cbo for tablesample Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 7c2b0cd Diff: https://reviews.apache.org/r/26379/diff/ Testing --- udf_substr.q Thanks, Ashutosh Chauhan
[jira] [Commented] (HIVE-8120) Umbrella JIRA tracking Parquet improvements
[ https://issues.apache.org/jira/browse/HIVE-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160823#comment-14160823 ] Brock Noland commented on HIVE-8120: Linking to HIVE-4329 Umbrella JIRA tracking Parquet improvements --- Key: HIVE-8120 URL: https://issues.apache.org/jira/browse/HIVE-8120 Project: Hive Issue Type: Improvement Reporter: Brock Noland -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6500: - Labels: TODOC13 TODOC14 (was: TODOC14) Stats collection via filesystem --- Key: HIVE-6500 URL: https://issues.apache.org/jira/browse/HIVE-6500 Project: Hive Issue Type: New Feature Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC13, TODOC14 Fix For: 0.13.0 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch Recently, support for stats gathering via counter was [added | https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has following issues: * [Length of counter group name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] * [Length of counter name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] * [Number of distinct counter groups are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] * [Number of distinct counters are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] Although, these limits are configurable, but setting them to higher value implies increased memory load on AM and job history server. Now, whether these limits makes sense or not is [debatable | https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that Hive doesn't make use of counters features of framework so that it we can evolve this feature without relying on support from framework. Filesystem based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7800) Parquet Column Index Access Schema Size Checking
[ https://issues.apache.org/jira/browse/HIVE-7800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-7800: --- Resolution: Fixed Fix Version/s: (was: 0.14.0) 0.15.0 Status: Resolved (was: Patch Available) Thank you so much Daniel! I have committed this to trunk. [~vikram.dixit] could we get this into 0.14? Parquet Column Index Access Schema Size Checking Key: HIVE-7800 URL: https://issues.apache.org/jira/browse/HIVE-7800 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Daniel Weeks Assignee: Daniel Weeks Priority: Critical Fix For: 0.15.0 Attachments: HIVE-7800.1.patch, HIVE-7800.2.patch, HIVE-7800.3.patch In the case that a parquet formatted table has partitions where the files have different size schema, using column index access can result in an index out of bounds exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-6500) Stats collection via filesystem
[ https://issues.apache.org/jira/browse/HIVE-6500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003107#comment-14003107 ] Lefty Leverenz edited comment on HIVE-6500 at 10/6/14 8:02 PM: --- Unfortunately my review board advice not to patch hive-default.xml.template led to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in the template file. But it's updated in the most recent patch for HIVE-6037, so presumably it will be corrected by release 0.14.0. Sorry about that. Edit: The updated parameter description didn't make it into the new version of HiveConf.java, so it needs to be fixed in another patch. (I suggest HIVE-6586.) was (Author: le...@hortonworks.com): Unfortunately my review board advice not to patch hive-default.xml.template led to release 0.13.0 having the obsolete default value for *hive.stats.dbclass* in the template file. But it's updated in the most recent patch for HIVE-6037, so presumably it will be corrected by release 0.14.0. Sorry about that. Stats collection via filesystem --- Key: HIVE-6500 URL: https://issues.apache.org/jira/browse/HIVE-6500 Project: Hive Issue Type: New Feature Components: Statistics Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Labels: TODOC13, TODOC14 Fix For: 0.13.0 Attachments: HIVE-6500.2.patch, HIVE-6500.3.patch, HIVE-6500.patch Recently, support for stats gathering via counter was [added | https://issues.apache.org/jira/browse/HIVE-4632] Although, its useful it has following issues: * [Length of counter group name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L340] * [Length of counter name is limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L337] * [Number of distinct counter groups are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L343] * [Number of distinct counters are limited | https://github.com/apache/hadoop-common/blob/branch-2.3/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/MRJobConfig.java?source=c#L334] Although, these limits are configurable, but setting them to higher value implies increased memory load on AM and job history server. Now, whether these limits makes sense or not is [debatable | https://issues.apache.org/jira/browse/MAPREDUCE-5680] it is desirable that Hive doesn't make use of counters features of framework so that it we can evolve this feature without relying on support from framework. Filesystem based counter collection is a step in that direction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8361) NPE in PTFOperator when there are empty partitions
[ https://issues.apache.org/jira/browse/HIVE-8361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160851#comment-14160851 ] Mostafa Mokhtar commented on HIVE-8361: --- [~rhbutani] Validated the fix on query98 and it ran fine. NPE in PTFOperator when there are empty partitions -- Key: HIVE-8361 URL: https://issues.apache.org/jira/browse/HIVE-8361 Project: Hive Issue Type: Bug Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-8361.1.patch Here is a simple query to reproduce this: {code} select sum(p_size) over (partition by p_mfgr ) from part where p_mfgr = 'some non existent mfgr'; {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8352) Enable windowing.q for spark
[ https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8352: --- Resolution: Fixed Fix Version/s: spark-branch Status: Resolved (was: Patch Available) Enable windowing.q for spark Key: HIVE-8352 URL: https://issues.apache.org/jira/browse/HIVE-8352 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Priority: Minor Fix For: spark-branch Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, hive-8385.patch We should enable windowing.q for basic windowing coverage. After checking out the spark branch, we would build: {noformat} $ mvn clean install -DskipTests -Phadoop-2 $ cd itests/ $ mvn clean install -DskipTests -Phadoop-2 {noformat} Then generate the windowing.q.out file: {noformat} $ cd qtest-spark/ $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 -Dtest.output.overwrite=true {noformat} Compare the output against MapReduce: {noformat} $ diff -y -W 150 ../../ql/src/test/results/clientpositive/spark/windowing.q.out ../../ql/src/test/results/clientpositive/windowing.q.out| less {noformat} And if everything looks good, add it to {{spark.query.files}} in {{./itests/src/test/resources/testconfiguration.properties}} then submit the patch including the .q file -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26325: HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26325/#review55571 --- Ship it! Ship It! - Thejas Nair On Oct. 3, 2014, 7:13 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26325/ --- (Updated Oct. 3, 2014, 7:13 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-8172 https://issues.apache.org/jira/browse/HIVE-8172 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-8172 Diffs - jdbc/src/java/org/apache/hive/jdbc/Utils.java e6b1a36 jdbc/src/java/org/apache/hive/jdbc/ZooKeeperHiveClientHelper.java 06795a5 Diff: https://reviews.apache.org/r/26325/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-8172) HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace
[ https://issues.apache.org/jira/browse/HIVE-8172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160863#comment-14160863 ] Thejas M Nair commented on HIVE-8172: - +1 HiveServer2 dynamic service discovery should let the JDBC client use default ZooKeeper namespace Key: HIVE-8172 URL: https://issues.apache.org/jira/browse/HIVE-8172 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Critical Labels: TODOC14 Fix For: 0.14.0 Attachments: HIVE-8172.1.patch Currently the client provides a url like: jdbc:hive2://vgumashta.local:2181,vgumashta.local:2182,vgumashta.local:2183/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2. The zooKeeperNamespace param when not provided should use the default value. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8335) TestHCatLoader/TestHCatStorer failures on pre-commit tests
[ https://issues.apache.org/jira/browse/HIVE-8335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere resolved HIVE-8335. -- Resolution: Fixed Fix Version/s: 0.14.0 Assignee: Gopal V Issue was resolved by Gopal reverting HIVE-8271. TestHCatLoader/TestHCatStorer failures on pre-commit tests -- Key: HIVE-8335 URL: https://issues.apache.org/jira/browse/HIVE-8335 Project: Hive Issue Type: Bug Components: HCatalog, Tests Reporter: Jason Dere Assignee: Gopal V Fix For: 0.14.0 Looks like a number of Hive pre-commit tests have been failing with the following failures: {noformat} org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadBasic[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testConvertBooleanToInt[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadComplex[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testColumnarStorePushdown[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testGetInputBytes[5] org.apache.hive.hcatalog.pig.TestHCatStorer.testNoAlias[5] org.apache.hive.hcatalog.pig.TestHCatStorer.testEmptyStore[5] org.apache.hive.hcatalog.pig.TestHCatStorer.testDynamicPartitioningMultiPartColsNoDataInDataNoSpec[5] org.apache.hive.hcatalog.pig.TestHCatStorer.testPartitionPublish[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testSchemaLoadPrimitiveTypes[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataBasic[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testReadPartitionedBasic[5] org.apache.hive.hcatalog.pig.TestHCatLoader.testProjectionsBasic[5] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26277: Shim KerberosName (causes build failure on hadoop-1)
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26277/#review55572 --- Ship it! Ship It! - Thejas Nair On Oct. 3, 2014, 6:39 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26277/ --- (Updated Oct. 3, 2014, 6:39 p.m.) Review request for hive, dilli dorai, Szehon Ho, and Thejas Nair. Bugs: HIVE-8324 https://issues.apache.org/jira/browse/HIVE-8324 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-8324 Diffs - service/src/java/org/apache/hive/service/auth/HiveAuthFactory.java 83dd2e6 service/src/java/org/apache/hive/service/cli/thrift/ThriftHttpServlet.java 312d05e shims/0.20/src/main/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java a353a46 shims/0.20S/src/main/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java 030cb75 shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java 0731108 shims/common/src/main/java/org/apache/hadoop/hive/shims/HadoopShims.java 4fcaa1e Diff: https://reviews.apache.org/r/26277/diff/ Testing --- Thanks, Vaibhav Gumashta
[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types
[ https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-8321: - Attachment: HIVE-8321.3.patch Looks like HCat tests were failing due to HIVE-8335. Re-attaching same patch. Fix serialization of TypeInfo for qualified types - Key: HIVE-8321 URL: https://issues.apache.org/jira/browse/HIVE-8321 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch TypeInfos for decimal/char/varchar don't appear to be serializing properly with javaXML. Decimal needed proper getters/setters for precision/scale. Also disabling setTypeInfo since for decimal/char/varchar the proper type name should already be set by the constructor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8352) Enable windowing.q for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brock Noland updated HIVE-8352: --- Summary: Enable windowing.q for spark [Spark Branch] (was: Enable windowing.q for spark) Enable windowing.q for spark [Spark Branch] --- Key: HIVE-8352 URL: https://issues.apache.org/jira/browse/HIVE-8352 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Brock Noland Assignee: Jimmy Xiang Priority: Minor Fix For: spark-branch Attachments: HIVE-8352.1-spark.patch, HIVE-8352.1-spark.patch, hive-8385.patch We should enable windowing.q for basic windowing coverage. After checking out the spark branch, we would build: {noformat} $ mvn clean install -DskipTests -Phadoop-2 $ cd itests/ $ mvn clean install -DskipTests -Phadoop-2 {noformat} Then generate the windowing.q.out file: {noformat} $ cd qtest-spark/ $ mvn test -Dtest=TestSparkCliDriver -Dqfile=windowing.q -Phadoop-2 -Dtest.output.overwrite=true {noformat} Compare the output against MapReduce: {noformat} $ diff -y -W 150 ../../ql/src/test/results/clientpositive/spark/windowing.q.out ../../ql/src/test/results/clientpositive/windowing.q.out| less {noformat} And if everything looks good, add it to {{spark.query.files}} in {{./itests/src/test/resources/testconfiguration.properties}} then submit the patch including the .q file -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8321) Fix serialization of TypeInfo for qualified types
[ https://issues.apache.org/jira/browse/HIVE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-8321: - Status: Patch Available (was: Open) Fix serialization of TypeInfo for qualified types - Key: HIVE-8321 URL: https://issues.apache.org/jira/browse/HIVE-8321 Project: Hive Issue Type: Bug Components: Types Reporter: Jason Dere Assignee: Jason Dere Attachments: HIVE-8321.1.patch, HIVE-8321.2.patch, HIVE-8321.3.patch TypeInfos for decimal/char/varchar don't appear to be serializing properly with javaXML. Decimal needed proper getters/setters for precision/scale. Also disabling setTypeInfo since for decimal/char/varchar the proper type name should already be set by the constructor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8358) Constant folding should happen before predicate pushdown
[ https://issues.apache.org/jira/browse/HIVE-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160884#comment-14160884 ] Hive QA commented on HIVE-8358: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12673119/HIVE-8358.patch {color:red}ERROR:{color} -1 due to 57 failed/errored test(s), 6525 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_columnstats_partlvl_dp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_dp org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_sort_skew_1_23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auto_unused org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_stale_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input26 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_partition_metadataonly org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_merge_dynamic_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_nonmr_fetch_threshold org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_udf_case org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_union_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_quotedid_partition org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_truncate_column_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_remove_25 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_view org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucket3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_sample1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_transform_ppr2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/1133/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-1133/ Messages: {noformat}
[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160899#comment-14160899 ] Szehon Ho commented on HIVE-7068: - This breaks hadoop-1 compilation, [~elserj] would you have a chance to look at this? HIVE-8363, a reference to StringUtils method that changed signature Integrate AccumuloStorageHandler Key: HIVE-7068 URL: https://issues.apache.org/jira/browse/HIVE-7068 Project: Hive Issue Type: New Feature Reporter: Josh Elser Assignee: Josh Elser Fix For: 0.14.0 Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, HIVE-7068.4.patch [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to HBase. Some [initial work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done to support querying an Accumulo table using Hive already. It is not a complete solution as, most notably, the current implementation presently lacks support for INSERTs. I would like to polish up the AccumuloStorageHandler (presently based on 0.10), implement missing basic functionality and compare it to the HBaseStorageHandler (to ensure that we follow the same general usage patterns). I've also been in communication with [~bfem] (the initial author) who expressed interest in working on this again. I hope to coordinate efforts with him. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-8363) AccumuloStorageHandler compile failure hadoop-1
[ https://issues.apache.org/jira/browse/HIVE-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Elser reassigned HIVE-8363: Assignee: Josh Elser AccumuloStorageHandler compile failure hadoop-1 --- Key: HIVE-8363 URL: https://issues.apache.org/jira/browse/HIVE-8363 Project: Hive Issue Type: Bug Components: StorageHandler Affects Versions: 0.14.0 Reporter: Szehon Ho Assignee: Josh Elser Priority: Blocker There's an error about AccumuloStorageHandler compiling on hadoop-1. It seems the signature of split() is not the same. Looks like we can should use another utils to fix this. {code} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hive-accumulo-handler: Compilation failure [ERROR] /data/hive-ptest/working/apache-svn-trunk-source/accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/columns/ColumnMapper.java:[57,52] no suitable method found for split(java.lang.String,char) [ERROR] method org.apache.hadoop.util.StringUtils.split(java.lang.String,char,char) is not applicable {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7068) Integrate AccumuloStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160903#comment-14160903 ] Josh Elser commented on HIVE-7068: -- [~szehon], yeah, I can get a patch up there today. Integrate AccumuloStorageHandler Key: HIVE-7068 URL: https://issues.apache.org/jira/browse/HIVE-7068 Project: Hive Issue Type: New Feature Reporter: Josh Elser Assignee: Josh Elser Fix For: 0.14.0 Attachments: HIVE-7068.1.patch, HIVE-7068.2.patch, HIVE-7068.3.patch, HIVE-7068.4.patch [Accumulo|http://accumulo.apache.org] is a BigTable-clone which is similar to HBase. Some [initial work|https://github.com/bfemiano/accumulo-hive-storage-manager] has been done to support querying an Accumulo table using Hive already. It is not a complete solution as, most notably, the current implementation presently lacks support for INSERTs. I would like to polish up the AccumuloStorageHandler (presently based on 0.10), implement missing basic functionality and compare it to the HBaseStorageHandler (to ensure that we follow the same general usage patterns). I've also been in communication with [~bfem] (the initial author) who expressed interest in working on this again. I hope to coordinate efforts with him. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26379: Disable cbo for tablesample
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26379/#review55577 --- Ship it! Ship It! - John Pullokkaran On Oct. 6, 2014, 7:50 p.m., Ashutosh Chauhan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26379/ --- (Updated Oct. 6, 2014, 7:50 p.m.) Review request for hive and John Pullokkaran. Bugs: HIVE-8366 https://issues.apache.org/jira/browse/HIVE-8366 Repository: hive-git Description --- Disable cbo for tablesample Diffs - ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/HiveOptiqUtil.java 7c2b0cd Diff: https://reviews.apache.org/r/26379/diff/ Testing --- udf_substr.q Thanks, Ashutosh Chauhan
[jira] [Created] (HIVE-8367) delete writes records in wrong order in some cases
Alan Gates created HIVE-8367: Summary: delete writes records in wrong order in some cases Key: HIVE-8367 URL: https://issues.apache.org/jira/browse/HIVE-8367 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Blocker Fix For: 0.14.0 I have found one query with 10k records where you do: create table insert into table -- 10k records delete from table -- just some records The records in the delete delta are not ordered properly by rowid. I assume this applies to updates as well, but I haven't tested it yet. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Review Request 26282: Hook HiveServer2 dynamic service discovery with session time out
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26282/#review55581 --- Ship it! Ship It! - Thejas Nair On Oct. 2, 2014, 9 p.m., Vaibhav Gumashta wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26282/ --- (Updated Oct. 2, 2014, 9 p.m.) Review request for hive and Thejas Nair. Bugs: HIVE-8193 https://issues.apache.org/jira/browse/HIVE-8193 Repository: hive-git Description --- https://issues.apache.org/jira/browse/HIVE-8193 Diffs - service/src/java/org/apache/hive/service/cli/CLIService.java b46c5b4 service/src/java/org/apache/hive/service/cli/session/SessionManager.java ecc9b96 service/src/java/org/apache/hive/service/cli/thrift/EmbeddedThriftBinaryCLIService.java 9ee9785 service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 4a1e004 service/src/java/org/apache/hive/service/server/HiveServer2.java c667533 service/src/test/org/apache/hive/service/auth/TestPlainSaslHelper.java fb784aa service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java 47d3a56 Diff: https://reviews.apache.org/r/26282/diff/ Testing --- Manually with ZooKeeper. Thanks, Vaibhav Gumashta
[jira] [Commented] (HIVE-8193) Hook HiveServer2 dynamic service discovery with session time out
[ https://issues.apache.org/jira/browse/HIVE-8193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160935#comment-14160935 ] Thejas M Nair commented on HIVE-8193: - +1 Hook HiveServer2 dynamic service discovery with session time out Key: HIVE-8193 URL: https://issues.apache.org/jira/browse/HIVE-8193 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.14.0 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Priority: Critical Fix For: 0.14.0 Attachments: HIVE-8193.1.patch For dynamic service discovery, if the HiveServer2 instance is removed from ZooKeeper, currently, on the last client close, the server shuts down. However, we need to ensure that this also happens when a session is closed on timeout and no current sessions exit on this instance of HiveServer2. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-8368) compactor is improperly writing delete records in base file
Alan Gates created HIVE-8368: Summary: compactor is improperly writing delete records in base file Key: HIVE-8368 URL: https://issues.apache.org/jira/browse/HIVE-8368 Project: Hive Issue Type: Bug Components: Transactions Affects Versions: 0.14.0 Reporter: Alan Gates Assignee: Alan Gates Priority: Critical Fix For: 0.14.0 When the compactor reads records from the base and deltas, it is not properly dropping delete records. This leads to oversized base files, and possibly to wrong query results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8360) Add cross cluster support for webhcat E2E tests
[ https://issues.apache.org/jira/browse/HIVE-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160957#comment-14160957 ] Thejas M Nair commented on HIVE-8360: - +1 Add cross cluster support for webhcat E2E tests --- Key: HIVE-8360 URL: https://issues.apache.org/jira/browse/HIVE-8360 Project: Hive Issue Type: Test Components: Tests, WebHCat Environment: Secure cluster Reporter: Aswathy Chellammal Sreekumar Attachments: AD-MIT.patch In current Webhcat E2E test setup, cross domain secure cluster runs will fail since the realm name for user principles are not included in the kinit command. This patch concatenates the realm name to the user principal there by resulting in a successful kinit. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8362) Investigate flaky test parallel.q [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160959#comment-14160959 ] Chao commented on HIVE-8362: Ran it several times - sometimes I got this diff: {noformat} --- a/ql/src/test/results/clientpositive/spark/parallel.q.out +++ b/ql/src/test/results/clientpositive/spark/parallel.q.out @@ -149,6 +149,7 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@src POSTHOOK: Output: default@src_a POSTHOOK: Output: default@src_b +POSTHOOK: Lineage: src_a.key SIMPLE [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: src_a.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] POSTHOOK: Lineage: src_b.key SIMPLE [(src)src.FieldSchema(name:key, type:string, comment:default), ] POSTHOOK: Lineage: src_b.value SIMPLE [(src)src.FieldSchema(name:value, type:string, comment:default), ] {noformat} Investigate flaky test parallel.q [Spark Branch] Key: HIVE-8362 URL: https://issues.apache.org/jira/browse/HIVE-8362 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Jimmy Xiang Assignee: Jimmy Xiang Labels: spark Test parallel.q is flaky. It fails sometimes with error like: {noformat} Failed tests: TestSparkCliDriver.testCliDriver_parallel:120-runTest:146 Unexpected exception junit.framework.AssertionFailedError: Client Execution results failed with error code = 1 See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-2828) make timestamp accessible in the hbase KeyValue
[ https://issues.apache.org/jira/browse/HIVE-2828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160981#comment-14160981 ] Sushanth Sowmyan commented on HIVE-2828: Sure, I'll try to look into this tonight. make timestamp accessible in the hbase KeyValue Key: HIVE-2828 URL: https://issues.apache.org/jira/browse/HIVE-2828 Project: Hive Issue Type: Improvement Components: HBase Handler Reporter: Navis Assignee: Navis Priority: Trivial Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.2.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.3.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.4.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2828.D1989.5.patch, HIVE-2828.6.patch.txt, HIVE-2828.7.patch.txt, HIVE-2828.8.patch.txt Originated from HIVE-2781 and not accepted, but I think this could be helpful to someone. By using special column notation ':timestamp' in HBASE_COLUMNS_MAPPING, user might access timestamp value in hbase KeyValue. {code} CREATE TABLE hbase_table (key int, value string, time timestamp) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES (hbase.columns.mapping = :key,cf:string,:timestamp) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8261) CBO : Predicate pushdown is removed by Optiq
[ https://issues.apache.org/jira/browse/HIVE-8261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14160999#comment-14160999 ] Harish Butani commented on HIVE-8261: - [~vikram.dixit] can be add this to 0.14 branch CBO : Predicate pushdown is removed by Optiq - Key: HIVE-8261 URL: https://issues.apache.org/jira/browse/HIVE-8261 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0, 0.13.1 Reporter: Mostafa Mokhtar Assignee: Harish Butani Fix For: 0.14.0 Attachments: HIVE-8261.1.patch Plan for TPC-DS Q64 wasn't optimal upon looking at the logical plan I realized that predicate pushdown is not applied on date_dim d1. Interestingly before optiq we have the predicate pushed : {code} HiveFilterRel(condition=[=($5, $1)]) HiveJoinRel(condition=[=($3, $6)], joinType=[inner]) HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col2=[$3], _o__col3=[$1]) HiveFilterRel(condition=[=($0, 2000)]) HiveAggregateRel(group=[{0, 1}], agg#0=[count()], agg#1=[sum($2)]) HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2]) HiveJoinRel(condition=[=($1, $8)], joinType=[inner]) HiveJoinRel(condition=[=($1, $5)], joinType=[inner]) HiveJoinRel(condition=[=($0, $3)], joinType=[inner]) HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], ss_wholesale_cost=[$11]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]]) HiveProjectRel(d_date_sk=[$0], d_year=[$6]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]]) HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), between(false, $1, +(35, 1), +(35, 15)))]) HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], i_color=[$17]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]]) HiveProjectRel(_o__col0=[$0]) HiveAggregateRel(group=[{0}]) HiveProjectRel($f0=[$0]) HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], joinType=[inner]) HiveProjectRel(cs_item_sk=[$15], cs_order_number=[$17]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]]) HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]]) HiveProjectRel(_o__col0=[$0], _o__col1=[$2], _o__col3=[$1]) HiveFilterRel(condition=[=($0, +(2000, 1))]) HiveAggregateRel(group=[{0, 1}], agg#0=[count()]) HiveProjectRel($f0=[$4], $f1=[$5], $f2=[$2]) HiveJoinRel(condition=[=($1, $8)], joinType=[inner]) HiveJoinRel(condition=[=($1, $5)], joinType=[inner]) HiveJoinRel(condition=[=($0, $3)], joinType=[inner]) HiveProjectRel(ss_sold_date_sk=[$0], ss_item_sk=[$2], ss_wholesale_cost=[$11]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.store_sales]]) HiveProjectRel(d_date_sk=[$0], d_year=[$6]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.date_dim]]) HiveFilterRel(condition=[AND(in($2, 'maroon', 'burnished', 'dim', 'steel', 'navajo', 'chocolate'), between(false, $1, 35, +(35, 10)), between(false, $1, +(35, 1), +(35, 15)))]) HiveProjectRel(i_item_sk=[$0], i_current_price=[$5], i_color=[$17]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.item]]) HiveProjectRel(_o__col0=[$0]) HiveAggregateRel(group=[{0}]) HiveProjectRel($f0=[$0]) HiveJoinRel(condition=[AND(=($0, $2), =($1, $3))], joinType=[inner]) HiveProjectRel(cs_item_sk=[$15], cs_order_number=[$17]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_sales]]) HiveProjectRel(cr_item_sk=[$2], cr_order_number=[$16]) HiveTableScanRel(table=[[tpcds_bin_partitioned_orc_200.catalog_returns]]) {code} While after Optiq the filter on date_dim gets pulled up the plan {code} HiveFilterRel(condition=[=($5, $1)]): rowcount = 1.0, cumulative cost = {5.50188454E8 rows, 0.0 cpu, 0.0 io}, id = 6895 HiveProjectRel(_o__col0=[$0], _o__col1=[$1], _o__col2=[$2], _o__col3=[$3], _o__col00=[$4], _o__col10=[$5], _o__col30=[$6]):
[jira] [Commented] (HIVE-7914) Simplify join predicates for CBO to avoid cross products
[ https://issues.apache.org/jira/browse/HIVE-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14161000#comment-14161000 ] Mostafa Mokhtar commented on HIVE-7914: --- Issue still exists {code} hive explain select avg(ss_quantity) ,avg(ss_ext_sales_price) ,avg(ss_ext_wholesale_cost) ,sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where store.s_store_sk = store_sales.ss_store_sk and store_sales.ss_sold_date_sk = date_dim.d_date_sk and date_dim.d_year = 2001 and((store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'M' and customer_demographics.cd_education_status = '4 yr Degree' and store_sales.ss_sales_price between 100.00 and 150.00 and household_demographics.hd_dep_count = 3 )or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = store_sales.ss_cdemo_sk and customer_demographics.cd_marital_status = 'D' and customer_demographics.cd_education_status = 'Primary' and store_sales.ss_sales_price between 50.00 and 100.00 and household_demographics.hd_dep_count = 1 ) or (store_sales.ss_hdemo_sk=household_demographics.hd_demo_sk and customer_demographics.cd_demo_sk = ss_cdemo_sk and customer_demographics.cd_marital_status = 'U' and customer_demographics.cd_education_status = 'Advanced Degree' and store_sales.ss_sales_price between 150.00 and 200.00 and household_demographics.hd_dep_count = 1 )) and((store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('KY', 'GA', 'NM') and store_sales.ss_net_profit between 100 and 200 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('MT', 'OR', 'IN') and store_sales.ss_net_profit between 150 and 300 ) or (store_sales.ss_addr_sk = customer_address.ca_address_sk and customer_address.ca_country = 'United States' and customer_address.ca_state in ('WI', 'MO', 'WV') and store_sales.ss_net_profit between 50 and 250 )) ; Warning: Map Join MAPJOIN[49][bigTable=?] in task 'Map 4' is a cross product Warning: Map Join MAPJOIN[48][bigTable=?] in task 'Map 4' is a cross product Warning: Map Join MAPJOIN[47][bigTable=?] in task 'Map 4' is a cross product OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 4 - Map 1 (BROADCAST_EDGE), Map 2 (BROADCAST_EDGE), Map 3 (BROADCAST_EDGE), Map 6 (BROADCAST_EDGE), Map 7 (BROADCAST_EDGE) Reducer 5 - Map 4 (SIMPLE_EDGE) DagName: mmokhtar_20141006173232_992a372b-cc0e-40d5-b51f-7098561df464:3 Vertices: Map 1 Map Operator Tree: TableScan alias: household_demographics Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 7200 Data size: 770400 Basic stats: COMPLETE Column stats: NONE value expressions: hd_demo_sk (type: int), hd_dep_count (type: int) Execution mode: vectorized Map 2 Map Operator Tree: TableScan alias: store filterExpr: s_store_sk is not null (type: boolean) Statistics: Num rows: 212 Data size: 405680 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: s_store_sk is not null (type: boolean) Statistics: Num rows: 106 Data size: 202840 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: s_store_sk (type: int) sort order: + Map-reduce partition columns: s_store_sk (type: int) Statistics: Num rows: 106 Data size: 202840 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized Map 3 Map Operator Tree: TableScan alias: customer_address Statistics: Num rows: 80 Data size: 811903688 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 80 Data size: 811903688 Basic stats: COMPLETE Column stats: NONE value expressions: ca_address_sk (type: int), ca_state (type: string), ca_country (type: string) Execution mode: vectorized Map 4 Map Operator Tree: TableScan alias: