[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434683#comment-15434683 ] Hive QA commented on HIVE-14462: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825218/HIVE-14462.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10459 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partitions_json] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[add_partition_with_whitelist] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/974/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/974/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-974/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825218 - PreCommit-HIVE-MASTER-Build > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-14571) Document configuration hive.msck.repair.batch.size
[ https://issues.apache.org/jira/browse/HIVE-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam resolved HIVE-14571. - Resolution: Fixed Updated wiki doc. > Document configuration hive.msck.repair.batch.size > -- > > Key: HIVE-14571 > URL: https://issues.apache.org/jira/browse/HIVE-14571 > Project: Hive > Issue Type: Improvement > Components: Documentation >Reporter: Chinna Rao Lalam >Assignee: Chinna Rao Lalam >Priority: Minor > Labels: TODOC2.2 > Fix For: 2.2.0 > > > Update here > [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)] > {quote} > When there is a large number of untracked partitions for the MSCK REPAIR > TABLE command, there is a provision to run the msck repair table batch wise > to avoid OOME. By giving the configured batch size for the property > *hive.msck.repair.batch.size* it can run in the batches internally. The > default value of the property is zero, it means it will execute all the > partitions at one short. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3
[ https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435905#comment-15435905 ] Thomas Poepping commented on HIVE-14373: Abdullah, what is wrong with: * at the beginning of the test run, do a mkdir in S3 for a unique test run id * at the end of the test run, do a rmdir for that directory That will remove all leftover data. Maybe we could have a setting to optionally not delete data at the end, to allow for more targeted debugging. Then it would be the responsibility of the user to delete those files after the fact. Are you planning on updating this patch again? > Add integration tests for hive on S3 > > > Key: HIVE-14373 > URL: https://issues.apache.org/jira/browse/HIVE-14373 > Project: Hive > Issue Type: Sub-task >Reporter: Sergio Peña >Assignee: Abdullah Yousufi > Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, > HIVE-14373.04.patch, HIVE-14373.patch > > > With Hive doing improvements to run on S3, it would be ideal to have better > integration testing on S3. > These S3 tests won't be able to be executed by HiveQA because it will need > Amazon credentials. We need to write suite based on ideas from the Hadoop > project where: > - an xml file is provided with S3 credentials > - a committer must run these tests manually to verify it works > - the xml file should not be part of the commit, and hiveqa should not run > these tests. > https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)
[ https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435904#comment-15435904 ] Hive QA commented on HIVE-13403: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825135/HIVE-13403.5.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10444 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby10.q-skewjoinopt5.q-join32_lessSize.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/976/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/976/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-976/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825135 - PreCommit-HIVE-MASTER-Build > Make Streaming API not create empty buckets (at least as an option) > --- > > Key: HIVE-13403 > URL: https://issues.apache.org/jira/browse/HIVE-13403 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng >Priority: Critical > Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, > HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch > > > as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full > compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is > created on disk even though some may end up receiving no data. > It would be better to create them on demand and not clog the FS. > Tez can handle missing (empty) buckets and on MR bucket join algorithms will > check if all buckets are there and bail out if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-14462: Attachment: HIVE-14462.7.patch > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets
[ https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13403: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks Eugene for review. > Make Streaming API not create empty buckets > --- > > Key: HIVE-13403 > URL: https://issues.apache.org/jira/browse/HIVE-13403 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, > HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch > > > as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full > compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is > created on disk even though some may end up receiving no data. > It would be better to create them on demand and not clog the FS. > Tez can handle missing (empty) buckets and on MR bucket join algorithms will > check if all buckets are there and bail out if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null
[ https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-14617: --- Attachment: HIVE-14617.patch > NPE in UDF MapValues() if input is null > --- > > Key: HIVE-14617 > URL: https://issues.apache.org/jira/browse/HIVE-14617 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-14617.patch > > > For query > {code} > select exploded_traits from hdrone.vehiclestore_udr_vehicle > lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as > exploded_traits > where datestr > '2016-08-22' LIMIT 100 > {code} > Job fails with error msg as follows: > {code} > Error: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... > 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error > evaluating map_values(vehicle_traits.vehicle_traits) at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) > ... 9 more Caused by: java.lang.NullPointerException at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) > ... 15 more > {code} > It appears that null is not properly handled in > GenericUDFMapValues.evaluate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060 ] Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:05 AM: -- This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. This also handles size increase, as new nodes will always be added to the end of the sequence, which is what consistent hashing needs. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. was (Author: sershe): This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. This also handles size increase, as new nodes will always be added to the end of the sequence, which is what consistent hashing needs. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060 ] Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:07 AM: -- Edit: removed the confusion between ZK node vs LLAP node/machine. This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The LLAPs are always sorted by the slot number for splits. The idea is that as long as LLAP is running, it will retain the same position in the ordering, regardless of other LLAPs restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total size of the cluster. The restarting LLAPs may not take the same positions as their predecessors (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever slots, but 3 will stay the 3rd and retain cache locality. This also handles size increase, as new LLAPs will always be added to the end of the sequence, which is what consistent hashing needs. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if LLAPs are removed that have the slots in the middle; until some are restarted, it will result in misses. was (Author: sershe): This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. This also handles size increase, as new nodes will always be added to the end of the sequence, which is what consistent hashing needs. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435917#comment-15435917 ] Prasanth Jayachandran commented on HIVE-14621: -- Mostly looks good. {code} LowLevelCacheImpl cacheImpl = new LowLevelCacheImpl(cacheMetrics, cachePolicy, allocator, true); cacheImpl.init(); {code} can we do init() inside ctor? so that we can avoid {code}cache = cacheImpl;{code}. Also can you add a test case for this? > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null
[ https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-14617: --- Status: Patch Available (was: Open) > NPE in UDF MapValues() if input is null > --- > > Key: HIVE-14617 > URL: https://issues.apache.org/jira/browse/HIVE-14617 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-14617.patch > > > For query > {code} > select exploded_traits from hdrone.vehiclestore_udr_vehicle > lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as > exploded_traits > where datestr > '2016-08-22' LIMIT 100 > {code} > Job fails with error msg as follows: > {code} > Error: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... > 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error > evaluating map_values(vehicle_traits.vehicle_traits) at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) > ... 9 more Caused by: java.lang.NullPointerException at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) > ... 15 more > {code} > It appears that null is not properly handled in > GenericUDFMapValues.evaluate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435959#comment-15435959 ] Sergey Shelukhin commented on HIVE-14462: - +1 > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436052#comment-15436052 ] Siddharth Seth commented on HIVE-14589: --- [~sershe] - could you provide a brief description of the change please. Makes the review a little easier. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14625: -- Status: Patch Available (was: Open) > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats
[ https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435752#comment-15435752 ] Sergey Shelukhin commented on HIVE-14623: - Whatever works, as long as the table is created ;) > add CREATE TABLE FROM FILE command for self-describing formats > -- > > Key: HIVE-14623 > URL: https://issues.apache.org/jira/browse/HIVE-14623 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > For self-describing formats like ORC, it should be possible to create a table > from a file without explicitly specifying the schema. It would be useful for > debugging, but also for all kinds of ad-hoc activities with data (and I bet > someone will also use it for ETL, sadly ;)). > The schema should be established in metastore as the final schema for the > table; it should not be an attached+derived schema, like for Avro. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435944#comment-15435944 ] Chaoyu Tang commented on HIVE-14626: Patch has been uploaded to https://reviews.apache.org/r/51395/ and requested for review. Thanks in advanced. > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14536) Unit test code cleanup
[ https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14536: -- Attachment: HIVE-14536.patch Removed wildcard import > Unit test code cleanup > -- > > Key: HIVE-14536 > URL: https://issues.apache.org/jira/browse/HIVE-14536 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14536.patch > > > Clean up the itest infrastructure, to create a readable, easy to understand > code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14625: -- Attachment: HIVE-14625.03.patch Minor fix with the stopwatch. > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, > HIVE-14625.03.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14536) Unit test code cleanup
[ https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14536: -- Status: Patch Available (was: Open) > Unit test code cleanup > -- > > Key: HIVE-14536 > URL: https://issues.apache.org/jira/browse/HIVE-14536 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14536.patch > > > Clean up the itest infrastructure, to create a readable, easy to understand > code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14561) Minor ptest2 improvements
[ https://issues.apache.org/jira/browse/HIVE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435992#comment-15435992 ] Prasanth Jayachandran commented on HIVE-14561: -- lgtm, +1 > Minor ptest2 improvements > - > > Key: HIVE-14561 > URL: https://issues.apache.org/jira/browse/HIVE-14561 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14561.01.patch > > > Re-purposed to track a few more improvements. > - Update spring framework to work with Java8 > - Change elapseTime logging to milliseconds from seconds > - Add thread name to log files. > - Allow an empty logsEndPoint if outputDir is not specified > - Log configuration when starting in a web server > - Allow tests to be run even if no qtests property is set > - Fix an exception on test completion when using FixedExecutionContextProvider -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14626: --- Status: Patch Available (was: Open) > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14626) Support Trash in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-14626: --- Attachment: HIVE-14626.patch > Support Trash in Truncate Table > --- > > Key: HIVE-14626 > URL: https://issues.apache.org/jira/browse/HIVE-14626 > Project: Hive > Issue Type: Sub-task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang >Priority: Minor > Attachments: HIVE-14626.patch > > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory, so > 1. it does not support HDFS Trash > 2. if the table/partition directory is initially encryption protected, after > being deleted and recreated, it is no more protected. > The new implementation is to clean the contents of directory using > multi-threaded trashFiles. If Trash is enabled and has a lower encryption > level than the data directory, the files under it will be deleted. Otherwise, > they will be Trashed -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435929#comment-15435929 ] Rajesh Balamohan commented on HIVE-14462: - Thanks [~sershe]. Addressed in the recent patch. > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435967#comment-15435967 ] Prasanth Jayachandran commented on HIVE-14625: -- +1 > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, > HIVE-14625.03.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14621: Attachment: HIVE-14621.01.patch Impl is still needed for other interfaces that it implements as we pass it on to other object. Renamed the call to startThreads for clarity... do you think we should start threads in ctor? As for tests, we don't have any for this mode now. IO is initialized during driver init, so we'd need a separate CliDriver. Btw, I was going to combine the interfaces for buffermanager and cache, since in both cases one object is used for both, but I couldn't come up with a name better than "CacheAndBufferManager", so I didn't do that. > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.01.patch, HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10485) Create md5 UDF
[ https://issues.apache.org/jira/browse/HIVE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435707#comment-15435707 ] Krishna Anisetty commented on HIVE-10485: - We are using Hive 1.1.0. We dont have any plans on upgrading to 2.0.0. But is there is any standalone way to just install this function. May be as UDF? > Create md5 UDF > -- > > Key: HIVE-10485 > URL: https://issues.apache.org/jira/browse/HIVE-10485 > Project: Hive > Issue Type: Task > Components: UDF >Reporter: Alexander Pivovarov >Assignee: Alexander Pivovarov > Fix For: 1.3.0, 2.0.0 > > Attachments: HIVE-10485.1.patch, HIVE-10485.2.patch, > HIVE-10485.3.patch > > > MD5(str) > Calculates an MD5 128-bit checksum for the string. The value is returned as a > string of 32 hex digits, or NULL if the argument was NULL. The return value > can, for example, be used as a hash key. > Example: > {code} > SELECT MD5('udf_md5'); > 'ce62ef0d2d27dc37b6d488b92f4b24fd' > {code} > online md5 generator: http://www.md5.cz/ > MySQL has md5 function: > https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html#function_md5 > PostgreSQL also has md5 function: > http://www.postgresql.org/docs/9.1/static/functions-string.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14612) org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure
[ https://issues.apache.org/jira/browse/HIVE-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14612: -- Parent Issue: HIVE-14547 (was: HIVE-13503) > org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout > failure > > > Key: HIVE-14612 > URL: https://issues.apache.org/jira/browse/HIVE-14612 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14612.1.patch > > > Failing for some time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)
[ https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435934#comment-15435934 ] Wei Zheng commented on HIVE-13403: -- Test failure for TestOperationLoggingLayout.testSwitchLogLayout is not related, and will be fixed by HIVE-14612 > Make Streaming API not create empty buckets (at least as an option) > --- > > Key: HIVE-13403 > URL: https://issues.apache.org/jira/browse/HIVE-13403 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng >Priority: Critical > Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, > HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch > > > as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full > compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is > created on disk even though some may end up receiving no data. > It would be better to create them on demand and not clog the FS. > Tez can handle missing (empty) buckets and on MR bucket join algorithms will > check if all buckets are there and bail out if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14536) Unit test code cleanup
[ https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14536: -- Attachment: (was: HIVE-14536.patch) > Unit test code cleanup > -- > > Key: HIVE-14536 > URL: https://issues.apache.org/jira/browse/HIVE-14536 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Reporter: Peter Vary >Assignee: Peter Vary > > Clean up the itest infrastructure, to create a readable, easy to understand > code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14536) Unit test code cleanup
[ https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Vary updated HIVE-14536: -- Attachment: HIVE-14536.patch I have tested them on my machine (at least 20 for every Driver), the results seem consistent. Adding here first to validate every query > Unit test code cleanup > -- > > Key: HIVE-14536 > URL: https://issues.apache.org/jira/browse/HIVE-14536 > Project: Hive > Issue Type: Sub-task > Components: Testing Infrastructure >Reporter: Peter Vary >Assignee: Peter Vary > Attachments: HIVE-14536.patch > > > Clean up the itest infrastructure, to create a readable, easy to understand > code -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14612) org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure
[ https://issues.apache.org/jira/browse/HIVE-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435734#comment-15435734 ] Hive QA commented on HIVE-14612: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825320/HIVE-14612.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10459 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/975/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/975/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-975/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825320 - PreCommit-HIVE-MASTER-Build > org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout > failure > > > Key: HIVE-14612 > URL: https://issues.apache.org/jira/browse/HIVE-14612 > Project: Hive > Issue Type: Sub-task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-14612.1.patch > > > Failing for some time -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435788#comment-15435788 ] Prasanth Jayachandran commented on HIVE-14625: -- Setting PerfLogger to INFO will cause failures for operation logging tests. https://github.com/apache/hive/blob/master/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java Why do we want to change perf logger level to INFO? I don't think that will contribute to huge percent of logging. I have seen log lines from blockreaders that are most common than perflogger. I think we should set log level for hive package to be DEBUG and all others at INFO level. Also, junit to RunListener that we can implement for log,computing time etc. I guess that will more cleaner? > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435890#comment-15435890 ] Siddharth Seth commented on HIVE-14625: --- Saw a lot of PerfLogger noise while debugging. I'm fine leaving it at debug if it's useful. At some point, the logger can be enabled for the specific test that will fail. Removing any log changes for now. bq. Also, junit to RunListener that we can implement for log,computing time etc. I guess that will more cleaner? RunListener requires a custom test runner. It also does not provide hooks for individual sections. > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14621: Status: Patch Available (was: Open) > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435970#comment-15435970 ] Ashutosh Chauhan commented on HIVE-14418: - you can reset a specific config as well. e.g. reset hive.auto.convert.join.noconditionaltask; How unset is different than that? > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14609) HS2 cannot drop a function whose associated jar file has been removed
[ https://issues.apache.org/jira/browse/HIVE-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435978#comment-15435978 ] Yibing Shi commented on HIVE-14609: --- To drop a function, Hive first gets the function definition: https://github.com/cloudera/hive/blob/cdh5-1.1.0_5.8.0/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java#L99 {code} FunctionInfo info = FunctionRegistry.getFunctionInfo(functionName); if (info == null) { if (throwException) { throw new SemanticException(ErrorMsg.INVALID_FUNCTION.getMsg(functionName)); } else { // Fail silently return; } } else if (info.isBuiltIn()) { throw new SemanticException(ErrorMsg.DROP_NATIVE_FUNCTION.getMsg(functionName)); } {code} Unfortunately {{FunctionRegistry.getFunctionInfo}} tries to load the function into registry after gets its definition, which includes the step of downloading jars and causes the failure. We should be able to fix this by adding one parameter to the getFunctionInfo method to control whether to adds the function to registry. And for the reason why Hive fails silently, it is because "hive.exec.drop.ignorenonexistent" is set to true by default, and thus Hive doesn't throw any exception when the failure happens. > HS2 cannot drop a function whose associated jar file has been removed > - > > Key: HIVE-14609 > URL: https://issues.apache.org/jira/browse/HIVE-14609 > Project: Hive > Issue Type: Bug >Reporter: Yibing Shi >Assignee: Chaoyu Tang > > Create a permanent function with below command: > {code:sql} > create function yshi.dummy as 'com.yshi.hive.udf.DummyUDF' using jar > 'hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar'; > {code} > After that, delete the HDFS file > {{hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar}}, and > *restart HS2 to remove the loaded class*. > Now the function cannot be dropped: > {noformat} > 0: jdbc:hive2://10.17.81.144:1/default> show functions yshi.dummy; > INFO : Compiling > command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): > show functions yshi.dummy > INFO : Semantic Analysis Completed > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from > deserializer)], properties:null) > INFO : Completed compiling > command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); > Time taken: 1.259 seconds > INFO : Executing > command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): > show functions yshi.dummy > INFO : Starting task [Stage-0:DDL] in serial mode > INFO : SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead. > INFO : Completed executing > command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); > Time taken: 0.024 seconds > INFO : OK > +-+--+ > | tab_name | > +-+--+ > | yshi.dummy | > +-+--+ > 1 row selected (3.877 seconds) > 0: jdbc:hive2://10.17.81.144:1/default> drop function yshi.dummy; > INFO : Compiling > command(queryId=hive_20160821213434_47d14df5-59b3-4ebc-9a48-5e1d9c60c1fc): > drop function yshi.dummy > INFO : converting to local > hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar > ERROR : Failed to read external resource > hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar > java.lang.RuntimeException: Failed to read external resource > hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar > at > org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1200) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1136) > at > org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1126) > at > org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:304) > at > org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:470) > at > org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:456) > at > org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:245) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:455) > at > org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:99) > at > org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:61) > at >
[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060 ] Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:02 AM: -- This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. was (Author: sershe): This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sort by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, their predecessors location, or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it doesn't matter as much because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay 3rd and retain cache locality. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060 ] Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:03 AM: -- This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. This also handles size increase, as new nodes will always be added to the end of the sequence, which is what consistent hashing needs. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. was (Author: sershe): This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sorted by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, the predecessors location (if restarted in a different place), or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it shouldn't matter because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay the 3rd and retain cache locality. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435973#comment-15435973 ] Hive QA commented on HIVE-14625: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825356/HIVE-14625.02.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 7117 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestContribCliDriver.org.apache.hadoop.hive.cli.TestContribCliDriver org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/977/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/977/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-977/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 9 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825356 - PreCommit-HIVE-MASTER-Build > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, > HIVE-14625.03.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14627) Improvements to MiniMr tests
[ https://issues.apache.org/jira/browse/HIVE-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-14627: - Description: Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following are the execution time breakdown Total time - 13m59s Junit reported time for testcase - 50s Most of the time is spent in creating/loading/analyzing initial tables - ~12m Cleanup - ~1m There is huge overhead for running MiniMr tests when compared to the actual test runtime. Ran the same test without init script. Total time - 2m17s Junit reported time for testcase - 52s Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q that does not require MiniMr. It just reads/write to hdfs which we can do in MiniTez/MiniLlap which are way faster). Most tests access only very few initial tables to read few rows from it. We can fix those tests to load just the table that is required for the table instead of all initial tables. Also we can remove q_init_script.sql initialization for MiniMr after rewriting and moving over the unwanted tests which should cut down the runtime a lot. was: Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following are the execution time breakdown Total time - 13m59s Junit reported time for testcase - 50s Most of the time is spent in creating/loading/analyzing initial tables - ~12m Cleanup - ~1m There is huge overhead for running MiniMr tests when compared to the actual test runtime. Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q that does not require MiniMr. It just reads/write to hdfs which we can do in MiniTez/MiniLlap which are way faster). Most tests access only very few initial tables to read few rows from it. We can fix those tests to load just the table that is required for the table instead of all initial tables. Also we can remove q_init_script.sql initialization for MiniMr after rewriting and moving over the unwanted tests which should cut down the runtime a lot. > Improvements to MiniMr tests > > > Key: HIVE-14627 > URL: https://issues.apache.org/jira/browse/HIVE-14627 > Project: Hive > Issue Type: Sub-task >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following > are the execution time breakdown > Total time - 13m59s > Junit reported time for testcase - 50s > Most of the time is spent in creating/loading/analyzing initial tables - ~12m > Cleanup - ~1m > There is huge overhead for running MiniMr tests when compared to the actual > test runtime. > Ran the same test without init script. > Total time - 2m17s > Junit reported time for testcase - 52s > Also I noticed some tests that doesn't have to run on MiniMr (like > udf_using.q that does not require MiniMr. It just reads/write to hdfs which > we can do in MiniTez/MiniLlap which are way faster). Most tests access only > very few initial tables to read few rows from it. We can fix those tests to > load just the table that is required for the table instead of all initial > tables. Also we can remove q_init_script.sql initialization for MiniMr after > rewriting and moving over the unwanted tests which should cut down the > runtime a lot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435999#comment-15435999 ] Ashutosh Chauhan commented on HIVE-14418: - How is removing override different than setting it to default value ? > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14589: Attachment: HIVE-14589.01.patch Rebased the patch. [~sseth] [~prasanth_j] ping? ;) > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436008#comment-15436008 ] Sergey Shelukhin commented on HIVE-14418: - Overrides are the ones specified in system properties, commandline and via set... commands. The things set in configuration files (and via whatever other means there may be) still stay. Do you think this should instead add some argument to reset command? > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14617) NPE in UDF MapValues() if input is null
[ https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436010#comment-15436010 ] Chao Sun commented on HIVE-14617: - +1 LGTM nit: add parenthesis on line 64 of GenericUDFMapValues.java and remove leading whitespace on line 49 of TestGenericUDFMapValues.java. > NPE in UDF MapValues() if input is null > --- > > Key: HIVE-14617 > URL: https://issues.apache.org/jira/browse/HIVE-14617 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-14617.patch > > > For query > {code} > select exploded_traits from hdrone.vehiclestore_udr_vehicle > lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as > exploded_traits > where datestr > '2016-08-22' LIMIT 100 > {code} > Job fails with error msg as follows: > {code} > Error: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... > 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error > evaluating map_values(vehicle_traits.vehicle_traits) at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) > ... 9 more Caused by: java.lang.NullPointerException at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) > ... 15 more > {code} > It appears that null is not properly handled in > GenericUDFMapValues.evaluate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14589) add consistent node replacement to LLAP for splits
[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060 ] Sergey Shelukhin commented on HIVE-14589: - This basically creates the nodes in ZK for "slots" in the cluster. The nodes try to take the lowest available slot, starting from 0. Unlike worker-... nodes, the slots are reused, which is the intent. The nodes are always sort by the slot number for splits. The idea is that as long as the node is running, it will retain the same position in the ordering, regardless of other nodes restarting, without knowing about each other, their predecessors location, or the total count of nodes in the cluster. The restarting nodes may not take the same positions as their predecessors (i.e. if two nodes restart they can swap slots) but it doesn't matter as much because they have lost their cache anyway. I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will take whatever spots, but 3 will stay 3rd and retain cache locality. One case it doesn't handle is permanent cluster size reduction. There will be a permanent gap if nodes are removed that have the slots in the middle; until some nodes restart, it will result in misses. > add consistent node replacement to LLAP for splits > -- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.patch > > > See HIVE-14574 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats
[ https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435686#comment-15435686 ] Gopal V commented on HIVE-14623: Why not jump on "IMPORT" instead of CREATE? > add CREATE TABLE FROM FILE command for self-describing formats > -- > > Key: HIVE-14623 > URL: https://issues.apache.org/jira/browse/HIVE-14623 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > For self-describing formats like ORC, it should be possible to create a table > from a file without explicitly specifying the schema. It would be useful for > debugging, but also for all kinds of ad-hoc activities with data (and I bet > someone will also use it for ETL, sadly ;)). > The schema should be established in metastore as the final schema for the > table; it should not be an attached+derived schema, like for Avro. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats
[ https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435686#comment-15435686 ] Gopal V edited comment on HIVE-14623 at 8/24/16 9:06 PM: - Why not jump on "IMPORT" instead of CREATE? Instead of reading the _metadata/ folder, it could go to the self describing input format. was (Author: gopalv): Why not jump on "IMPORT" instead of CREATE? > add CREATE TABLE FROM FILE command for self-describing formats > -- > > Key: HIVE-14623 > URL: https://issues.apache.org/jira/browse/HIVE-14623 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin > > For self-describing formats like ORC, it should be possible to create a table > from a file without explicitly specifying the schema. It would be useful for > debugging, but also for all kinds of ad-hoc activities with data (and I bet > someone will also use it for ETL, sadly ;)). > The schema should be established in metastore as the final schema for the > table; it should not be an attached+derived schema, like for Avro. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14625: -- Attachment: HIVE-14625.01.patch Patch to address the 3 items mentioned in the description. [~prasanth_j] - could you please review. > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14437) Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable
[ https://issues.apache.org/jira/browse/HIVE-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435769#comment-15435769 ] Matt McCline commented on HIVE-14437: - +1 LGTM. > Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable > - > > Key: HIVE-14437 > URL: https://issues.apache.org/jira/browse/HIVE-14437 > Project: Hive > Issue Type: Improvement > Components: Vectorization >Affects Versions: 2.2.0 >Reporter: Gopal V >Assignee: Gopal V > Attachments: HIVE-14437.1.patch > > > Currently, the lookup in VectorMapJoinFastBytesHashTable proceeds until the > max number of metric put conflicts have been reached. > This can have a fast-exit when encountering the first empty slot during the > probe, to speed up looking for non-existent keys. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14621: Attachment: HIVE-14621.patch The reason is that the code relies on cache for refcount increment; no cache means no refcount [~prasanth_j] can you take a look? > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435908#comment-15435908 ] Sergey Shelukhin edited comment on HIVE-14621 at 8/24/16 11:08 PM: --- The reason is that the code relies on cache for refcount increment; no cache means no refcount [~prasanth_j] can you take a look? was (Author: sershe): The reason is that the code relies on cache for refcount increment; no cache means no refcount [~prasanth_j] can you take a look? > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14625: -- Attachment: HIVE-14625.02.patch Updated patch without the log changes. > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11667) Support Trash and Snapshot in Truncate Table
[ https://issues.apache.org/jira/browse/HIVE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang updated HIVE-11667: --- Priority: Major (was: Minor) Issue Type: Task (was: Improvement) Separate Trash and Snapshot supports to subtasks > Support Trash and Snapshot in Truncate Table > > > Key: HIVE-11667 > URL: https://issues.apache.org/jira/browse/HIVE-11667 > Project: Hive > Issue Type: Task > Components: Query Processor >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > > Currently Truncate Table (or Partition) is implemented using > FileSystem.delete and then recreate the directory. It does not support HDFS > Trash if it is turned on. The table/partition can not be truncated if it has > a snapshot. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14572) Investigate jenkins test report timings
[ https://issues.apache.org/jira/browse/HIVE-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435924#comment-15435924 ] Zoltan Haindrich commented on HIVE-14572: - created infra ticket > Investigate jenkins test report timings > --- > > Key: HIVE-14572 > URL: https://issues.apache.org/jira/browse/HIVE-14572 > Project: Hive > Issue Type: Sub-task > Components: Tests >Reporter: Zoltan Haindrich >Assignee: Zoltan Haindrich > > [~sseth] have noticed some odd timings in the jenkins reports > I've created a sample project, to emulate a clidriver run during qtest: > the testclass: > * 1 sec beforeclass > * 3x 0.2s test > created using junit4 parameterized. > Double checkout; second project runs different tests...or at least they have > different name. > here are my preliminary findings: > || thing || expected || 2.16 || 2.19.1 > | total time | ~3.4s | 1.2s | 3.4s > | package time | ~3.4s | 0.61s | 1.7s > | class time | ~3.4s | 0.61s | 1.7s > | testcase times | ~.2s | ~.2s | ~.2s > notes: > * using 2.16 beforeclass timngs are totally hidden or lost > * 2.19.1 does account for beforeclass but still fails to correctly aggregate > the two runs of the similary named testclasses > it might worth a try to look at the bleeding edge of this jenkins plugin... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435938#comment-15435938 ] Sergey Shelukhin commented on HIVE-14418: - [~ashutoshc] ping? > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets
[ https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13403: - Summary: Make Streaming API not create empty buckets (was: Make Streaming API not create empty buckets (at least as an option)) > Make Streaming API not create empty buckets > --- > > Key: HIVE-13403 > URL: https://issues.apache.org/jira/browse/HIVE-13403 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 1.3.0 >Reporter: Eugene Koifman >Assignee: Wei Zheng >Priority: Critical > Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, > HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch > > > as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full > compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is > created on disk even though some may end up receiving no data. > It would be better to create them on demand and not clog the FS. > Tez can handle missing (empty) buckets and on MR bucket join algorithms will > check if all buckets are there and bail out if not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14574) use consistent hashing for LLAP consistent splits to alleviate impact from cluster changes
[ https://issues.apache.org/jira/browse/HIVE-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-14574: Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Committed to master. Thanks for the review! > use consistent hashing for LLAP consistent splits to alleviate impact from > cluster changes > -- > > Key: HIVE-14574 > URL: https://issues.apache.org/jira/browse/HIVE-14574 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.2.0 > > Attachments: HIVE-14574.01.patch, HIVE-14574.02.patch, > HIVE-14574.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings
[ https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435981#comment-15435981 ] Sergey Shelukhin commented on HIVE-14418: - reset only removes overrides in the session; unset sets it to default value > Hive config validation prevents unsetting the settings > -- > > Key: HIVE-14418 > URL: https://issues.apache.org/jira/browse/HIVE-14418 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, > HIVE-14418.03.patch, HIVE-14418.patch > > > {noformat} > hive> set hive.tez.task.scale.memory.reserve.fraction.max=; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > hive> set hive.tez.task.scale.memory.reserve.fraction.max=null; > Query returned non-zero code: 1, cause: 'SET > hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because > hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value. > {noformat} > unset also doesn't work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14617) NPE in UDF MapValues() if input is null
[ https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436117#comment-15436117 ] Xuefu Zhang commented on HIVE-14617: Thanks for the review, Chao. I will take care of those after the test results coming back but before committing to git. > NPE in UDF MapValues() if input is null > --- > > Key: HIVE-14617 > URL: https://issues.apache.org/jira/browse/HIVE-14617 > Project: Hive > Issue Type: Bug > Components: HiveServer2 >Affects Versions: 2.1.0 >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Attachments: HIVE-14617.patch > > > For query > {code} > select exploded_traits from hdrone.vehiclestore_udr_vehicle > lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as > exploded_traits > where datestr > '2016-08-22' LIMIT 100 > {code} > Job fails with error msg as follows: > {code} > Error: java.lang.RuntimeException: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at > org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at > org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at > org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at > org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at > java.security.AccessController.doPrivileged(Native Method) at > javax.security.auth.Subject.doAs(Subject.java:422) at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while > processing row > {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) > at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... > 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error > evaluating map_values(vehicle_traits.vehicle_traits) at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at > org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) > at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) > ... 9 more Caused by: java.lang.NullPointerException at > org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64) > at > org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) > at > org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) > ... 15 more > {code} > It appears that null is not properly handled in > GenericUDFMapValues.evaluate() method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ke Jia updated HIVE-13589: -- Attachment: HIVE-13589.4.patch > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch, HIVE-13589.4.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14625) Minor qtest fixes
[ https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436184#comment-15436184 ] Hive QA commented on HIVE-14625: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825376/HIVE-14625.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10459 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-979/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825376 - PreCommit-HIVE-MASTER-Build > Minor qtest fixes > - > > Key: HIVE-14625 > URL: https://issues.apache.org/jira/browse/HIVE-14625 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth >Assignee: Siddharth Seth > Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, > HIVE-14625.03.patch > > > Log times for CoreCliDriver > Exit early if cleanup and createsSources fails > Turn PerfLogger off for ptests -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14249: --- Status: Patch Available (was: In Progress) > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14249: --- Attachment: HIVE-14249.05.patch > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch, HIVE-14249.05.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-14249 started by Jesus Camacho Rodriguez. -- > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14249: --- Status: Open (was: Patch Available) > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14249: --- Attachment: (was: HIVE-14249.04.patch) > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds
[ https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-14249: --- Attachment: (was: HIVE-14249.03.patch) > Add simple materialized views with manual rebuilds > -- > > Key: HIVE-14249 > URL: https://issues.apache.org/jira/browse/HIVE-14249 > Project: Hive > Issue Type: New Feature > Components: Materialized views, Parser >Reporter: Alan Gates >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-10459.2.patch > > > This patch is a start at implementing simple views. It doesn't have enough > testing yet (e.g. there's no negative testing). And I know it fails in the > partitioned case. I suspect things like security and locking don't work > properly yet either. But I'm posting it as a starting point. > In this initial patch I'm just handling simple materialized views with manual > rebuilds. In later JIRAs we can add features such as allowing the optimizer > to rewrite queries to use materialized views rather than tables named in the > queries, giving the optimizer the ability to determine when a materialized > view is stale, etc. > Also, I didn't rebase this patch against trunk after the migration from > svn->git so it may not apply cleanly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436126#comment-15436126 ] Yechao Chen commented on HIVE-14398: [~xuefuz] Yes,I checkout the latest trunk,the code has change a lot; I test it, It is already fixed.thanks for your answer,so i just cancel this patch or else? > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14233) Improve vectorization for ACID by eliminating row-by-row stitching
[ https://issues.apache.org/jira/browse/HIVE-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436236#comment-15436236 ] Saket Saurabh commented on HIVE-14233: -- Thanks [~ekoifman] for the comments, working now on fixing them. > Improve vectorization for ACID by eliminating row-by-row stitching > -- > > Key: HIVE-14233 > URL: https://issues.apache.org/jira/browse/HIVE-14233 > Project: Hive > Issue Type: New Feature > Components: Transactions, Vectorization >Reporter: Saket Saurabh >Assignee: Saket Saurabh > Attachments: HIVE-14233.01.patch, HIVE-14233.02.patch, > HIVE-14233.03.patch, HIVE-14233.04.patch, HIVE-14233.05.patch, > HIVE-14233.06.patch, HIVE-14233.07.patch, HIVE-14233.08.patch, > HIVE-14233.09.patch > > > This JIRA proposes to improve vectorization for ACID by eliminating > row-by-row stitching when reading back ACID files. In the current > implementation, a vectorized row batch is created by populating the batch one > row at a time, before the vectorized batch is passed up along the operator > pipeline. This row-by-row stitching limitation was because of the fact that > the ACID insert/update/delete events from various delta files needed to be > merged together before the actual version of a given row was found out. > HIVE-14035 has enabled us to break away from that limitation by splitting > ACID update events into a combination of delete+insert. In fact, it has now > enabled us to create splits on delta files. > Building on top of HIVE-14035, this JIRA proposes to solve this earlier > bottleneck in the vectorized code path for ACID by now directly reading row > batches from the underlying ORC files and avoiding any stitching altogether. > Once a row batch is read from the split (which may be on a base/delta file), > the deleted rows will be found by cross-referencing them against a data > structure that will just keep track of deleted events (found in the > deleted_delta files). This will lead to a large performance gain when reading > ACID files in vectorized fashion, while enabling further optimizations in > future that can be done on top of that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-14582) Add trunc(numeric) udf
[ https://issues.apache.org/jira/browse/HIVE-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam reassigned HIVE-14582: --- Assignee: Chinna Rao Lalam > Add trunc(numeric) udf > -- > > Key: HIVE-14582 > URL: https://issues.apache.org/jira/browse/HIVE-14582 > Project: Hive > Issue Type: Sub-task > Components: SQL >Reporter: Ashutosh Chauhan >Assignee: Chinna Rao Lalam > > https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions200.htm -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436098#comment-15436098 ] Hive QA commented on HIVE-14621: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825358/HIVE-14621.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10461 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/978/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/978/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-978/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825358 - PreCommit-HIVE-MASTER-Build > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.01.patch, HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14618) beeline fetch logging delays before query completion
[ https://issues.apache.org/jira/browse/HIVE-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Li updated HIVE-14618: -- Attachment: HIVE-14618.2.patch > beeline fetch logging delays before query completion > > > Key: HIVE-14618 > URL: https://issues.apache.org/jira/browse/HIVE-14618 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-14618.1.patch, HIVE-14618.2.patch > > > Beeline has a thread that fetches logs from HS2. However, it uses the same > HiveStatement object to also wait for query completion using a long-poll > (with default interval of 5 seconds). > The jdbc client has a lock around the thrift api calls, resulting in the > getLogs api blocking on the query completion check. ie the logs would get > shown only every 5 seconds by default. > cc [~vgumashta] [~gopalv] [~thejas] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436153#comment-15436153 ] Ferdinand Xu commented on HIVE-13589: - [~vihangk1] Good idea. Let's make it in this way. [~Jk_Self] thank you for your update. LGTM +1 > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch, HIVE-13589.4.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14618) beeline fetch logging delays before query completion
[ https://issues.apache.org/jira/browse/HIVE-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436160#comment-15436160 ] Gopal V commented on HIVE-14618: LGTM - +1 tests pending. > beeline fetch logging delays before query completion > > > Key: HIVE-14618 > URL: https://issues.apache.org/jira/browse/HIVE-14618 > Project: Hive > Issue Type: Bug >Reporter: Tao Li >Assignee: Tao Li > Attachments: HIVE-14618.1.patch, HIVE-14618.2.patch > > > Beeline has a thread that fetches logs from HS2. However, it uses the same > HiveStatement object to also wait for query completion using a long-poll > (with default interval of 5 seconds). > The jdbc client has a lock around the thrift api calls, resulting in the > getLogs api blocking on the query completion check. ie the logs would get > shown only every 5 seconds by default. > cc [~vgumashta] [~gopalv] [~thejas] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14628) Flaky tests: MiniYarnClusterDir in /tmp - TestPigHBaseStorageHandler
[ https://issues.apache.org/jira/browse/HIVE-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-14628: -- Attachment: hive.log > Flaky tests: MiniYarnClusterDir in /tmp - TestPigHBaseStorageHandler > > > Key: HIVE-14628 > URL: https://issues.apache.org/jira/browse/HIVE-14628 > Project: Hive > Issue Type: Sub-task >Reporter: Siddharth Seth > Attachments: hive.log > > > Configure the MiniYarnCluster to work within a test specific directory, > instead of using /tmp > https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/testReport/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436271#comment-15436271 ] Hive QA commented on HIVE-14462: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12825362/HIVE-14462.7.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10461 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partitions_json] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char] org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/980/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/980/console Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-980/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12825362 - PreCommit-HIVE-MASTER-Build > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436126#comment-15436126 ] Yechao Chen edited comment on HIVE-14398 at 8/25/16 2:06 AM: - [~xuefuz] Yes,I checkout the latest trunk,the code changes a lot; I test it, It is already fixed.thanks for your answer,so i just cancel this patch or else? was (Author: chenyechao): [~xuefuz] Yes,I checkout the latest trunk,the code has change a lot; I test it, It is already fixed.thanks for your answer,so i just cancel this patch or else? > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14614) Insert overwrite local directory fails with IllegalStateException
[ https://issues.apache.org/jira/browse/HIVE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436106#comment-15436106 ] Mohit Sabharwal commented on HIVE-14614: Nit: I'd just define a new HADOOP_LOCAL_FS_SCHEME and do {code} +isLocal = path.toUri().getScheme().equals(HADOOP_LOCAL_FS_SCHEME); {code} for readability. Otherwise LGTM pending tests. [~spena] should take a look since he worked on HIVE-14270. > Insert overwrite local directory fails with IllegalStateException > - > > Key: HIVE-14614 > URL: https://issues.apache.org/jira/browse/HIVE-14614 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-14614.2.patch > > > insert overwrite local directory select * from table; fails with > "java.lang.IllegalStateException: Cannot create staging directory" when the > path sent to the getTempDirForPath(Path path) is a local fs path. > This is a regression caused by the fix for HIVE-14270 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE
[ https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436111#comment-15436111 ] Sergey Shelukhin commented on HIVE-14621: - Test failures appear unrelated (mostly the usual stats/hashtable order changes) > LLAP: memory.mode = none has NPE > > > Key: HIVE-14621 > URL: https://issues.apache.org/jira/browse/HIVE-14621 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-14621.01.patch, HIVE-14621.patch > > > When IO elevator is enabled, but cache and allocator are both disabled, NPEs > happen. It's not really a recommended mode, but it's the only way to disable > cache, so we probably need to fix it. I am also going to nuke the > intermediate mode (allocator w/no cache) meanwhile cause it's pointless and > just creates a zoo of configurations. > {noformat} > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63) > at > org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162) > at > org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76) > at > org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224) > at > org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93) > ... 6 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14233) Improve vectorization for ACID by eliminating row-by-row stitching
[ https://issues.apache.org/jira/browse/HIVE-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436113#comment-15436113 ] Eugene Koifman commented on HIVE-14233: --- [~saketj] more comments on RB > Improve vectorization for ACID by eliminating row-by-row stitching > -- > > Key: HIVE-14233 > URL: https://issues.apache.org/jira/browse/HIVE-14233 > Project: Hive > Issue Type: New Feature > Components: Transactions, Vectorization >Reporter: Saket Saurabh >Assignee: Saket Saurabh > Attachments: HIVE-14233.01.patch, HIVE-14233.02.patch, > HIVE-14233.03.patch, HIVE-14233.04.patch, HIVE-14233.05.patch, > HIVE-14233.06.patch, HIVE-14233.07.patch, HIVE-14233.08.patch, > HIVE-14233.09.patch > > > This JIRA proposes to improve vectorization for ACID by eliminating > row-by-row stitching when reading back ACID files. In the current > implementation, a vectorized row batch is created by populating the batch one > row at a time, before the vectorized batch is passed up along the operator > pipeline. This row-by-row stitching limitation was because of the fact that > the ACID insert/update/delete events from various delta files needed to be > merged together before the actual version of a given row was found out. > HIVE-14035 has enabled us to break away from that limitation by splitting > ACID update events into a combination of delete+insert. In fact, it has now > enabled us to create splits on delta files. > Building on top of HIVE-14035, this JIRA proposes to solve this earlier > bottleneck in the vectorized code path for ACID by now directly reading row > batches from the underlying ORC files and avoiding any stitching altogether. > Once a row batch is read from the split (which may be on a base/delta file), > the deleted rows will be found by cross-referencing them against a data > structure that will just keep track of deleted events (found in the > deleted_delta files). This will lead to a large performance gain when reading > ACID files in vectorized fashion, while enabling further optimizations in > future that can be done on top of that. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13930) upgrade Hive to latest Hadoop version
[ https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13930: Attachment: HIVE-13930.07.patch Actually I cannot repro those (other than maybe some ordering changes if I run with java7). Will run again and try to catch the logs/test report this time > upgrade Hive to latest Hadoop version > - > > Key: HIVE-13930 > URL: https://issues.apache.org/jira/browse/HIVE-13930 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, > HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, > HIVE-13930.06.patch, HIVE-13930.07.patch, HIVE-13930.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436127#comment-15436127 ] Xuefu Zhang commented on HIVE-14398: I closed this as "not reproducible". > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14398) import database.tablename from path error
[ https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-14398: --- Resolution: Cannot Reproduce Status: Resolved (was: Patch Available) > import database.tablename from path error > - > > Key: HIVE-14398 > URL: https://issues.apache.org/jira/browse/HIVE-14398 > Project: Hive > Issue Type: Bug > Components: Import/Export >Affects Versions: 1.1.0 >Reporter: Yechao Chen >Assignee: Yechao Chen > Fix For: 1.1.0 > > Attachments: HIVE-14398.1.patch > > > hive>create table a(id int,name string); > hive>export table a to '/tmp/a'; > hive> import table test.a from '/tmp/a'; > Copying data from hdfs://test:8020/tmp/a/data > Loading data to table default.test.a > Failed with exception Invalid table name default.test.a > FAILED: Execution Error, return code 1 from > org.apache.hadoop.hive.ql.exec.MoveTask > tablename should be test.a not default.test.a -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434353#comment-15434353 ] Ferdinand Xu commented on HIVE-13589: - [~vihangk1], I take a look at the BeeLine code. Now -p has been associated with a password. If we make it optional, it will parse the string next to "-p" to see whether it exists a password. In this way, it's possible to treat "--" as a password since - doesn't exist in Beeline options. One way I can think of is that we could add a new option which has no password. And prompt user to enter the password. Any thoughputs? > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434353#comment-15434353 ] Ferdinand Xu edited comment on HIVE-13589 at 8/24/16 7:27 AM: -- [~vihangk1], I take a look at the BeeLine code. Now "- p" has been associated with a password. If we make it optional, it will parse the string next to "- p" to see whether it exists a password. In this way, it's possible to treat "--" as a password since - doesn't exist in Beeline options. One way I can think of is that we could add a new option which has no password. And prompt user to enter the password. Any thoughputs? was (Author: ferd): [~vihangk1], I take a look at the BeeLine code. Now -p has been associated with a password. If we make it optional, it will parse the string next to "-p" to see whether it exists a password. In this way, it's possible to treat "--" as a password since - doesn't exist in Beeline options. One way I can think of is that we could add a new option which has no password. And prompt user to enter the password. Any thoughputs? > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14446) Adjust bloom filter for hybrid grace hash join when row count exceeds certain limit
[ https://issues.apache.org/jira/browse/HIVE-14446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434394#comment-15434394 ] Gopal V commented on HIVE-14446: LGTM - +1. > Adjust bloom filter for hybrid grace hash join when row count exceeds certain > limit > --- > > Key: HIVE-14446 > URL: https://issues.apache.org/jira/browse/HIVE-14446 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.3.0, 2.2.0, 2.1.1 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-14446.1.patch, HIVE-14446.2.patch > > > When row count exceeds certain limit, it doesn't make sense to generate a > bloom filter, since its size will be a few hundred MB or even a few GB. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434292#comment-15434292 ] Ferdinand Xu commented on HIVE-13589: - [~Jk_Self], we should not move those options to Beeline which will break backwards compatibility. Anyway to make this option required and if user doesn't enter a password, we pass in an empty value. When comes the empty, we prompt user to enter their password. > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434269#comment-15434269 ] Ke Jia commented on HIVE-13589: --- Hi [~vihangk1], [~Ferd], current patch can prompt for the password when the password is null or empty string. Howerver,if the case with "-p" option does not follow the specified option which are added in the Beeline.java [L290-L391], the Apache Common CLI will consider the option as the value of "-p" argument. For example, "hive --service beeline -u jdbc:hive2://localhost:1 -n root -p --force=true -e 'show tables;'", because the "--force" option is not added in the Beeline.java, the Apache Common CLI set "--force=true" to the "-p" option. So when it execute the code which you mentioned above, the password is not null and will not prompt for password. Avoiding the bad user-experience, do you think we need to add all the options in the Beeline.java? > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-14462: Status: Patch Available (was: Open) > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions
[ https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated HIVE-14462: Attachment: HIVE-14462.6.patch > Reduce number of partition check calls in add_partitions > > > Key: HIVE-14462 > URL: https://issues.apache.org/jira/browse/HIVE-14462 > Project: Hive > Issue Type: Improvement > Components: Metastore >Reporter: Rajesh Balamohan >Assignee: Rajesh Balamohan >Priority: Minor > Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, > HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14362) Support explain analyze in Hive
[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435149#comment-15435149 ] Pengcheng Xiong commented on HIVE-14362: Thanks [~gopalv] for the detailed performance analysis. I have addressed the local file and also vectorization issue. I still have some other small issue to address before i submit another patch. Thanks. > Support explain analyze in Hive > --- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14362) Support explain analyze in Hive
[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435150#comment-15435150 ] Pengcheng Xiong commented on HIVE-14362: Thanks [~gopalv] for the detailed performance analysis. I have addressed the local file and also vectorization issue. I still have some other small issue to address before i submit another patch. Thanks. > Support explain analyze in Hive > --- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets
[ https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Liew updated HIVE-13680: -- Attachment: HIVE-13680.4.patch > HiveServer2: Provide a way to compress ResultSets > - > > Key: HIVE-13680 > URL: https://issues.apache.org/jira/browse/HIVE-13680 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2, JDBC >Reporter: Vaibhav Gumashta >Assignee: Kevin Liew > Attachments: HIVE-13680.2.patch, HIVE-13680.3.patch, > HIVE-13680.4.patch, HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf > > > With HIVE-12049 in, we can provide an option to compress ResultSets before > writing to disk. The user can specify a compression library via a config > param which can be used in the tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null
[ https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-14617: --- Description: For query {code} select exploded_traits from hdrone.vehiclestore_udr_vehicle lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as exploded_traits where datestr > '2016-08-22' LIMIT 100 {code} Job fails with error msg as follows: {code} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating map_values(vehicle_traits.vehicle_traits) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) ... 9 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77) at org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) ... 15 more {code} It appears that null is not properly handled in GenericUDFMapValues.evaluate() method. was: Job fails with error msg as follows: {code} Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"} at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
[jira] [Resolved] (HIVE-14601) Altering table/partition file format with preexisting data should not be allowed
[ https://issues.apache.org/jira/browse/HIVE-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Barna Zsombor Klara resolved HIVE-14601. Resolution: Not A Bug Altering a table file format with records inside is indeed useful if the user forgets to specify the correct format and uploads the data file directly to the table location. In this case the user should be able to alter the file format to the correct value to fix the table setup. > Altering table/partition file format with preexisting data should not be > allowed > > > Key: HIVE-14601 > URL: https://issues.apache.org/jira/browse/HIVE-14601 > Project: Hive > Issue Type: Bug > Components: Metastore >Reporter: Barna Zsombor Klara >Assignee: Barna Zsombor Klara >Priority: Minor > > The file format of a table or a partition can be changed using an alter > statement. However this only affects the metadata, the data in hdfs is not > changed, leading to a table from which you cannot select anymore. > Changing the file format back fixes the issue, but a better approach would be > to prevent the alter to the file format if we have data in the tables. > The issue is reproducible by executing the following commands: > {code} > create table test (id int); > insert into test values (1); > alter table test set fileformat parquet; > insert into test values (2); > select * from test; > {code} > Will result in: > {code} > java.lang.RuntimeException: .../00_0 is not a Parquet file (too small) > (state=,code=0) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14362) Support explain analyze in Hive
[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14362: --- Attachment: HIVE-14362.03.patch > Support explain analyze in Hive > --- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > HIVE-14362.03.patch, compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14362) Support explain analyze in Hive
[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14362: --- Status: Open (was: Patch Available) > Support explain analyze in Hive > --- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > HIVE-14362.03.patch, compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-14362) Support explain analyze in Hive
[ https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-14362: --- Status: Patch Available (was: Open) > Support explain analyze in Hive > --- > > Key: HIVE-14362 > URL: https://issues.apache.org/jira/browse/HIVE-14362 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, > HIVE-14362.03.patch, compare_on_cluster.pdf > > > Right now all the explain levels only support stats before query runs. We > would like to have an explain analyze similar to Postgres for real stats > after query runs. This will help to identify the major gap between > estimated/real stats and make not only query optimization better but also > query performance debugging easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option
[ https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435315#comment-15435315 ] Vihang Karajgaonkar commented on HIVE-13589: [~Ferd] I agree that we should not change the behavior of -p argument since it will break backwards compatibility. Adding another option to achieve the same purpose seems to be a overkill. Beeline should be smart enough to prompt for the password if there is no password given at the command line. eg: 1. beeline -u "jdbc:hive2://localhost:1" -n username > beeline should prompt for the password. 2. beeline -u "jdbc:hive2://localhost:1" -n username -p If we can achieve (1) above within beeline, I think that should be sufficient to solve this issue without any work-arounds mentioned by [~thejas] in the first comment. > beeline - support prompt for password with '-u' option > -- > > Key: HIVE-13589 > URL: https://issues.apache.org/jira/browse/HIVE-13589 > Project: Hive > Issue Type: Bug > Components: Beeline >Reporter: Thejas M Nair >Assignee: Ke Jia > Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, > HIVE-13589.3.patch > > > Specifying connection string using commandline options in beeline is > convenient, as it gets saved in shell command history, and it is easy to > retrieve it from there. > However, specifying the password in command prompt is not secure as it gets > displayed on screen and saved in the history. > It should be possible to specify '-p' without an argument to make beeline > prompt for password. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-14614) Insert overwrite local directory fails with IllegalStateException
[ https://issues.apache.org/jira/browse/HIVE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435335#comment-15435335 ] Vihang Karajgaonkar commented on HIVE-14614: [~spena] and [~mohitsabharwal] Can you please review the patch when you get a chance? Thanks! > Insert overwrite local directory fails with IllegalStateException > - > > Key: HIVE-14614 > URL: https://issues.apache.org/jira/browse/HIVE-14614 > Project: Hive > Issue Type: Bug >Reporter: Vihang Karajgaonkar >Assignee: Vihang Karajgaonkar >Priority: Minor > Attachments: HIVE-14614.2.patch > > > insert overwrite local directory select * from table; fails with > "java.lang.IllegalStateException: Cannot create staging directory" when the > path sent to the getTempDirForPath(Path path) is a local fs path. > This is a regression caused by the fix for HIVE-14270 -- This message was sent by Atlassian JIRA (v6.3.4#6332)