[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434683#comment-15434683
 ] 

Hive QA commented on HIVE-14462:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825218/HIVE-14462.6.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 10459 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partitions_json]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[add_partition_with_whitelist]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/974/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/974/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-974/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825218 - PreCommit-HIVE-MASTER-Build

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-14571) Document configuration hive.msck.repair.batch.size

2016-08-24 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam resolved HIVE-14571.
-
Resolution: Fixed

Updated wiki doc.

> Document configuration hive.msck.repair.batch.size
> --
>
> Key: HIVE-14571
> URL: https://issues.apache.org/jira/browse/HIVE-14571
> Project: Hive
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Chinna Rao Lalam
>Assignee: Chinna Rao Lalam
>Priority: Minor
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
>
> Update here 
> [https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)]
> {quote}
> When there is a large number of untracked partitions for the MSCK REPAIR 
> TABLE command, there is a provision to run the msck repair table batch wise 
> to avoid OOME. By giving the configured batch size for the property 
> *hive.msck.repair.batch.size* it can run in the batches internally. The 
> default value of the property is zero, it means it will execute all the 
> partitions at one short.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14373) Add integration tests for hive on S3

2016-08-24 Thread Thomas Poepping (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435905#comment-15435905
 ] 

Thomas Poepping commented on HIVE-14373:


Abdullah, what is wrong with:
* at the beginning of the test run, do a mkdir in S3 for a unique test run id
* at the end of the test run, do a rmdir for that directory

That will remove all leftover data. Maybe we could have a setting to optionally 
not delete data at the end, to allow for more targeted debugging. Then it would 
be the responsibility of the user to delete those files after the fact.

Are you planning on updating this patch again?

> Add integration tests for hive on S3
> 
>
> Key: HIVE-14373
> URL: https://issues.apache.org/jira/browse/HIVE-14373
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergio Peña
>Assignee: Abdullah Yousufi
> Attachments: HIVE-14373.02.patch, HIVE-14373.03.patch, 
> HIVE-14373.04.patch, HIVE-14373.patch
>
>
> With Hive doing improvements to run on S3, it would be ideal to have better 
> integration testing on S3.
> These S3 tests won't be able to be executed by HiveQA because it will need 
> Amazon credentials. We need to write suite based on ideas from the Hadoop 
> project where:
> - an xml file is provided with S3 credentials
> - a committer must run these tests manually to verify it works
> - the xml file should not be part of the commit, and hiveqa should not run 
> these tests.
> https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435904#comment-15435904
 ] 

Hive QA commented on HIVE-13403:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825135/HIVE-13403.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10444 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby10.q-skewjoinopt5.q-join32_lessSize.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/976/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/976/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-976/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825135 - PreCommit-HIVE-MASTER-Build

> Make Streaming API not create empty buckets (at least as an option)
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, 
> HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch
>
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14462:

Attachment: HIVE-14462.7.patch

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets

2016-08-24 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13403:
-
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks Eugene for review.

> Make Streaming API not create empty buckets
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, 
> HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch
>
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null

2016-08-24 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-14617:
---
Attachment: HIVE-14617.patch

> NPE in UDF MapValues() if input is null
> ---
>
> Key: HIVE-14617
> URL: https://issues.apache.org/jira/browse/HIVE-14617
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-14617.patch
>
>
> For query
> {code}
> select exploded_traits from hdrone.vehiclestore_udr_vehicle 
> lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as 
> exploded_traits 
> where datestr > '2016-08-22' LIMIT 100
> {code}
> Job fails with error msg as follows:
> {code}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 
> 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating map_values(vehicle_traits.vehicle_traits) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) 
> ... 9 more Caused by: java.lang.NullPointerException at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>  ... 15 more 
> {code}
> It appears that null is not properly handled in 
> GenericUDFMapValues.evaluate() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060
 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:05 AM:
--

This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

This also handles size increase, as new nodes will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

This also handles size increase, as new nodes will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060
 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:07 AM:
--

Edit: removed the confusion between ZK node vs LLAP node/machine.

This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The LLAPs are always sorted 
by the slot number for splits.
The idea is that as long as LLAP is running, it will retain the same position 
in the ordering, regardless of other LLAPs restarting, without knowing about 
each other, the predecessors location (if restarted in a different place), or 
the total size of the cluster. 
The restarting LLAPs may not take the same positions as their predecessors 
(i.e. if two LLAPs restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
they will take whatever slots, but 3 will stay the 3rd and retain cache 
locality.

This also handles size increase, as new LLAPs will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if LLAPs are removed that have the slots in the middle; until 
some are restarted, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The LLAPs 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

This also handles size increase, as new nodes will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435917#comment-15435917
 ] 

Prasanth Jayachandran commented on HIVE-14621:
--

Mostly looks good. 

{code}
LowLevelCacheImpl cacheImpl = new LowLevelCacheImpl(cacheMetrics, cachePolicy, 
allocator, true);
cacheImpl.init();
{code}

can we do init() inside ctor? so that we can avoid {code}cache = 
cacheImpl;{code}. 

Also can you add a test case for this?

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null

2016-08-24 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-14617:
---
Status: Patch Available  (was: Open)

> NPE in UDF MapValues() if input is null
> ---
>
> Key: HIVE-14617
> URL: https://issues.apache.org/jira/browse/HIVE-14617
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-14617.patch
>
>
> For query
> {code}
> select exploded_traits from hdrone.vehiclestore_udr_vehicle 
> lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as 
> exploded_traits 
> where datestr > '2016-08-22' LIMIT 100
> {code}
> Job fails with error msg as follows:
> {code}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 
> 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating map_values(vehicle_traits.vehicle_traits) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) 
> ... 9 more Caused by: java.lang.NullPointerException at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>  ... 15 more 
> {code}
> It appears that null is not properly handled in 
> GenericUDFMapValues.evaluate() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435959#comment-15435959
 ] 

Sergey Shelukhin commented on HIVE-14462:
-

+1

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436052#comment-15436052
 ] 

Siddharth Seth commented on HIVE-14589:
---

[~sershe] - could you provide a brief description of the change please. Makes 
the review a little easier.

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14625:
--
Status: Patch Available  (was: Open)

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435752#comment-15435752
 ] 

Sergey Shelukhin commented on HIVE-14623:
-

Whatever works, as long as the table is created ;)

> add CREATE TABLE FROM FILE command for self-describing formats
> --
>
> Key: HIVE-14623
> URL: https://issues.apache.org/jira/browse/HIVE-14623
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> For self-describing formats like ORC, it should be possible to create a table 
> from a file without explicitly specifying the schema. It would be useful for 
> debugging, but also for all kinds of ad-hoc activities with data (and I bet 
> someone will also use it for ETL, sadly ;)). 
> The schema should be established in metastore as the final schema for the 
> table; it should not be an attached+derived schema, like for Avro.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14626) Support Trash in Truncate Table

2016-08-24 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435944#comment-15435944
 ] 

Chaoyu Tang commented on HIVE-14626:


Patch has been uploaded to https://reviews.apache.org/r/51395/ and requested 
for review. Thanks in advanced.

> Support Trash in Truncate Table
> ---
>
> Key: HIVE-14626
> URL: https://issues.apache.org/jira/browse/HIVE-14626
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-14626.patch
>
>
> Currently Truncate Table (or Partition) is implemented using 
> FileSystem.delete and then recreate the directory, so
> 1. it does not support HDFS Trash
> 2. if the table/partition directory is initially encryption protected, after 
> being deleted and recreated, it is no more protected.
> The new implementation is to clean the contents of directory using 
> multi-threaded trashFiles. If Trash is enabled and has a lower encryption 
> level than the data directory, the files under it will be deleted. Otherwise, 
> they will be Trashed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14536) Unit test code cleanup

2016-08-24 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14536:
--
Attachment: HIVE-14536.patch

Removed wildcard import

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14625:
--
Attachment: HIVE-14625.03.patch

Minor fix with the stopwatch.

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, 
> HIVE-14625.03.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14536) Unit test code cleanup

2016-08-24 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14536:
--
Status: Patch Available  (was: Open)

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14561) Minor ptest2 improvements

2016-08-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435992#comment-15435992
 ] 

Prasanth Jayachandran commented on HIVE-14561:
--

lgtm, +1

> Minor ptest2 improvements
> -
>
> Key: HIVE-14561
> URL: https://issues.apache.org/jira/browse/HIVE-14561
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14561.01.patch
>
>
> Re-purposed to track a few more improvements.
> - Update spring framework to work with Java8
> - Change elapseTime logging to milliseconds from seconds
> - Add thread name to log files.
> - Allow an empty logsEndPoint if outputDir is not specified
> - Log configuration when starting in a web server
> - Allow tests to be run even if no qtests property is set
> - Fix an exception on test completion when using FixedExecutionContextProvider



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14626) Support Trash in Truncate Table

2016-08-24 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14626:
---
Status: Patch Available  (was: Open)

> Support Trash in Truncate Table
> ---
>
> Key: HIVE-14626
> URL: https://issues.apache.org/jira/browse/HIVE-14626
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-14626.patch
>
>
> Currently Truncate Table (or Partition) is implemented using 
> FileSystem.delete and then recreate the directory, so
> 1. it does not support HDFS Trash
> 2. if the table/partition directory is initially encryption protected, after 
> being deleted and recreated, it is no more protected.
> The new implementation is to clean the contents of directory using 
> multi-threaded trashFiles. If Trash is enabled and has a lower encryption 
> level than the data directory, the files under it will be deleted. Otherwise, 
> they will be Trashed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14626) Support Trash in Truncate Table

2016-08-24 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-14626:
---
Attachment: HIVE-14626.patch

> Support Trash in Truncate Table
> ---
>
> Key: HIVE-14626
> URL: https://issues.apache.org/jira/browse/HIVE-14626
> Project: Hive
>  Issue Type: Sub-task
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>Priority: Minor
> Attachments: HIVE-14626.patch
>
>
> Currently Truncate Table (or Partition) is implemented using 
> FileSystem.delete and then recreate the directory, so
> 1. it does not support HDFS Trash
> 2. if the table/partition directory is initially encryption protected, after 
> being deleted and recreated, it is no more protected.
> The new implementation is to clean the contents of directory using 
> multi-threaded trashFiles. If Trash is enabled and has a lower encryption 
> level than the data directory, the files under it will be deleted. Otherwise, 
> they will be Trashed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435929#comment-15435929
 ] 

Rajesh Balamohan commented on HIVE-14462:
-

Thanks [~sershe]. Addressed in the recent patch.

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435967#comment-15435967
 ] 

Prasanth Jayachandran commented on HIVE-14625:
--

+1

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, 
> HIVE-14625.03.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14621:

Attachment: HIVE-14621.01.patch

Impl is still needed for other interfaces that it implements as we pass it on 
to other object. Renamed the call to startThreads for clarity... do you think 
we should start threads in ctor?

As for tests, we don't have any for this mode now. IO is initialized during 
driver init, so we'd need a separate CliDriver.

Btw, I was going to combine the interfaces for buffermanager and cache, since 
in both cases one object is used for both, but I couldn't come up with a name 
better than "CacheAndBufferManager", so I didn't do that.

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.01.patch, HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10485) Create md5 UDF

2016-08-24 Thread Krishna Anisetty (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435707#comment-15435707
 ] 

Krishna Anisetty commented on HIVE-10485:
-

We are using Hive 1.1.0. We dont have any plans on upgrading to 2.0.0. But is 
there is any standalone way to just install this function. May be as UDF?

> Create md5 UDF
> --
>
> Key: HIVE-10485
> URL: https://issues.apache.org/jira/browse/HIVE-10485
> Project: Hive
>  Issue Type: Task
>  Components: UDF
>Reporter: Alexander Pivovarov
>Assignee: Alexander Pivovarov
> Fix For: 1.3.0, 2.0.0
>
> Attachments: HIVE-10485.1.patch, HIVE-10485.2.patch, 
> HIVE-10485.3.patch
>
>
> MD5(str)
> Calculates an MD5 128-bit checksum for the string. The value is returned as a 
> string of 32 hex digits, or NULL if the argument was NULL. The return value 
> can, for example, be used as a hash key.
> Example:
> {code}
> SELECT MD5('udf_md5');
> 'ce62ef0d2d27dc37b6d488b92f4b24fd'
> {code}
> online md5 generator: http://www.md5.cz/
> MySQL has md5 function: 
> https://dev.mysql.com/doc/refman/5.5/en/encryption-functions.html#function_md5
> PostgreSQL also has md5 function: 
> http://www.postgresql.org/docs/9.1/static/functions-string.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14612) org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14612:
--
Parent Issue: HIVE-14547  (was: HIVE-13503)

> org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
>  failure
> 
>
> Key: HIVE-14612
> URL: https://issues.apache.org/jira/browse/HIVE-14612
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14612.1.patch
>
>
> Failing for some time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13403) Make Streaming API not create empty buckets (at least as an option)

2016-08-24 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435934#comment-15435934
 ] 

Wei Zheng commented on HIVE-13403:
--

Test failure for TestOperationLoggingLayout.testSwitchLogLayout is not related, 
and will be fixed by HIVE-14612

> Make Streaming API not create empty buckets (at least as an option)
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, 
> HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch
>
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14536) Unit test code cleanup

2016-08-24 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14536:
--
Attachment: (was: HIVE-14536.patch)

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14536) Unit test code cleanup

2016-08-24 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14536:
--
Attachment: HIVE-14536.patch

I have tested them on my machine (at least 20 for every Driver), the results 
seem consistent.

Adding here first to validate every query

> Unit test code cleanup
> --
>
> Key: HIVE-14536
> URL: https://issues.apache.org/jira/browse/HIVE-14536
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14536.patch
>
>
> Clean up the itest infrastructure, to create a readable, easy to understand 
> code



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14612) org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout failure

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435734#comment-15435734
 ] 

Hive QA commented on HIVE-14612:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825320/HIVE-14612.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10459 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/975/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/975/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-975/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825320 - PreCommit-HIVE-MASTER-Build

> org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
>  failure
> 
>
> Key: HIVE-14612
> URL: https://issues.apache.org/jira/browse/HIVE-14612
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Hari Sankar Sivarama Subramaniyan
>Assignee: Hari Sankar Sivarama Subramaniyan
> Attachments: HIVE-14612.1.patch
>
>
> Failing for some time



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435788#comment-15435788
 ] 

Prasanth Jayachandran commented on HIVE-14625:
--

Setting PerfLogger to INFO will cause failures for operation logging tests.
https://github.com/apache/hive/blob/master/itests/hive-unit/src/test/java/org/apache/hive/service/cli/operation/TestOperationLoggingAPIWithMr.java

Why do we want to change perf logger level to INFO? I don't think that will 
contribute to huge percent of logging. I have seen log lines from blockreaders 
that are most common than perflogger. I think we should set log level for hive 
package to be DEBUG and all others at INFO level. 

Also, junit to RunListener that we can implement for log,computing time etc. I 
guess that will more cleaner?

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435890#comment-15435890
 ] 

Siddharth Seth commented on HIVE-14625:
---

Saw a lot of PerfLogger noise while debugging. I'm fine leaving it at debug if 
it's useful. At some point, the logger can be enabled for the specific test 
that will fail. Removing any log changes for now.

bq. Also, junit to RunListener that we can implement for log,computing time 
etc. I guess that will more cleaner?
RunListener requires a custom test runner. It also does not provide hooks for 
individual sections.

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14621:

Status: Patch Available  (was: Open)

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435970#comment-15435970
 ] 

Ashutosh Chauhan commented on HIVE-14418:
-

you can reset a specific config as well. e.g. reset 
hive.auto.convert.join.noconditionaltask; How unset is different than that?

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14609) HS2 cannot drop a function whose associated jar file has been removed

2016-08-24 Thread Yibing Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435978#comment-15435978
 ] 

Yibing Shi commented on HIVE-14609:
---

To drop a function, Hive first gets the function definition:
https://github.com/cloudera/hive/blob/cdh5-1.1.0_5.8.0/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java#L99
{code}
FunctionInfo info = FunctionRegistry.getFunctionInfo(functionName);
if (info == null) {
  if (throwException) {
throw new 
SemanticException(ErrorMsg.INVALID_FUNCTION.getMsg(functionName));
  } else {
// Fail silently
return;
  }
} else if (info.isBuiltIn()) {
  throw new 
SemanticException(ErrorMsg.DROP_NATIVE_FUNCTION.getMsg(functionName));
}
{code}

Unfortunately {{FunctionRegistry.getFunctionInfo}} tries to load the function 
into registry after gets its definition, which includes the step of downloading 
jars and causes the failure. We should be able to fix this by adding one 
parameter to the getFunctionInfo method to control whether to adds the function 
to registry.

And for the reason why Hive fails silently, it is because 
"hive.exec.drop.ignorenonexistent" is set to true by default, and thus Hive 
doesn't throw any exception when the failure happens.

> HS2 cannot drop a function whose associated jar file has been removed
> -
>
> Key: HIVE-14609
> URL: https://issues.apache.org/jira/browse/HIVE-14609
> Project: Hive
>  Issue Type: Bug
>Reporter: Yibing Shi
>Assignee: Chaoyu Tang
>
> Create a permanent function with below command:
> {code:sql}
> create function yshi.dummy as 'com.yshi.hive.udf.DummyUDF' using jar 
> 'hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar';
> {code}
> After that, delete the HDFS file 
> {{hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar}}, and 
> *restart HS2 to remove the loaded class*.
> Now the function cannot be dropped:
> {noformat}
> 0: jdbc:hive2://10.17.81.144:1/default> show functions yshi.dummy;
> INFO  : Compiling 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): 
> show functions yshi.dummy
> INFO  : Semantic Analysis Completed
> INFO  : Returning Hive schema: 
> Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
> deserializer)], properties:null)
> INFO  : Completed compiling 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); 
> Time taken: 1.259 seconds
> INFO  : Executing 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded): 
> show functions yshi.dummy
> INFO  : Starting task [Stage-0:DDL] in serial mode
> INFO  : SHOW FUNCTIONS is deprecated, please use SHOW FUNCTIONS LIKE instead.
> INFO  : Completed executing 
> command(queryId=hive_20160821213434_d0271d77-84d8-45ba-8d92-3da1c143bded); 
> Time taken: 0.024 seconds
> INFO  : OK
> +-+--+
> |  tab_name   |
> +-+--+
> | yshi.dummy  |
> +-+--+
> 1 row selected (3.877 seconds)
> 0: jdbc:hive2://10.17.81.144:1/default> drop function yshi.dummy;
> INFO  : Compiling 
> command(queryId=hive_20160821213434_47d14df5-59b3-4ebc-9a48-5e1d9c60c1fc): 
> drop function yshi.dummy
> INFO  : converting to local 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
> ERROR : Failed to read external resource 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
> java.lang.RuntimeException: Failed to read external resource 
> hdfs://host-10-17-81-142.coe.cloudera.com:8020/hive/jars/yshi.jar
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1200)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1136)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1126)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:304)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:470)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.getQualifiedFunctionInfo(Registry.java:456)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.getFunctionInfo(Registry.java:245)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:455)
>   at 
> org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeDropFunction(FunctionSemanticAnalyzer.java:99)
>   at 
> org.apache.hadoop.hive.ql.parse.FunctionSemanticAnalyzer.analyzeInternal(FunctionSemanticAnalyzer.java:61)
>   at 
> 

[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060
 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:02 AM:
--

This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sort by 
the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, their predecessors location, or the total count of nodes in 
the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it doesn't matter as much 
because they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060
 ] 

Sergey Shelukhin edited comment on HIVE-14589 at 8/25/16 1:03 AM:
--

This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

This also handles size increase, as new nodes will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 


was (Author: sershe):
This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sorted 
by the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, the predecessors location (if restarted in a different 
place), or the total count of nodes in the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay the 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435973#comment-15435973
 ] 

Hive QA commented on HIVE-14625:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825356/HIVE-14625.02.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 7117 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestContribCliDriver.org.apache.hadoop.hive.cli.TestContribCliDriver
org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.org.apache.hadoop.hive.cli.TestMiniLlapCliDriver
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.org.apache.hadoop.hive.cli.TestMiniTezCliDriver
org.apache.hadoop.hive.cli.TestMinimrCliDriver.org.apache.hadoop.hive.cli.TestMinimrCliDriver
org.apache.hadoop.hive.cli.TestSparkCliDriver.org.apache.hadoop.hive.cli.TestSparkCliDriver
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/977/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/977/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-977/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825356 - PreCommit-HIVE-MASTER-Build

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, 
> HIVE-14625.03.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14627) Improvements to MiniMr tests

2016-08-24 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-14627:
-
Description: 
Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following 
are the execution time breakdown

Total time - 13m59s
Junit reported time for testcase - 50s
Most of the time is spent in creating/loading/analyzing initial tables - ~12m
Cleanup - ~1m

There is huge overhead for running MiniMr tests when compared to the actual 
test runtime. 

Ran the same test without init script.
Total time - 2m17s
Junit reported time for testcase - 52s

Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q 
that does not require MiniMr. It just reads/write to hdfs which we can do in 
MiniTez/MiniLlap which are way faster). Most tests access only very few initial 
tables to read few rows from it. We can fix those tests to load just the table 
that is required for the table instead of all initial tables. Also we can 
remove q_init_script.sql initialization for MiniMr after rewriting and moving 
over the unwanted tests which should cut down the runtime a lot.  


  was:
Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following 
are the execution time breakdown

Total time - 13m59s
Junit reported time for testcase - 50s
Most of the time is spent in creating/loading/analyzing initial tables - ~12m
Cleanup - ~1m

There is huge overhead for running MiniMr tests when compared to the actual 
test runtime. 

Also I noticed some tests that doesn't have to run on MiniMr (like udf_using.q 
that does not require MiniMr. It just reads/write to hdfs which we can do in 
MiniTez/MiniLlap which are way faster). Most tests access only very few initial 
tables to read few rows from it. We can fix those tests to load just the table 
that is required for the table instead of all initial tables. Also we can 
remove q_init_script.sql initialization for MiniMr after rewriting and moving 
over the unwanted tests which should cut down the runtime a lot.  



> Improvements to MiniMr tests
> 
>
> Key: HIVE-14627
> URL: https://issues.apache.org/jira/browse/HIVE-14627
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> Currently MiniMr is extremely slow, I ran udf_using.q on MiniMr and following 
> are the execution time breakdown
> Total time - 13m59s
> Junit reported time for testcase - 50s
> Most of the time is spent in creating/loading/analyzing initial tables - ~12m
> Cleanup - ~1m
> There is huge overhead for running MiniMr tests when compared to the actual 
> test runtime. 
> Ran the same test without init script.
> Total time - 2m17s
> Junit reported time for testcase - 52s
> Also I noticed some tests that doesn't have to run on MiniMr (like 
> udf_using.q that does not require MiniMr. It just reads/write to hdfs which 
> we can do in MiniTez/MiniLlap which are way faster). Most tests access only 
> very few initial tables to read few rows from it. We can fix those tests to 
> load just the table that is required for the table instead of all initial 
> tables. Also we can remove q_init_script.sql initialization for MiniMr after 
> rewriting and moving over the unwanted tests which should cut down the 
> runtime a lot.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-24 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435999#comment-15435999
 ] 

Ashutosh Chauhan commented on HIVE-14418:
-

How is removing override different than setting it to default value ?

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.01.patch

Rebased the patch. [~sseth] [~prasanth_j] ping? ;)

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436008#comment-15436008
 ] 

Sergey Shelukhin commented on HIVE-14418:
-

Overrides are the ones specified in system properties, commandline and via 
set... commands. The things set in configuration files (and via whatever other 
means there may be) still stay. Do you think this should instead add some 
argument to reset command?

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14617) NPE in UDF MapValues() if input is null

2016-08-24 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436010#comment-15436010
 ] 

Chao Sun commented on HIVE-14617:
-

+1 LGTM
nit: add parenthesis on line 64 of GenericUDFMapValues.java and remove leading 
whitespace on line 49 of TestGenericUDFMapValues.java.

> NPE in UDF MapValues() if input is null
> ---
>
> Key: HIVE-14617
> URL: https://issues.apache.org/jira/browse/HIVE-14617
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-14617.patch
>
>
> For query
> {code}
> select exploded_traits from hdrone.vehiclestore_udr_vehicle 
> lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as 
> exploded_traits 
> where datestr > '2016-08-22' LIMIT 100
> {code}
> Job fails with error msg as follows:
> {code}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 
> 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating map_values(vehicle_traits.vehicle_traits) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) 
> ... 9 more Caused by: java.lang.NullPointerException at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>  ... 15 more 
> {code}
> It appears that null is not properly handled in 
> GenericUDFMapValues.evaluate() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436060#comment-15436060
 ] 

Sergey Shelukhin commented on HIVE-14589:
-

This basically creates the nodes in ZK for "slots" in the cluster. The nodes 
try to take the lowest available slot, starting from 0. Unlike worker-... 
nodes, the slots are reused, which is the intent. The nodes are always sort by 
the slot number for splits.
The idea is that as long as the node is running, it will retain the same 
position in the ordering, regardless of other nodes restarting, without knowing 
about each other, their predecessors location, or the total count of nodes in 
the cluster. 
The restarting nodes may not take the same positions as their predecessors 
(i.e. if two nodes restart they can swap slots) but it doesn't matter as much 
because they have lost their cache anyway.
I.e. if you have nodes 1-2-3-4 and I nuke and restart 1, 2, and 4, they will 
take whatever spots, but 3 will stay 3rd and retain cache locality.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if nodes are removed that have the slots in the middle; until 
some nodes restart, it will result in misses. 

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats

2016-08-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435686#comment-15435686
 ] 

Gopal V commented on HIVE-14623:


Why not jump on "IMPORT" instead of CREATE?

> add CREATE TABLE FROM FILE command for self-describing formats
> --
>
> Key: HIVE-14623
> URL: https://issues.apache.org/jira/browse/HIVE-14623
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> For self-describing formats like ORC, it should be possible to create a table 
> from a file without explicitly specifying the schema. It would be useful for 
> debugging, but also for all kinds of ad-hoc activities with data (and I bet 
> someone will also use it for ETL, sadly ;)). 
> The schema should be established in metastore as the final schema for the 
> table; it should not be an attached+derived schema, like for Avro.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14623) add CREATE TABLE FROM FILE command for self-describing formats

2016-08-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435686#comment-15435686
 ] 

Gopal V edited comment on HIVE-14623 at 8/24/16 9:06 PM:
-

Why not jump on "IMPORT" instead of CREATE?

Instead of reading the _metadata/ folder, it could go to the self describing 
input format.


was (Author: gopalv):
Why not jump on "IMPORT" instead of CREATE?

> add CREATE TABLE FROM FILE command for self-describing formats
> --
>
> Key: HIVE-14623
> URL: https://issues.apache.org/jira/browse/HIVE-14623
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>
> For self-describing formats like ORC, it should be possible to create a table 
> from a file without explicitly specifying the schema. It would be useful for 
> debugging, but also for all kinds of ad-hoc activities with data (and I bet 
> someone will also use it for ETL, sadly ;)). 
> The schema should be established in metastore as the final schema for the 
> table; it should not be an attached+derived schema, like for Avro.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14625:
--
Attachment: HIVE-14625.01.patch

Patch to address the 3 items mentioned in the description.

[~prasanth_j] - could you please review.

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14437) Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable

2016-08-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435769#comment-15435769
 ] 

Matt McCline commented on HIVE-14437:
-

+1 LGTM.

> Vectorization: Optimize key misses in VectorMapJoinFastBytesHashTable
> -
>
> Key: HIVE-14437
> URL: https://issues.apache.org/jira/browse/HIVE-14437
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 2.2.0
>Reporter: Gopal V
>Assignee: Gopal V
> Attachments: HIVE-14437.1.patch
>
>
> Currently, the lookup in VectorMapJoinFastBytesHashTable proceeds until the 
> max number of metric put conflicts have been reached.
> This can have a fast-exit when encountering the first empty slot during the 
> probe, to speed up looking for non-existent keys.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14621:

Attachment: HIVE-14621.patch

The reason is that the code relies on cache for refcount increment; no cache 
means no refcount
[~prasanth_j] can you take a look?

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435908#comment-15435908
 ] 

Sergey Shelukhin edited comment on HIVE-14621 at 8/24/16 11:08 PM:
---

The reason is that the code relies on cache for refcount increment; no cache 
means no refcount
[~prasanth_j] can you take a look?


was (Author: sershe):
The reason is that the code relies on cache for refcount increment; no cache 
means no refcount
[~prasanth_j] can you take a look?

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14625:
--
Attachment: HIVE-14625.02.patch

Updated patch without the log changes.

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-11667) Support Trash and Snapshot in Truncate Table

2016-08-24 Thread Chaoyu Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chaoyu Tang updated HIVE-11667:
---
  Priority: Major  (was: Minor)
Issue Type: Task  (was: Improvement)

Separate Trash and Snapshot supports to subtasks

> Support Trash and Snapshot in Truncate Table
> 
>
> Key: HIVE-11667
> URL: https://issues.apache.org/jira/browse/HIVE-11667
> Project: Hive
>  Issue Type: Task
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
>
> Currently Truncate Table (or Partition) is implemented using 
> FileSystem.delete and then recreate the directory. It does not support HDFS 
> Trash if it is turned on. The table/partition can not be truncated if it has 
> a snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14572) Investigate jenkins test report timings

2016-08-24 Thread Zoltan Haindrich (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435924#comment-15435924
 ] 

Zoltan Haindrich commented on HIVE-14572:
-

created infra ticket

> Investigate jenkins test report timings
> ---
>
> Key: HIVE-14572
> URL: https://issues.apache.org/jira/browse/HIVE-14572
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>
> [~sseth] have noticed some odd timings in the jenkins reports
> I've created a sample project, to emulate a clidriver run during qtest:
> the testclass:
> * 1 sec beforeclass
> * 3x 0.2s test
> created using junit4 parameterized.
> Double checkout; second project runs different tests...or at least they have 
> different name.
> here are my preliminary findings:
> || thing || expected || 2.16 || 2.19.1
> | total time | ~3.4s | 1.2s | 3.4s 
> | package time | ~3.4s | 0.61s | 1.7s
> | class time | ~3.4s | 0.61s | 1.7s
> | testcase times | ~.2s | ~.2s | ~.2s 
> notes:
> * using 2.16 beforeclass timngs are totally hidden or lost
> * 2.19.1 does account for beforeclass but still fails to correctly aggregate 
> the two runs of the similary named testclasses
> it might worth a try to look at the bleeding edge of this jenkins plugin...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435938#comment-15435938
 ] 

Sergey Shelukhin commented on HIVE-14418:
-

[~ashutoshc] ping?

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13403) Make Streaming API not create empty buckets

2016-08-24 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13403:
-
Summary: Make Streaming API not create empty buckets  (was: Make Streaming 
API not create empty buckets (at least as an option))

> Make Streaming API not create empty buckets
> ---
>
> Key: HIVE-13403
> URL: https://issues.apache.org/jira/browse/HIVE-13403
> Project: Hive
>  Issue Type: Bug
>  Components: HCatalog, Transactions
>Affects Versions: 1.3.0
>Reporter: Eugene Koifman
>Assignee: Wei Zheng
>Priority: Critical
> Attachments: HIVE-13403.1.patch, HIVE-13403.2.patch, 
> HIVE-13403.3.patch, HIVE-13403.4.patch, HIVE-13403.5.patch
>
>
> as of HIVE-11983, when a TransactionBatch is opened in StreamingAPI, a full 
> compliment of bucket files (AbstractRecordWriter.createRecordUpdaters()) is 
> created on disk even though some may end up receiving no data.
> It would be better to create them on demand and not clog the FS.
> Tez can handle missing (empty) buckets and on MR bucket join algorithms will 
> check if all buckets are there and bail out if not.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14574) use consistent hashing for LLAP consistent splits to alleviate impact from cluster changes

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14574:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!


> use consistent hashing for LLAP consistent splits to alleviate impact from 
> cluster changes
> --
>
> Key: HIVE-14574
> URL: https://issues.apache.org/jira/browse/HIVE-14574
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-14574.01.patch, HIVE-14574.02.patch, 
> HIVE-14574.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14418) Hive config validation prevents unsetting the settings

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435981#comment-15435981
 ] 

Sergey Shelukhin commented on HIVE-14418:
-

reset only removes overrides in the session; unset sets it to default value

> Hive config validation prevents unsetting the settings
> --
>
> Key: HIVE-14418
> URL: https://issues.apache.org/jira/browse/HIVE-14418
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14418.01.patch, HIVE-14418.02.patch, 
> HIVE-14418.03.patch, HIVE-14418.patch
>
>
> {noformat}
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> hive> set hive.tez.task.scale.memory.reserve.fraction.max=null;
> Query returned non-zero code: 1, cause: 'SET 
> hive.tez.task.scale.memory.reserve.fraction.max=null' FAILED because 
> hive.tez.task.scale.memory.reserve.fraction.max expects FLOAT type value.
> {noformat}
> unset also doesn't work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14617) NPE in UDF MapValues() if input is null

2016-08-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436117#comment-15436117
 ] 

Xuefu Zhang commented on HIVE-14617:


Thanks for the review, Chao. I will take care of those after the test results 
coming back but before committing to git.

> NPE in UDF MapValues() if input is null
> ---
>
> Key: HIVE-14617
> URL: https://issues.apache.org/jira/browse/HIVE-14617
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 2.1.0
>Reporter: Xuefu Zhang
>Assignee: Xuefu Zhang
> Attachments: HIVE-14617.patch
>
>
> For query
> {code}
> select exploded_traits from hdrone.vehiclestore_udr_vehicle 
> lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as 
> exploded_traits 
> where datestr > '2016-08-22' LIMIT 100
> {code}
> Job fails with error msg as follows:
> {code}
> Error: java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
> org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:422) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
> processing row 
> {"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) 
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 
> 8 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
> evaluating map_values(vehicle_traits.vehicle_traits) at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37)
>  at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
> org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
>  at 
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
>  at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) 
> ... 9 more Caused by: java.lang.NullPointerException at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
>  at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
>  at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77)
>  ... 15 more 
> {code}
> It appears that null is not properly handled in 
> GenericUDFMapValues.evaluate() method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ke Jia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ke Jia updated HIVE-13589:
--
Attachment: HIVE-13589.4.patch

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch, HIVE-13589.4.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14625) Minor qtest fixes

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436184#comment-15436184
 ] 

Hive QA commented on HIVE-14625:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825376/HIVE-14625.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10459 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-979/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825376 - PreCommit-HIVE-MASTER-Build

> Minor qtest fixes
> -
>
> Key: HIVE-14625
> URL: https://issues.apache.org/jira/browse/HIVE-14625
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
>Assignee: Siddharth Seth
> Attachments: HIVE-14625.01.patch, HIVE-14625.02.patch, 
> HIVE-14625.03.patch
>
>
> Log times for CoreCliDriver
> Exit early if cleanup and createsSources fails
> Turn PerfLogger off for ptests



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14249:
---
Status: Patch Available  (was: In Progress)

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14249:
---
Attachment: HIVE-14249.05.patch

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch, HIVE-14249.05.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-14249 started by Jesus Camacho Rodriguez.
--
> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14249:
---
Status: Open  (was: Patch Available)

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14249:
---
Attachment: (was: HIVE-14249.04.patch)

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14249) Add simple materialized views with manual rebuilds

2016-08-24 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-14249:
---
Attachment: (was: HIVE-14249.03.patch)

> Add simple materialized views with manual rebuilds
> --
>
> Key: HIVE-14249
> URL: https://issues.apache.org/jira/browse/HIVE-14249
> Project: Hive
>  Issue Type: New Feature
>  Components: Materialized views, Parser
>Reporter: Alan Gates
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-10459.2.patch
>
>
> This patch is a start at implementing simple views. It doesn't have enough 
> testing yet (e.g. there's no negative testing). And I know it fails in the 
> partitioned case. I suspect things like security and locking don't work 
> properly yet either. But I'm posting it as a starting point.
> In this initial patch I'm just handling simple materialized views with manual 
> rebuilds. In later JIRAs we can add features such as allowing the optimizer 
> to rewrite queries to use materialized views rather than tables named in the 
> queries, giving the optimizer the ability to determine when a materialized 
> view is stale, etc.
> Also, I didn't rebase this patch against trunk after the migration from 
> svn->git so it may not apply cleanly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14398) import database.tablename from path error

2016-08-24 Thread Yechao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436126#comment-15436126
 ] 

Yechao Chen commented on HIVE-14398:


[~xuefuz]
Yes,I checkout the latest trunk,the code has change a lot;
I test it, It is already fixed.thanks for your answer,so i just cancel this 
patch or else?

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14233) Improve vectorization for ACID by eliminating row-by-row stitching

2016-08-24 Thread Saket Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436236#comment-15436236
 ] 

Saket Saurabh commented on HIVE-14233:
--

Thanks [~ekoifman] for the comments, working now on fixing them.

> Improve vectorization for ACID by eliminating row-by-row stitching
> --
>
> Key: HIVE-14233
> URL: https://issues.apache.org/jira/browse/HIVE-14233
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions, Vectorization
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14233.01.patch, HIVE-14233.02.patch, 
> HIVE-14233.03.patch, HIVE-14233.04.patch, HIVE-14233.05.patch, 
> HIVE-14233.06.patch, HIVE-14233.07.patch, HIVE-14233.08.patch, 
> HIVE-14233.09.patch
>
>
> This JIRA proposes to improve vectorization for ACID by eliminating 
> row-by-row stitching when reading back ACID files. In the current 
> implementation, a vectorized row batch is created by populating the batch one 
> row at a time, before the vectorized batch is passed up along the operator 
> pipeline. This row-by-row stitching limitation was because of the fact that 
> the ACID insert/update/delete events from various delta files needed to be 
> merged together before the actual version of a given row was found out. 
> HIVE-14035 has enabled us to break away from that limitation by splitting 
> ACID update events into a combination of delete+insert. In fact, it has now 
> enabled us to create splits on delta files.
> Building on top of HIVE-14035, this JIRA proposes to solve this earlier 
> bottleneck in the vectorized code path for ACID by now directly reading row 
> batches from the underlying ORC files and avoiding any stitching altogether. 
> Once a row batch is read from the split (which may be on a base/delta file), 
> the deleted rows will be found by cross-referencing them against a data 
> structure that will just keep track of deleted events (found in the 
> deleted_delta files). This will lead to a large performance gain when reading 
> ACID files in vectorized fashion, while enabling further optimizations in 
> future that can be done on top of that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14582) Add trunc(numeric) udf

2016-08-24 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam reassigned HIVE-14582:
---

Assignee: Chinna Rao Lalam

> Add trunc(numeric) udf
> --
>
> Key: HIVE-14582
> URL: https://issues.apache.org/jira/browse/HIVE-14582
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Ashutosh Chauhan
>Assignee: Chinna Rao Lalam
>
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions200.htm



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436098#comment-15436098
 ] 

Hive QA commented on HIVE-14621:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825358/HIVE-14621.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10461 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.jdbc.TestJdbcWithMiniHS2.testAddJarConstructorUnCaching
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/978/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/978/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-978/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825358 - PreCommit-HIVE-MASTER-Build

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.01.patch, HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14618) beeline fetch logging delays before query completion

2016-08-24 Thread Tao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Li updated HIVE-14618:
--
Attachment: HIVE-14618.2.patch

> beeline fetch logging delays before query completion
> 
>
> Key: HIVE-14618
> URL: https://issues.apache.org/jira/browse/HIVE-14618
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14618.1.patch, HIVE-14618.2.patch
>
>
> Beeline has a thread that fetches logs from HS2. However, it uses the same 
> HiveStatement object to also wait for query completion using a long-poll 
> (with default interval of 5 seconds).
> The jdbc client has a lock around the thrift api calls, resulting in the 
> getLogs api blocking on the query completion check. ie the logs would get 
> shown only every 5 seconds by default.
> cc [~vgumashta] [~gopalv] [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436153#comment-15436153
 ] 

Ferdinand Xu commented on HIVE-13589:
-

 [~vihangk1] Good idea. Let's make it in this way. [~Jk_Self] thank you for 
your update. LGTM +1

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch, HIVE-13589.4.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14618) beeline fetch logging delays before query completion

2016-08-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436160#comment-15436160
 ] 

Gopal V commented on HIVE-14618:


LGTM - +1 tests pending.

> beeline fetch logging delays before query completion
> 
>
> Key: HIVE-14618
> URL: https://issues.apache.org/jira/browse/HIVE-14618
> Project: Hive
>  Issue Type: Bug
>Reporter: Tao Li
>Assignee: Tao Li
> Attachments: HIVE-14618.1.patch, HIVE-14618.2.patch
>
>
> Beeline has a thread that fetches logs from HS2. However, it uses the same 
> HiveStatement object to also wait for query completion using a long-poll 
> (with default interval of 5 seconds).
> The jdbc client has a lock around the thrift api calls, resulting in the 
> getLogs api blocking on the query completion check. ie the logs would get 
> shown only every 5 seconds by default.
> cc [~vgumashta] [~gopalv] [~thejas]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14628) Flaky tests: MiniYarnClusterDir in /tmp - TestPigHBaseStorageHandler

2016-08-24 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14628:
--
Attachment: hive.log

> Flaky tests: MiniYarnClusterDir in /tmp - TestPigHBaseStorageHandler
> 
>
> Key: HIVE-14628
> URL: https://issues.apache.org/jira/browse/HIVE-14628
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Siddharth Seth
> Attachments: hive.log
>
>
> Configure the MiniYarnCluster to work within a test specific directory, 
> instead of using /tmp
> https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/979/testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436271#comment-15436271
 ] 

Hive QA commented on HIVE-14462:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12825362/HIVE-14462.7.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10461 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.org.apache.hadoop.hive.cli.TestCliDriver
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_mapjoin]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[add_part_exist]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[create_view_partitioned]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ctas]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[partitions_json]
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_join_part_col_char]
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[acid_bucket_pruning]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3]
org.apache.hive.service.cli.operation.TestOperationLoggingLayout.testSwitchLogLayout
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/980/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/980/console
Test logs: 
http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-980/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12825362 - PreCommit-HIVE-MASTER-Build

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch, HIVE-14462.7.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14398) import database.tablename from path error

2016-08-24 Thread Yechao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436126#comment-15436126
 ] 

Yechao Chen edited comment on HIVE-14398 at 8/25/16 2:06 AM:
-

[~xuefuz]
Yes,I checkout the latest trunk,the code changes a lot;
I test it, It is already fixed.thanks for your answer,so i just cancel this 
patch or else?


was (Author: chenyechao):
[~xuefuz]
Yes,I checkout the latest trunk,the code has change a lot;
I test it, It is already fixed.thanks for your answer,so i just cancel this 
patch or else?

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14614) Insert overwrite local directory fails with IllegalStateException

2016-08-24 Thread Mohit Sabharwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436106#comment-15436106
 ] 

Mohit Sabharwal commented on HIVE-14614:


Nit: I'd just define a new HADOOP_LOCAL_FS_SCHEME and do
{code}
+isLocal = path.toUri().getScheme().equals(HADOOP_LOCAL_FS_SCHEME);
{code}
for readability.

Otherwise LGTM pending tests. [~spena] should take a look since he worked on 
HIVE-14270. 

> Insert overwrite local directory fails with IllegalStateException
> -
>
> Key: HIVE-14614
> URL: https://issues.apache.org/jira/browse/HIVE-14614
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14614.2.patch
>
>
> insert overwrite local directory  select * from table; fails with 
> "java.lang.IllegalStateException: Cannot create staging directory" when the 
> path sent to the getTempDirForPath(Path path)  is a local fs path.
> This is a regression caused by the fix for HIVE-14270



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14621) LLAP: memory.mode = none has NPE

2016-08-24 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436111#comment-15436111
 ] 

Sergey Shelukhin commented on HIVE-14621:
-

Test failures appear unrelated (mostly the usual stats/hashtable order changes)

> LLAP: memory.mode = none has NPE
> 
>
> Key: HIVE-14621
> URL: https://issues.apache.org/jira/browse/HIVE-14621
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14621.01.patch, HIVE-14621.patch
>
>
> When IO elevator is enabled, but cache and allocator are both disabled, NPEs 
> happen. It's not really a recommended mode, but it's the only way to disable 
> cache, so we probably need to fix it. I am also going to nuke the 
> intermediate mode (allocator w/no cache) meanwhile cause it's pointless and 
> just creates a zoo of configurations.
> {noformat}
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.llap.cache.LlapDataBuffer.getByteBufferDup(LlapDataBuffer.java:59)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createDiskRangeInfo(StreamUtils.java:63)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.StreamUtils.createSettableUncompressedStream(StreamUtils.java:48)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory$LongStreamReader$StreamReaderBuilder.build(EncodedTreeReaderFactory.java:514)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1737)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:162)
> at 
> org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:55)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:76)
> at 
> org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:30)
> at 
> org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:408)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:424)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:227)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:224)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:224)
> at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:93)
> ... 6 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14233) Improve vectorization for ACID by eliminating row-by-row stitching

2016-08-24 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436113#comment-15436113
 ] 

Eugene Koifman commented on HIVE-14233:
---

[~saketj] more comments on RB

> Improve vectorization for ACID by eliminating row-by-row stitching
> --
>
> Key: HIVE-14233
> URL: https://issues.apache.org/jira/browse/HIVE-14233
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions, Vectorization
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14233.01.patch, HIVE-14233.02.patch, 
> HIVE-14233.03.patch, HIVE-14233.04.patch, HIVE-14233.05.patch, 
> HIVE-14233.06.patch, HIVE-14233.07.patch, HIVE-14233.08.patch, 
> HIVE-14233.09.patch
>
>
> This JIRA proposes to improve vectorization for ACID by eliminating 
> row-by-row stitching when reading back ACID files. In the current 
> implementation, a vectorized row batch is created by populating the batch one 
> row at a time, before the vectorized batch is passed up along the operator 
> pipeline. This row-by-row stitching limitation was because of the fact that 
> the ACID insert/update/delete events from various delta files needed to be 
> merged together before the actual version of a given row was found out. 
> HIVE-14035 has enabled us to break away from that limitation by splitting 
> ACID update events into a combination of delete+insert. In fact, it has now 
> enabled us to create splits on delta files.
> Building on top of HIVE-14035, this JIRA proposes to solve this earlier 
> bottleneck in the vectorized code path for ACID by now directly reading row 
> batches from the underlying ORC files and avoiding any stitching altogether. 
> Once a row batch is read from the split (which may be on a base/delta file), 
> the deleted rows will be found by cross-referencing them against a data 
> structure that will just keep track of deleted events (found in the 
> deleted_delta files). This will lead to a large performance gain when reading 
> ACID files in vectorized fashion, while enabling further optimizations in 
> future that can be done on top of that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13930) upgrade Hive to latest Hadoop version

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13930:

Attachment: HIVE-13930.07.patch

Actually I cannot repro those (other than maybe some ordering changes if I run 
with java7). Will run again and try to catch the logs/test report this time

> upgrade Hive to latest Hadoop version
> -
>
> Key: HIVE-13930
> URL: https://issues.apache.org/jira/browse/HIVE-13930
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13930.01.patch, HIVE-13930.02.patch, 
> HIVE-13930.03.patch, HIVE-13930.04.patch, HIVE-13930.05.patch, 
> HIVE-13930.06.patch, HIVE-13930.07.patch, HIVE-13930.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14398) import database.tablename from path error

2016-08-24 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436127#comment-15436127
 ] 

Xuefu Zhang commented on HIVE-14398:


I closed this as "not reproducible".

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14398) import database.tablename from path error

2016-08-24 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-14398:
---
Resolution: Cannot Reproduce
Status: Resolved  (was: Patch Available)

> import database.tablename from path error
> -
>
> Key: HIVE-14398
> URL: https://issues.apache.org/jira/browse/HIVE-14398
> Project: Hive
>  Issue Type: Bug
>  Components: Import/Export
>Affects Versions: 1.1.0
>Reporter: Yechao Chen
>Assignee: Yechao Chen
> Fix For: 1.1.0
>
> Attachments: HIVE-14398.1.patch
>
>
> hive>create table a(id int,name string);
> hive>export table a to '/tmp/a';
> hive> import table test.a from '/tmp/a';
> Copying data from hdfs://test:8020/tmp/a/data
> Loading data to table default.test.a
> Failed with exception Invalid table name default.test.a
> FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.MoveTask
> tablename  should be test.a not default.test.a 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434353#comment-15434353
 ] 

Ferdinand Xu commented on HIVE-13589:
-

[~vihangk1], I take a look at the BeeLine code. Now -p has been associated with 
a password. If we make it optional, it will parse the string next to "-p" to 
see whether it exists a password. In this way, it's possible to treat "--" 
as a password since - doesn't exist in Beeline options. One way I can think 
of is that we could add a new option which has no password. And prompt user to 
enter the password. Any thoughputs?

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434353#comment-15434353
 ] 

Ferdinand Xu edited comment on HIVE-13589 at 8/24/16 7:27 AM:
--

[~vihangk1], I take a look at the BeeLine code. Now "- p" has been associated 
with a password. If we make it optional, it will parse the string next to "- p" 
to see whether it exists a password. In this way, it's possible to treat 
"--" as a password since - doesn't exist in Beeline options. One way I 
can think of is that we could add a new option which has no password. And 
prompt user to enter the password. Any thoughputs?


was (Author: ferd):
[~vihangk1], I take a look at the BeeLine code. Now -p has been associated with 
a password. If we make it optional, it will parse the string next to "-p" to 
see whether it exists a password. In this way, it's possible to treat "--" 
as a password since - doesn't exist in Beeline options. One way I can think 
of is that we could add a new option which has no password. And prompt user to 
enter the password. Any thoughputs?

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14446) Adjust bloom filter for hybrid grace hash join when row count exceeds certain limit

2016-08-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434394#comment-15434394
 ] 

Gopal V commented on HIVE-14446:


LGTM - +1.

> Adjust bloom filter for hybrid grace hash join when row count exceeds certain 
> limit
> ---
>
> Key: HIVE-14446
> URL: https://issues.apache.org/jira/browse/HIVE-14446
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.3.0, 2.2.0, 2.1.1
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-14446.1.patch, HIVE-14446.2.patch
>
>
> When row count exceeds certain limit, it doesn't make sense to generate a 
> bloom filter, since its size will be a few hundred MB or even a few GB.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ferdinand Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434292#comment-15434292
 ] 

Ferdinand Xu commented on HIVE-13589:
-

[~Jk_Self], we should not move those options to Beeline which will break 
backwards compatibility. Anyway to make this option required and if user 
doesn't enter a password, we pass in an empty value. When comes the empty, we 
prompt user to enter their password.

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Ke Jia (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434269#comment-15434269
 ] 

Ke Jia commented on HIVE-13589:
---

Hi [~vihangk1], [~Ferd], current patch can prompt for the password when the 
password is null or empty string. Howerver,if the case with "-p" option does 
not follow the specified option which are added in the Beeline.java 
[L290-L391], the Apache Common CLI will consider the option as the value of 
"-p" argument. For example, "hive --service beeline -u 
jdbc:hive2://localhost:1  -n root -p --force=true -e 'show tables;'", 
because the "--force" option is not added in the Beeline.java, the Apache 
Common CLI set "--force=true" to the "-p" option. So when it execute the code 
which you mentioned above, the password is not null and will not prompt for 
password. Avoiding the bad user-experience, do you think we need to add all the 
options in the Beeline.java? 

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14462:

Status: Patch Available  (was: Open)

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14462) Reduce number of partition check calls in add_partitions

2016-08-24 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-14462:

Attachment: HIVE-14462.6.patch

> Reduce number of partition check calls in add_partitions
> 
>
> Key: HIVE-14462
> URL: https://issues.apache.org/jira/browse/HIVE-14462
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-14462.1.patch, HIVE-14462.2.patch, 
> HIVE-14462.3.patch, HIVE-14462.4.patch, HIVE-14462.6.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435149#comment-15435149
 ] 

Pengcheng Xiong commented on HIVE-14362:


Thanks [~gopalv] for the detailed performance analysis. I have addressed the 
local file and also vectorization issue. I still have some other small issue to 
address before i submit another patch. Thanks.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435150#comment-15435150
 ] 

Pengcheng Xiong commented on HIVE-14362:


Thanks [~gopalv] for the detailed performance analysis. I have addressed the 
local file and also vectorization issue. I still have some other small issue to 
address before i submit another patch. Thanks.

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13680) HiveServer2: Provide a way to compress ResultSets

2016-08-24 Thread Kevin Liew (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Liew updated HIVE-13680:
--
Attachment: HIVE-13680.4.patch

> HiveServer2: Provide a way to compress ResultSets
> -
>
> Key: HIVE-13680
> URL: https://issues.apache.org/jira/browse/HIVE-13680
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, JDBC
>Reporter: Vaibhav Gumashta
>Assignee: Kevin Liew
> Attachments: HIVE-13680.2.patch, HIVE-13680.3.patch, 
> HIVE-13680.4.patch, HIVE-13680.patch, SnappyCompDe.zip, proposal.pdf
>
>
> With HIVE-12049 in, we can provide an option to compress ResultSets before 
> writing to disk. The user can specify a compression library via a config 
> param which can be used in the tasks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14617) NPE in UDF MapValues() if input is null

2016-08-24 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-14617:
---
Description: 
For query
{code}
select exploded_traits from hdrone.vehiclestore_udr_vehicle 
lateral view explode(map_values(vehicle_traits.vehicle_traits)) traits as 
exploded_traits 
where datestr > '2016-08-22' LIMIT 100
{code}
Job fails with error msg as follows:
{code}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 
{"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 
{"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507) at 
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170) ... 8 
more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error 
evaluating map_values(vehicle_traits.vehicle_traits) at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:82) 
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37)
 at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
 at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
 at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497) 
... 9 more Caused by: java.lang.NullPointerException at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDFMapValues.evaluate(GenericUDFMapValues.java:64)
 at 
org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator._evaluate(ExprNodeGenericFuncEvaluator.java:185)
 at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:77)
 at 
org.apache.hadoop.hive.ql.exec.ExprNodeEvaluator.evaluate(ExprNodeEvaluator.java:65)
 at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:77) 
... 15 more 
{code}
It appears that null is not properly handled in GenericUDFMapValues.evaluate() 
method.

  was:
Job fails with error msg as follows:
{code}
Error: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 
{"ts":null,"_max_added_id":null,"identity_info":null,"vehicle_specs":null,"tracking_info":null,"color_info":null,"vehicle_traits":null,"detail_info":null,"_row_key":null,"_shard":null,"image_info":null,"vehicle_tags":null,"activation_info":null,"flavor_info":null,"sounds":null,"legacy_info":null,"images":null,"datestr":"2016-08-24"}
 at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179) at 
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at 
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at 
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row 

[jira] [Resolved] (HIVE-14601) Altering table/partition file format with preexisting data should not be allowed

2016-08-24 Thread Barna Zsombor Klara (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara resolved HIVE-14601.

Resolution: Not A Bug

Altering a table file format with records inside is indeed useful if the user 
forgets to specify the correct format and uploads the data file directly to the 
table location. In this case the user should be able to alter the file format 
to the correct value to fix the table setup.

> Altering table/partition file format with preexisting data should not be 
> allowed
> 
>
> Key: HIVE-14601
> URL: https://issues.apache.org/jira/browse/HIVE-14601
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Barna Zsombor Klara
>Assignee: Barna Zsombor Klara
>Priority: Minor
>
> The file format of a table or a partition can be changed using an alter 
> statement. However this only affects the metadata, the data in hdfs is not 
> changed, leading to a table from which you cannot select anymore. 
> Changing the file format back fixes the issue, but a better approach would be 
> to prevent the alter to the file format if we have data in the tables.
> The issue is reproducible by executing the following commands:
> {code}
> create table test (id int);
> insert into test values (1);
> alter table test set fileformat parquet;
> insert into test values (2);
> select * from test;
> {code}
> Will result in:
> {code}
> java.lang.RuntimeException: .../00_0 is not a Parquet file (too small) 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14362:
---
Attachment: HIVE-14362.03.patch

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14362:
---
Status: Open  (was: Patch Available)

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14362) Support explain analyze in Hive

2016-08-24 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14362:
---
Status: Patch Available  (was: Open)

> Support explain analyze in Hive
> ---
>
> Key: HIVE-14362
> URL: https://issues.apache.org/jira/browse/HIVE-14362
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-14362.01.patch, HIVE-14362.02.patch, 
> HIVE-14362.03.patch, compare_on_cluster.pdf
>
>
> Right now all the explain levels only support stats before query runs. We 
> would like to have an explain analyze similar to Postgres for real stats 
> after query runs. This will help to identify the major gap between 
> estimated/real stats and make not only query optimization better but also 
> query performance debugging easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13589) beeline - support prompt for password with '-u' option

2016-08-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435315#comment-15435315
 ] 

Vihang Karajgaonkar commented on HIVE-13589:


[~Ferd] I agree that we should not change the behavior of -p argument since it 
will break backwards compatibility. Adding another option to achieve the same 
purpose seems to be a overkill. Beeline should be smart enough to prompt for 
the password if there is no password given at the command line.

eg:
1. beeline -u "jdbc:hive2://localhost:1" -n username
> beeline should prompt for the password.

2. beeline -u "jdbc:hive2://localhost:1" -n username -p 

If we can achieve (1) above within beeline, I think that should be sufficient 
to solve this issue without any work-arounds mentioned by [~thejas] in the 
first comment.

> beeline - support prompt for password with '-u' option
> --
>
> Key: HIVE-13589
> URL: https://issues.apache.org/jira/browse/HIVE-13589
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Reporter: Thejas M Nair
>Assignee: Ke Jia
> Attachments: HIVE-13589.1.patch, HIVE-13589.2.patch, 
> HIVE-13589.3.patch
>
>
> Specifying connection string using commandline options in beeline is 
> convenient, as it gets saved in shell command history, and it is easy to 
> retrieve it from there.
> However, specifying the password in command prompt is not secure as it gets 
> displayed on screen and saved in the history.
> It should be possible to specify '-p' without an argument to make beeline 
> prompt for password.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14614) Insert overwrite local directory fails with IllegalStateException

2016-08-24 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435335#comment-15435335
 ] 

Vihang Karajgaonkar commented on HIVE-14614:


[~spena] and [~mohitsabharwal] Can you please review the patch when you get a 
chance? Thanks!

> Insert overwrite local directory fails with IllegalStateException
> -
>
> Key: HIVE-14614
> URL: https://issues.apache.org/jira/browse/HIVE-14614
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>Priority: Minor
> Attachments: HIVE-14614.2.patch
>
>
> insert overwrite local directory  select * from table; fails with 
> "java.lang.IllegalStateException: Cannot create staging directory" when the 
> path sent to the getTempDirForPath(Path path)  is a local fs path.
> This is a regression caused by the fix for HIVE-14270



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >