[jira] [Commented] (HIVE-14957) HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union

2016-10-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584820#comment-15584820
 ] 

Lefty Leverenz commented on HIVE-14957:
---

Shouldn't fix version include 2.1.1?

> HiveSortLimitPullUpConstantsRule misses branches when parent operator is Union
> --
>
> Key: HIVE-14957
> URL: https://issues.apache.org/jira/browse/HIVE-14957
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.2.0
>
> Attachments: HIVE-14957.01.patch, HIVE-14957.02.patch
>
>
> {code}
> call.transformTo(parent.copy(parent.getTraitSet(), 
> ImmutableList.of(relBuilder.build(;
> {code}
> When parent is an union operator which has 2 inputs, the parent.copy will 
> only copy the one that has SortLimit and ignore the other branches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14642) handle insert overwrite for MM tables

2016-10-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584794#comment-15584794
 ] 

Hive QA commented on HIVE-14642:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833872/HIVE-14642.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1614/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1614/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1614/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2016-10-18 07:59:02.305
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-1614/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2016-10-18 07:59:02.307
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 4b7f373 HIVE-14940: MiniTezCliDriver - switch back to SQL 
metastore as default (Prasanth Jayachandran reviewed by Siddharth Seth)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 4b7f373 HIVE-14940: MiniTezCliDriver - switch back to SQL 
metastore as default (Prasanth Jayachandran reviewed by Siddharth Seth)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2016-10-18 07:59:03.451
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: common/src/java/org/apache/hadoop/hive/common/ValidWriteIds.java: No 
such file or directory
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java:314
error: ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java: patch 
does not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java:258
error: ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java: patch does not 
apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java:3810
error: ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java: patch does 
not apply
error: patch failed: 
ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:1832
error: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: patch does not 
apply
error: ql/src/test/queries/clientpositive/mm_all.q: No such file or directory
error: ql/src/test/queries/clientpositive/mm_all2.q: No such file or directory
error: ql/src/test/queries/clientpositive/mm_current.q: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_all.q.out: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_all2.q.out: No such file or 
directory
error: ql/src/test/results/clientpositive/llap/mm_current.q.out: No such file 
or directory
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833872 - PreCommit-HIVE-Build

> handle insert overwrite for MM tables
> -
>
> Key: HIVE-14642
> URL: https://issues.apache.org/jira/browse/HIVE-14642
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: hive-14535
>
> Attachments: HIVE-14642.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584787#comment-15584787
 ] 

Hive QA commented on HIVE-14797:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833869/HIVE-14797.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 9 failed/errored test(s), 10592 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver[hbase_bulk] 
(batchId=89)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1613/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1613/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1613/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 9 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833869 - PreCommit-HIVE-Build

> reducer number estimating may lead to data skew
> ---
>
> Key: HIVE-14797
> URL: https://issues.apache.org/jira/browse/HIVE-14797
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Reporter: roncenzhao
>Assignee: roncenzhao
> Attachments: HIVE-14797.2.patch, HIVE-14797.3.patch, 
> HIVE-14797.4.patch, HIVE-14797.patch
>
>
> HiveKey's hash code is generated by multipling by 31 key by key which is 
> implemented in method `ObjectInspectorUtils.getBucketHashCode()`:
> for (int i = 0; i < bucketFields.length; i++) {
>   int fieldHash = ObjectInspectorUtils.hashCode(bucketFields[i], 
> bucketFieldInspectors[i]);
>   hashCode = 31 * hashCode + fieldHash;
> }
> The follow example will lead to data skew:
> I hava two table called tbl1 and tbl2 and they have the same column: a int, b 
> string. The values of column 'a' in both two tables are not skew, but values 
> of column 'b' in both two tables are skew.
> When my sql is "select * from tbl1 join tbl2 on tbl1.a=tbl2.a and 
> tbl1.b=tbl2.b" and the estimated reducer number is 31, it will lead to data 
> skew.
> As we know, the HiveKey's hash code is generated by `hash(a)*31 + hash(b)`. 
> When reducer number is 31 the reducer No. of each row is `hash(b)%31`. In the 
> result, the job will be skew.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584774#comment-15584774
 ] 

Lefty Leverenz commented on HIVE-11394:
---

Removed the TODOC2.2 label for now because this issue was reverted.

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> 

[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization

2016-10-18 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-11394:
--
Labels:   (was: TODOC2.2)

> Enhance EXPLAIN display for vectorization
> -
>
> Key: HIVE-11394
> URL: https://issues.apache.org/jira/browse/HIVE-11394
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Fix For: 2.2.0
>
> Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, 
> HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, 
> HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, 
> HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, 
> HIVE-11394.093.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not 
> vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] 
> \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization 
> enabled) and a summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter 
> Vectorization.  It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. 
> predicateExpression.  It includes all information of SUMMARY and OPERATOR, 
> too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization 
> sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION 
> SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for 
> GROUPBY operator: Data type struct of 
> Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": 
> "false" which says a node has a GROUP BY with an AVG or some other aggregator 
> that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators 
> are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at 
> least one vectorized expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  
> Today, GROUP BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are 
> conditionally native.  FILTER and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
> Tez
> ...
>   Edges:
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>   Vertices:
> Map 1 
> Map Operator Tree:
> TableScan
>   alias: alltypesorc
>   Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
>   Select Operator
> expressions: cint (type: int)
> outputColumnNames: cint
> Statistics: Num rows: 12288 Data size: 36696 Basic stats: 
> COMPLETE Column stats: COMPLETE
> Group By Operator
>   keys: cint (type: int)
>   mode: hash
>   outputColumnNames: _col0
>   Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
>   Reduce Output Operator
> key expressions: _col0 (type: int)
> sort order: +
> Map-reduce partition columns: _col0 (type: int)
> Statistics: Num rows: 5775 Data size: 17248 Basic 
> stats: COMPLETE Column stats: COMPLETE
> Execution mode: vectorized, llap
> LLAP IO: all inputs
> Map Vectorization:
> enabled: true
> enabledConditionsMet: 
> hive.vectorized.use.vectorized.input.format IS true
> groupByVectorOutput: true
> inputFileFormats: 
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
> allNative: false
> usesVectorUDFAdaptor: false
> vectorized: true
> Reducer 2 
> Execution 

[jira] [Resolved] (HIVE-14458) change relative data refernces in qfiles

2016-10-18 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-14458.
-
Resolution: Won't Fix

it seems we dont need this to happen ;)

> change relative data refernces in qfiles
> 
>
> Key: HIVE-14458
> URL: https://issues.apache.org/jira/browse/HIVE-14458
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>
> there are many relative ({{../..}}) references inside qfiles and q.out files;
> because these references heavily dependent on the current working directory,  
> these should be changed to
> * either use properties like {{test.data.dir}} or {{hive.root}} ...
> * or any other reliable method to access those files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584719#comment-15584719
 ] 

Lefty Leverenz commented on HIVE-14822:
---

Here's the link to the config doc:

* [hive.server2.job.credential.provider.path | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.server2.job.credential.provider.path]

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch, 
> HIVE-14822.07.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-18 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584710#comment-15584710
 ] 

Jianguo Tian commented on HIVE-14679:
-

What you said about "not affect the csv2/tsv2 formats" is correct, and that is 
exactly what I'm working forward to. Thanks for your opinion! Please wait for 
my patch which will be updated. 

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-18 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584702#comment-15584702
 ] 

Jianguo Tian edited comment on HIVE-14679 at 10/18/16 7:13 AM:
---

Hi, Kenneth MacArthur. It looks difficult to implement "there should simply be 
no quote character at all when quoting is disabled". As we can see from the 
below code, the first parameter of *Builder* method is a character, but 
unfortunately we can't implement an empty character in java as *""* in String.
{code:borderStyle=solid}
unquotedCsvPreference = new CsvPreference.Builder('\0', separator, "").build();
{code}
How do you think about this above?


was (Author: jonnyr):
Hi, [~Kenneth MacArthur]. It looks difficult to implement "there should simply 
be no quote character at all when quoting is disabled". As we can see from the 
below code, the first parameter of *Builder* method is a character, but 
unfortunately we can't implement an empty character in java as *""* in String.
{code:borderStyle=solid}
unquotedCsvPreference = new CsvPreference.Builder('\0', separator, "").build();
{code}
How do you think about this above?

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-18 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584702#comment-15584702
 ] 

Jianguo Tian commented on HIVE-14679:
-

Hi, [~Kenneth MacArthur]. It looks difficult to implement "there should simply 
be no quote character at all when quoting is disabled". As we can see from the 
below code, the first parameter of *Builder* method is a character, but 
unfortunately we can't implement an empty character in java as *""* in String.
{code:borderStyle=solid}
unquotedCsvPreference = new CsvPreference.Builder('\0', separator, "").build();
{code}
How do you think about this above?

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14985) Remove UDF-s created during test runs

2016-10-18 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14985:
--
Attachment: HIVE-14985.2.patch

Resubmitting because of QA problems not tests run

> Remove UDF-s created during test runs
> -
>
> Key: HIVE-14985
> URL: https://issues.apache.org/jira/browse/HIVE-14985
> Project: Hive
>  Issue Type: Sub-task
>  Components: Testing Infrastructure
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14985.2.patch, HIVE-14985.patch
>
>
> When I tried to run llap_udf.q repeatedly from my IDE then the first run was 
> a pass, but following runs were failed. 
> The query does not remove the created functions in the query file which could 
> cause problems for the follow up tests.
> The same problem could happen if a query test fails in the middle of the 
> script, and even though the file contains the removal sql commands, those are 
> not executed.
> It might be a good idea to clean up not just tables and keys, but functions 
> created during the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9941) sql std authorization on partitioned table: truncate and insert

2016-10-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584689#comment-15584689
 ] 

Hive QA commented on HIVE-9941:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833864/HIVE-9941.3.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10596 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1612/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1612/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1612/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833864 - PreCommit-HIVE-Build

> sql std authorization on partitioned table: truncate and insert
> ---
>
> Key: HIVE-9941
> URL: https://issues.apache.org/jira/browse/HIVE-9941
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization
>Affects Versions: 1.0.0, 1.2.0
>Reporter: Olaf Flebbe
>Assignee: Sushanth Sowmyan
> Attachments: HIVE-9941.2.patch, HIVE-9941.3.patch, HIVE-9941.patch
>
>
> sql std authorization works as expected.
> However if a table is partitioned any user can truncate it
> User foo:
> {code}
> create table bla (a string) partitioned by (b string);
> #.. loading values ...
> {code}
> Admin:
> {code}
> 0: jdbc:hive2://localhost:1/default> set role admin;
> No rows affected (0,074 seconds)
> 0: jdbc:hive2://localhost:1/default> show grant on bla;
> +---+++-+-+-++---++--+--+
> | database  | table  | partition  | column  | principal_name  | 
> principal_type  | privilege  | grant_option  |   grant_time   | grantor  |
> +---+++-+-+-++---++--+--+
> | default   | bla|| | foo | USER  
>   | DELETE | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | INSERT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | SELECT | true  | 1426158997000  | foo  |
> | default   | bla|| | foo | USER  
>   | UPDATE | true  | 1426158997000  | foo  |
> +---+++-+-+-++---++--+--+
> {code}
> now user olaf
> {code}
> 0: jdbc:hive2://localhost:1/default> select * from bla;
> Error: Error while compiling statement: FAILED: HiveAccessControlException 
> Permission denied: Principal [name=olaf, type=USER] does not have following 
> privileges for operation QUERY [[SELECT] on Object [type=TABLE_OR_VIEW, 
> name=default.bla]] (state=42000,code=4)
> {code}
> works as expected.
> _BUT_
> {code}
> 0: jdbc:hive2://localhost:1/default> truncate table bla;
> No rows affected (0,18 seconds)
> {code}
> _And table is empty afterwards_.
> Similarily: {{insert into table}} works, too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584682#comment-15584682
 ] 

Lefty Leverenz commented on HIVE-14822:
---

Actually [~vihangk1] already documented 
*hive.server2.job.credential.provider.path* (thanks!) so all we need now is 
usage documentation similar to that found in this issue's description.

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch, 
> HIVE-14822.07.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13316) Upgrade to Calcite 1.10

2016-10-18 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-13316:
---
Attachment: HIVE-13316.11.patch

> Upgrade to Calcite 1.10
> ---
>
> Key: HIVE-13316
> URL: https://issues.apache.org/jira/browse/HIVE-13316
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-13316.01.patch, HIVE-13316.02.patch, 
> HIVE-13316.05.patch, HIVE-13316.07.patch, HIVE-13316.08.patch, 
> HIVE-13316.09.patch, HIVE-13316.10.patch, HIVE-13316.11.patch, 
> HIVE-13316.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14913) Add new unit tests

2016-10-18 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584672#comment-15584672
 ] 

Ashutosh Chauhan commented on HIVE-14913:
-

Can you update the RB ?

> Add new unit tests
> --
>
> Key: HIVE-14913
> URL: https://issues.apache.org/jira/browse/HIVE-14913
> Project: Hive
>  Issue Type: Task
>  Components: Tests
>Reporter: Vineet Garg
>Assignee: Vineet Garg
> Attachments: HIVE-14913.1.patch, HIVE-14913.2.patch, 
> HIVE-14913.3.patch, HIVE-14913.4.patch, HIVE-14913.5.patch
>
>
> Moving bunch of tests from system test to hive unit tests to reduce testing 
> overhead



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14679) csv2/tsv2 output format disables quoting by default and it's difficult to enable

2016-10-18 Thread Jianguo Tian (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15580939#comment-15580939
 ] 

Jianguo Tian edited comment on HIVE-14679 at 10/18/16 6:57 AM:
---

Thanks for your suggestions. I have finished the part of "Disabling quoting 
should be possible using a beeline argument". Next, I'll resolve your 3rd 
suggestion.



was (Author: jonnyr):
Thanks for your suggestions. I have finished the part of "Disabling quoting 
should be possible using a beeline argument". Next, I'll resolved your 3rd 
suggestion.

> csv2/tsv2 output format disables quoting by default and it's difficult to 
> enable
> 
>
> Key: HIVE-14679
> URL: https://issues.apache.org/jira/browse/HIVE-14679
> Project: Hive
>  Issue Type: Bug
>Reporter: Brock Noland
>Assignee: Jianguo Tian
>
> Over in HIVE-9788 we made quoting optional for csv2/tsv2.
> However I see the following issues:
> * JIRA doc doesn't mention it's disabled by default, this should be there an 
> in the output of beeline help.
> * The JIRA says the property is {{--disableQuotingForSV}} but it's actually a 
> system property. We should not use a system property as it's non-standard so 
> extremely hard for users to set. For example I must do: {{env 
> HADOOP_CLIENT_OPTS="-Ddisable.quoting.for.sv=false" beeline ...}}
> * The arg {{--disableQuotingForSV}} should be documented in beeline help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14459) TestBeeLineDriver - migration and re-enable

2016-10-18 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14459:
--
Attachment: HIVE-14459.patch

Kept the changes minimal with the sole goal to be able to run multiple times 
successfully.
- Enabled the driver
- Modified the regexps to hide when comparing the results
- Configured to run only 1 qtest file - so we can test, and could decide later 
of the beeline testing scope
- Added required dependencies to pom
- Added specific results dir for beeline q.out-s

After running on my rig several times, Testing against QA the first time - 
might require some adjustments

> TestBeeLineDriver - migration and re-enable
> ---
>
> Key: HIVE-14459
> URL: https://issues.apache.org/jira/browse/HIVE-14459
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
> Attachments: HIVE-14459.patch
>
>
> this test have been left behind in HIVE-1 because it had some compile 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14459) TestBeeLineDriver - migration and re-enable

2016-10-18 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-14459:
-

Assignee: Peter Vary

> TestBeeLineDriver - migration and re-enable
> ---
>
> Key: HIVE-14459
> URL: https://issues.apache.org/jira/browse/HIVE-14459
> Project: Hive
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Zoltan Haindrich
>Assignee: Peter Vary
>
> this test have been left behind in HIVE-1 because it had some compile 
> issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-18 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584632#comment-15584632
 ] 

Lefty Leverenz commented on HIVE-14822:
---

Doc note:  This adds *hive.server2.job.credential.provider.path* to 
HiveConf.java, so it needs to be documented in the wiki.

General usage information is also needed -- should that go in the HS2 setup doc 
or user doc?

* [Setting Up HiveServer2 -- Authentication/SecurityConfiguration | 
https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Authentication/SecurityConfiguration]
* [HiveServer2 Clients | 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients]
* [Configuration Properties -- HiveServer2 | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-HiveServer2]

Added a TODOC2.2 label.

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch, 
> HIVE-14822.07.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14822) Add support for credential provider for jobs launched from Hiveserver2

2016-10-18 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-14822:
--
Labels: TODOC2.2  (was: )

> Add support for credential provider for jobs launched from Hiveserver2
> --
>
> Key: HIVE-14822
> URL: https://issues.apache.org/jira/browse/HIVE-14822
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
>  Labels: TODOC2.2
> Fix For: 2.2.0
>
> Attachments: HIVE-14822.01.patch, HIVE-14822.02.patch, 
> HIVE-14822.03.patch, HIVE-14822.05.patch, HIVE-14822.06.patch, 
> HIVE-14822.07.patch
>
>
> When using encrypted passwords via the Hadoop Credential Provider, 
> HiveServer2 currently does not correctly forward enough information to the 
> job configuration for jobs to read those secrets. If your job needs to access 
> any secrets, like S3 credentials, then there's no convenient and secure way 
> to configure this today.
> You could specify the decryption key in files like mapred-site.xml that 
> HiveServer2 uses, but this would place the encryption password on local disk 
> in plaintext, which can be a security concern.
> To solve this problem, HiveServer2 should modify job configuration to include 
> the environment variable settings needed to decrypt the passwords. 
> Specifically, it will need to modify:
> * For MR2 jobs:
> ** yarn.app.mapreduce.am.admin.user.env
> ** mapreduce.admin.user.env
> * For Spark jobs:
> ** spark.yarn.appMasterEnv.HADOOP_CREDSTORE_PASSWORD
> ** spark.executorEnv.HADOOP_CREDSTORE_PASSWORD
> HiveServer2 can get the decryption password from its own environment, the 
> same way it does for its own credential provider store today.
> Additionally, it can be desirable for HiveServer2 to have a separate 
> encrypted password file than what is used by the job. HiveServer2 may have 
> secrets that the job should not have, such as the metastore database password 
> or the password to decrypt its private SSL certificate. It is also best 
> practices to have separate passwords on separate files. To facilitate this, 
> Hive will also accept:
> * A configuration for a path to a credential store to use for jobs. This 
> should already be uploaded in HDFS. (hive.server2.job.keystore.location or a 
> better name) If this is not specified, then HS2 will simply use the value of 
> hadoop.security.credential.provider.path.
> * An environment variable for the password to decrypt the credential store 
> (HIVE_JOB_KEYSTORE_PASSWORD or better). If this is not specified, then HS2 
> will simply use the standard environment variable for decrypting the Hadoop 
> Credential Provider.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13873) Column pruning for nested fields

2016-10-18 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584606#comment-15584606
 ] 

Hive QA commented on HIVE-13873:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833863/HIVE-13873.4.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 10602 tests 
executed
*Failed tests:*
{noformat}
TestBeelineWithHS2ConnectionFile - did not produce a TEST-*.xml file (likely 
timed out) (batchId=197)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[acid_globallimit] 
(batchId=27)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[case_sensitivity] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input_testxpath] 
(batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[order_null] (batchId=18)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[union_fast_stats] 
(batchId=46)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJarWithoutAddDriverClazz[0]
 (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[0] (batchId=155)
org.apache.hive.beeline.TestBeelineArgParsing.testAddLocalJar[1] (batchId=155)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation 
(batchId=157)
org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthorization.testBlackListedUdfUsage
 (batchId=204)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1611/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1611/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-1611/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 11 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833863 - PreCommit-HIVE-Build

> Column pruning for nested fields
> 
>
> Key: HIVE-13873
> URL: https://issues.apache.org/jira/browse/HIVE-13873
> Project: Hive
>  Issue Type: New Feature
>  Components: Logical Optimizer
>Reporter: Xuefu Zhang
>Assignee: Ferdinand Xu
> Attachments: HIVE-13873.1.patch, HIVE-13873.2.patch, 
> HIVE-13873.3.patch, HIVE-13873.4.patch, HIVE-13873.patch, HIVE-13873.wip.patch
>
>
> Some columnar file formats such as Parquet store fields in struct type also 
> column by column using encoding described in Google Dramel pager. It's very 
> common in big data where data are stored in structs while queries only needs 
> a subset of the the fields in the structs. However, presently Hive still 
> needs to read the whole struct regardless whether all fields are selected. 
> Therefore, pruning unwanted sub-fields in struct or nested fields at file 
> reading time would be a big performance boost for such scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2   3