[jira] [Commented] (HIVE-18346) Beeline could not launch because of size of history

2017-12-28 Thread Jumping (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306010#comment-16306010
 ] 

Jumping commented on HIVE-18346:


How to ?

> Beeline could not launch because of size of history
> ---
>
> Key: HIVE-18346
> URL: https://issues.apache.org/jira/browse/HIVE-18346
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Jumping
>Assignee: Madhudeep Petwal
>Priority: Minor
>
> Beeline version 1.2.1 could not launch when the size of  
> ${user.home}/.beeline/history larger than 39MB. Which reports 
> "java.lang.outofmemoryerror" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18346) Beeline could not launch because of size of history

2017-12-28 Thread Madhudeep Petwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305993#comment-16305993
 ] 

Madhudeep Petwal commented on HIVE-18346:
-

[~JQu] can you print the complete stack trace ?

> Beeline could not launch because of size of history
> ---
>
> Key: HIVE-18346
> URL: https://issues.apache.org/jira/browse/HIVE-18346
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: Jumping
>Assignee: Madhudeep Petwal
>Priority: Minor
>
> Beeline version 1.2.1 could not launch when the size of  
> ${user.home}/.beeline/history larger than 39MB. Which reports 
> "java.lang.outofmemoryerror" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14531) Support deep nested struct for INSERT OVER DIRECTORY

2017-12-28 Thread priyanka gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305898#comment-16305898
 ] 

priyanka gupta commented on HIVE-14531:
---

Did you find any solution? I am facing the same issue and setting 
hive.serialization.extend.nesting.levels is not working.

> Support deep nested struct for INSERT OVER DIRECTORY
> 
>
> Key: HIVE-14531
> URL: https://issues.apache.org/jira/browse/HIVE-14531
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor, Serializers/Deserializers
>Affects Versions: 2.2.0
>Reporter: Chao Sun
>Assignee: Wei Yan
>
> Currently if we do something similar to:
> {code}
> INSERT OVERWRITE DIRECTORY  SELECT * FROM 
> 
> {code}
> Then Hive may fail with error message like this:
> {code}
> Error: Error while compiling statement: FAILED: SemanticException 
> org.apache.hadoop.hive.serde2.SerDeException: Number of levels of nesting 
> supported for LazySimpleSerde is 7 Unable to work with level 8. Use 
> hive.serialization.extend.nesting.levels serde property for tables using 
> LazySimpleSerde. (state=42000,code=4)
> {code}
> It seems there's no way to set serde properties in this case. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

2017-12-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305879#comment-16305879
 ] 

Hive QA commented on HIVE-18323:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12903958/HIVE-18323.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 11542 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] (batchId=72)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_sortmerge_join_2] 
(batchId=48)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_join5] (batchId=35)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[bucketsortoptimize_insert_2]
 (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_lineage2]
 (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[hybridgrace_hashjoin_2]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage2] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[lineage3] 
(batchId=160)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid] 
(batchId=168)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mergejoin] 
(batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] 
(batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_reduce_groupby_duplicate_cols]
 (batchId=158)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[authorization_part]
 (batchId=93)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[ppd_join5] 
(batchId=120)
org.apache.hadoop.hive.metastore.TestEmbeddedHiveMetaStore.testTransactionalValidation
 (batchId=213)
org.apache.hadoop.hive.metastore.security.TestHadoopAuthBridge23.testDelegationTokenSharedStore
 (batchId=238)
org.apache.hadoop.hive.ql.TestAcidOnTez.testMapJoinOnTez (batchId=222)
org.apache.hadoop.hive.ql.io.TestDruidRecordWriter.testWrite (batchId=253)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=225)
org.apache.hive.jdbc.TestSSL.testConnectionMismatch (batchId=231)
org.apache.hive.jdbc.TestSSL.testConnectionWrongCertCN (batchId=231)
org.apache.hive.jdbc.TestSSL.testMetastoreConnectionWrongCertCN (batchId=231)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/8387/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/8387/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-8387/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12903958 - PreCommit-HIVE-Build

> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> --
>
> Key: HIVE-18323
> URL: https://issues.apache.org/jira/browse/HIVE-18323
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

2017-12-28 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305838#comment-16305838
 ] 

Hive QA commented on HIVE-18323:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
32s{color} | {color:red} ql: The patch generated 51 new + 41 unchanged - 0 
fixed = 92 total (was 41) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 13m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.36-1+deb8u1 (2016-09-03) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 035eca3 |
| Default Java | 1.8.0_111 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8387/yetus/diff-checkstyle-ql.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-8387/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> --
>
> Key: HIVE-18323
> URL: https://issues.apache.org/jira/browse/HIVE-18323
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

2017-12-28 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18323:

Status: Patch Available  (was: Open)

> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> --
>
> Key: HIVE-18323
> URL: https://issues.apache.org/jira/browse/HIVE-18323
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

2017-12-28 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305829#comment-16305829
 ] 

Aihua Xu commented on HIVE-18323:
-

Add the support to parse timestamp.

> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> --
>
> Key: HIVE-18323
> URL: https://issues.apache.org/jira/browse/HIVE-18323
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18323) Vectorization: add the support of timestamp in VectorizedPrimitiveColumnReader for parquet

2017-12-28 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-18323:

Attachment: HIVE-18323.1.patch

> Vectorization: add the support of timestamp in 
> VectorizedPrimitiveColumnReader for parquet
> --
>
> Key: HIVE-18323
> URL: https://issues.apache.org/jira/browse/HIVE-18323
> Project: Hive
>  Issue Type: Improvement
>  Components: Vectorization
>Affects Versions: 3.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-18323.1.patch
>
>
> {noformat}
> CREATE TABLE `t1`(
>   `ts` timestamp,
>   `s1` string)
> STORED AS PARQUET;
> set hive.vectorized.execution.enabled=true;
> SELECT * from t1 SORT BY s1;
> {noformat}
> This query will throw exception since timestamp is not supported here yet.
> {noformat}
> Caused by: java.io.IOException: java.io.IOException: Unsupported type: 
> optional int96 ts
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:365)
> at 
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14792) AvroSerde reads the remote schema-file at least once per mapper, per table reference.

2017-12-28 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305677#comment-16305677
 ] 

Aihua Xu commented on HIVE-14792:
-

[~mithun] Thanks for the fix. Can we still keep the default behavior to false? 
Do you have the sample query to repro such exception? 

> AvroSerde reads the remote schema-file at least once per mapper, per table 
> reference.
> -
>
> Key: HIVE-14792
> URL: https://issues.apache.org/jira/browse/HIVE-14792
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1, 2.1.0
>Reporter: Mithun Radhakrishnan
>Assignee: Mithun Radhakrishnan
>  Labels: TODOC2.2, TODOC2.4
> Fix For: 3.0.0, 2.4.0, 2.2.1
>
> Attachments: HIVE-14792.1.patch, HIVE-14792.3.patch
>
>
> Avro tables that use "external" schema files stored on HDFS can cause 
> excessive calls to {{FileSystem::open()}}, especially for queries that spawn 
> large numbers of mappers.
> This is because of the following code in {{AvroSerDe::initialize()}}:
> {code:title=AvroSerDe.java|borderStyle=solid}
> public void initialize(Configuration configuration, Properties properties) 
> throws SerDeException {
> // ...
> if (hasExternalSchema(properties)
> || columnNameProperty == null || columnNameProperty.isEmpty()
> || columnTypeProperty == null || columnTypeProperty.isEmpty()) {
>   schema = determineSchemaOrReturnErrorSchema(configuration, properties);
> } else {
>   // Get column names and sort order
>   columnNames = Arrays.asList(columnNameProperty.split(","));
>   columnTypes = 
> TypeInfoUtils.getTypeInfosFromTypeString(columnTypeProperty);
>   schema = getSchemaFromCols(properties, columnNames, columnTypes, 
> columnCommentProperty);
>  
> properties.setProperty(AvroSerdeUtils.AvroTableProperties.SCHEMA_LITERAL.getPropName(),
>  schema.toString());
> }
> // ...
> }
> {code}
> For tables using {{avro.schema.url}}, every time the SerDe is initialized 
> (i.e. at least once per mapper), the schema file is read remotely. For 
> queries with thousands of mappers, this leads to a stampede to the handful 
> (3?) datanodes that host the schema-file. In the best case, this causes 
> slowdowns.
> It would be preferable to distribute the Avro-schema to all mappers as part 
> of the job-conf. The alternatives aren't exactly appealing:
> # One can't rely solely on the {{column.list.types}} stored in the Hive 
> metastore. (HIVE-14789).
> # {{avro.schema.literal}} might not always be usable, because of the 
> size-limit on table-parameters. The typical size of the Avro-schema file is 
> between 0.5-3MB, in my limited experience. Bumping the max table-parameter 
> size isn't a great solution.
> If the {{avro.schema.file}} were read during query-planning, and made 
> available as part of table-properties (but not serialized into the 
> metastore), the downstream logic will remain largely intact. I have a patch 
> that does this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-13000) Hive returns useless parsing error

2017-12-28 Thread Amruth S (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305600#comment-16305600
 ] 

Amruth S commented on HIVE-13000:
-

[~alina.abramova]/[~ekoifman] *bump* Let me know if any action item is needed 
here. I can take it up.

> Hive returns useless parsing error 
> ---
>
> Key: HIVE-13000
> URL: https://issues.apache.org/jira/browse/HIVE-13000
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.13.0, 1.0.0, 1.2.1, 2.2.0
>Reporter: Alina Abramova
>Assignee: Alina Abramova
>Priority: Minor
> Attachments: HIVE-13000.1.patch, HIVE-13000.2.patch, 
> HIVE-13000.3.patch, HIVE-13000.4.patch, HIVE-13000.5.patch
>
>
> When I run query like these I receive unclear exception
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException Error in parsing 
> It will be clearer if it would be like:
> hive> SELECT record FROM ctest GROUP BY record.instance_id;
> FAILED: SemanticException  Expression not in GROUP BY key record



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18346) Beeline could not launch because of size of history

2017-12-28 Thread Madhudeep Petwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhudeep Petwal reassigned HIVE-18346:
---

Assignee: Madhudeep Petwal  (was: Andrew Sherman)

> Beeline could not launch because of size of history
> ---
>
> Key: HIVE-18346
> URL: https://issues.apache.org/jira/browse/HIVE-18346
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: JQu
>Assignee: Madhudeep Petwal
>Priority: Minor
>
> Beeline version 1.2.1 could not launch when the size of  
> ${user.home}/.beeline/history larger than 39MB. Which reports 
> "java.lang.outofmemoryerror" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14615) Temp table leaves behind insert command

2017-12-28 Thread Madhudeep Petwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305596#comment-16305596
 ] 

Madhudeep Petwal commented on HIVE-14615:
-

Thanks [~asherman] . I will look into that.

> Temp table leaves behind insert command
> ---
>
> Key: HIVE-14615
> URL: https://issues.apache.org/jira/browse/HIVE-14615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Andrew Sherman
>
> {code}
> create table test (key int, value string);
> insert into test values (1, 'val1');
> show tables;
> test
> values__tmp__table__1
> {code}
> the temp table values__tmp__table__1 was resulted from insert into ...values
> and exists until logout the session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18338) [Client, JDBC] Expose async interface through hive JDBC.

2017-12-28 Thread Amruth S (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305589#comment-16305589
 ] 

Amruth S commented on HIVE-18338:
-

[~ashutoshc]/[~ngangam]/[~thejas]/[~vgumashta], Can one of you guys kindly 
review this?

> [Client, JDBC] Expose async interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth S
>Assignee: Amruth S
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18338.patch, HIVE-18338.patch.1, 
> HIVE-18338.patch.2, HIVE-18338.patch.3
>
>
> Lot of users are struggling and rewriting a lot of boiler plate over thrift 
> to get pure asynchronous capability. 
> The idea is to expose operation handle, so that clients can persist it and 
> later can latch on to the same execution.
> *Problem statement*
> Hive JDBC currently exposes 2 methods related to asynchronous execution
> *executeAsync()* - to trigger a query execution and return immediately.
> *waitForOperationToComplete()* - which waits till the current execution is 
> complete *blocking the user thread*.
> This has one problem
> If the client process goes down, there is no way to resume queries although 
> hive server is completely asynchronous.
> *Proposal*
> If operation handle could be exposed, we can latch on to an active execution 
> of a query.
> *Code changes*
> Operation handle is exposed. So client can keep a copy.
> latchSync() and latchAsync() methods take an operation handle and try to 
> latch on to the current execution in hive server if present



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18338) [Client, JDBC] Expose async interface through hive JDBC.

2017-12-28 Thread Amruth S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amruth S updated HIVE-18338:

Description: 
Lot of users are struggling and rewriting a lot of boiler plate over thrift to 
get pure asynchronous capability. 

The idea is to expose operation handle, so that clients can persist it and 
later can latch on to the same execution.

*Problem statement*

Hive JDBC currently exposes 2 methods related to asynchronous execution
*executeAsync()* - to trigger a query execution and return immediately.
*waitForOperationToComplete()* - which waits till the current execution is 
complete *blocking the user thread*.

This has one problem

If the client process goes down, there is no way to resume queries although 
hive server is completely asynchronous.
*Proposal*

If operation handle could be exposed, we can latch on to an active execution of 
a query.

*Code changes*

Operation handle is exposed. So client can keep a copy.
latchSync() and latchAsync() methods take an operation handle and try to latch 
on to the current execution in hive server if present

  was:
Lot of users are struggling and rewriting a lot of boiler plate over thrift to 
get pure asynchronous capability. 

The idea is to expose operation handle, so that clients can persist it and 
later can latch on to the same execution.

Let me know your ideas around this. We have solved this already at our org by 
tweaking HiveStatement.java.


> [Client, JDBC] Expose async interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth S
>Assignee: Amruth S
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18338.patch, HIVE-18338.patch.1, 
> HIVE-18338.patch.2, HIVE-18338.patch.3
>
>
> Lot of users are struggling and rewriting a lot of boiler plate over thrift 
> to get pure asynchronous capability. 
> The idea is to expose operation handle, so that clients can persist it and 
> later can latch on to the same execution.
> *Problem statement*
> Hive JDBC currently exposes 2 methods related to asynchronous execution
> *executeAsync()* - to trigger a query execution and return immediately.
> *waitForOperationToComplete()* - which waits till the current execution is 
> complete *blocking the user thread*.
> This has one problem
> If the client process goes down, there is no way to resume queries although 
> hive server is completely asynchronous.
> *Proposal*
> If operation handle could be exposed, we can latch on to an active execution 
> of a query.
> *Code changes*
> Operation handle is exposed. So client can keep a copy.
> latchSync() and latchAsync() methods take an operation handle and try to 
> latch on to the current execution in hive server if present



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18338) [Client, JDBC] Expose async interface through hive JDBC.

2017-12-28 Thread Amruth S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amruth S updated HIVE-18338:

Summary: [Client, JDBC] Expose async interface through hive JDBC.  (was: 
[Client, JDBC] Asynchronous interface through hive JDBC.)

> [Client, JDBC] Expose async interface through hive JDBC.
> 
>
> Key: HIVE-18338
> URL: https://issues.apache.org/jira/browse/HIVE-18338
> Project: Hive
>  Issue Type: Improvement
>  Components: Clients, JDBC
>Affects Versions: 2.3.2
>Reporter: Amruth S
>Assignee: Amruth S
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-18338.patch, HIVE-18338.patch.1, 
> HIVE-18338.patch.2, HIVE-18338.patch.3
>
>
> Lot of users are struggling and rewriting a lot of boiler plate over thrift 
> to get pure asynchronous capability. 
> The idea is to expose operation handle, so that clients can persist it and 
> later can latch on to the same execution.
> Let me know your ideas around this. We have solved this already at our org by 
> tweaking HiveStatement.java.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-14615) Temp table leaves behind insert command

2017-12-28 Thread Andrew Sherman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305582#comment-16305582
 ] 

Andrew Sherman commented on HIVE-14615:
---

Thanks [~minions] I am taking this back as it sounds like I am a lot further 
along than you. 

Welcome to Hive! I hope you can be successful. 

I see a new jira [HIVE-18346] which looks like it might be interesting for 
someone starting out. Do you want to take a look? I've assigned it to myself 
for now. But if you are interested then assign to yourself and take a look.

> Temp table leaves behind insert command
> ---
>
> Key: HIVE-14615
> URL: https://issues.apache.org/jira/browse/HIVE-14615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Andrew Sherman
>
> {code}
> create table test (key int, value string);
> insert into test values (1, 'val1');
> show tables;
> test
> values__tmp__table__1
> {code}
> the temp table values__tmp__table__1 was resulted from insert into ...values
> and exists until logout the session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18346) Beeline could not launch because of size of history

2017-12-28 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-18346:
-

Assignee: Andrew Sherman

> Beeline could not launch because of size of history
> ---
>
> Key: HIVE-18346
> URL: https://issues.apache.org/jira/browse/HIVE-18346
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: JQu
>Assignee: Andrew Sherman
>Priority: Minor
>
> Beeline version 1.2.1 could not launch when the size of  
> ${user.home}/.beeline/history larger than 39MB. Which reports 
> "java.lang.outofmemoryerror" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-18346) Beeline could not launch because of size of history

2017-12-28 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-18346:
-

Assignee: Madhudeep Petwal

> Beeline could not launch because of size of history
> ---
>
> Key: HIVE-18346
> URL: https://issues.apache.org/jira/browse/HIVE-18346
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 1.2.1
>Reporter: JQu
>Assignee: Madhudeep Petwal
>Priority: Minor
>
> Beeline version 1.2.1 could not launch when the size of  
> ${user.home}/.beeline/history larger than 39MB. Which reports 
> "java.lang.outofmemoryerror" .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-14615) Temp table leaves behind insert command

2017-12-28 Thread Andrew Sherman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Sherman reassigned HIVE-14615:
-

Assignee: Andrew Sherman  (was: Madhudeep Petwal)

> Temp table leaves behind insert command
> ---
>
> Key: HIVE-14615
> URL: https://issues.apache.org/jira/browse/HIVE-14615
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Chaoyu Tang
>Assignee: Andrew Sherman
>
> {code}
> create table test (key int, value string);
> insert into test values (1, 'val1');
> show tables;
> test
> values__tmp__table__1
> {code}
> the temp table values__tmp__table__1 was resulted from insert into ...values
> and exists until logout the session.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18325) Config to do case unaware schema evolution to ORC reader.

2017-12-28 Thread Amruth S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amruth S updated HIVE-18325:

Summary: Config to do case unaware schema evolution to ORC reader.  (was: 
sending flag to do case unaware schema evolution to reader.)

> Config to do case unaware schema evolution to ORC reader.
> -
>
> Key: HIVE-18325
> URL: https://issues.apache.org/jira/browse/HIVE-18325
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: piyush mukati
>Assignee: piyush mukati
>Priority: Critical
>
> in case of orc data reader schema passed by hive are all small cases and if 
> the column name stored in the file has any uppercase, it will return null 
> values for those columns even if the data is present in the file. 
> Column name matching while schema evolution should be case unaware. 
> we need to pass config for same from hive. the 
> config(orc.schema.evolution.case.sensitive) in orc will be exposed by 
> https://issues.apache.org/jira/browse/ORC-264 
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18325) sending flag to do case unaware schema evolution to reader.

2017-12-28 Thread Amruth S (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amruth S updated HIVE-18325:

Priority: Critical  (was: Major)

> sending flag to do case unaware schema evolution to reader.
> ---
>
> Key: HIVE-18325
> URL: https://issues.apache.org/jira/browse/HIVE-18325
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: piyush mukati
>Assignee: piyush mukati
>Priority: Critical
>
> in case of orc data reader schema passed by hive are all small cases and if 
> the column name stored in the file has any uppercase, it will return null 
> values for those columns even if the data is present in the file. 
> Column name matching while schema evolution should be case unaware. 
> we need to pass config for same from hive. the 
> config(orc.schema.evolution.case.sensitive) in orc will be exposed by 
> https://issues.apache.org/jira/browse/ORC-264 
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2017-12-28 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305203#comment-16305203
 ] 

anishek commented on HIVE-18341:


the /.reserved/raw virtual path seems to be documented, only in relation to 
copying files via distcp, i think there will be some extra work w.r.t copying 
the XATTR attributes if we want normal filesystem to use this optimization as 
well. 



> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2017-12-28 Thread Shwetha G S (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305193#comment-16305193
 ] 

Shwetha G S commented on HIVE-18341:


{quote}
One thing to note is since for regular file copies we use the fileSystem copy, 
even for TDE deployments with same keys we wont be able to leverage the 
optimization that distcp does
{quote}
If the filesystem read uses /.reserved/raw in the path, even file copy should 
copy encrypted bytes, right?

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2017-12-28 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305167#comment-16305167
 ] 

anishek edited comment on HIVE-18341 at 12/28/17 8:38 AM:
--

I have also tried with _skipcrccheck_ option in distcp.options but that fails 
as well with files having incomprehensible data on target warehouse.


was (Author: anishek):
I have also tried with _ skipcrccheck_ option in distcp.options but that fails 
as well with files having incomprehensible data on target warehouse.

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2017-12-28 Thread anishek (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305167#comment-16305167
 ] 

anishek commented on HIVE-18341:


I have also tried with _ skipcrccheck_ option in distcp.options but that fails 
as well with files having incomprehensible data on target warehouse.

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-18341) Add repl load support for adding "raw" namespace for TDE with same encryption keys

2017-12-28 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-18341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-18341:
---
Attachment: HIVE-18341.1.patch

[~thejas] I have included the changes as provided in the distcp page for 
"/.reserved/raw", however it looks like distcp copy fails with 
"checksum-mistmatch" exception. this shouldnt have happened since the two 
different zones are using the same keys, output from logs :
{code}
Check-sum mismatch between 
hdfs://localhost:53536/.reserved/raw/warehouse0/targetandsourcehavesameencryptionzonekeys_1514449998552.db/encrypted_table/00_0_copy_1
 and 
hdfs://localhost:53536/.reserved/raw/warehouse1/replicated_targetandsourcehavesameencryptionzonekeys_1514449998552.db/encrypted_table/.hive-staging_hive_2017-12-28_00-33-30_893_6165151359381350374-1/-ext-10001/.distcp.tmp.attempt_local327098851_0003_m_00_0
{code}
The test case is "targetAndSourceHaveSameEncryptionZoneKeys".

Additionally i have also included changes to do the regular file copies ( when 
either just 1 file or if file size is small ) to be done under *doAs* using the 
user configuration provided for distcp ("hive.distcp.privileged.doAs").  

One thing to note is since for regular file copies we use the fileSystem copy, 
even for TDE deployments with same keys we wont be able to leverage the 
optimization that distcp does, this will be of particular interest for ACID 
table replications where we will mostly transfer 1 delta file per table with in 
a transaction.

> Add repl load support for adding "raw" namespace for TDE with same encryption 
> keys
> --
>
> Key: HIVE-18341
> URL: https://issues.apache.org/jira/browse/HIVE-18341
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-18341.0.patch, HIVE-18341.1.patch
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html#Running_as_the_superuser
> "a new virtual path prefix, /.reserved/raw/, that gives superusers direct 
> access to the underlying block data in the filesystem. This allows superusers 
> to distcp data without needing having access to encryption keys, and also 
> avoids the overhead of decrypting and re-encrypting data."
> We need to introduce a new option in "Repl Load" command that will change the 
> files being copied in distcp to have this "/.reserved/raw/" namespace before 
> the file paths.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)