[jira] [Updated] (HIVE-14144) Permanent functions are showing up in show functions, but describe says it doesn't exist

2016-07-05 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-14144:
---
Target Version/s: 2.1.1

> Permanent functions are showing up in show functions, but describe says it 
> doesn't exist
> 
>
> Key: HIVE-14144
> URL: https://issues.apache.org/jira/browse/HIVE-14144
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-14144.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14167) Use work directories provided by Tez instead of directly using YARN local dirs

2016-07-05 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated HIVE-14167:
--
Affects Version/s: 2.1.0
 Target Version/s: 2.2.0

> Use work directories provided by Tez instead of directly using YARN local dirs
> --
>
> Key: HIVE-14167
> URL: https://issues.apache.org/jira/browse/HIVE-14167
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Siddharth Seth
>
> HIVE-13303 fixed things to use multiple directories instead of a single tmp 
> directory. However it's using yarn-local-dirs directly.
> I'm not sure how well using the yarn-local-dir will work on a secure cluster.
> Would be better to use Tez*Context.getWorkDirs. This provides an app specific 
> directory - writable by the user.
> cc [~sershe]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363805#comment-15363805
 ] 

Hive QA commented on HIVE-14035:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816335/HIVE-14035.08.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10299 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hive.spark.client.TestSparkClient.testJobSubmission
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/377/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/377/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-377/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816335 - PreCommit-HIVE-MASTER-Build

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13749) Memory leak in Hive Metastore

2016-07-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363705#comment-15363705
 ] 

Thejas M Nair edited comment on HIVE-13749 at 7/6/16 3:46 AM:
--

+1 looks good.
Thanks for chasing this down!


was (Author: thejas):
+1 looks good.


> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.1.patch, HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-07-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363705#comment-15363705
 ] 

Thejas M Nair commented on HIVE-13749:
--

+1 looks good.


> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.1.patch, HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363696#comment-15363696
 ] 

Hive QA commented on HIVE-12646:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816307/HIVE-12646.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 24 failed/errored test(s), 10033 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-cbo_rp_join1.q-union_top_level.q-insert_update_delete.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-create_func1.q-bucketmapjoin3.q-enforce_order.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-encryption_join_with_different_encryption_keys.q-bucketcontext_3.q-udf_smallint.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-groupby4.q-convert_enum_to_string.q-mapjoin_filter_on_outerjoin.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-index_compact.q-merge_dynamic_partition2.q-cbo_rp_subq_exists.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-nullscript.q-vector_char_mapjoin1.q-load_dyn_part3.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-parquet_ppd_decimal.q-cluster.q-groupby_sort_6.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-ptf_general_queries.q-unionDistinct_1.q-udf_version.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-sample_islocalmode_hook_use_metadata.q-cbo_rp_semijoin.q-udf_when.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-schema_evol_text_vec_mapwork_part_all_complex.q-metadataonly1.q-deleteAnalyze.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-stats13.q-join_parse.q-sort_merge_join_desc_2.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-stats_publisher_error_1.q-auto_join1.q-cast_to_int.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-tez_joins_explain.q-rename_column.q-varchar_serde.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-tez_smb_empty.q-char_2.q-udf_date_sub.q-and-12-more - did not 
produce a TEST-*.xml file
TestCliDriver-udf_locate.q-join32_lessSize.q-correlationoptimizer8.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_complex_join.q-interval_udf.q-udf_classloader_dynamic_dependency_resolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_distinct_2.q-cte_mat_1.q-update_after_multiple_inserts_special_characters.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_partition_diff_num_cols.q-stats2.q-union11.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
org.apache.hive.beeline.cli.TestHiveCli.testInValidCmd
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/376/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/376/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-376/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 24 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816307 - PreCommit-HIVE-MASTER-Build

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
> Attachments: HIVE-12646.patch
>
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo

[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363688#comment-15363688
 ] 

Thejas M Nair commented on HIVE-7224:
-

bq. To clarify, what would happen if Beeline uses the first 1000 rows to 
calculate the width, but then row 1001th is longer than that width. 
If 1001th row has column larger than the precomputed column width, that 
particular row would have the column with larger width to accommodate it. This 
would mean some rows have the separator "|" out of alignment with previous row. 
However, even if we recompute every 1000 rows, we could still  have 
misalignment every 1000 rows.

I looked at where the Row width gets used. The width is getting used only when 
--outputformat=table (ie TableOutputFormat class) is used .
If someone is working on very large outputs, it is likely to be processed by 
other applications and not human eyes, and a *sv (eg csv) format is likely to 
be used. It doesn't make any sense waste cpu cycles computing the width in 
those cases. This is also the case where performance impact of this computation 
would be more visible.

ie, If we can selectively enable buffering and width calculation only for 
TableOutputFormat, I don't think it would matter if we stick to column width 
based on first 1000 rows or recompute every 1000 rows.
Looks like the Row subclasses have access to beeline options and would be able 
to determine what the output format is.


> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark

2016-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363656#comment-15363656
 ] 

Rui Li commented on HIVE-13278:
---

My understanding is {{HiveOutputFormatImpl.checkOutputSpecs}} only looks for FS 
operator. FS indicates the end of a job. So if we find FS in MapWork, it means 
this is a map-only job and then we don't have to look for ReduceWork.

> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> 
>
> Key: HIVE-13278
> URL: https://issues.apache.org/jira/browse/HIVE-13278
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Found based on :
> Apache Hive 2.0.0
> Apache Spark 1.6.0
>Reporter: Xin Hao
>Assignee: Sahil Takiar
>Priority: Minor
>
> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark.
> Certainly, it doesn't prevent the query from running successfully. So mark it 
> as Minor currently.
> Error message example:
> {noformat}
> 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
> /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14139) NPE dropping permanent function

2016-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363624#comment-15363624
 ] 

Rui Li commented on HIVE-14139:
---

[~sershe] - do you mean we should just return if {{refCount}} is null?
{code}
  private void removePersistentFunctionUnderLock(FunctionInfo fi) {
String className = fi.getClassName();
Integer refCount = persistent.get(className);
assert refCount != null;
if (refCount == 1) {
  persistent.remove(className);
} else {
  persistent.put(className, Integer.valueOf(refCount - 1));
}
  }
{code}

> NPE dropping permanent function
> ---
>
> Key: HIVE-14139
> URL: https://issues.apache.org/jira/browse/HIVE-14139
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14139.1.patch, HIVE-14139.2.patch, 
> HIVE-14139.3.patch
>
>
> To reproduce:
> 1. Start a CLI session and create a permanent function.
> 2. Exit current CLI session.
> 3. Start a new CLI session and drop the function.
> Stack trace:
> {noformat}
> FAILED: error during drop function: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.removePersistentFunctionUnderLock(Registry.java:513)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.unregisterFunction(Registry.java:501)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.unregisterPermanentFunction(FunctionRegistry.java:1532)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.dropPermanentFunction(FunctionTask.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:95)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1860)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1564)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1316)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1085)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1073)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363618#comment-15363618
 ] 

Rui Li commented on HIVE-14156:
---

Thanks [~xuefuz] and [~niklaus.xiao] for your inputs.
Besides configuring the underlying DB, I think we also need to change how we 
create the database and table in hive's script. This 
[doc|https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html] 
provides a way to set character set per database.

> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14163:

Attachment: HIVE-14163.01.patch

Added

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.01.patch, HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12052) automatically populate file metadata to HBase metastore based on config or table properties

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12052:

Attachment: HIVE-12052.WIP.patch

WIP patch... close to finishing, then need to test. Adding updates to methods 
other than alter/create table will probably be in separate jira.


> automatically populate file metadata to HBase metastore based on config or 
> table properties
> ---
>
> Key: HIVE-12052
> URL: https://issues.apache.org/jira/browse/HIVE-12052
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12052.WIP.patch
>
>
> As discussed in HIVE-11500
> Should use a table property similar to auto.purge.
> Then, when this setting is set, partitions are added (convertToMPart is a 
> good source to find all the paths for that), after compactions, after 
> load/non-ACID insert, and periodically (configurable), the storage locations 
> should be scanned for new files and cache updated accordingly. All the 
> updates should probably be in the background thread and taken from queue 
> (high pri from most ops, low pri from enabling the property and from periodic 
> updates) to avoid high load on HDFS from metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363570#comment-15363570
 ] 

Hive QA commented on HIVE-14163:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816296/HIVE-14163.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10295 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskSchedulerService.testDelayedLocalityNodeCommErrorImmediateAllocation
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/375/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/375/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-375/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816296 - PreCommit-HIVE-MASTER-Build

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14145) Too small length of column 'PARAM_VALUE' in table 'SERDE_PARAMS'

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363560#comment-15363560
 ] 

niklaus xiao commented on HIVE-14145:
-

Met the same issue, can anyone review the patch ?

> Too small length of column 'PARAM_VALUE' in table 'SERDE_PARAMS'
> 
>
> Key: HIVE-14145
> URL: https://issues.apache.org/jira/browse/HIVE-14145
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
> Fix For: 2.1.1
>
> Attachments: HIVE-14145.1.patch, HIVE-14145.2.patch
>
>
> Customer has following table
> {code}
> create external table hive_hbase_test(
> HBASE_KEY string,
> ENTITY_NAME string,
> ENTITY_ID string,
> CLAIM_HEADER_ID string,
> CLAIM_LINE_ID string,
> MEDICAL_CLAIM_SOURCE_SYSTEM string,
> UNIQUE_MEMBER_ID string,
> MEMBER_SOURCE_SYSTEM string,
> SUBSCRIBER_ID string,
> COVERAGE_CLASS_CODE string,
> SERVICING_PROVIDER_ID string,
> PROVIDER_SOURCE_SYSTEM string,
> SERVICING_PROVIDER_SPECIALTY string,
> SERVICING_STANDARD_PROVIDER_SPECIALTY string,
> SERVICING_PROVIDER_TYPE_CODE string,
> REFERRING_PROVIDER_ID string,
> ADMITTING_PROVIDER_ID string,
> ATTENDING_PROVIDER_ID string,
> OPERATING_PROVIDER_ID string,
> BILLING_PROVIDER_ID string,
> ORDERING_PROVIDER_ID string,
> HEALTH_PLAN_SOURCE_ID string,
> HEALTH_PLAN_PAYER_NAME string,
> BUSINESS_UNIT string,
> OPERATING_UNIT string,
> PRODUCT string,
> MARKET string,
> DEPARTMENT string,
> IPA string,
> SUPPLEMENTAL_DATA_TYPE string,
> PSEUDO_CLAIM_FLAG string,
> CLAIM_STATUS string,
> CLAIM_LINE_STATUS string,
> CLAIM_DENIED_FLAG string,
> SERVICE_LINE_DENIED_FLAG string,
> DENIED_REASON_CODE string,
> SERVICE_LINE_DENIED_REASON_CODE string,
> DAYS_DENIED int,
> DIAGNOSIS_DATE timestamp,
> SERVICE_DATE TIMESTAMP,
> SERVICE_FROM_DATE TIMESTAMP,
> SERVICE_TO_DATE TIMESTAMP,
> ADMIT_DATE TIMESTAMP,
> ADMIT_TYPE string,
> ADMIT_SOURCE_TYPE string,
> DISCHARGE_DATE TIMESTAMP,
> DISCHARGE_STATUS_CODE string,
> SERVICE_LINE_TYPE_OF_SERVICE string,
> TYPE_OF_BILL_CODE string,
> INPATIENT_FLAG string,
> PLACE_OF_SERVICE_CODE string,
> FACILITY_CODE string,
> AUTHORIZATION_NUMBER string,
> CLAIM_REFERRAL_NUMBER string,
> CLAIM_TYPE string,
> CLAIM_ADJUSTMENT_TYPE string,
> ICD_DIAGNOSIS_CODE_1 string,
> PRESENT_ON_ADMISSION_FLAG_1 string,
> ICD_DIAGNOSIS_CODE_2 string,
> PRESENT_ON_ADMISSION_FLAG_2 string,
> ICD_DIAGNOSIS_CODE_3 string,
> PRESENT_ON_ADMISSION_FLAG_3 string,
> ICD_DIAGNOSIS_CODE_4 string,
> PRESENT_ON_ADMISSION_FLAG_4 string,
> ICD_DIAGNOSIS_CODE_5 string,
> PRESENT_ON_ADMISSION_FLAG_5 string,
> ICD_DIAGNOSIS_CODE_6 string,
> PRESENT_ON_ADMISSION_FLAG_6 string,
> ICD_DIAGNOSIS_CODE_7 string,
> PRESENT_ON_ADMISSION_FLAG_7 string,
> ICD_DIAGNOSIS_CODE_8 string,
> PRESENT_ON_ADMISSION_FLAG_8 string,
> ICD_DIAGNOSIS_CODE_9 string,
> PRESENT_ON_ADMISSION_FLAG_9 string,
> ICD_DIAGNOSIS_CODE_10 string,
> PRESENT_ON_ADMISSION_FLAG_10 string,
> ICD_DIAGNOSIS_CODE_11 string,
> PRESENT_ON_ADMISSION_FLAG_11 string,
> ICD_DIAGNOSIS_CODE_12 string,
> PRESENT_ON_ADMISSION_FLAG_12 string,
> ICD_DIAGNOSIS_CODE_13 string,
> PRESENT_ON_ADMISSION_FLAG_13 string,
> ICD_DIAGNOSIS_CODE_14 string,
> PRESENT_ON_ADMISSION_FLAG_14 string,
> ICD_DIAGNOSIS_CODE_15 string,
> PRESENT_ON_ADMISSION_FLAG_15 string,
> ICD_DIAGNOSIS_CODE_16 string,
> PRESENT_ON_ADMISSION_FLAG_16 string,
> ICD_DIAGNOSIS_CODE_17 string,
> PRESENT_ON_ADMISSION_FLAG_17 string,
> ICD_DIAGNOSIS_CODE_18 string,
> PRESENT_ON_ADMISSION_FLAG_18 string,
> ICD_DIAGNOSIS_CODE_19 string,
> PRESENT_ON_ADMISSION_FLAG_19 string,
> ICD_DIAGNOSIS_CODE_20 string,
> PRESENT_ON_ADMISSION_FLAG_20 string,
> ICD_DIAGNOSIS_CODE_21 string,
> PRESENT_ON_ADMISSION_FLAG_21 string,
> ICD_DIAGNOSIS_CODE_22 string,
> PRESENT_ON_ADMISSION_FLAG_22 string,
> ICD_DIAGNOSIS_CODE_23 string,
> PRESENT_ON_ADMISSION_FLAG_23 string,
> ICD_DIAGNOSIS_CODE_24 string,
> PRESENT_ON_ADMISSION_FLAG_24 string,
> ICD_DIAGNOSIS_CODE_25 string,
> PRESENT_ON_ADMISSION_FLAG_25 string,
> QUANTITY_OF_SERVICES decimal(10,2),
> REVENUE_CODE string,
> PROCEDURE_CODE string,
> PROCEDURE_CODE_MODIFIER_1 string,
> PROCEDURE_CODE_MODIFIER_2 string,
> PROCEDURE_CODE_MODIFIER_3 string,
> PROCEDURE_CODE_MODIFIER_4 string,
> ICD_VERSION_CODE_TYPE string,
> ICD_PROCEDURE_CODE_1 string,
> ICD_PROCEDURE_CODE_2 string,
> ICD_PROCEDURE_CODE_3 string,
> ICD_PROCEDURE_CODE_4 string,
> ICD_PROCEDURE_CODE_5 string,
> ICD_PROCEDURE_CODE_6 string,
> ICD_PROCEDURE_CODE_7 string,
> ICD_PROCEDURE_CODE_8 string,
> ICD_PROCEDURE_CODE_9 string,
> ICD_PROCEDURE_CODE_10 string,
> ICD_PROCEDURE_CODE_11 string,
> ICD_PROCEDURE_CODE_12 string,
> ICD_PROCEDURE_CODE_13 string,
> ICD_PROCEDURE_CODE_14 string,
> ICD_PROCEDURE_CODE_15 string,
> ICD_PROCEDURE_CODE_16 

[jira] [Commented] (HIVE-14081) Appending a variable value into the hive query inside java code gives me an error

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363552#comment-15363552
 ] 

niklaus xiao commented on HIVE-14081:
-

Try this:
{code}
res=stm.executeQuery("select * from dataset where c_name = 
'"+jComboBox1.getSelectedItem() + "'");
{code}

> Appending a variable value into the hive query inside java code gives me an 
> error
> -
>
> Key: HIVE-14081
> URL: https://issues.apache.org/jira/browse/HIVE-14081
> Project: Hive
>  Issue Type: Bug
>  Components: API
>Affects Versions: 0.13.0
>Reporter: Amey D
>
> New to this forum please help or guide me where do i find the solution to 
> this error.
> Query inside java :
>  res=stm.executeQuery("select * from dataset where c_name = 
> "+jComboBox1.getSelectedItem());
> Error :
> FAILED: SemanticException [Error 10004]: Line 1:35 Invalid table alias or 
> column reference 'AAPL': (possible column names are:
> Cannot get over this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14158) deal with derived column names

2016-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14158:
---
Status: Patch Available  (was: Open)

> deal with derived column names
> --
>
> Key: HIVE-14158
> URL: https://issues.apache.org/jira/browse/HIVE-14158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-14158.01.patch, HIVE-14158.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14158) deal with derived column names

2016-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14158:
---
Status: Open  (was: Patch Available)

> deal with derived column names
> --
>
> Key: HIVE-14158
> URL: https://issues.apache.org/jira/browse/HIVE-14158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-14158.01.patch, HIVE-14158.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14158) deal with derived column names

2016-07-05 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-14158:
---
Attachment: HIVE-14158.02.patch

> deal with derived column names
> --
>
> Key: HIVE-14158
> URL: https://issues.apache.org/jira/browse/HIVE-14158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-14158.01.patch, HIVE-14158.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14044) Newlines in Avro maps cause external table to return corrupt values

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363512#comment-15363512
 ] 

Sahil Takiar commented on HIVE-14044:
-

[~Sh4pe] is there anymore environment information you can provided?

I tried to re-produce this on CDH 5.5.1 but I can't reproduce is there either.

> Newlines in Avro maps cause external table to return corrupt values
> ---
>
> Key: HIVE-14044
> URL: https://issues.apache.org/jira/browse/HIVE-14044
> Project: Hive
>  Issue Type: Bug
> Environment: Hive version: 1.1.0-cdh5.5.1 (bundled with cloudera 
> 5.5.1)
>Reporter: David Nies
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: test.json, test.schema
>
>
> When {{\n}} characters are contained in Avro files that are used as data 
> bases for an external table, the result of {{SELECT}} queries may be corrupt. 
> I encountered this error when querying hive both from {{beeline}} and from 
> JDBC.
> h3. Steps to reproduce (used files are attached to ticket)
> # Create an {{.avro}} file that contains newline characters in a value of a 
> map:
> {code}
> avro-tools fromjson --schema-file test.schema test.json > test.avro
> {code}
> # Copy {{.avro}} file to HDFS
> {code}
> hdfs dfs -copyFromLocal test.avro /some/location/
> {code}
> # Create an external table in beeline containing this {{.avro}}:
> {code}
> beeline> CREATE EXTERNAL TABLE broken_newline_map
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION '/some/location/'
> TBLPROPERTIES ('avro.schema.literal'='
> {
>   "type" : "record",
>   "name" : "myEntry",
>   "namespace" : "myNamespace",
>   "fields" : [ {
> "name" : "foo",
> "type" : "long"
>   }, {
> "name" : "bar",
> "type" : {
>   "type" : "map",
>   "values" : "string"
> }
>   } ]
> }
> ');
> {code}
> # Now, selecting may return corrupt results:
> {code}
> jdbc:hive2://my-server:1/> select * from broken_newline_map;
> +-+---+--+
> | broken_newline_map.foo  |  broken_newline_map.bar   
> |
> +-+---+--+
> | 1   | {"key2":"value2","key1":"value1\nafter newline"}  
> |
> | 2   | {"key2":"new value2","key1":"new value"}  
> |
> +-+---+--+
> 2 rows selected (1.661 seconds)
> jdbc:hive2://my-server:1/> select foo, map_keys(bar), map_values(bar) 
> from broken_newline_map;
> +---+--+-+--+
> |  foo  |   _c1| _c2 |
> +---+--+-+--+
> | 1 | ["key2","key1"]  | ["value2","value1"] |
> | NULL  | NULL | NULL|
> | 2 | ["key2","key1"]  | ["new value2","new value"]  |
> +---+--+-+--+
> 3 rows selected (28.05 seconds)
> {code}
> Obviously, the last result set contains corrupt entries (line 2) and 
> incorrect entries (line 1). I also encountered this when doing this query 
> with JDBC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14007) Replace ORC module with ORC release

2016-07-05 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363497#comment-15363497
 ] 

Owen O'Malley commented on HIVE-14007:
--

I wasn't sure whether deleting variables out of HiveConf would break any users.

I believe all of the ORC documentation has already moved to 
http://orc.apache.org . In particular, it should be:

The Hive Configuration Properties - ORC File format is in 
http://orc.apache.org/docs/hive-config.html

The Hive Language Manual -ORC files is in 
http://orc.apache.org/docs/spec-intro.html .



> Replace ORC module with ORC release
> ---
>
> Key: HIVE-14007
> URL: https://issues.apache.org/jira/browse/HIVE-14007
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.2.0
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 2.2.0
>
> Attachments: HIVE-14007.patch
>
>
> This completes moving the core ORC reader & writer to the ORC project.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13958) hive.strict.checks.type.safety should apply to decimals, as well as IN... and BETWEEN... ops

2016-07-05 Thread Takuma Wakamori (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363490#comment-15363490
 ] 

Takuma Wakamori commented on HIVE-13958:


@Sergey Shelukhin
Thank you for reviewing.
I apologize in adbance that it will take a long time to fix it.
Because I have lost my laptop recently. I'm buying new one now.
If it is unacceptable, please unassign me from this issue.
Best,

> hive.strict.checks.type.safety should apply to decimals, as well as IN... and 
> BETWEEN... ops
> 
>
> Key: HIVE-13958
> URL: https://issues.apache.org/jira/browse/HIVE-13958
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Takuma Wakamori
>  Labels: patch
> Attachments: HIVE-13958.01.patch, HIVE-13958.02.patch, 
> HIVE-13958.03.patch, HIVE-13958.04.patch
>
>
> String to decimal auto-casts should be prohibited for compares



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13934) Configure Tez to make nocondiional task size memory available for the Processor

2016-07-05 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13934:
-
Attachment: HIVE-13934.3.patch

Patch 3, reserving more memory for DAGs that have Mapjoins

> Configure Tez to make nocondiional task size memory available for the 
> Processor
> ---
>
> Key: HIVE-13934
> URL: https://issues.apache.org/jira/browse/HIVE-13934
> Project: Hive
>  Issue Type: Bug
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13934.1.patch, HIVE-13934.2.patch, 
> HIVE-13934.3.patch
>
>
> Currently, noconditionaltasksize is not validated against the container size, 
> the reservations made in the container by Tez for Inputs / Outputs etc.
> Check this at compile time to see if enough memory is available, or set up 
> the vertex to reserve additional memory for the Processor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14044) Newlines in Avro maps cause external table to return corrupt values

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363467#comment-15363467
 ] 

Sahil Takiar commented on HIVE-14044:
-

I checked and this issue is no longer present in the master branch.

The query: {{select foo, map_keys(bar), map_values(bar) from 
broken_newline_map}}

Prints:

{code}
+--+--+-+--+
| foo  |c1| c2  |
+--+--+-+--+
| 1| ["key1","key2"]  | ["value1\nafter newline","value2"]  |
| 2| ["key1","key2"]  | ["new value","new value2"]  |
+--+--+-+--+
{code}

> Newlines in Avro maps cause external table to return corrupt values
> ---
>
> Key: HIVE-14044
> URL: https://issues.apache.org/jira/browse/HIVE-14044
> Project: Hive
>  Issue Type: Bug
> Environment: Hive version: 1.1.0-cdh5.5.1 (bundled with cloudera 
> 5.5.1)
>Reporter: David Nies
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: test.json, test.schema
>
>
> When {{\n}} characters are contained in Avro files that are used as data 
> bases for an external table, the result of {{SELECT}} queries may be corrupt. 
> I encountered this error when querying hive both from {{beeline}} and from 
> JDBC.
> h3. Steps to reproduce (used files are attached to ticket)
> # Create an {{.avro}} file that contains newline characters in a value of a 
> map:
> {code}
> avro-tools fromjson --schema-file test.schema test.json > test.avro
> {code}
> # Copy {{.avro}} file to HDFS
> {code}
> hdfs dfs -copyFromLocal test.avro /some/location/
> {code}
> # Create an external table in beeline containing this {{.avro}}:
> {code}
> beeline> CREATE EXTERNAL TABLE broken_newline_map
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION '/some/location/'
> TBLPROPERTIES ('avro.schema.literal'='
> {
>   "type" : "record",
>   "name" : "myEntry",
>   "namespace" : "myNamespace",
>   "fields" : [ {
> "name" : "foo",
> "type" : "long"
>   }, {
> "name" : "bar",
> "type" : {
>   "type" : "map",
>   "values" : "string"
> }
>   } ]
> }
> ');
> {code}
> # Now, selecting may return corrupt results:
> {code}
> jdbc:hive2://my-server:1/> select * from broken_newline_map;
> +-+---+--+
> | broken_newline_map.foo  |  broken_newline_map.bar   
> |
> +-+---+--+
> | 1   | {"key2":"value2","key1":"value1\nafter newline"}  
> |
> | 2   | {"key2":"new value2","key1":"new value"}  
> |
> +-+---+--+
> 2 rows selected (1.661 seconds)
> jdbc:hive2://my-server:1/> select foo, map_keys(bar), map_values(bar) 
> from broken_newline_map;
> +---+--+-+--+
> |  foo  |   _c1| _c2 |
> +---+--+-+--+
> | 1 | ["key2","key1"]  | ["value2","value1"] |
> | NULL  | NULL | NULL|
> | 2 | ["key2","key1"]  | ["new value2","new value"]  |
> +---+--+-+--+
> 3 rows selected (28.05 seconds)
> {code}
> Obviously, the last result set contains corrupt entries (line 2) and 
> incorrect entries (line 1). I also encountered this when doing this query 
> with JDBC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-14044) Newlines in Avro maps cause external table to return corrupt values

2016-07-05 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-14044:
---

Assignee: Sahil Takiar

> Newlines in Avro maps cause external table to return corrupt values
> ---
>
> Key: HIVE-14044
> URL: https://issues.apache.org/jira/browse/HIVE-14044
> Project: Hive
>  Issue Type: Bug
> Environment: Hive version: 1.1.0-cdh5.5.1 (bundled with cloudera 
> 5.5.1)
>Reporter: David Nies
>Assignee: Sahil Takiar
>Priority: Critical
> Attachments: test.json, test.schema
>
>
> When {{\n}} characters are contained in Avro files that are used as data 
> bases for an external table, the result of {{SELECT}} queries may be corrupt. 
> I encountered this error when querying hive both from {{beeline}} and from 
> JDBC.
> h3. Steps to reproduce (used files are attached to ticket)
> # Create an {{.avro}} file that contains newline characters in a value of a 
> map:
> {code}
> avro-tools fromjson --schema-file test.schema test.json > test.avro
> {code}
> # Copy {{.avro}} file to HDFS
> {code}
> hdfs dfs -copyFromLocal test.avro /some/location/
> {code}
> # Create an external table in beeline containing this {{.avro}}:
> {code}
> beeline> CREATE EXTERNAL TABLE broken_newline_map
> ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> LOCATION '/some/location/'
> TBLPROPERTIES ('avro.schema.literal'='
> {
>   "type" : "record",
>   "name" : "myEntry",
>   "namespace" : "myNamespace",
>   "fields" : [ {
> "name" : "foo",
> "type" : "long"
>   }, {
> "name" : "bar",
> "type" : {
>   "type" : "map",
>   "values" : "string"
> }
>   } ]
> }
> ');
> {code}
> # Now, selecting may return corrupt results:
> {code}
> jdbc:hive2://my-server:1/> select * from broken_newline_map;
> +-+---+--+
> | broken_newline_map.foo  |  broken_newline_map.bar   
> |
> +-+---+--+
> | 1   | {"key2":"value2","key1":"value1\nafter newline"}  
> |
> | 2   | {"key2":"new value2","key1":"new value"}  
> |
> +-+---+--+
> 2 rows selected (1.661 seconds)
> jdbc:hive2://my-server:1/> select foo, map_keys(bar), map_values(bar) 
> from broken_newline_map;
> +---+--+-+--+
> |  foo  |   _c1| _c2 |
> +---+--+-+--+
> | 1 | ["key2","key1"]  | ["value2","value1"] |
> | NULL  | NULL | NULL|
> | 2 | ["key2","key1"]  | ["new value2","new value"]  |
> +---+--+-+--+
> 3 rows selected (28.05 seconds)
> {code}
> Obviously, the last result set contains corrupt entries (line 2) and 
> incorrect entries (line 1). I also encountered this when doing this query 
> with JDBC. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-07-05 Thread Saket Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Patch Available  (was: Open)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-07-05 Thread Saket Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Attachment: HIVE-14035.08.patch

Allow transactional_properties to be set when converting non-acid tables to 
acid tables and add more unit test cases

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.08.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363417#comment-15363417
 ] 

Hive QA commented on HIVE-14146:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816201/HIVE-14146.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 12 failed/errored test(s), 10281 tests 
executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-vector_non_string_partition.q-delete_where_non_partitioned.q-auto_sortmerge_join_16.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_concatenate_indexed_table
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_auth
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_indexes_edge_cases
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_indexes_syntax
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_concatenate_indexed_table
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/373/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/373/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-373/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 12 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816201 - PreCommit-HIVE-MASTER-Build

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14035) Enable predicate pushdown to delta files created by ACID Transactions

2016-07-05 Thread Saket Saurabh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saket Saurabh updated HIVE-14035:
-
Status: Open  (was: Patch Available)

> Enable predicate pushdown to delta files created by ACID Transactions
> -
>
> Key: HIVE-14035
> URL: https://issues.apache.org/jira/browse/HIVE-14035
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: HIVE-14035.02.patch, HIVE-14035.03.patch, 
> HIVE-14035.04.patch, HIVE-14035.05.patch, HIVE-14035.06.patch, 
> HIVE-14035.07.patch, HIVE-14035.patch
>
>
> In current Hive version, delta files created by ACID transactions do not 
> allow predicate pushdown if they contain any update/delete events. This is 
> done to preserve correctness when following a multi-version approach during 
> event collapsing, where an update event overwrites an existing insert event. 
> This JIRA proposes to split an update event into a combination of a delete 
> event followed by a new insert event, that can enable predicate push down to 
> all delta files without breaking correctness. To support backward 
> compatibility for this feature, this JIRA also proposes to add some sort of 
> versioning to ACID that can allow different versions of ACID transactions to 
> co-exist together.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363377#comment-15363377
 ] 

Sahil Takiar commented on HIVE-13278:
-

[~lirui] you are right. I debugged this some more, and found out the the 
{{FileNofFoundException}} is caused by {{HiveOutputFormatImpl.checkOutputSpecs 
-> Utilities.getMapRedWork}}. This happens even when running in MR mode.

I can work on a fix, but I'm not entirely sure how to modify the 
{{HiveOutputFormatImpl}} class to fix this. Given a {{JobConf}} object, is 
there any way to know if the current job corresponds to a Map-Only job?

> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> 
>
> Key: HIVE-13278
> URL: https://issues.apache.org/jira/browse/HIVE-13278
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Found based on :
> Apache Hive 2.0.0
> Apache Spark 1.6.0
>Reporter: Xin Hao
>Assignee: Sahil Takiar
>Priority: Minor
>
> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark.
> Certainly, it doesn't prevent the query from running successfully. So mark it 
> as Minor currently.
> Error message example:
> {noformat}
> 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
> /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14140) LLAP: package codec jars

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14140:

   Resolution: Fixed
Fix Version/s: 2.1.1
   2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master and branch-2.1. Thanks for the review!

> LLAP: package codec jars
> 
>
> Key: HIVE-14140
> URL: https://issues.apache.org/jira/browse/HIVE-14140
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-14140.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363295#comment-15363295
 ] 

Sahil Takiar commented on HIVE-7224:


{quote}
I think it is better to keep the column width to be consistent across all rows 
in the output, rather than adjust it every 1000 rows.
{quote}

Thanks for the input [~thejas]! To clarify, what would happen if Beeline uses 
the first 1000 rows to calculate the width, but then row 1001th is longer than 
that width. In that case the width would have to be widened for that row, 
right? I think that is what the current implementation of {{--incremental}} is 
doing.

I like the idea of keeping the width the same for all rows, even in incremental 
mode, but I'm not sure how it would work. Thoughts?

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14164) JDBC: Add retry in JDBC driver when reading config values from ZK

2016-07-05 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14164:

Affects Version/s: 1.2.1
   2.1.0
   2.0.1

> JDBC: Add retry in JDBC driver when reading config values from ZK
> -
>
> Key: HIVE-14164
> URL: https://issues.apache.org/jira/browse/HIVE-14164
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0, 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> Sometimes ZK may intermittently experience network partitioning. During this 
> time, clients trying to open a JDBC connection get an exception. To improve 
> user experience, we should implement a retry logic and fail after retrying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14164) JDBC: Add retry in JDBC driver when reading config values from ZK

2016-07-05 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-14164:

Component/s: JDBC

> JDBC: Add retry in JDBC driver when reading config values from ZK
> -
>
> Key: HIVE-14164
> URL: https://issues.apache.org/jira/browse/HIVE-14164
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 1.2.1, 2.1.0, 2.0.1
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>
> Sometimes ZK may intermittently experience network partitioning. During this 
> time, clients trying to open a JDBC connection get an exception. To improve 
> user experience, we should implement a retry logic and fail after retrying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-12646:

Attachment: HIVE-12646.patch

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
> Attachments: HIVE-12646.patch
>
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-12646:

Status: Patch Available  (was: Open)

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
> Attachments: HIVE-12646.patch
>
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363260#comment-15363260
 ] 

Siddharth Seth commented on HIVE-14163:
---

Shared ZK - admins want to control paths maybe.
If the same cluster is changed from secure to unsecure or the other way around 
- there's alternate ways to fix this, rather than selecting different paths. 
(The secure path breaks if security settings are changed)

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9756) LLAP: use log4j 2 for llap (log to separate files, etc.)

2016-07-05 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363252#comment-15363252
 ] 

Siddharth Seth commented on HIVE-9756:
--

[~prasanth_j] - along with this, I think we need to add an option to the 'hive 
--service llap' script to provide the logger to be used. It's hardcoded to RFA 
at the moment.
Also it'll be useful to use either the dagId or the queryId in the filename - 
maybe as a follow up.

> LLAP: use log4j 2 for llap (log to separate files, etc.)
> 
>
> Key: HIVE-9756
> URL: https://issues.apache.org/jira/browse/HIVE-9756
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.0.0
>Reporter: Gunther Hagleitner
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-9756.1.patch, HIVE-9756.2.patch, HIVE-9756.3.patch, 
> HIVE-9756.4.patch, HIVE-9756.4.patch, HIVE-9756.5.patch, HIVE-9756.6.patch
>
>
> For the INFO logging, we'll need to use the log4j-jcl 2.x upgrade-path to get 
> throughput friendly logging.
> http://logging.apache.org/log4j/2.0/manual/async.html#Performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363245#comment-15363245
 ] 

Sergey Shelukhin commented on HIVE-14163:
-

I dunno. Why?

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363231#comment-15363231
 ] 

Hive QA commented on HIVE-6329:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12664315/HIVE-6329.11.patch.txt

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/372/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/372/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-372/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.8.0_25 ]]
+ export JAVA_HOME=/usr/java/jdk1.8.0_25
+ JAVA_HOME=/usr/java/jdk1.8.0_25
+ export 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.8.0_25/bin/:/usr/lib64/qt-3.3/bin:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-MASTER-Build-372/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 23fd2ae HIVE-14119: LLAP external recordreader not returning 
non-ascii string properly (Jason Dere, reviewed by Sergey Shelukhin)
+ git clean -f -d
Removing ql/src/test/org/apache/hadoop/hive/ql/exec/TestRegistry.java
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at 23fd2ae HIVE-14119: LLAP external recordreader not returning 
non-ascii string properly (Jason Dere, reviewed by Sergey Shelukhin)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12664315 - PreCommit-HIVE-MASTER-Build

> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.11.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, 
> HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, 
> HIVE-6329.7.patch.txt, HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14139) NPE dropping permanent function

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363230#comment-15363230
 ] 

Hive QA commented on HIVE-14139:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816188/HIVE-14139.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10296 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/371/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/371/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-371/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816188 - PreCommit-HIVE-MASTER-Build

> NPE dropping permanent function
> ---
>
> Key: HIVE-14139
> URL: https://issues.apache.org/jira/browse/HIVE-14139
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14139.1.patch, HIVE-14139.2.patch, 
> HIVE-14139.3.patch
>
>
> To reproduce:
> 1. Start a CLI session and create a permanent function.
> 2. Exit current CLI session.
> 3. Start a new CLI session and drop the function.
> Stack trace:
> {noformat}
> FAILED: error during drop function: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.removePersistentFunctionUnderLock(Registry.java:513)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.unregisterFunction(Registry.java:501)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.unregisterPermanentFunction(FunctionRegistry.java:1532)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.dropPermanentFunction(FunctionTask.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:95)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1860)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1564)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1316)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1085)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1073)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363229#comment-15363229
 ] 

Siddharth Seth commented on HIVE-14163:
---

[~sershe] - should we making the namespace configurable ?

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14140) LLAP: package codec jars

2016-07-05 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363196#comment-15363196
 ] 

Gopal V commented on HIVE-14140:


LGTM - +1

I assume this is for LzoCodec which is not shipped with Tez today.

> LLAP: package codec jars
> 
>
> Key: HIVE-14140
> URL: https://issues.apache.org/jira/browse/HIVE-14140
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14140.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14163) LLAP: use different kerberized/unkerberized zk paths for registry

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14163:

Summary: LLAP: use different kerberized/unkerberized zk paths for registry  
(was: LLAP: use different kerberized/unkerberized paths for registry)

> LLAP: use different kerberized/unkerberized zk paths for registry
> -
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363180#comment-15363180
 ] 

Thejas M Nair commented on HIVE-7224:
-

bq. I think a better approach for the IncrementalRows class would be to instead 
buffer 1000 rows at a time (by default, this value can be configurable), this 
way it can optimally set the column width for each set of 1000 rows.
I think it is better to keep the column width to be consistent across all rows 
in the output, rather than adjust it every 1000 rows. The purpose of using 
optimal display size is primarily to make it easier to read. If we change the 
column width periodically, it doesn't help with that. Also, if there is some 
application that assumes the column width remains the same across rows, that 
might break. 

ie, beeline could buffer only the first 1000 rows and use that to determine the 
optimal column width, and then output the rest unbuffered. Not having to buffer 
the remaining rows could also be better for performance.



> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14163) LLAP: use different kerberized/unkerberized paths for registry

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14163:

Status: Patch Available  (was: Open)

> LLAP: use different kerberized/unkerberized paths for registry
> --
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14163) LLAP: use different kerberized/unkerberized paths for registry

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14163:

Attachment: HIVE-14163.patch

[~prasanth_j] [~sseth] can you take a look? I assume nobody is using these 
paths directly atm so no additional changes are needed.

> LLAP: use different kerberized/unkerberized paths for registry
> --
>
> Key: HIVE-14163
> URL: https://issues.apache.org/jira/browse/HIVE-14163
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14163.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14111) better concurrency handling for TezSessionState - part I

2016-07-05 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14111:

Attachment: HIVE-14111.04.patch

Fixing one more path.

> better concurrency handling for TezSessionState - part I
> 
>
> Key: HIVE-14111
> URL: https://issues.apache.org/jira/browse/HIVE-14111
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14111.01.patch, HIVE-14111.02.patch, 
> HIVE-14111.03.patch, HIVE-14111.04.patch, HIVE-14111.patch, 
> sessionPoolNotes.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14140) LLAP: package codec jars

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363128#comment-15363128
 ] 

Sergey Shelukhin commented on HIVE-14140:
-

[~gopalv] ping?

> LLAP: package codec jars
> 
>
> Key: HIVE-14140
> URL: https://issues.apache.org/jira/browse/HIVE-14140
> Project: Hive
>  Issue Type: Bug
>Reporter: Takahiko Saito
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14140.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363091#comment-15363091
 ] 

Vaibhav Gumashta commented on HIVE-7224:


[~stakiar] I agree that buffering would improve usability. Would you like to 
take a shot at it?

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14138) CBO failed for select current_database()

2016-07-05 Thread Jesus Camacho Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363060#comment-15363060
 ] 

Jesus Camacho Rodriguez commented on HIVE-14138:


[~pvary], could you regenerate q file for {{explainuser_1}}?

{noformat}
< Plan not optimized by CBO.
---
> Plan not optimized by CBO due to missing feature [Others].
{noformat}

Rest looks good, +1.

> CBO failed for select current_database()
> 
>
> Key: HIVE-14138
> URL: https://issues.apache.org/jira/browse/HIVE-14138
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14138.patch
>
>
> When issuing the following query, with hive.cbo.enable set to true:
> select current_database();
> The following exception is printed to the Hiveserver2 logs:
> 2016-06-30T09:58:24,146 ERROR [HiveServer2-Handler-Pool: Thread-33] 
> parse.CalcitePlanner: CBO failed, skipping CBO. 
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Unsupported
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3136)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:940)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:894)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:969)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:712)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:280)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10795)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:239)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:329)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1159)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:324)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:464)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:451)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:295)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1422)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363058#comment-15363058
 ] 

Sahil Takiar commented on HIVE-12646:
-

There seem to be a lot of JIRAs related to handling of semi-colons in the query 
strings. Here is a brief summary:

HIVE-11100 - added support for Beeline handling of escaped semi-colons - e.g. 
{{\;}}
HIVE-9877 - added support for Beeline handling of multiple queries in the same 
row, each terminated by a semi-colon
HIVE-12259 - added support for non-escaped semicolons in Beeline commands (e.g. 
!cmd)

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar reassigned HIVE-12646:
---

Assignee: Sahil Takiar

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>Assignee: Sahil Takiar
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12646) beeline and HIVE CLI do not parse ; in quote properly

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363038#comment-15363038
 ] 

Sahil Takiar commented on HIVE-12646:
-

[~ychena] I wanted to pick up working on this if that is ok with you. If you 
have any other insight into this issue that would also be very helpful.

Thanks

> beeline and HIVE CLI do not parse ; in quote properly
> -
>
> Key: HIVE-12646
> URL: https://issues.apache.org/jira/browse/HIVE-12646
> Project: Hive
>  Issue Type: Bug
>  Components: CLI, Clients
>Reporter: Yongzhi Chen
>
> Beeline and Cli have to escape ; in the quote while most other shell scripts 
> need not. For example:
> in Beeline:
> {noformat}
> 0: jdbc:hive2://localhost:1> select ';' from tlb1;
> select ';' from tlb1;
> 15/12/10 10:45:26 DEBUG TSaslTransport: writing data length: 115
> 15/12/10 10:45:26 DEBUG TSaslTransport: CLIENT: reading data length: 3403
> Error: Error while compiling statement: FAILED: ParseException line 1:8 
> cannot recognize input near '' '
> {noformat}
> while in mysql shell:
> {noformat}
> mysql> SELECT CONCAT(';', 'foo') FROM test limit 3;
> ++
> | ;foo   |
> | ;foo   |
> | ;foo   |
> ++
> 3 rows in set (0.00 sec)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - disallow parallel query execution within a single Session

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363028#comment-15363028
 ] 

Sergey Shelukhin commented on HIVE-11402:
-

Yeah, there's a single-entry semaphore that controls the execution of most 
operations (not e.g. fetchresults, or cancel operation :))

> HS2 - disallow parallel query execution within a single Session
> ---
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - disallow parallel query execution within a single Session

2016-07-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363011#comment-15363011
 ] 

Aihua Xu commented on HIVE-11402:
-

hmm. Yeah. Saw your comments in HiveSessionImpl and we do have issues  that 
some variables in SessionState are not protected for multiple threads.

Seems to make sense to have such option and continue fixing the thread-safe 
issue. Just wondering if the option is set to false, what kind of effect to the 
new operation in the same session? will it just get blocked to get the previous 
one finished?  



> HS2 - disallow parallel query execution within a single Session
> ---
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13744) LLAP IO - add complex types support

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363006#comment-15363006
 ] 

Sergey Shelukhin commented on HIVE-13744:
-

Hmm.. unless there's documentation to the contrary, I don't think so.

> LLAP IO - add complex types support
> ---
>
> Key: HIVE-13744
> URL: https://issues.apache.org/jira/browse/HIVE-13744
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.2.0
>Reporter: Sergey Shelukhin
>Assignee: Prasanth Jayachandran
>  Labels: llap, orc
> Fix For: 2.2.0, 2.1.1
>
> Attachments: HIVE-13744.1.patch, HIVE-13744.2.patch
>
>
> Recently, complex type column vectors were added to Hive. We should use them 
> in IO elevator.
> Vectorization itself doesn't support complex types (yet), but this would be 
> useful when it does, also it will enable LLAP IO elevator to be used in 
> non-vectorized context with complex types after HIVE-13617



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363001#comment-15363001
 ] 

Sahil Takiar commented on HIVE-7224:


[~vgumashta] is seems the behavior you are seeing is by design. Looking at 
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-BeelineCommandOptions
 the following explanation of the {{--incremental}} property suggests that this 
is expected:

{quote}
Defaults to false. When set to false, the entire result set is fetched and 
buffered before being displayed, yielding optimal display column sizing. When 
set to true, result rows are displayed immediately as they are fetched, 
yielding lower latency and memory usage at the price of extra display column 
padding. Setting --incremental=true is recommended if you encounter an 
OutOfMemory on the client side (due to the fetched result set size being large).
{quote}

So it seems there is a tradeoff when using {{--incremental}} that the column 
padding won't be optimal, but memory usage will be better. This makes sense 
since the {{IncrementalRows}} class that controls this logic doesn't do any 
buffering of rows, so it cannot predict what the optimal column width should be 
since it only looks at one row at a time.

I think a better approach for the {{IncrementalRows}} class would be to instead 
buffer 1000 rows at a time (by default, this value can be configurable), this 
way it can optimally set the column width for each set of 1000 rows. This 
shouldn't introduce memory issues unless each row is huge, in which case the 
use can decrease the buffer size to say 100 or 10.

What do you think?

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Sahil Takiar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362947#comment-15362947
 ] 

Sahil Takiar commented on HIVE-7224:


Thanks [~vgumashta] I'll dig into it some more and see what I can find.

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - disallow parallel query execution within a single Session

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362941#comment-15362941
 ] 

Sergey Shelukhin commented on HIVE-11402:
-

This patch has been changed to allow it by default. This is just a safety flag 
if someone sees issues.
Last time I checked, the runtime parallel operations in the same session work 
purely by magic (I think I left a comment somewhere).
I.e. I believe they may work but there's no good reason why they do, because 
non-thread-safe objects appear to be used without synchronization; there may be 
some bugs.

cc [~thejas]

> HS2 - disallow parallel query execution within a single Session
> ---
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14094) Remove unused function closeFs from Warehouse.java

2016-07-05 Thread Chao Sun (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362937#comment-15362937
 ] 

Chao Sun commented on HIVE-14094:
-

+1

> Remove unused function closeFs from Warehouse.java
> --
>
> Key: HIVE-14094
> URL: https://issues.apache.org/jira/browse/HIVE-14094
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Reporter: zhihai xu
>Assignee: zhihai xu
>Priority: Trivial
> Attachments: HIVE-14094.000.patch
>
>
> Remove unused function closeFs from Warehouse.java
> after HIVE-10922, no one will call Warehouse.closeFs. It will be good to 
> delete this function to prevent people from using it. Normally closing 
> FileSystem is not safe because most of the time FileSystem will be shared.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14139) NPE dropping permanent function

2016-07-05 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362930#comment-15362930
 ] 

Sergey Shelukhin commented on HIVE-14139:
-

It is consistent logically, we know about the function but don't load it until 
requested. We can just add safety check on removal and not assume it's loaded, 
does it make sense?

> NPE dropping permanent function
> ---
>
> Key: HIVE-14139
> URL: https://issues.apache.org/jira/browse/HIVE-14139
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14139.1.patch, HIVE-14139.2.patch, 
> HIVE-14139.3.patch
>
>
> To reproduce:
> 1. Start a CLI session and create a permanent function.
> 2. Exit current CLI session.
> 3. Start a new CLI session and drop the function.
> Stack trace:
> {noformat}
> FAILED: error during drop function: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.removePersistentFunctionUnderLock(Registry.java:513)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.unregisterFunction(Registry.java:501)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.unregisterPermanentFunction(FunctionRegistry.java:1532)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.dropPermanentFunction(FunctionTask.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:95)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1860)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1564)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1316)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1085)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1073)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10100) Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0

2016-07-05 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362914#comment-15362914
 ] 

Prasanth Jayachandran commented on HIVE-10100:
--

The test failures look related. I will look at it shortly. 

> Warning "yarn jar" instead of "hadoop jar" in hadoop 2.7.0
> --
>
> Key: HIVE-10100
> URL: https://issues.apache.org/jira/browse/HIVE-10100
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.3.0, 2.2.0
>Reporter: Gunther Hagleitner
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-10100.1.patch, HIVE-10100.2.patch, 
> HIVE-10100.3.patch, yarn_bin.patch
>
>
> HADOOP-11257 adds a warning to stdout
> {noformat}
> WARNING: Use "yarn jar" to launch YARN applications.
> {noformat}
> which will cause issues if untreated with folks that programatically parse 
> stdout for query results (i.e.: CLI, silent mode, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14119) LLAP external recordreader not returning non-ascii string properly

2016-07-05 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-14119:
--
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master

> LLAP external recordreader not returning non-ascii string properly
> --
>
> Key: HIVE-14119
> URL: https://issues.apache.org/jira/browse/HIVE-14119
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 2.2.0
>
> Attachments: HIVE-14119.1.patch, HIVE-14119.2.patch
>
>
> Strings with non-ascii chars showing up with "\�\�\� "



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Vaibhav Gumashta (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Gumashta updated HIVE-7224:
---
Component/s: Beeline

> Set incremental printing to true by default in Beeline
> --
>
> Key: HIVE-7224
> URL: https://issues.apache.org/jira/browse/HIVE-7224
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline, Clients, JDBC
>Affects Versions: 0.13.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Vaibhav Gumashta
>Assignee: Sahil Takiar
> Attachments: HIVE-7224.1.patch, HIVE-7224.2.patch, HIVE-7224.2.patch, 
> HIVE-7224.3.patch
>
>
> See HIVE-7221.
> By default beeline tries to buffer the entire output relation before printing 
> it on stdout. This can cause OOM when the output relation is large. However, 
> beeline has the option of incremental prints. We should keep that as the 
> default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362826#comment-15362826
 ] 

Vaibhav Gumashta commented on HIVE-7224:


[~stakiar] There seems to be an issue with the column width estimation when 
incremental printing is enabled by default. An example:
With incremental true:
{code}
| search_engine   | string  | |
| exclude_hit | string  | |
| hier1   | string  | |
| hier2   | string  | |
| hier3   | string  | |
| hier4   | string  | |
| hier5   | string  | |
| browser | string  | |
| post_browser_height | string  | |
| post_browser_width | string  | |
| post_cookies| string  | |
| post_java_enabled | string  | |
| post_persistent_cookie | string  | |
| color   | string  | |
| connection_type | string  | |
| country | string  | |
| domain  | string  | |
| post_t_time_info | string  | |
| javascript  | string  | |
| language| string  | |
| os  | string  | |
| plugins | string  | |
| resolution  | string  | |
| last_hit_time_gmt | string  | |
| first_hit_time_gmt | string  | |
| visit_start_time_gmt | string  | |
| last_purchase_time_gmt | string  | |
+-+-+-+--+
|col_name |data_type| comment |
+-+-+-+--+
| last_purchase_num | string  | |
| first_hit_page_url | string  | |
| first_hit_pagename | string  | |
| visit_start_page_url | string  | |
| visit_start_pagename | string  | |
| first_hit_referrer | string  | |
| visit_referrer  | string  | |
| visit_search_engine | string  | |
| visit_num   | string  | |
| visit_page_num  | string  | |
| prev_page   | string  | |
| geo_city| string  | |
| geo_country | string  | |
| geo_region  | string  | |
| duplicate_purchase | string  | |
{code}

With incremental false:
{code}
| search_engine| string|   |
| exclude_hit  | string|   |
| hier1| string|   |
| hier2| string|   |
| hier3| string|   |
| hier4| string|   |
| hier5| string|   |
| browser  | string|   |
| post_browser_height  | string|   |
| post_browser_width   | string|   |
| post_cookies | string|   |
| post_java_enabled| string|   |
| post_persistent_cookie   | string|   |
| color| string|   |
| connection_type  | string|   |
| country  | string|   |
| domain   | string|   |
| post_t_time_info | string|   |
| javascript   | string|   |
| language | string|   |
| os   | string|   |
| plugins  | string|   |
| resolution   | string|   |
| last_hit_time_gmt| string|   |
| first_hit_time_gmt   | string|   |
| visit_start_time_gmt | string 

[jira] [Comment Edited] (HIVE-7224) Set incremental printing to true by default in Beeline

2016-07-05 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362826#comment-15362826
 ] 

Vaibhav Gumashta edited comment on HIVE-7224 at 7/5/16 5:30 PM:


[~stakiar] There seems to be an issue with the column width estimation when 
incremental printing is enabled by default. An example (output of describe 
table):
With incremental true:
{code}
| search_engine   | string  | |
| exclude_hit | string  | |
| hier1   | string  | |
| hier2   | string  | |
| hier3   | string  | |
| hier4   | string  | |
| hier5   | string  | |
| browser | string  | |
| post_browser_height | string  | |
| post_browser_width | string  | |
| post_cookies| string  | |
| post_java_enabled | string  | |
| post_persistent_cookie | string  | |
| color   | string  | |
| connection_type | string  | |
| country | string  | |
| domain  | string  | |
| post_t_time_info | string  | |
| javascript  | string  | |
| language| string  | |
| os  | string  | |
| plugins | string  | |
| resolution  | string  | |
| last_hit_time_gmt | string  | |
| first_hit_time_gmt | string  | |
| visit_start_time_gmt | string  | |
| last_purchase_time_gmt | string  | |
+-+-+-+--+
|col_name |data_type| comment |
+-+-+-+--+
| last_purchase_num | string  | |
| first_hit_page_url | string  | |
| first_hit_pagename | string  | |
| visit_start_page_url | string  | |
| visit_start_pagename | string  | |
| first_hit_referrer | string  | |
| visit_referrer  | string  | |
| visit_search_engine | string  | |
| visit_num   | string  | |
| visit_page_num  | string  | |
| prev_page   | string  | |
| geo_city| string  | |
| geo_country | string  | |
| geo_region  | string  | |
| duplicate_purchase | string  | |
{code}

With incremental false:
{code}
| search_engine| string|   |
| exclude_hit  | string|   |
| hier1| string|   |
| hier2| string|   |
| hier3| string|   |
| hier4| string|   |
| hier5| string|   |
| browser  | string|   |
| post_browser_height  | string|   |
| post_browser_width   | string|   |
| post_cookies | string|   |
| post_java_enabled| string|   |
| post_persistent_cookie   | string|   |
| color| string|   |
| connection_type  | string|   |
| country  | string|   |
| domain   | string|   |
| post_t_time_info | string|   |
| javascript   | string|   |
| language | string|   |
| os   | string|   |
| plugins  | string|   |
| resolution   | string|   |
| last_hit_time_gmt| string|   |
| first_hit_time_gmt   | string   

[jira] [Commented] (HIVE-14138) CBO failed for select current_database()

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362740#comment-15362740
 ] 

Hive QA commented on HIVE-14138:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816147/HIVE-14138.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10294 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/369/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/369/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-369/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816147 - PreCommit-HIVE-MASTER-Build

> CBO failed for select current_database()
> 
>
> Key: HIVE-14138
> URL: https://issues.apache.org/jira/browse/HIVE-14138
> Project: Hive
>  Issue Type: Bug
>  Components: CBO
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14138.patch
>
>
> When issuing the following query, with hive.cbo.enable set to true:
> select current_database();
> The following exception is printed to the Hiveserver2 logs:
> 2016-06-30T09:58:24,146 ERROR [HiveServer2-Handler-Pool: Thread-33] 
> parse.CalcitePlanner: CBO failed, skipping CBO. 
> org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSemanticException: 
> Unsupported
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.genLogicalPlan(CalcitePlanner.java:3136)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:940)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:894)
>   at org.apache.calcite.tools.Frameworks$1.apply(Frameworks.java:113)
>   at 
> org.apache.calcite.prepare.CalcitePrepareImpl.perform(CalcitePrepareImpl.java:969)
>   at org.apache.calcite.tools.Frameworks.withPrepare(Frameworks.java:149)
>   at org.apache.calcite.tools.Frameworks.withPlanner(Frameworks.java:106)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:712)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:280)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10795)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:239)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:329)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1159)
>   at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1146)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:191)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:276)
>   at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:324)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:464)
>   at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:451)
>   at 
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:295)
>   at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:509)
>   at 
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1437)
>   at 
> 

[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362693#comment-15362693
 ] 

Peter Vary commented on HIVE-14146:
---

Good catch [~aihuaxu]. I realised that myself, and in the process to updating 
the patch :)
Thanks anyway!!!

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362677#comment-15362677
 ] 

Xuefu Zhang commented on HIVE-14156:


Since the partition names are stored in underlying metadata store DB, that DB 
has to support unicode in order to support unicode partition names. This in 
return might require some configuration changes (to enable unicode values) on 
the DB.

Here is the doc for mysql: 
http://dev.mysql.com/doc/refman/5.7/en/charset-unicode.html

> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362648#comment-15362648
 ] 

Aihua Xu commented on HIVE-14146:
-

[~pvary] Seems "show create table" output doesn't escape the "\n" in the test. 
Is that expected?

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14146:
--
Attachment: HIVE-14146.patch

The proposed patch

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14146:
--
Status: Patch Available  (was: Open)

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
> Attachments: HIVE-14146.patch
>
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread Peter Vary (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362599#comment-15362599
 ] 

Peter Vary commented on HIVE-14146:
---

Thanks [~niklaus.xiao].
For human use, your solution is perfect, but we had to parse the response by 
code.
My example above is clearly a bug, which should be solved. My proposed solution 
is to print out an escaped \n for this scenario.

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14152) datanucleus.autoStartMechanismMode should set to 'Ignored' to allow rolling downgrade

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362577#comment-15362577
 ] 

Hive QA commented on HIVE-14152:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816138/HIVE-14152.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 29 failed/errored test(s), 9958 tests 
executed
*Failed tests:*
{noformat}
TestCliDriver-bool_literal.q-authorization_cli_createtab.q-explain_ddl.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-cbo_rp_join1.q-union_top_level.q-insert_update_delete.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-create_func1.q-bucketmapjoin3.q-enforce_order.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-describe_xpath.q-autogen_colalias.q-udf_named_struct.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-encryption_join_with_different_encryption_keys.q-bucketcontext_3.q-udf_smallint.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-groupby4.q-convert_enum_to_string.q-mapjoin_filter_on_outerjoin.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-index_compact.q-merge_dynamic_partition2.q-cbo_rp_subq_exists.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-nullscript.q-vector_char_mapjoin1.q-load_dyn_part3.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-parquet_ppd_decimal.q-cluster.q-groupby_sort_6.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-ptf_general_queries.q-unionDistinct_1.q-udf_version.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-sample_islocalmode_hook_use_metadata.q-cbo_rp_semijoin.q-udf_when.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-schema_evol_text_vec_mapwork_part_all_complex.q-metadataonly1.q-deleteAnalyze.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-skewjoinopt3.q-rcfile_merge1.q-multigroupby_singlemr.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-stats13.q-join_parse.q-sort_merge_join_desc_2.q-and-12-more - did 
not produce a TEST-*.xml file
TestCliDriver-stats_publisher_error_1.q-auto_join1.q-cast_to_int.q-and-12-more 
- did not produce a TEST-*.xml file
TestCliDriver-tez_joins_explain.q-rename_column.q-varchar_serde.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-udf_double.q-join11.q-join18.q-and-12-more - did not produce a 
TEST-*.xml file
TestCliDriver-udf_locate.q-join32_lessSize.q-correlationoptimizer8.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-udf_to_float.q-decimal_precision2.q-ppd_gby_join.q-and-12-more - 
did not produce a TEST-*.xml file
TestCliDriver-unicode_notation.q-gen_udf_example_add10.q-ppd_join4.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_complex_join.q-interval_udf.q-udf_classloader_dynamic_dependency_resolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_distinct_2.q-cte_mat_1.q-update_after_multiple_inserts_special_characters.q-and-12-more
 - did not produce a TEST-*.xml file
TestCliDriver-vector_partition_diff_num_cols.q-stats2.q-union11.q-and-12-more - 
did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.metastore.TestMetastoreVersion.testMetastoreVersion
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/368/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/368/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-368/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 29 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816138 - PreCommit-HIVE-MASTER-Build

> datanucleus.autoStartMechanismMode should set to 'Ignored' to allow rolling 
> downgrade 
> --
>
> Key: HIVE-14152
> URL: https://issues.apache.org/jira/browse/HIVE-14152
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Daniel Dai
>Assignee: Daniel Dai
> Attachments: HIVE-14152.1.patch

[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-07-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362576#comment-15362576
 ] 

Aihua Xu commented on HIVE-13749:
-

+1. The change makes sense to me.

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.1.patch, HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14162) Allow disabling of long running job on Hive On Spark On YARN

2016-07-05 Thread Thomas Scott (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362559#comment-15362559
 ] 

Thomas Scott commented on HIVE-14162:
-

This is equivalent to running:

set hive.execution.engine=spark; 

set hive.execution.engine=mr; 

> Allow disabling of long running job on Hive On Spark On YARN
> 
>
> Key: HIVE-14162
> URL: https://issues.apache.org/jira/browse/HIVE-14162
> Project: Hive
>  Issue Type: New Feature
>  Components: Spark
>Reporter: Thomas Scott
>Priority: Minor
>
> Hive On Spark launches a long running process on the first query to handle 
> all queries for that user session. In some use cases this is not desired, for 
> instance when using Hue with large intervals between query executions.
> Could we have a property that would cause long running spark jobs to be 
> terminated after each query execution and started again for the next one?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13749) Memory leak in Hive Metastore

2016-07-05 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362557#comment-15362557
 ] 

Naveen Gangam commented on HIVE-13749:
--

The test failures above do not appear to be related to the patch. So +1 for me.

> Memory leak in Hive Metastore
> -
>
> Key: HIVE-13749
> URL: https://issues.apache.org/jira/browse/HIVE-13749
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.1.0
>Reporter: Naveen Gangam
>Assignee: Naveen Gangam
> Attachments: HIVE-13749.1.patch, HIVE-13749.patch, Top_Consumers7.html
>
>
> Looking a heap dump of 10GB, a large number of Configuration objects(> 66k 
> instances) are being retained. These objects along with its retained set is 
> occupying about 95% of the heap space. This leads to HMS crashes every few 
> days.
> I will attach an exported snapshot from the eclipse MAT.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13208) Properties setting from the client in Hive.conf may not pass to HMS properly

2016-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu resolved HIVE-13208.
-
Resolution: Cannot Reproduce

The issue should have been fixed by HIVE-13424.

> Properties setting from the client in Hive.conf may not pass to HMS properly
> 
>
> Key: HIVE-13208
> URL: https://issues.apache.org/jira/browse/HIVE-13208
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Aihua Xu
>
> Seems we have some issues with HiveConf within Hive class. We are trying to 
> compare the current conf against the saved conf within local thread Hive. If 
> it's changed, then we need to recreate HMS client to pass  the new conf to 
> the HMS.
> While in some places, we are passing a reference of conf to Hive object and 
> then updating conf and comparing will not trigger the creation of new HMS 
> Client since it's the same conf object.
> And also we are setting db.getConf().set() in QTestUtils.java, but actually 
> it may not work as expected since a HMS client may already exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13208) Properties setting from the client in Hive.conf may not pass to HMS properly

2016-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13208:

Assignee: (was: Aihua Xu)

> Properties setting from the client in Hive.conf may not pass to HMS properly
> 
>
> Key: HIVE-13208
> URL: https://issues.apache.org/jira/browse/HIVE-13208
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore
>Reporter: Aihua Xu
>
> Seems we have some issues with HiveConf within Hive class. We are trying to 
> compare the current conf against the saved conf within local thread Hive. If 
> it's changed, then we need to recreate HMS client to pass  the new conf to 
> the HMS.
> While in some places, we are passing a reference of conf to Hive object and 
> then updating conf and comparing will not trigger the creation of new HMS 
> Client since it's the same conf object.
> And also we are setting db.getConf().set() in QTestUtils.java, but actually 
> it may not work as expected since a HMS client may already exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-6329) Support column level encryption/decryption

2016-07-05 Thread Konstantin Ryakhovskiy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362530#comment-15362530
 ] 

Konstantin Ryakhovskiy commented on HIVE-6329:
--

no updates for more than one year, is there any chance that this feature can be 
in some production version?


> Support column level encryption/decryption
> --
>
> Key: HIVE-6329
> URL: https://issues.apache.org/jira/browse/HIVE-6329
> Project: Hive
>  Issue Type: New Feature
>  Components: Security, Serializers/Deserializers
>Reporter: Navis
>Assignee: Navis
>Priority: Minor
> Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.11.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, 
> HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, 
> HIVE-6329.7.patch.txt, HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
> > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
> > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010--','Seoul, Seocho' from src tablesample (1 rows);
> ..
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11402) HS2 - disallow parallel query execution within a single Session

2016-07-05 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362523#comment-15362523
 ] 

Aihua Xu commented on HIVE-11402:
-

[~sershe] Do you still observe parallel operation issues in the single session? 
I have separated QueryState out of SessionState in HIVE-13424. From previous 
comments, seems it's also common to share operations in the same session like 
in HUE. Should we continue to fix parallel issue if we have any rather than 
disallowing parallel execution?

> HS2 - disallow parallel query execution within a single Session
> ---
>
> Key: HIVE-11402
> URL: https://issues.apache.org/jira/browse/HIVE-11402
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Thejas M Nair
>Assignee: Sergey Shelukhin
> Attachments: HIVE-11402.01.patch, HIVE-11402.02.patch, 
> HIVE-11402.patch
>
>
> HiveServer2 currently allows concurrent queries to be run in a single 
> session. However, every HS2 session has  an associated SessionState object, 
> and the use of SessionState in many places assumes that only one thread is 
> using it, ie it is not thread safe.
> There are many places where SesssionState thread safety needs to be 
> addressed, and until then we should serialize all query execution for a 
> single HS2 session. -This problem can become more visible with HIVE-4239 now 
> allowing parallel query compilation.-
> Note that running queries in parallel for single session is not 
> straightforward  with jdbc, you need to spawn another thread as the 
> Statement.execute calls are blocking. I believe ODBC has non blocking query 
> execution API, and Hue is another well known application that shares sessions 
> for all queries that a user runs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14139) NPE dropping permanent function

2016-07-05 Thread Rui Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-14139:
--
Attachment: HIVE-14139.3.patch

Just realised I missed some update to the new test. Upload patch v3 for that.

> NPE dropping permanent function
> ---
>
> Key: HIVE-14139
> URL: https://issues.apache.org/jira/browse/HIVE-14139
> Project: Hive
>  Issue Type: Bug
>Reporter: Rui Li
>Assignee: Rui Li
> Attachments: HIVE-14139.1.patch, HIVE-14139.2.patch, 
> HIVE-14139.3.patch
>
>
> To reproduce:
> 1. Start a CLI session and create a permanent function.
> 2. Exit current CLI session.
> 3. Start a new CLI session and drop the function.
> Stack trace:
> {noformat}
> FAILED: error during drop function: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.removePersistentFunctionUnderLock(Registry.java:513)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.unregisterFunction(Registry.java:501)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.unregisterPermanentFunction(FunctionRegistry.java:1532)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.dropPermanentFunction(FunctionTask.java:228)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:95)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:197)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1860)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1564)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1316)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1085)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1073)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14142) java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path for Hive on Spark

2016-07-05 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-14142:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Pushed to master. Thanks Ferdinand for reviewing.

> java.lang.ClassNotFoundException for the jar in hive.reloadable.aux.jars.path 
> for Hive on Spark
> ---
>
> Key: HIVE-14142
> URL: https://issues.apache.org/jira/browse/HIVE-14142
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.2.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Fix For: 2.2.0
>
> Attachments: HIVE-14142.1.patch
>
>
> Similar to HIVE-14037, seems HOS also has the same issue. The jars in 
> hive.reloadable.aux.jars.path are not available during runtime.
> {noformat}
> java.lang.RuntimeException: Reduce operator initialization failed
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:232)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
>   at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>   at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>   at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>   at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>   at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
>   at org.apache.spark.scheduler.Task.run(Task.scala:89)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: 
> xudf.XAdd
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:134)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.isStateful(FunctionRegistry.java:1365)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.isDeterministic(FunctionRegistry.java:1328)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.isDeterministic(ExprNodeGenericFuncEvaluator.java:153)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.iterate(ExprNodeEvaluatorFactory.java:100)
>   at 
> org.apache.hadoop.hive.ql.exec.ExprNodeEvaluatorFactory.toCachedEvals(ExprNodeEvaluatorFactory.java:74)
>   at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:59)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:469)
>   at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:425)
>   at 
> org.apache.hadoop.hive.ql.exec.GroupByOperator.initializeOp(GroupByOperator.java:406)
>   at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:217)
>   ... 15 more
> Caused by: java.lang.ClassNotFoundException: xudf.XAdd
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:270)
>   at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.getUdfClass(GenericUDFBridge.java:132)
>   ... 27 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362443#comment-15362443
 ] 

niklaus xiao edited comment on HIVE-14156 at 7/5/16 12:53 PM:
--

I tried this on postgres, seems not an issue.

{code}
create table foo (name string, age int) partitioned by (city string) row format 
delimited fields terminated by ',';
alter table foo add partition(city='深圳');
show partitions foo;
++--+
| partition  |
++--+
| city=深圳|
++--+
1 row selected (0.355 seconds)
{code}


was (Author: niklaus.xiao):
I tried this on postgres, seems not an issue.

{quote}
create table foo (name string, age int) partitioned by (city string) row format 
delimited fields terminated by ',';
alter table foo add partition(city='深圳');
show partitions foo;
++--+
| partition  |
++--+
| city=深圳|
++--+
1 row selected (0.355 seconds)
{quote}

> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14146) Column comments with "\n" character "corrupts" table metadata

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362446#comment-15362446
 ] 

niklaus xiao commented on HIVE-14146:
-

You can try this:
{code}
desc pretty commtest;
+--++--+
| col_name | data_type  | comment  |
+--++--+--+
| col_name data_type comment   | NULL   | NULL |
|  | NULL   | NULL |
| first_nm stringIndicates First name  | NULL   | NULL |
|of an individual  | NULL   | NULL |
+--++--+
{code}

> Column comments with "\n" character "corrupts" table metadata
> -
>
> Key: HIVE-14146
> URL: https://issues.apache.org/jira/browse/HIVE-14146
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>
> Create a table with the following(noting the \n in the COMMENT):
> {noformat}
> CREATE TABLE commtest(first_nm string COMMENT 'Indicates First name\nof an 
> individual’);
> {noformat}
> Describe shows that now the metadata is messed up:
> {noformat}
> beeline> describe commtest;
> +---++---+--+
> | col_name  | data_type  |comment|
> +---++---+--+
> | first_nm | string   | Indicates First name  |
> | of an individual  | NULL   | NULL  |
> +---++---+--+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362443#comment-15362443
 ] 

niklaus xiao commented on HIVE-14156:
-

I tried this on postgres, seems not an issue.

{quote}
create table foo (name string, age int) partitioned by (city string) row format 
delimited fields terminated by ',';
alter table foo add partition(city='深圳');
show partitions foo;
++--+
| partition  |
++--+
| city=深圳|
++--+
1 row selected (0.355 seconds)
{quote}

> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (HIVE-12154) Load data inpath 'PATTERN' into table should only check files match the PATTERN

2016-07-05 Thread niklaus xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

niklaus xiao updated HIVE-12154:

Comment: was deleted

(was: Use 
{quote}
fs.globStatus(pattern); 
{quote}
instead of 
{quote}
fs.listStatus(path);
{quote}


Attached the initial patch.)

> Load data inpath 'PATTERN' into table should only check files match the 
> PATTERN
> ---
>
> Key: HIVE-12154
> URL: https://issues.apache.org/jira/browse/HIVE-12154
> Project: Hive
>  Issue Type: Bug
>  Components: SQLStandardAuthorization
>Affects Versions: 0.13.1, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>Reporter: niklaus xiao
>Priority: Minor
>
> We are using flume to sink data to HDFS directory '/tmp/test/', temporal 
> files that flume actively writes into has a suffix .tmp, after writes finish, 
> the file will be renamed to SAMPLE.data.
> Hive periodic task execute script like 
> {quote}
> load data inpath '/tmp/test/*.data' into table t1;
> {quote}
> This exception happens sometimes
> {quote}
> 2015-10-12 19:38:00,133 | ERROR | HiveServer2-Handler-Pool: Thread-57 | 
> FAILED: HiveAuthzPluginException Error getting permissions for 
> hdfs://hacluster/tmp/test/*.data: null
> org.apache.hadoop.hive.ql.security.authorization.plugin.HiveAuthzPluginException:
>  Error getting permissions for hdfs://hacluster/tmp/test/*.data: null
> ...
> Caused by: java.io.FileNotFoundException: Path not found
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAccess(FSNamesystem.java:8175)
> {quote}
> I digged into the code, and found that SQLStdHiveAuthorizationValidator 
> checks all the files in /tmp/test/ directory, but when checks the permission 
> of .tmp file, the file is renamed to .data, hdfs cannot find this file.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13883) WebHCat leaves token crc file never gets deleted

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362428#comment-15362428
 ] 

niklaus xiao commented on HIVE-13883:
-

[~sushanth] Could you take a look?

> WebHCat leaves token crc file never gets deleted
> 
>
> Key: HIVE-13883
> URL: https://issues.apache.org/jira/browse/HIVE-13883
> Project: Hive
>  Issue Type: Bug
>  Components: WebHCat
>Affects Versions: 1.2.0, 1.1.1, 1.2.1, 2.0.1
>Reporter: niklaus xiao
>Priority: Minor
> Attachments: HIVE-13883.patch
>
>
> In one of our long run environment, there are thousands of 
> /tmp/.templeton*.tmp.crc files, 
> {quote}
> omm@szxciitslx17645:/> ll /tmp/.templeton*.tmp.crc 
> ...
> -rw-r--r-- 1 omm  wheel 12 May 26 18:15 
> /tmp/.templeton6676048390600607654.tmp.crc
> -rw-r--r-- 1 omm  wheel 12 May 26 18:14 
> /tmp/.templeton2733383617337556503.tmp.crc
> -rw-r--r-- 1 omm  wheel 12 May 26 18:12 
> /tmp/.templeton2183121761801669064.tmp.crc
> -rw-r--r-- 1 omm  wheel 12 May 26 18:11 
> /tmp/.templeton2689764046140543879.tmp.crc
> ...
> {quote}
> {quote}
> omm@szxciitslx17645:/> ll /tmp/.templeton*.tmp.crc  | wc -l
> 17986
> {quote}
> It's created by webhcat, 
> [https://github.com/apache/hive/blob/master/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java#L193]
>   and never gets deleted 
> [https://github.com/apache/hive/blob/master/hcatalog/webhcat/svr/src/main/java/org/apache/hive/hcatalog/templeton/SecureProxySupport.java#L110]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14144) Permanent functions are showing up in show functions, but describe says it doesn't exist

2016-07-05 Thread Rajat Khandelwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362418#comment-15362418
 ] 

Rajat Khandelwal commented on HIVE-14144:
-

Fix for HIVE-13903 caused this to happen. Seems that because of that patch, 
some steps get missed for permanent functions and though they *are* present in 
the memory, some info is not present in the FunctionInfo object for those 
functions. Hence, reverting those changes in this, and adding a check just in 
the download step. Thus, repeated download issue (HIVE-13903) is also solved. 

> Permanent functions are showing up in show functions, but describe says it 
> doesn't exist
> 
>
> Key: HIVE-14144
> URL: https://issues.apache.org/jira/browse/HIVE-14144
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-14144.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14158) deal with derived column names

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362406#comment-15362406
 ] 

Hive QA commented on HIVE-14158:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816136/HIVE-14158.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10296 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_masking_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/367/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/367/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-367/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816136 - PreCommit-HIVE-MASTER-Build

> deal with derived column names
> --
>
> Key: HIVE-14158
> URL: https://issues.apache.org/jira/browse/HIVE-14158
> Project: Hive
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Fix For: 2.1.0
>
> Attachments: HIVE-14158.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14113) Create function failed but function in show function list

2016-07-05 Thread niklaus xiao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362403#comment-15362403
 ] 

niklaus xiao commented on HIVE-14113:
-

test failure unrelated.

> Create function failed but function in show function list
> -
>
> Key: HIVE-14113
> URL: https://issues.apache.org/jira/browse/HIVE-14113
> Project: Hive
>  Issue Type: Bug
>  Components: UDF
>Affects Versions: 1.2.0
>Reporter: niklaus xiao
>Assignee: Navis
> Fix For: 1.3.0
>
> Attachments: HIVE-14113.1.patch
>
>
> 1. create function with invalid hdfs path, /udf/udf-test.jar does not exists
> {quote}
> create function my_lower as 'com.tang.UDFLower' using jar 
> 'hdfs:///udf/udf-test.jar';
> {quote}
> Failed with following exception:
> {quote}
> 0: jdbc:hive2://189.39.151.44:1/> create function my_lower as 
> 'com.tang.UDFLower' using jar 'hdfs:///udf/udf-test.jar';
> INFO  : converting to local hdfs:///udf/udf-test.jar
> ERROR : Failed to read external resource hdfs:///udf/udf-test.jar
> java.lang.RuntimeException: Failed to read external resource 
> hdfs:///udf/udf-test.jar
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1384)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.resolveAndDownload(SessionState.java:1340)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1264)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.add_resources(SessionState.java:1250)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.addFunctionResources(FunctionTask.java:306)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerToSessionRegistry(Registry.java:466)
>   at 
> org.apache.hadoop.hive.ql.exec.Registry.registerPermanentFunction(Registry.java:206)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerPermanentFunction(FunctionRegistry.java:1551)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.createPermanentFunction(FunctionTask.java:136)
>   at 
> org.apache.hadoop.hive.ql.exec.FunctionTask.execute(FunctionTask.java:75)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:158)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1965)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1723)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1475)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1283)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1278)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:167)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation.access$200(SQLOperation.java:75)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:245)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1711)
>   at 
> org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:258)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.FileNotFoundException: File does not exist: 
> hdfs:/udf/udf-test.jar
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1391)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1383)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1383)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:340)
>   at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:292)
>   at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2034)
>   at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2003)
>   at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1979)
>   at 
> org.apache.hadoop.hive.ql.session.SessionState.downloadResource(SessionState.java:1370)
>   ... 28 more
> ERROR : Failed to register default.my_lower using class com.tang.UDFLower
> Error: Error while processing statement: FAILED: Execution Error, return 

[jira] [Updated] (HIVE-14100) current_user() returns invalid information

2016-07-05 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14100:
--
Status: Patch Available  (was: Open)

The patch is ready

> current_user() returns invalid information
> --
>
> Key: HIVE-14100
> URL: https://issues.apache.org/jira/browse/HIVE-14100
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14100.patch
>
>
> Using HadoopDeaultAuthenticator the current_user() returns the username of 
> the unix user running hiveservice2.
> Using SessionStateAuthenticator the current_user returns the username which 
> is provided when the connection started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14100) current_user() returns invalid information

2016-07-05 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14100:
--
Attachment: HIVE-14100.patch

Since there was no answer from the original author, and we do not want to break 
the backward compatibility, a new function is provided to return the logged in 
user

> current_user() returns invalid information
> --
>
> Key: HIVE-14100
> URL: https://issues.apache.org/jira/browse/HIVE-14100
> Project: Hive
>  Issue Type: Bug
>  Components: Authentication, Beeline
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14100.patch
>
>
> Using HadoopDeaultAuthenticator the current_user() returns the username of 
> the unix user running hiveservice2.
> Using SessionStateAuthenticator the current_user returns the username which 
> is provided when the connection started.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-12548) Hive metastore goes down in Kerberos,sentry enabled CDH5.5 cluster

2016-07-05 Thread weiqiang chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362377#comment-15362377
 ] 

weiqiang chen commented on HIVE-12548:
--

Thanks Andrew, I have figured out the reason. I am using spark and I am writing 
code to get the token myself.



On Fri, Jul 1, 2016 at 10:16 PM +0800, "Andrew Olson (JIRA)" 
> wrote:


[ 
https://issues.apache.org/jira/browse/HIVE-12548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15359017#comment-15359017
 ]

Andrew Olson commented on HIVE-12548:
-

[~dbacwq] It is difficult to suggest a solution for the problem of "it doesn't 
work". An exception stack trace or otherwise more details would be helpful.

Can you confirm that you are using Oozie, with a Java action? I don't know much 
about Sentry. This code was only relevant for Oozie + Java action + Metastore + 
Kerberos. It should be applicable for CDH 5.x.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


> Hive metastore goes down in Kerberos,sentry enabled CDH5.5 cluster
> --
>
> Key: HIVE-12548
> URL: https://issues.apache.org/jira/browse/HIVE-12548
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
> Environment: RHEL 6.5 CLOUDERA CDH 5.5
>Reporter: narendra reddy ganesana
>
> [pool-3-thread-10]: Error occurred during processing of message.
> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: 
> Invalid status -128
>   at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:356)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
>   at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.transport.TTransportException: Invalid status 
> -128
>   at 
> org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232)
>   at 
> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184)
>   at 
> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>   at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:271)
>   at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>   at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>   ... 10 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14004) Minor compaction produces ArrayIndexOutOfBoundsException: 7 in SchemaEvolution.getFileType

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362330#comment-15362330
 ] 

Hive QA commented on HIVE-14004:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816140/HIVE-14004.01.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 10295 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_table_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateDatabaseWithTableNonDefaultNameNode
org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/366/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/366/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-366/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 8 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816140 - PreCommit-HIVE-MASTER-Build

> Minor compaction produces ArrayIndexOutOfBoundsException: 7 in 
> SchemaEvolution.getFileType
> --
>
> Key: HIVE-14004
> URL: https://issues.apache.org/jira/browse/HIVE-14004
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Matt McCline
> Attachments: HIVE-14004.01.patch
>
>
> Easiest way to repro is to add TestTxnCommands2
> {noformat}
>   @Test
>   public void testCompactWithDelete() throws Exception {
> int[][] tableData = {{1,2},{3,4}};
> runStatementOnDriver("insert into " + Table.ACIDTBL + "(a,b) " + 
> makeValuesClause(tableData));
> runStatementOnDriver("alter table "+ Table.ACIDTBL + " compact 'MAJOR'");
> Worker t = new Worker();
> t.setThreadId((int) t.getId());
> t.setHiveConf(hiveConf);
> AtomicBoolean stop = new AtomicBoolean();
> AtomicBoolean looped = new AtomicBoolean();
> stop.set(true);
> t.init(stop, looped);
> t.run();
> runStatementOnDriver("delete from " + Table.ACIDTBL + " where b = 4");
> runStatementOnDriver("update " + Table.ACIDTBL + " set b = -2 where b = 
> 2");
> runStatementOnDriver("alter table "+ Table.ACIDTBL + " compact 'MINOR'");
> t.run();
>   }
> {noformat}
> to TestTxnCommands2 and run it.
> Test won't fail but if you look 
> in target/tmp/log/hive.log for the following exception (from Minor 
> compaction).
> {noformat}
> 2016-06-09T18:36:39,071 WARN  [Thread-190[]]: mapred.LocalJobRunner 
> (LocalJobRunner.java:run(560)) - job_local1233973168_0005
> java.lang.Exception: java.lang.ArrayIndexOutOfBoundsException: 7
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) 
> ~[hadoop-mapreduce-client-common-2.6.1.jar:?]
> at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) 
> [hadoop-mapreduce-client-common-2.6.1.jar:?]
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 7
> at 
> org.apache.orc.impl.SchemaEvolution.getFileType(SchemaEvolution.java:67) 
> ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2031)
>  ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1716)
>  ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2077)
>  ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.orc.impl.TreeReaderFactory$StructTreeReader.(TreeReaderFactory.java:1716)
>  ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> org.apache.orc.impl.TreeReaderFactory.createTreeReader(TreeReaderFactory.java:2077)
>  ~[hive-orc-2.2.0-SNAPSHOT.jar:2.2.0-SNAPSHOT]
> at 
> 

[jira] [Updated] (HIVE-14123) Add beeline configuration option to show database in the prompt

2016-07-05 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-14123:
--
Description: 
There are several jira issues complaining that, the Beeline does not respect 
hive.cli.print.current.db.

This is partially true, since in embedded mode, it uses the 
hive.cli.print.current.db to change the prompt, since HIVE-10511.

In beeline mode, I think this function should use a beeline command line option 
instead, like for the showHeader option emphasizing, that this is a client side 
option.

  was:
There are several jira issues complaining that, the Beeline does not respect 
hive.cli.print.current.db.

This is partially true, since in embedded mode, it uses the 
hive.cli.print.current.db to change the prompt, since HIVE-10511.

In remote mode, I think this function should use a beeline command line option 
instead, like for the showHeader option emphasizing, that this is a client side 
option.


> Add beeline configuration option to show database in the prompt
> ---
>
> Key: HIVE-14123
> URL: https://issues.apache.org/jira/browse/HIVE-14123
> Project: Hive
>  Issue Type: Improvement
>  Components: Beeline, CLI
>Affects Versions: 2.2.0
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Minor
> Attachments: HIVE-14123.2.patch, HIVE-14123.3.patch, HIVE-14123.patch
>
>
> There are several jira issues complaining that, the Beeline does not respect 
> hive.cli.print.current.db.
> This is partially true, since in embedded mode, it uses the 
> hive.cli.print.current.db to change the prompt, since HIVE-10511.
> In beeline mode, I think this function should use a beeline command line 
> option instead, like for the showHeader option emphasizing, that this is a 
> client side option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread Bing Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362286#comment-15362286
 ] 

Bing Li commented on HIVE-14156:


Hi, [~xiaobingo]
I noticed that you fixed HIVE-8550 on windows, and mentioned that it should 
work on Linux.
I ran the similar query but failed with MySQL.

In order to make it work, besides the changes in Hive schema script, I also 
need to update MySQL's configuration file which is my.cnf.

When you ran it on windows, did you change the configuraions for the database? 
Did you have a chance to run it on Linux as well?

Thank you.


> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14156) Problem with Chinese characters as partition value when using MySQL

2016-07-05 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362278#comment-15362278
 ] 

Rui Li commented on HIVE-14156:
---

Hi [~xuefuz], do you have any idea on this one? Is user allowed to use Chinese 
for partition value? Thanks.

> Problem with Chinese characters as partition value when using MySQL
> ---
>
> Key: HIVE-14156
> URL: https://issues.apache.org/jira/browse/HIVE-14156
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 1.2.1, 2.0.0
>Reporter: Bing Li
>Assignee: Bing Li
>
> Steps to reproduce:
> create table t1 (name string, age int) partitioned by (city string) row 
> format delimited fields terminated by ',';
> load data local inpath '/tmp/chn-partition.txt' overwrite into table t1 
> partition (city='北京');
> The content of /tmp/chn-partition.txt:
> 小明,20
> 小红,15
> 张三,36
> 李四,50
> When check the partition value in MySQL, it shows ?? instead of "北京".
> When run "drop table t1", it will hang.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13825) Using JOIN in 2 tables that has same path locations, but different colum names fail wtih an error exception

2016-07-05 Thread Venkat Sambath (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362244#comment-15362244
 ] 

Venkat Sambath commented on HIVE-13825:
---

For some use cases this workaround wouldn't work. 
For example.
If the usecase involves mapping multiple tables on the same file.

CREATE TABLE t1 ( a string, b string) location '/user/hive/warehouse/test1'; 
CREATE TABLE t2 ( d string, e string) location '/user/hive/warehouse/test1';

t1 and t2 maps to different columns on the same table. In this case, creating 
view t3 on t2 and joining t3 with t1 will result in the same error as provided 
in the case description. 

> Using JOIN in 2 tables that has same path locations, but different colum 
> names fail wtih an error exception
> ---
>
> Key: HIVE-13825
> URL: https://issues.apache.org/jira/browse/HIVE-13825
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergio Peña
>Assignee: Vihang Karajgaonkar
>
> The following scenario of 2 tables with same locations cannot be used on a 
> JOIN query:
> {noformat}
> hive> create table t1 (a string, b string) location 
> '/user/hive/warehouse/test1';
> OK
> hive> create table t2 (c string, d string) location 
> '/user/hive/warehouse/test1';
> OK
> hive> select t1.a from t1 join t2 on t1.a = t2.c;
> ...
> 2016-05-23 16:39:57 Starting to launch local task to process map join;
>   maximum memory = 477102080
> Execution failed with exit status: 2
> Obtaining error information
> Task failed!
> Task ID:
>   Stage-4
> Logs:
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
> {noformat}
> The logs contain this error exception:
> {noformat}
> 2016-05-23T16:39:58,163 ERROR [main]: mr.MapredLocalTask (:()) - Hive Runtime 
> Error: Map local work failed
> java.lang.RuntimeException: cannot find field a from [0:c, 1:d]
> at 
> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:485)
> at 
> org.apache.hadoop.hive.serde2.BaseStructObjectInspector.getStructFieldRef(BaseStructObjectInspector.java:133)
> at 
> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:55)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:973)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:999)
> at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:75)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:355)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:504)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:457)
> at 
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:365)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.initializeOperators(MapredLocalTask.java:499)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.startForward(MapredLocalTask.java:403)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInProcess(MapredLocalTask.java:383)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.main(ExecDriver.java:751)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-14159) sorting of tuple array using multiple field[s]

2016-07-05 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15362203#comment-15362203
 ] 

Hive QA commented on HIVE-14159:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12816126/HIVE-14159.1.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 10301 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_show_functions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_multiinsert
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_all
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vector_complex_join
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/365/testReport
Console output: 
https://builds.apache.org/job/PreCommit-HIVE-MASTER-Build/365/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-365/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12816126 - PreCommit-HIVE-MASTER-Build

> sorting of tuple array using multiple field[s]
> --
>
> Key: HIVE-14159
> URL: https://issues.apache.org/jira/browse/HIVE-14159
> Project: Hive
>  Issue Type: Improvement
>  Components: UDF
>Reporter: Simanchal Das
>Assignee: Simanchal Das
>  Labels: patch
> Attachments: HIVE-14159.1.patch
>
>
> Problem Statement:
> When we are working with complex structure of data like avro.
> Most of the times we are encountering array contains multiple tuples and each 
> tuple have struct schema.
> Suppose here struct schema is like below:
> {noformat}
> {
>   "name": "employee",
>   "type": [{
>   "type": "record",
>   "name": "Employee",
>   "namespace": "com.company.Employee",
>   "fields": [{
>   "name": "empId",
>   "type": "int"
>   }, {
>   "name": "empName",
>   "type": "string"
>   }, {
>   "name": "age",
>   "type": "int"
>   }, {
>   "name": "salary",
>   "type": "double"
>   }]
>   }]
> }
> {noformat}
> Then while running our hive query complex array looks like array of employee 
> objects.
> {noformat}
> Example: 
>   //(array>)
>   
> Array[Employee(100,Foo,20,20990),Employee(500,Boo,30,50990),Employee(700,Harry,25,40990),Employee(100,Tom,35,70990)]
> {noformat}
> When we are implementing business use cases day to day life we are 
> encountering problems like sorting a tuple array by specific field[s] like 
> empId,name,salary,etc.
> Proposal:
> I have developed a udf 'sort_array_field' which will sort a tuple array by 
> one or more fields in naural order.
> {noformat}
> Example:
>   1.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Salary");
>   output: 
> array[struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(500,Boo,30,50990),struct(100,Tom,35,70990)]
>   
>   2.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,80990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary");
>   output: 
> array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
>   3.Select 
> sort_array_field(array[struct(100,Foo,20,20990),struct(500,Boo,30,50990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)],"Name","Salary","Age);
>   output: 
> array[struct(500,Boo,30,50990),struct(500,Boo,30,80990),struct(100,Foo,20,20990),struct(700,Harry,25,40990),struct(100,Tom,35,70990)]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >