date:20150428


 [ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-5672:

Attachment: HIVE-5672.7.patch.tar.gz

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz, HIVE-5672.7.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10507) Expose RetryingMetastoreClient to other external users of metastore client like Flume and Storm.

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517045#comment-14517045
 ] 

Hive QA commented on HIVE-10507:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728609/HIVE-10507.1.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8818 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3624/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3624/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3624/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728609 - PreCommit-HIVE-TRUNK-Build

 Expose  RetryingMetastoreClient to other external users of metastore client 
 like Flume and Storm.
 -

 Key: HIVE-10507
 URL: https://issues.apache.org/jira/browse/HIVE-10507
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10507.1.patch


 HiveMetastoreClient is now being relied upon by external clients like Flume 
 and Storm for streaming.
 When the thrift connection between MetaStoreClient and the meta store is 
 broken (due to intermittent network issues or restarting of metastore) the 
 Metastore does not handle the connection error and automatically re-establish 
 the connection. Currently the client process needs to be restarted to 
 re-establish the connection.
 The request here is consider supporting the following behavior: For each API 
 invocation on the MetastoreClient, it should try to restablish the connection 
 (if needed) once. And if that does not work out then throw a specific 
 exception indicating the same. The client could then handle the issue by 
 retrying the same API after some delay. By catching the specific connection 
 exception, the client could decide how many times to retry before aborting.
 Hive does this internally using RetryingMetastoreClient. This jira is suppose 
 to expose this mechanism to other users of that interface. This is useful for 
 users of this interface, and from metastore HA point of view.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10447) Beeline JDBC Driver to support 2 way SSL


 [ 
https://issues.apache.org/jira/browse/HIVE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10447:
-
Labels: TODOC1.2  (was: )

 Beeline JDBC Driver to support 2 way SSL
 

 Key: HIVE-10447
 URL: https://issues.apache.org/jira/browse/HIVE-10447
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10447.1.patch, HIVE-10447.2.patch, 
 HIVE-10447.2.patch


 This jira should cover 2-way SSL authentication between the JDBC Client and 
 server which requires the driver to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10504) ORC date column statistics should return primitive object instead of writable

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516800#comment-14516800
 ] 

Hive QA commented on HIVE-10504:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728530/HIVE-10504.1.patch

{color:red}ERROR:{color} -1 due to 18 failed/errored test(s), 8818 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableInBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testDateWritableNullSafeEqualsBloomFilter
org.apache.hadoop.hive.ql.io.orc.TestRecordReaderImpl.testPredEvalWithDateStats
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3622/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3622/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3622/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 18 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728530 - PreCommit-HIVE-TRUNK-Build

 ORC date column statistics should return primitive object instead of writable
 -

 Key: HIVE-10504
 URL: https://issues.apache.org/jira/browse/HIVE-10504
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0, 1.3.0
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10504.1.patch


 Date column statistics is inconsistent with other column statistics. It 
 returns DateWritable as opposed to primitive variant Date.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10428) NPE in RegexSerDe using HCat

2015-04-28 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10428:
--
Attachment: HIVE-10428.2.patch

Attaching patch v2, which makes the fix in InternalUtil.getSerdeProperties()

 NPE in RegexSerDe using HCat
 

 Key: HIVE-10428
 URL: https://issues.apache.org/jira/browse/HIVE-10428
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10428.1.patch, HIVE-10428.2.patch


 When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, 
 when doing Hcatalog call to get read the table, it throws exception:
 {noformat}
 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; 
 Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: 
 (HDFS_DELEGATION_TOKEN token 1478 for haha)
 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 Splits len : 1
 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, 
 hdpseca05.seca.hwxsup.com]
 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing 
 org.apache.hadoop.hive.serde2.RegexSerDe with properties 
 {name=casetest.regex_table, numFiles=1, columns.types=string,string, 
 serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, 
 output.format.string=%1$s %2$s, 
 serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, 
 COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, 
 input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172}
 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been 
 deprecated
 Exception in thread main java.lang.NullPointerException
   at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
   at com.google.common.base.Splitter.split(Splitter.java:371)
   at 
 org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155)
   at 
 org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518)
   at 
 org.apache.hive.hcatalog.mapreduce.InternalUtil.initializeDeserializer(InternalUtil.java:156)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createDeserializer(HCatRecordReader.java:127)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:92)
   at HCatalogSQLMR.main(HCatalogSQLMR.java:81)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10508) Strip out password information from config passed to Tez/MR in cases where password encryption is not used

2015-04-28 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517929#comment-14517929
 ] 

Thejas M Nair commented on HIVE-10508:
--

Thanks for the patch Hari!
 # Looks like additional changes would be needed for the MR case
 # Setting the value to null would result in an error. See Configuration.set. 
Maybe set to emtpy string instead ? If I remember right there was not a good 
way to remove the configuration that works in both 1.x and 2.x .
 #  conf.setVar(HiveConf.ConfVars.METASTOREPWD, null) is more concise/readable 
than -  HiveConf.setVar(conf, HiveConf.ConfVars.METASTOREPWD, null);


 Strip out password information from config passed to Tez/MR in cases where 
 password encryption is not used
 --

 Key: HIVE-10508
 URL: https://issues.apache.org/jira/browse/HIVE-10508
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10508.1.patch


 Remove password information from configuration copy that is sent to Yarn/Tez. 
 We don't need it there. The config entries can potentially be visible to 
 other users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10478) resolved

2015-04-28 Thread anna ken (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anna ken resolved HIVE-10478.
-
Resolution: Fixed

 resolved
 

 Key: HIVE-10478
 URL: https://issues.apache.org/jira/browse/HIVE-10478
 Project: Hive
  Issue Type: Task
  Components: Hive
Reporter: anna ken
  Labels: hadoop, hive, hue, kryo

 resolved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10478) resolved

2015-04-28 Thread anna ken (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anna ken updated HIVE-10478:

Description: 

resolved

  was:
I am writing the simple hive query,Receiving the following error 
intermittently. This error presents itself for 30min-2hr then goes away.

Error: java.lang.RuntimeException: 
org.apache.hive.com/esotericsoftware.kryo.KryoException: Encountered 
unregistered class ID: 380

on hive server the following Hive jar is installed:

i have using kryo version 2.22 
hive-exec-0.13.1-cdh5.2.1.jar

Please help me on this.

2015-04-23 10:35:55,496 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: hadoop.ssl.require.client.cert; 
 Ignoring.
2015-04-23 10:35:55,505 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2015-04-23 10:35:55,507 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: hadoop.ssl.client.conf;  
Ignoring.
2015-04-23 10:35:55,512 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: 
hadoop.ssl.keystores.factory.class;  Ignoring.
2015-04-23 10:35:55,523 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: hadoop.ssl.server.conf;  
Ignoring.
2015-04-23 10:35:55,582 WARN [main] org.apache.hadoop.conf.Configuration: 
job.xml:an attempt to override final parameter: 
mapreduce.job.end-notification.max.attempts;  Ignoring.
2015-04-23 10:35:55,901 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties
2015-04-23 10:35:55,978 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 
10 second(s).
2015-04-23 10:35:55,979 INFO [main] 
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: MapTask metrics system 
started
2015-04-23 10:35:55,990 INFO [main] org.apache.hadoop.mapred.YarnChild: 
Executing with tokens:
2015-04-23 10:35:55,990 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
mapreduce.job, Service: job_1429749818660_0140, Ident: 
(org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@4c26add0)
2015-04-23 10:35:56,111 INFO [main] org.apache.hadoop.mapred.YarnChild: 
Sleeping for 0ms before retrying again. Got null now.
2015-04-23 10:35:58,290 ERROR [main] org.apache.hadoop.hive.ql.exec.Utilities: 
Failed to load plan: 
hdfs://nameservice1/tmp/hive-dks0344135/hive_2015-04-23_10-34-25_211_2333815521043174815-4/-mr-10016/b7fe67d9-5471-4e66-bdf4-6280f840f5ec/map.xml
org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered 
unregistered class ID: -848874534
Serialization trace:
startTimes (org.apache.hadoop.hive.ql.log.PerfLogger)
perfLogger (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.FilterOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.UnionOperator)
childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator)
aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork)
at 
org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:119)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507)
at 
org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112)
at 
org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
2015-04-23 10:35:58,293 WARN [main] org.apache.hadoop.mapred.YarnChild: 
Exception running child : java.lang.RuntimeException: 
org.apache.hive.com.esotericsoftware.kryo.KryoException: Encountered 
unregistered class ID: -848874534
Serialization trace:
startTimes (org.apache.hadoop.hive.ql.log.PerfLogger)
perfLogger (org.apache.hadoop.hive.ql.exec.MapJoinOperator)
parentOperators (org.apache.hadoop.hive.ql.exec.SelectOperator)
parentOperators

[jira] [Updated] (HIVE-10514) Fix MiniCliDriver tests failure


 [ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10514:
-
Attachment: HIVE-10514.3.patch

[~szehon] Good point, I have refactored the code to avoid redundancy. Please 
take a look at patch#3.

Thanks
Hari

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch, 
 HIVE-10514.3.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.


[ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518433#comment-14518433
 ] 

Hive QA commented on HIVE-10454:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728785/HIVE-10454.patch

{color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 8825 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_alter_view_failure6
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_input_part0_neg
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3633/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3633/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3633/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 17 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728785 - PreCommit-HIVE-TRUNK-Build

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10454.patch


 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]

2015-04-28 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-9152:
---
Attachment: HIVE-9152.5-spark.patch

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10514) Fix MiniCliDriver tests failure


[ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518244#comment-14518244
 ] 

Szehon Ho commented on HIVE-10514:
--

OK, got it.  I think last patch looks great, wondering if we can have a central 
method in QtestUtil for these methods (for writing the file, and reading)?

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account

2015-04-28 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518393#comment-14518393
 ] 

Laljo John Pullokkaran commented on HIVE-10526:
---

[~ashutoshc] Could you review this?

 CBO (Calcite Return Path): HiveCost epsilon comparison should take row count 
 in to account
 --

 Key: HIVE-10526
 URL: https://issues.apache.org/jira/browse/HIVE-10526
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10526.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10403) Add n-way join support for Hybrid Grace Hash Join

2015-04-28 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10403:
-
Attachment: HIVE-10403.06.patch

Replace patch 06 to fix a bug

 Add n-way join support for Hybrid Grace Hash Join
 -

 Key: HIVE-10403
 URL: https://issues.apache.org/jira/browse/HIVE-10403
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Attachments: HIVE-10403.01.patch, HIVE-10403.02.patch, 
 HIVE-10403.03.patch, HIVE-10403.04.patch, HIVE-10403.06.patch


 Currently Hybrid Grace Hash Join only supports 2-way join (one big table and 
 one small table). This task will enable n-way join (one big table and 
 multiple small tables).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10403) Add n-way join support for Hybrid Grace Hash Join

2015-04-28 Thread Wei Zheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-10403:
-
Attachment: (was: HIVE-10403.06.patch)

 Add n-way join support for Hybrid Grace Hash Join
 -

 Key: HIVE-10403
 URL: https://issues.apache.org/jira/browse/HIVE-10403
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.2.0
Reporter: Wei Zheng
Assignee: Wei Zheng
 Attachments: HIVE-10403.01.patch, HIVE-10403.02.patch, 
 HIVE-10403.03.patch, HIVE-10403.04.patch, HIVE-10403.06.patch


 Currently Hybrid Grace Hash Join only supports 2-way join (one big table and 
 one small table). This task will enable n-way join (one big table and 
 multiple small tables).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10481) ACID table update finishes but values not really updated if column names are not all lower case

2015-04-28 Thread Eugene Koifman (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518307#comment-14518307
 ] 

Eugene Koifman commented on HIVE-10481:
---

the failures are not related.  I tried a random handful from this list locally 
- they all pass.

 ACID table update finishes but values not really updated if column names are 
 not all lower case
 ---

 Key: HIVE-10481
 URL: https://issues.apache.org/jira/browse/HIVE-10481
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10481.2.patch, HIVE-10481.patch


 Column in table is defined with upper case or mixed case, when do update 
 command with verbatim column names, update doesn't update the value. when do 
 update with all lower case column names, it works.
 STEPS TO REPRODUCE:
 create table testable( a string, Bb string, c string)
 clustered by (c) into 3 buckets
 stored as orc
 tblproperties(transactional=true);
 insert into table testable values ('a1','b1','c1), ('a2','b2','c2'), 
 ('a3','b3','c3');
 update table testable set Bb='bb';
 job finishes, but the values are not really updated.
 update table testable set bb='bb'; it works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join


[ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518323#comment-14518323
 ] 

Hive QA commented on HIVE-10520:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728734/HIVE-10520.01.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8822 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3632/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3632/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3632/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728734 - PreCommit-HIVE-TRUNK-Build

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10520.01.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account

2015-04-28 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518440#comment-14518440
 ] 

Laljo John Pullokkaran commented on HIVE-10526:
---

I am not sure about that. We are trying to say if Latency (CPU + IO) is less 
than epsilon and if row count difference is less than epsilon.

By separating CPU, IO we may loose the additive effect.

 CBO (Calcite Return Path): HiveCost epsilon comparison should take row count 
 in to account
 --

 Key: HIVE-10526
 URL: https://issues.apache.org/jira/browse/HIVE-10526
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10526.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10516) Measure Hive CLI's performance difference before and after implementation is switched

2015-04-28 Thread Ferdinand Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ferdinand Xu updated HIVE-10516:

Attachment: HIVE-10516.patch

 Measure Hive CLI's performance difference before and after implementation is 
 switched
 -

 Key: HIVE-10516
 URL: https://issues.apache.org/jira/browse/HIVE-10516
 Project: Hive
  Issue Type: Sub-task
  Components: CLI
Affects Versions: 0.10.0
Reporter: Xuefu Zhang
Assignee: Ferdinand Xu
 Attachments: HIVE-10516.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account

2015-04-28 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518417#comment-14518417
 ] 

Ashutosh Chauhan commented on HIVE-10526:
-

Wondering if its better to separate io  cpu cost as well :
{code}
return (this == other) || 
  ((Math.abs((this.io) - (other.getio()))  RelOptUtil.EPSILON) 
  ((Math.abs((this.cpu) - (other.getCpu()))  RelOptUtil.EPSILON) 
   (Math.abs((this.rowCount - other.getRows()))  RelOptUtil.EPSILON));
{code}

 CBO (Calcite Return Path): HiveCost epsilon comparison should take row count 
 in to account
 --

 Key: HIVE-10526
 URL: https://issues.apache.org/jira/browse/HIVE-10526
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10526.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10522) CBO (Calcite Return Path): fix the wrong needed column names when TS is created

2015-04-28 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10522:
---
Attachment: HIVE-10522.02.patch

It seems that it needs more work.

 CBO (Calcite Return Path): fix the wrong needed column names when TS is 
 created
 ---

 Key: HIVE-10522
 URL: https://issues.apache.org/jira/browse/HIVE-10522
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
Priority: Critical
 Fix For: 1.2.0

 Attachments: HIVE-10522.01.patch, HIVE-10522.02.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9152) Dynamic Partition Pruning [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518594#comment-14518594
 ] 

Hive QA commented on HIVE-9152:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728999/HIVE-9152.5-spark.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8722 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_spark_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/846/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/846/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-846/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728999 - PreCommit-HIVE-SPARK-Build

 Dynamic Partition Pruning [Spark Branch]
 

 Key: HIVE-9152
 URL: https://issues.apache.org/jira/browse/HIVE-9152
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Chao Sun
 Attachments: HIVE-9152.1-spark.patch, HIVE-9152.2-spark.patch, 
 HIVE-9152.3-spark.patch, HIVE-9152.4-spark.patch, HIVE-9152.5-spark.patch


 Tez implemented dynamic partition pruning in HIVE-7826. This is a nice 
 optimization and we should implement the same in HOS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10506) CBO (Calcite Return Path): Disallow return path to be enable if CBO is off


[ 
https://issues.apache.org/jira/browse/HIVE-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516961#comment-14516961
 ] 

Hive QA commented on HIVE-10506:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728742/HIVE-10506.01.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8818 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchAbort
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3623/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3623/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3623/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728742 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
 --

 Key: HIVE-10506
 URL: https://issues.apache.org/jira/browse/HIVE-10506
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10506.01.patch, HIVE-10506.patch


 If hive.cbo.enable=false and hive.cbo.returnpath=true then some optimizations 
 would kick in. It's quite possible that in customer environment, they might 
 end up in these scenarios; we should prevent it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory


[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516983#comment-14516983
 ] 

Nemon Lou commented on HIVE-5672:
-

The query plan for local directory is simple:
{quote}
explain insert overwrite local directory '/tmp/xxx' select * from src;
+---+
|Explain
|
+---+
| STAGE DEPENDENCIES:   
|
|   Stage-1 is a root stage 
|
|   Stage-0 depends on stages: Stage-1  
|
|   
|
| STAGE PLANS:  
|
|   Stage: Stage-1  
|
| Map Reduce
|
|   Map Operator Tree:  
|
|   TableScan   
|
| alias: src
|
| Statistics: Num rows: 38 Data size: 11900 Basic stats: COMPLETE 
Column stats: NONE|
| Select Operator   
|
|   expressions: id (type: string), starttime (type: bigint), 
callerno (type: string), note (type: string)  |
|   outputColumnNames: _col0, _col1, _col2, _col3   
|
|   Statistics: Num rows: 38 Data size: 11900 Basic stats: COMPLETE 
Column stats: NONE  |
|   File Output Operator
|
| compressed: false 
|
| Statistics: Num rows: 38 Data size: 11900 Basic stats: 
COMPLETE Column stats: NONE|
| table:
|
| input format: org.apache.hadoop.mapred.TextInputFormat
|
| output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat  
   |
| serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 
|
|   
|
|   Stage: Stage-0  
|
| Move Operator 
|
|   files:  
|
|   hdfs directory: false   
|
|   destination: /tmp/xxx   
|
|   
|
+---+
29 rows selected (5.957 seconds)
{quote}

While the query plan for DFS directory is complicated：
{quote}
explain insert overwrite directory '/tmp/xxx' select * from src;
+-+
| Explain   
  |
+-+
| STAGE DEPENDENCIES:

[jira] [Updated] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.


 [ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10454:

Attachment: HIVE-10454.patch

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu
 Attachments: HIVE-10454.patch


 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10442) HIVE-10098 broke hadoop-1 build

2015-04-28 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517028#comment-14517028
 ] 

Yongzhi Chen commented on HIVE-10442:
-

Thanks [~prasanth_j] for the review.

 HIVE-10098 broke hadoop-1 build
 ---

 Key: HIVE-10442
 URL: https://issues.apache.org/jira/browse/HIVE-10442
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Yongzhi Chen
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10442.1.patch, HIVE-10442.1.patch, 
 HIVE-10442.1.patch


 fs.addDelegationTokens() method does not seem to exist in hadoop 1.2.1. This 
 breaks the hadoop-1 builds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10454) Query against partitioned table in strict mode failed with No partition predicate found even if partition predicate is specified.


 [ 
https://issues.apache.org/jira/browse/HIVE-10454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-10454:

Attachment: (was: HIVE-10454.patch)

 Query against partitioned table in strict mode failed with No partition 
 predicate found even if partition predicate is specified.
 ---

 Key: HIVE-10454
 URL: https://issues.apache.org/jira/browse/HIVE-10454
 Project: Hive
  Issue Type: Bug
Reporter: Aihua Xu
Assignee: Aihua Xu

 The following queries fail:
 {noformat}
 create table t1 (c1 int) PARTITIONED BY (c2 string);
 set hive.mapred.mode=strict;
 select * from t1 where t1.c2  to_date(date_add(from_unixtime( 
 unix_timestamp() ),1));
 {noformat}
 The query failed with No partition predicate found for alias t1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-6852) JDBC client connections hang at TSaslTransport

2015-04-28 Thread Vladimir Kovalchuk (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516998#comment-14516998
 ] 

Vladimir Kovalchuk commented on HIVE-6852:
--

100% reproducible.
...
main #1 prio=5 os_prio=0 tid=0x01e78000 nid=0x255c runnable 
[0x02bee000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:170)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked 0xd63658d8 (a java.io.BufferedInputStream)
at 
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at 
org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
at 
org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:288)
at 
org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37)
at 
org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:190)
at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:163)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
...

 JDBC client connections hang at TSaslTransport
 --

 Key: HIVE-6852
 URL: https://issues.apache.org/jira/browse/HIVE-6852
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: jay vyas

 I've noticed that when there is an underlying issue in connecting a client to 
 the JDBC interface of the HiveServer2 to run queries, you get a hang after 
 the thrift portion, at least in certain scenarios: 
 Turning log4j to DEBUG, you can see the following when trying to get a 
 connection using:
 {noformat}
 Connection jdbc = 
 DriverManager.getConnection(this.con,hive,password);
 jdbc:hive2://localhost:1/default,
 {noformat}
 The logs get to here before the hang :
 {noformat}
 0[main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening 
 transport org.apache.thrift.transport.TSaslClientTransport@219ba640
 0 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - opening 
 transport org.apache.thrift.transport.TSaslClientTransport@219ba640
 3[main] DEBUG org.apache.thrift.transport.TSaslClientTransport  - Sending 
 mechanism name PLAIN and initial response of length 14
 3 [main] DEBUG org.apache.thrift.transport.TSaslClientTransport  - Sending 
 mechanism name PLAIN and initial response of length 14
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: 
 Writing message with status START and payload length 5
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing 
 message with status START and payload length 5
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: 
 Writing message with status COMPLETE and payload length 14
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Writing 
 message with status COMPLETE and payload length 14
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Start 
 message handled
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Start 
 message handled
 5[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main 
 negotiation loop complete
 5 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: Main 
 negotiation loop complete
 6[main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL 
 Client receiving last message
 6 [main] DEBUG org.apache.thrift.transport.TSaslTransport  - CLIENT: SASL 
 Client receiving last message
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10526) CBO (Calcite Return Path): HiveCost epsilon comparison should take row count in to account

2015-04-28 Thread Laljo John Pullokkaran (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-10526:
--
Attachment: HIVE-10526.patch

 CBO (Calcite Return Path): HiveCost epsilon comparison should take row count 
 in to account
 --

 Key: HIVE-10526
 URL: https://issues.apache.org/jira/browse/HIVE-10526
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Fix For: 1.2.0

 Attachments: HIVE-10526.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10512) CBO (Calcite Return Path): SMBJoin conversion throws ClassCastException


[ 
https://issues.apache.org/jira/browse/HIVE-10512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517157#comment-14517157
 ] 

Hive QA commented on HIVE-10512:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728748/HIVE-10512.01.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8817 tests 
executed
*Failed tests:*
{noformat}
TestCustomAuthentication - did not produce a TEST-*.xml file
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3625/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3625/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3625/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728748 - PreCommit-HIVE-TRUNK-Build

 CBO (Calcite Return Path): SMBJoin conversion throws ClassCastException
 ---

 Key: HIVE-10512
 URL: https://issues.apache.org/jira/browse/HIVE-10512
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10512.01.patch, HIVE-10512.patch


 When return path is on, SMB conversion is throwing an Exception in some cases.
 The problem can be reproduced with auto_join32.q. The Exception with the 
 following stacktrace is thrown:
 {noformat}
 java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.FilterOperator 
 cannot be cast to org.apache.hadoop.hive.ql.exec.TableScanOperator
 at 
 org.apache.hadoop.hive.ql.parse.TableAccessAnalyzer.genRootTableScan(TableAccessAnalyzer.java:243)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractBucketJoinProc.checkConvertBucketMapJoin(AbstractBucketJoinProc.java:226)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.canConvertJoinToBucketMapJoin(AbstractSMBJoinProc.java:497)
 at 
 org.apache.hadoop.hive.ql.optimizer.AbstractSMBJoinProc.canConvertJoinToSMBJoin(AbstractSMBJoinProc.java:414)
 at 
 org.apache.hadoop.hive.ql.optimizer.SortedMergeJoinProc.process(SortedMergeJoinProc.java:45)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:133)
 at

[jira] [Commented] (HIVE-10428) NPE in RegexSerDe using HCat

2015-04-28 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517293#comment-14517293
 ] 

Ashutosh Chauhan commented on HIVE-10428:
-

[~jdere] Actually bug is in 
o.a.hive.hcatalog.mapreduce.InternalUtil::getSerdeProperties() where it is not 
setting comments in properties object it is passing to serde in 
initializeDeserializer() of same class. Its better to fix bug there so that we 
dont have to make change of current patch in all possible serdes.
 

 NPE in RegexSerDe using HCat
 

 Key: HIVE-10428
 URL: https://issues.apache.org/jira/browse/HIVE-10428
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10428.1.patch


 When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, 
 when doing Hcatalog call to get read the table, it throws exception:
 {noformat}
 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; 
 Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: 
 (HDFS_DELEGATION_TOKEN token 1478 for haha)
 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 Splits len : 1
 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, 
 hdpseca05.seca.hwxsup.com]
 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing 
 org.apache.hadoop.hive.serde2.RegexSerDe with properties 
 {name=casetest.regex_table, numFiles=1, columns.types=string,string, 
 serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, 
 output.format.string=%1$s %2$s, 
 serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, 
 COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, 
 input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172}
 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been 
 deprecated
 Exception in thread main java.lang.NullPointerException
   at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
   at com.google.common.base.Splitter.split(Splitter.java:371)
   at 
 org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155)
   at 
 org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518)
   at 
 org.apache.hive.hcatalog.mapreduce.InternalUtil.initializeDeserializer(InternalUtil.java:156)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createDeserializer(HCatRecordReader.java:127)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:92)
   at HCatalogSQLMR.main(HCatalogSQLMR.java:81)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10527) NPE in SparkUtilities::isDedicatedCluster [Spark Branch]

2015-04-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10527:
--
Attachment: HIVE-10527.1-spark.patch

cc [~jxiang]

 NPE in SparkUtilities::isDedicatedCluster [Spark Branch]
 

 Key: HIVE-10527
 URL: https://issues.apache.org/jira/browse/HIVE-10527
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-10527.1-spark.patch


 We should add {{spark.master}} to HiveConf when it doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10529) Cleanup tezcontext reference in org.apache.hadoop.hive.ql.exec.tez.HashTableLoader

2015-04-28 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-10529:

Attachment: HIVE-10529.1.patch

[~gopalv], [~hagleitn] - Please review.

 Cleanup tezcontext reference in 
 org.apache.hadoop.hive.ql.exec.tez.HashTableLoader
 --

 Key: HIVE-10529
 URL: https://issues.apache.org/jira/browse/HIVE-10529
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: HIVE-10529.1.patch, hive_hashtable_loader.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9674) *DropPartitionEvent should handle partition-sets.


[ 
https://issues.apache.org/jira/browse/HIVE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518624#comment-14518624
 ] 

Hive QA commented on HIVE-9674:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728832/HIVE-9674.5.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8825 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3635/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3635/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3635/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728832 - PreCommit-HIVE-TRUNK-Build

 *DropPartitionEvent should handle partition-sets.
 -

 Key: HIVE-9674
 URL: https://issues.apache.org/jira/browse/HIVE-9674
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9674.2.patch, HIVE-9674.3.patch, HIVE-9674.4.patch, 
 HIVE-9674.5.patch


 Dropping a set of N partitions from a table currently results in N 
 DropPartitionEvents (and N PreDropPartitionEvents) being fired serially. This 
 is wasteful, especially so for large N. It also makes it impossible to even 
 try to run authorization-checks on all partitions in a batch.
 Taking the cue from HIVE-9609, we should compose an {{IterablePartition}} 
 in the event, and expose them via an {{Iterator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10121) Implement a hive --service udflint command to check UDF jars for common shading mistakes

2015-04-28 Thread Abdelrahman Shettia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdelrahman Shettia updated HIVE-10121:
---
Attachment: HIVE-10121.2.patch

Fixing minor format.

 Implement a hive --service udflint command to check UDF jars for common 
 shading mistakes
 

 Key: HIVE-10121
 URL: https://issues.apache.org/jira/browse/HIVE-10121
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Gopal V
Assignee: Abdelrahman Shettia
 Fix For: 1.2.0

 Attachments: HIVE-10121.1.patch, HIVE-10121.2.patch, bad_udfs.out, 
 bad_udfs_verbose.out, good_udfs.out, good_udfs_verbose.out


 Several SerDe and UDF jars tend to shade in various parts of the dependencies 
 including hadoop-common or guava without relocation.
 Implement a simple udflint tool which automates some part of the class path 
 and shaded resources audit process required when upgrading a hive install 
 from an old version to a new one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10514) Fix MiniCliDriver tests failure


[ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518731#comment-14518731
 ] 

Szehon Ho commented on HIVE-10514:
--

+1 looks good to me, thanks for taking care of this.

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch, 
 HIVE-10514.3.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10483) insert overwrite partition deadlocks on itself with DbTxnManager

2015-04-28 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10483:
--
Attachment: HIVE-10483.2.patch

patch 2 has minor change to make sure it applies cleanly after HIVE-10481

 insert overwrite partition deadlocks on itself with DbTxnManager
 

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10483.2.patch, HIVE-10483.patch


 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.
 A simpler case is
 insert overwrite ta partition(part=) select * from ta



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory


[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518637#comment-14518637
 ] 

Nemon Lou commented on HIVE-5672:
-

Thanks to Xuefu and Sushanth for you review.
There is one more thing that i need to update :
removing occurrences of TOK_LOCAL_DIR.As you mentioned above.
Will update soon.

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz, HIVE-5672.7.patch, 
 HIVE-5672.7.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join


 [ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10520:

Attachment: HIVE-10520.02.patch

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10520.01.patch, HIVE-10520.02.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10483) insert overwrite partition deadlocks on itself with DbTxnManager

2015-04-28 Thread Eugene Koifman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman resolved HIVE-10483.
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   1.2.0

committed to branch-1.2 and master. Thanks [~alangates] for the review

 insert overwrite partition deadlocks on itself with DbTxnManager
 

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10483.2.patch, HIVE-10483.patch


 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.
 A simpler case is
 insert overwrite ta partition(part=) select * from ta



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10286) SARGs: Type Safety via PredicateLeaf.type


[ 
https://issues.apache.org/jira/browse/HIVE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518652#comment-14518652
 ] 

Prasanth Jayachandran commented on HIVE-10286:
--

I will go over the test cases again to make sure the changes are correct.

 SARGs: Type Safety via PredicateLeaf.type
 -

 Key: HIVE-10286
 URL: https://issues.apache.org/jira/browse/HIVE-10286
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Reporter: Gopal V
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10286.1.patch, HIVE-10286.2.patch


 The Sargs impl today converts the statsObj to the type of the predicate 
 object before doing any comparisons.
 To satisfy the PPD requirements, the conversion has to be coerced to the type 
 specified in PredicateLeaf.type.
 The type conversions in Hive are standard and have a fixed promotion order.
 Therefore the PredicateLeaf has to do type changes which match the exact 
 order of type coercions offered by the FilterOperator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10286) SARGs: Type Safety via PredicateLeaf.type


 [ 
https://issues.apache.org/jira/browse/HIVE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10286:
-
Attachment: HIVE-10286.2.patch

Added changes for HIVE-10504 along with this patch as PPD is broken already for 
date case. 

 SARGs: Type Safety via PredicateLeaf.type
 -

 Key: HIVE-10286
 URL: https://issues.apache.org/jira/browse/HIVE-10286
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Reporter: Gopal V
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10286.1.patch, HIVE-10286.2.patch


 The Sargs impl today converts the statsObj to the type of the predicate 
 object before doing any comparisons.
 To satisfy the PPD requirements, the conversion has to be coerced to the type 
 specified in PredicateLeaf.type.
 The type conversions in Hive are standard and have a fixed promotion order.
 Therefore the PredicateLeaf has to do type changes which match the exact 
 order of type coercions offered by the FilterOperator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10528) Hiveserver2 in HTTP mode is not applying auth_to_local rules

2015-04-28 Thread Abdelrahman Shettia (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abdelrahman Shettia updated HIVE-10528:
---
Assignee: Abdelrahman Shettia

 Hiveserver2 in HTTP mode is not applying auth_to_local rules
 

 Key: HIVE-10528
 URL: https://issues.apache.org/jira/browse/HIVE-10528
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 0.14.0
 Environment: Centos 6
Reporter: Abdelrahman Shettia
Assignee: Abdelrahman Shettia

 PROBLEM: Authenticating to HS2 in HTTP mode with Kerberos, auth_to_local 
 mappings do not get applied.  Because of this various permissions checks 
 which rely on the local cluster name for a user are going to fail.
 STEPS TO REPRODUCE:
 1.  Create  kerberos cluster  and HS2 in HTTP mode
 2.  Create a new user, test, along with a kerberos principal for this user
 3.  Create a separate principal, mapped-test
 4.  Create an auth_to_local rule to make sure that mapped-test is mapped to 
 test
 5.  As the test user, connect to HS2 with beeline and create a simple table:
 {code}
 CREATE TABLE permtest (field1 int);
 {code}
 There is no need to load anything into this table.
 6.  Establish that it works as the test user:
 {code}
 show create table permtest;
 {code}
 7.  Drop the test identity and become mapped-test
 8.  Re-connect to HS2 with beeline, re-run the above command:
 {code}
 show create table permtest;
 {code}
 You will find that when this is done in HTTP mode, you will get an HDFS error 
 (because of StorageBasedAuthorization doing a HDFS permissions check) and the 
 user will be mapped-test and NOT test as it should be.
 ANALYSIS:  This appears to be HTTP specific and the problem seems to come in 
 {{ThriftHttpServlet$HttpKerberosServerAction.getPrincipalWithoutRealmAndHost()}}:
 {code}
   try {
 fullKerberosName = 
 ShimLoader.getHadoopShims().getKerberosNameShim(fullPrincipal);
   } catch (IOException e) {
 throw new HttpAuthenticationException(e);
   }
   return fullKerberosName.getServiceName();
 {code}
 getServiceName applies no auth_to_local rules.  Seems like maybe this should 
 be getShortName()?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9736) StorageBasedAuthProvider should batch namenode-calls where possible.

2015-04-28 Thread Chris Nauroth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518695#comment-14518695
 ] 

Chris Nauroth commented on HIVE-9736:
-

Hi [~mithun].  Thank you for uploading a new patch.

I was unable to apply patch v3 to the master branch.  Does it need to be 
rebased, or should I be working with a different branch?

There was one suggestion I made on Review Board that still isn't implemented.  
In {{Hadoop23Shims#checkFileAccess}}, we can combine the multiple {{actions}} 
by using {{FsAction#or}}, and then call {{accessMethod.invoke}} just once to do 
the check in a single RPC (per file).  Were you planning to make this change, 
or is there a reason you decided not to do it?

Aside from that, I can see all of my other feedback has been addressed.  Thanks 
again!

 StorageBasedAuthProvider should batch namenode-calls where possible.
 

 Key: HIVE-9736
 URL: https://issues.apache.org/jira/browse/HIVE-9736
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9736.1.patch, HIVE-9736.2.patch, HIVE-9736.3.patch


 Consider a table partitioned by 2 keys (dt, region). Say a dt partition could 
 have 1 associated regions. Consider that the user does:
 {code:sql}
 ALTER TABLE my_table DROP PARTITION (dt='20150101');
 {code}
 As things stand now, {{StorageBasedAuthProvider}} will make individual 
 {{DistributedFileSystem.listStatus()}} calls for each partition-directory, 
 and authorize each one separately. It'd be faster to batch the calls, and 
 examine multiple FileStatus objects at once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10527) NPE in SparkUtilities::isDedicatedCluster [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518719#comment-14518719
 ] 

Hive QA commented on HIVE-10527:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12729039/HIVE-10527.1-spark.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/847/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/847/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-847/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12729039 - PreCommit-HIVE-SPARK-Build

 NPE in SparkUtilities::isDedicatedCluster [Spark Branch]
 

 Key: HIVE-10527
 URL: https://issues.apache.org/jira/browse/HIVE-10527
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li
 Attachments: HIVE-10527.1-spark.patch


 We should add {{spark.master}} to HiveConf when it doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5672) Insert with custom separator not supported for non-local directory


 [ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-5672:

Attachment: HIVE-5672.8.patch.tar.gz

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz, HIVE-5672.7.patch, 
 HIVE-5672.7.patch.tar.gz, HIVE-5672.8.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10527) NPE in SparkUtilities::isDedicatedCluster [Spark Branch]

2015-04-28 Thread Rui Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Li updated HIVE-10527:
--
Summary: NPE in SparkUtilities::isDedicatedCluster [Spark Branch]  (was: 
NPE in SparkUtilities::isDedicatedCluster)

 NPE in SparkUtilities::isDedicatedCluster [Spark Branch]
 

 Key: HIVE-10527
 URL: https://issues.apache.org/jira/browse/HIVE-10527
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Rui Li
Assignee: Rui Li

 We should add {{spark.master}} to HiveConf when it doesn't exist.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10286) SARGs: Type Safety via PredicateLeaf.type


[ 
https://issues.apache.org/jira/browse/HIVE-10286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518674#comment-14518674
 ] 

Prasanth Jayachandran commented on HIVE-10286:
--

Added comments in the RB for all the test case changes to make the review easy 
:)

[~gopalv] FYI..

 SARGs: Type Safety via PredicateLeaf.type
 -

 Key: HIVE-10286
 URL: https://issues.apache.org/jira/browse/HIVE-10286
 Project: Hive
  Issue Type: Bug
  Components: File Formats, Serializers/Deserializers
Reporter: Gopal V
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10286.1.patch, HIVE-10286.2.patch


 The Sargs impl today converts the statsObj to the type of the predicate 
 object before doing any comparisons.
 To satisfy the PPD requirements, the conversion has to be coerced to the type 
 specified in PredicateLeaf.type.
 The type conversions in Hive are standard and have a fixed promotion order.
 Therefore the PredicateLeaf has to do type changes which match the exact 
 order of type coercions offered by the FilterOperator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10529) Cleanup tezcontext reference in org.apache.hadoop.hive.ql.exec.tez.HashTableLoader

2015-04-28 Thread Rajesh Balamohan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated HIVE-10529:

Attachment: hive_hashtable_loader.png

 Cleanup tezcontext reference in 
 org.apache.hadoop.hive.ql.exec.tez.HashTableLoader
 --

 Key: HIVE-10529
 URL: https://issues.apache.org/jira/browse/HIVE-10529
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: hive_hashtable_loader.png






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10453) HS2 leaking open file descriptors when using UDFs

2015-04-28 Thread Yongzhi Chen (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517324#comment-14517324
 ] 

Yongzhi Chen commented on HIVE-10453:
-

The file is opened when:
Resource res = ucp.getResource(path, false);
{noformat}
URLClassLoader$1.run() line: 358 [local variables unavailable]  
URLClassLoader$1.run() line: 355
AccessController.doPrivileged(PrivilegedExceptionActionT, 
AccessControlContext) line: not available [native method]   
URLClassLoader.findClass(String) line: 354  
URLClassLoader(ClassLoader).loadClass(String, boolean) line: 425
URLClassLoader(ClassLoader).loadClass(String) line: 358 
ClassT.forName0(String, boolean, ClassLoader) line: not available [native 
method] 
ClassT.forName(String, boolean, ClassLoader) line: 270
Registry.registerToSessionRegistry(String, FunctionInfo) line: 460  
Registry.getQualifiedFunctionInfo(String) line: 438 
Registry.getFunctionInfo(String) line: 250  
FunctionRegistry.getFunctionInfo(String) line: 465  
FunctionRegistry.impliesOrder(String) line: 1523
CalcitePlanner(SemanticAnalyzer).doPhase1GetAllAggregations(ASTNode, 
HashMapString,ASTNode, ListASTNode) line: 526  

{noformat}


And closed by:
ListIOException errors = ucp.closeLoaders();
{noformat}
URLClassLoader.close() line: 286 [local variables unavailable]  
JavaUtils.closeClassLoader(ClassLoader) line: 110   
JavaUtils.closeClassLoadersTo(ClassLoader, ClassLoader) line: 87
SessionState.close() line: 1450 
HiveSessionImplwithUGI(HiveSessionImpl).close() line: 566   
HiveSessionImplwithUGI.close() line: 110
NativeMethodAccessorImpl.invoke0(Method, Object, Object[]) line: not available 
[native method]  
NativeMethodAccessorImpl.invoke(Object, Object[]) line: 57  
DelegatingMethodAccessorImpl.invoke(Object, Object[]) line: 43  
Method.invoke(Object, Object...) line: 606  
HiveSessionProxy.invoke(Method, Object[]) line: 78  
HiveSessionProxy.access$000(HiveSessionProxy, Method, Object[]) line: 36
HiveSessionProxy$1.run() line: 63   
AccessController.doPrivileged(PrivilegedExceptionActionT, 
AccessControlContext) line: not available [native method]   
Subject.doAs(Subject, PrivilegedExceptionActionT) line: 415   
UserGroupInformation.doAs(PrivilegedExceptionActionT) line: 1628  
HiveSessionProxy.invoke(Object, Method, Object[]) line: 59  
$Proxy23.close() line: not available
SessionManager.closeSession(SessionHandle) line: 279
CLIService.closeSession(SessionHandle) line: 237
{noformat}

The test sometimes fails sometimes succeeds, these are java API, not sure why 
it has random behaviors.


 HS2 leaking open file descriptors when using UDFs
 -

 Key: HIVE-10453
 URL: https://issues.apache.org/jira/browse/HIVE-10453
 Project: Hive
  Issue Type: Bug
Reporter: Yongzhi Chen
Assignee: Yongzhi Chen

 1. create a custom function by
 CREATE FUNCTION myfunc AS 'someudfclass' using jar 'hdfs:///tmp/myudf.jar';
 2. Create a simple jdbc client, just do 
 connect, 
 run simple query which using the function such as:
 select myfunc(col1) from sometable
 3. Disconnect.
 Check open file for HiveServer2 by:
 lsof -p HSProcID | grep myudf.jar
 You will see the leak as:
 {noformat}
 java  28718 ychen  txt  REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 java  28718 ychen  330r REG1,4741 212977666 
 /private/var/folders/6p/7_njf13d6h144wldzbbsfpz8gp/T/1bfe3de0-ac63-4eba-a725-6a9840f1f8d5_resources/myudf.jar
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.


[ 
https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517340#comment-14517340
 ] 

Alan Gates commented on HIVE-10165:
---

A clarifying question before I review the rest of the patch.  I was assuming 
that the record identifier struct would not be in the incoming records, but if 
I read the patch correctly you're assuming that it is.

If the record identifier struct is not in the incoming data you would need to 
buffer up records from a single transaction in memory.  I assume that the data 
would have a primary key (user generated, not the record id), that you'd use to 
match updates and deletes again.  Then when the user called 
BaseTransactionBatch.commit the partition would be scanned, updates and deletes 
applied, and the results written out.  I don't see something like this 
happening in the code.

If the record identifier struct is in the incoming data, how is the user 
supposed to get that information?  I had assumed the use case here was loading 
data off of transactional systems that I assume would have their own primary 
key which is not related to Hive's record identifier.

If you are assuming the record identifier struct is in the incoming data you 
will still need to batch up the data and sort it before commit.  That's because 
the merger assumes that the rows are ordered by (originaltxnid, lasttxnid, 
rowid).

Note that I'm not saying it should be one way or another, I'm just trying to 
understand the intent.

 Improve hive-hcatalog-streaming extensibility and support updates and deletes.
 --

 Key: HIVE-10165
 URL: https://issues.apache.org/jira/browse/HIVE-10165
 Project: Hive
  Issue Type: Improvement
  Components: HCatalog
Reporter: Elliot West
Assignee: Elliot West
  Labels: streaming_api
 Fix For: 1.2.0

 Attachments: HIVE-10165.0.patch


 h3. Overview
 I'd like to extend the 
 [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest]
  API so that it also supports the writing of record updates and deletes in 
 addition to the already supported inserts.
 h3. Motivation
 We have many Hadoop processes outside of Hive that merge changed facts into 
 existing datasets. Traditionally we achieve this by: reading in a 
 ground-truth dataset and a modified dataset, grouping by a key, sorting by a 
 sequence and then applying a function to determine inserted, updated, and 
 deleted rows. However, in our current scheme we must rewrite all partitions 
 that may potentially contain changes. In practice the number of mutated 
 records is very small when compared with the records contained in a 
 partition. This approach results in a number of operational issues:
 * Excessive amount of write activity required for small data changes.
 * Downstream applications cannot robustly read these datasets while they are 
 being updated.
 * Due to scale of the updates (hundreds or partitions) the scope for 
 contention is high. 
 I believe we can address this problem by instead writing only the changed 
 records to a Hive transactional table. This should drastically reduce the 
 amount of data that we need to write and also provide a means for managing 
 concurrent access to the data. Our existing merge processes can read and 
 retain each record's {{ROW_ID}}/{{RecordIdentifier}} and pass this through to 
 an updated form of the hive-hcatalog-streaming API which will then have the 
 required data to perform an update or insert in a transactional manner. 
 h3. Benefits
 * Enables the creation of large-scale dataset merge processes  
 * Opens up Hive transactional functionality in an accessible manner to 
 processes that operate outside of Hive.
 h3. Implementation
 Our changes do not break the existing API contracts. Instead our approach has 
 been to consider the functionality offered by the existing API and our 
 proposed API as fulfilling separate and distinct use-cases. The existing API 
 is primarily focused on the task of continuously writing large volumes of new 
 data into a Hive table for near-immediate analysis. Our use-case however, is 
 concerned more with the frequent but not continuous ingestion of mutations to 
 a Hive table from some ETL merge process. Consequently we feel it is 
 justifiable to add our new functionality via an alternative set of public 
 interfaces and leave the existing API as is. This keeps both APIs clean and 
 focused at the expense of presenting additional options to potential users. 
 Wherever possible, shared implementation concerns have been factored out into 
 abstract base classes that are open to third-party extension. A detailed 
 breakdown of the changes is as follows:
 * We've introduced a public {{RecordMutator}}

[jira] [Commented] (HIVE-10517) HCatPartition should not be created with as location in tests


[ 
https://issues.apache.org/jira/browse/HIVE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517356#comment-14517356
 ] 

Sushanth Sowmyan commented on HIVE-10517:
-

Test failures reported here are unrelated.

 HCatPartition should not be created with  as location in tests
 

 Key: HIVE-10517
 URL: https://issues.apache.org/jira/browse/HIVE-10517
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 1.2.0
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-10517.patch


 Tests in TestHCatClient and TestCommands wind up instantiating HCatPartition 
 with a dummy empty String as location. This causes test failures when run 
 against an existing metastore, as introduced by HIVE-10074.
 We need to instantiate actual values instead of dummy  strings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10445) Report error when dynamic partition insert is not following the correct syntax

2015-04-28 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10445:

Attachment: (was: HIVE-10045.1.patch)

 Report error when dynamic partition insert is not following the correct syntax
 --

 Key: HIVE-10445
 URL: https://issues.apache.org/jira/browse/HIVE-10445
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.1.0
Reporter: Chao Sun
Assignee: Chao Sun

 With dynamic partition insert, user should follow the syntax as specified in: 
 https://cwiki.apache.org/confluence/display/Hive/DynamicPartitions.
 However, this is purely enforced on the user side, and there's no checking in 
 Hive. As result, this could cause unexpected results for user queries, or 
 confusing error messages.
 I think we need to display information about which input column is used as DP 
 column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10445) Report error when dynamic partition insert is not following the correct syntax

2015-04-28 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10445:

Attachment: HIVE-10445.1.patch

Tests seem flaky. Reattach the same patch and try again.

 Report error when dynamic partition insert is not following the correct syntax
 --

 Key: HIVE-10445
 URL: https://issues.apache.org/jira/browse/HIVE-10445
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 1.1.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10445.1.patch


 With dynamic partition insert, user should follow the syntax as specified in: 
 https://cwiki.apache.org/confluence/display/Hive/DynamicPartitions.
 However, this is purely enforced on the user side, and there's no checking in 
 Hive. As result, this could cause unexpected results for user queries, or 
 confusing error messages.
 I think we need to display information about which input column is used as DP 
 column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8165) Annotation changes for replication


[ 
https://issues.apache.org/jira/browse/HIVE-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517418#comment-14517418
 ] 

Sushanth Sowmyan commented on HIVE-8165:


Given that we intend to introduce a HDFS-based-event log that'll take over the 
metastore-based one (so that events can be kept for longer than 1day), it's 
possible that we may want to disable the metastoreclient interfaces at some 
point - that's why I wanted to mark those as LimitedPrivate - so as to not have 
to need to support it after it's disabled(if it's disabled)

As  to ReplicationTask and the HCatClient calls themselves, I guess we can skip 
the LimitedPrivate as long as we have the Evolving - they were an extension of 
the fact that they relied on the metastore-side eventlog.

 Annotation changes for replication
 --

 Key: HIVE-8165
 URL: https://issues.apache.org/jira/browse/HIVE-8165
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-8165.patch


 We need to make a couple of changes for annotating the recent changes.
 a) Marking old notification listener in HCatalog as @Deprecated, linking 
 instead to the new repl/ module.
 b) Mark the new interfaces as @Evolving @Unstable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10499) Ensure Session/ZooKeeperClient instances are closed

2015-04-28 Thread Jimmy Xiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517410#comment-14517410
 ] 

Jimmy Xiang commented on HIVE-10499:


The test failures are not related. They fail in other pre-commit tests too.

 Ensure Session/ZooKeeperClient instances are closed
 ---

 Key: HIVE-10499
 URL: https://issues.apache.org/jira/browse/HIVE-10499
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-10499.patch


 Some Session/ZooKeeperClient instances are not closed in some scenario. We 
 need to make sure they are always closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9674) *DropPartitionEvent should handle partition-sets.


 [ 
https://issues.apache.org/jira/browse/HIVE-9674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-9674:
---
Attachment: HIVE-9674.5.patch

 *DropPartitionEvent should handle partition-sets.
 -

 Key: HIVE-9674
 URL: https://issues.apache.org/jira/browse/HIVE-9674
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-9674.2.patch, HIVE-9674.3.patch, HIVE-9674.4.patch, 
 HIVE-9674.5.patch


 Dropping a set of N partitions from a table currently results in N 
 DropPartitionEvents (and N PreDropPartitionEvents) being fired serially. This 
 is wasteful, especially so for large N. It also makes it impossible to even 
 try to run authorization-checks on all partitions in a batch.
 Taking the cue from HIVE-9609, we should compose an {{IterablePartition}} 
 in the event, and expose them via an {{Iterator}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10525) loading data into list bucketing table when null in skew column


 [ 
https://issues.apache.org/jira/browse/HIVE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel C Balan updated HIVE-10525:
---
Description: 
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

{code:title=has-null.csv|borderStyle=solid}
1
2
\N
3
{code}

{code:title=no-null.csv|borderStyle=solid}
1
2
3
{code}

{code}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner 
(LocalJobRunner.java:run(560)) - job_local402607316_0001
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {x:null}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 10 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.generateListBucketingDirName(FileSinkOperator.java:833)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:615)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)


  was:
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

 more *null.csv
::
has-null.csv
::
1
2
\N
3
::
no-null.csv
::
1
2
3

{code}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner

[jira] [Updated] (HIVE-10521) TxnHandler.timeOutTxns only times out some of the expired transactions


 [ 
https://issues.apache.org/jira/browse/HIVE-10521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-10521:
--
Attachment: HIVE-10521.patch

Attaching a patch that changes timeOutTxns to loop through and time out all old 
transactions instead of only the first 20 it finds.

[~ekoifman], can you review this?

[~sushanth], can I commit this to the 1.2 branch?

 TxnHandler.timeOutTxns only times out some of the expired transactions
 --

 Key: HIVE-10521
 URL: https://issues.apache.org/jira/browse/HIVE-10521
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Alan Gates
Assignee: Alan Gates
 Attachments: HIVE-10521.patch


 {code}
   for (int i = 0; i  20  rs.next(); i++) deadTxns.add(rs.getLong(1));
   // We don't care whether all of the transactions get deleted or not,
   // if some didn't it most likely means someone else deleted them in the 
 interum
   if (deadTxns.size()  0) abortTxns(dbConn, deadTxns);
 {code}
 While it makes sense to limit the number of transactions aborted in one pass 
 (since this get's translated to an IN clause) we should still make sure all 
 are timed out.  Also, 20 seems pretty small as a batch size.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10464) resolved

2015-04-28 Thread ankush (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ankush updated HIVE-10464:
--
Description: resolved  (was: Could you please let me know how i find the 
kryo version that i using ?

Please help me on this,

We are just running HQL (Hive) queries)
Summary: resolved  (was: How i find the kryo version )

 resolved
 

 Key: HIVE-10464
 URL: https://issues.apache.org/jira/browse/HIVE-10464
 Project: Hive
  Issue Type: Improvement
Reporter: ankush

 resolved



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10524) Add utility method ExprNodeDescUtils.forwardTrack()

2015-04-28 Thread Jason Dere (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-10524:
--
Attachment: HIVE-10524.1.patch

initial patch

 Add utility method ExprNodeDescUtils.forwardTrack()
 ---

 Key: HIVE-10524
 URL: https://issues.apache.org/jira/browse/HIVE-10524
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10524.1.patch


 ExprNodeDescUtils has a method backtrack(), which is able to take an 
 ExprNodeDesc from an operator and convert it to an equivalent expression 
 based on the columns of a parent operator. Adding a forwardTrack() method to 
 do something similar, but for a child operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10500) Repeated deadlocks in underlying RDBMS cause transaction or lock failure


[ 
https://issues.apache.org/jira/browse/HIVE-10500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518233#comment-14518233
 ] 

Alan Gates commented on HIVE-10500:
---

[~ekoifman], I didn't change the log level as that was already that way.  This 
patch doesn't make it worse.  If we find it to be a problem we can change it 
separately.

 Repeated deadlocks in underlying RDBMS cause transaction or lock failure
 

 Key: HIVE-10500
 URL: https://issues.apache.org/jira/browse/HIVE-10500
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 0.14.0, 1.0.0, 1.1.0
Reporter: Alan Gates
Assignee: Alan Gates
 Fix For: 1.2.0

 Attachments: HIVE-10050.patch


 In some cases in a busy system, deadlocks in the metastore RDBMS can cause 
 failures in Hive locks and transactions when using DbTxnManager



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10499) Ensure Session/ZooKeeperClient instances are closed


[ 
https://issues.apache.org/jira/browse/HIVE-10499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518021#comment-14518021
 ] 

Szehon Ho commented on HIVE-10499:
--

Thanks +1

 Ensure Session/ZooKeeperClient instances are closed
 ---

 Key: HIVE-10499
 URL: https://issues.apache.org/jira/browse/HIVE-10499
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-10499.patch


 Some Session/ZooKeeperClient instances are not closed in some scenario. We 
 need to make sure they are always closed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-10140) Window boundary is not compared correctly


 [ 
https://issues.apache.org/jira/browse/HIVE-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu reassigned HIVE-10140:
---

Assignee: Aihua Xu

 Window boundary is not compared correctly
 -

 Key: HIVE-10140
 URL: https://issues.apache.org/jira/browse/HIVE-10140
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 1.0.0
Reporter: Yi Zhang
Assignee: Aihua Xu
Priority: Minor

 “ROWS between 10 preceding and 2 preceding” is not handled correctly.
 Underlying error: Window range invalid, start boundary is greater than end 
 boundary: window(start=range(10 PRECEDING), end=range(2 PRECEDING))
 If I change it to “2 preceding and 10 preceding”, the syntax works but the 
 results are 0 of course.
 Reason for the function: during analysis, it is sometimes desired to design 
 the window to filter the most recent events, in the case of the events' 
 responses are not available yet. There is a workaround for this, but it is 
 better/more proper to fix the bug. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10140) Window boundary is not compared correctly


[ 
https://issues.apache.org/jira/browse/HIVE-10140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518017#comment-14518017
 ] 

Aihua Xu commented on HIVE-10140:
-

Let me take a look. 

 Window boundary is not compared correctly
 -

 Key: HIVE-10140
 URL: https://issues.apache.org/jira/browse/HIVE-10140
 Project: Hive
  Issue Type: Bug
  Components: PTF-Windowing
Affects Versions: 1.0.0
Reporter: Yi Zhang
Assignee: Aihua Xu
Priority: Minor

 “ROWS between 10 preceding and 2 preceding” is not handled correctly.
 Underlying error: Window range invalid, start boundary is greater than end 
 boundary: window(start=range(10 PRECEDING), end=range(2 PRECEDING))
 If I change it to “2 preceding and 10 preceding”, the syntax works but the 
 results are 0 of course.
 Reason for the function: during analysis, it is sometimes desired to design 
 the window to filter the most recent events, in the case of the events' 
 responses are not available yet. There is a workaround for this, but it is 
 better/more proper to fix the bug. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10525) loading data into list bucketing table when null in skew column


 [ 
https://issues.apache.org/jira/browse/HIVE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel C Balan updated HIVE-10525:
---
Description: 
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

 more *null.csv
::
has-null.csv
::
1
2
\N
3
::
no-null.csv
::
1
2
3

{code}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner 
(LocalJobRunner.java:run(560)) - job_local402607316_0001
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {x:null}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 10 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.generateListBucketingDirName(FileSinkOperator.java:833)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:615)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)


  was:
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

 more *null.csv
::
has-null.csv
::
1
2
\N
3
::
no-null.csv
::
1
2
3


set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner

[jira] [Updated] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join


 [ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10520:

Affects Version/s: 1.2.0

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.0

 Attachments: HIVE-10520.01.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8165) Annotation changes for replication

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517964#comment-14517964
 ] 

Hive QA commented on HIVE-8165:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728845/HIVE-8165.2.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8822 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3630/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3630/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3630/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728845 - PreCommit-HIVE-TRUNK-Build

 Annotation changes for replication
 --

 Key: HIVE-8165
 URL: https://issues.apache.org/jira/browse/HIVE-8165
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-8165.2.patch, HIVE-8165.patch


 We need to make a couple of changes for annotating the recent changes.
 a) Marking old notification listener in HCatalog as @Deprecated, linking 
 instead to the new repl/ module.
 b) Mark the new interfaces as @Evolving @Unstable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10508) Strip out password information from config passed to Tez/MR in cases where password encryption is not used


 [ 
https://issues.apache.org/jira/browse/HIVE-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10508:
-
Attachment: (was: HIVE-10508.2.patch)

 Strip out password information from config passed to Tez/MR in cases where 
 password encryption is not used
 --

 Key: HIVE-10508
 URL: https://issues.apache.org/jira/browse/HIVE-10508
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10508.1.patch, HIVE-10508.2.patch


 Remove password information from configuration copy that is sent to Yarn/Tez. 
 We don't need it there. The config entries can potentially be visible to 
 other users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10508) Strip out password information from config passed to Tez/MR in cases where password encryption is not used

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10508:
-
Attachment: HIVE-10508.3.patch

Added this to sparkTask as well in patch#3

 Strip out password information from config passed to Tez/MR in cases where 
 password encryption is not used
 --

 Key: HIVE-10508
 URL: https://issues.apache.org/jira/browse/HIVE-10508
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10508.1.patch, HIVE-10508.2.patch, 
 HIVE-10508.3.patch


 Remove password information from configuration copy that is sent to Yarn/Tez. 
 We don't need it there. The config entries can potentially be visible to 
 other users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518216#comment-14518216
 ] 

Sushanth Sowmyan commented on HIVE-10450:
-

+1 for inclusion into branch-1.2

 More than one TableScan in MapWork not supported in Vectorization -- causes  
 query to fail during vectorization
 ---

 Key: HIVE-10450
 URL: https://issues.apache.org/jira/browse/HIVE-10450
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
 HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch


 [~gopalv] found a error with this query:
 {noformat}
 explain select
 s_state, count(1)
  from store_sales,
  store,
  date_dim
  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
store_sales.ss_store_sk = store.s_store_sk and
store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
  group by s_state
  order by s_state
  limit 100;
 {noformat}
 Stack trace:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.reflect.InvocationTargetException
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
   at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
   at

[jira] [Updated] (HIVE-10525) loading data into list bucketing table when null in skew column


 [ 
https://issues.apache.org/jira/browse/HIVE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel C Balan updated HIVE-10525:
---
Description: 
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

{code:title=has-null.csv|borderStyle=solid}
1
2
\N
3
{code}

{code:title=no-null.csv|borderStyle=solid}
1
2
3
{code}

{code: title=hive cli|borderStyle=solid}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner 
(LocalJobRunner.java:run(560)) - job_local402607316_0001
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {x:null}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 10 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.generateListBucketingDirName(FileSinkOperator.java:833)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:615)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)


  was:
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

{code:title=has-null.csv|borderStyle=solid}
1
2
\N
3
{code}

{code:title=no-null.csv|borderStyle=solid}
1
2
3
{code}

{code}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN

[jira] [Updated] (HIVE-10525) loading data into list bucketing table when null in skew column

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel C Balan updated HIVE-10525:
---
Description: 
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

{code:title=has-null.csv}
1
2
\N
3
{code}

{code:title=no-null.csv}
1
2
3
{code}

{code: title=hive cli|borderStyle=solid}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}

{noformat:nopanel=true}I see this in ${hive.log.dir}/hive.log{noformat}

2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner 
(LocalJobRunner.java:run(560)) - job_local402607316_0001
java.lang.Exception: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
processing row {x:null}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at 
org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error 
while processing row {x:null}
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 10 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.generateListBucketingDirName(FileSinkOperator.java:833)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:615)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)


  was:
I'm trying to load data into a list bucketing table.
The insert statement fails when there are nulls going into the skew column.
If this is the expected behavior, there is no mention of this restriction in 
the doc.

{code:title=has-null.csv}
1
2
\N
3
{code}

{code:title=no-null.csv}
1
2
3
{code}

{code: title=hive cli|borderStyle=solid}
set hive.mapred.supports.subdirectories=true;
set hive.optimize.listbucketing=true;
set mapred.input.dir.recursive=true;
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;

create table src_with_null (x int);
load data local inpath 'has-null.csv' overwrite into table src_with_null;

create table src_no_null (x int);
load data local inpath 'no-null.csv' overwrite into table src_no_null;

create table lb (x int) partitioned by (p string) 
skewed by ( x ) on (1) STORED AS DIRECTORIES
stored as rcfile;

insert overwrite table lb partition (p = 'foo') select * from src_with_null;
--fails

insert overwrite table lb partition (p = 'foo') select * from src_no_null;
--succeeds
{code}


I see this in ${hive.log.dir}/hive.log

2015-04-28 13:43:47,646 WARN  [Thread-82]:

[jira] [Commented] (HIVE-10507) Expose RetryingMetastoreClient to other external users of metastore client like Flume and Storm.

[
https://issues.apache.org/jira/browse/HIVE-10507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518154#comment-14518154
]

Hari Sankar Sivarama Subramaniyan commented on HIVE-10507:
--

The test failures are unrelated to this change.

Thanks
Hari

Expose RetryingMetastoreClient to other external users of metastore client
like Flume and Storm.
-

Key: HIVE-10507
URL: https://issues.apache.org/jira/browse/HIVE-10507
Project: Hive
Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
Attachments: HIVE-10507.1.patch

HiveMetastoreClient is now being relied upon by external clients like Flume
and Storm for streaming.
When the thrift connection between MetaStoreClient and the meta store is
broken (due to intermittent network issues or restarting of metastore) the
Metastore does not handle the connection error and automatically re-establish
the connection. Currently the client process needs to be restarted to
re-establish the connection.
The request here is consider supporting the following behavior: For each API
invocation on the MetastoreClient, it should try to restablish the connection
(if needed) once. And if that does not work out then throw a specific
exception indicating the same. The client could then handle the issue by
retrying the same API after some delay. By catching the specific connection
exception, the client could decide how many times to retry before aborting.
Hive does this internally using RetryingMetastoreClient. This jira is suppose
to expose this mechanism to other users of that interface. This is useful for
users of this interface, and from metastore HA point of view.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10233) Hive on LLAP: Memory manager

2015-04-28 Thread Gunther Hagleitner (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518152#comment-14518152
 ] 

Gunther Hagleitner commented on HIVE-10233:
---

I'm still reviewing, but there are some changes in this that I think is 
unnecessary. I think you've renamed the llap memory manager to 
MemoryManagerInterface to make room for another MemoryManager 
(ql/exec/MemoryManager). But that one isn't used. You really use the 
ExecMemoryManager. 

So - you could roll back the changes to the llap cache, remove the old memory 
manager and just use the exec one. That simplifies the patch.

I also think you don't need a memory manager class at all. All it does is 
remember a field per operator. It seems cleaner to add memInfo to the operator 
base class with some facilities to track memory. (or introduce a class between 
operator and gby/join/rs).

 Hive on LLAP: Memory manager
 

 Key: HIVE-10233
 URL: https://issues.apache.org/jira/browse/HIVE-10233
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: llap
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-10233-WIP-2.patch, HIVE-10233-WIP-3.patch, 
 HIVE-10233-WIP-4.patch, HIVE-10233-WIP-5.patch


 We need a memory manager in llap/tez to manage the usage of memory across 
 threads. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10514) Fix MiniCliDriver tests failure

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518161#comment-14518161
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10514:
--

The above test failures are unrelated to the change. On the positive side, no 
more TestMiniMrCliDriver failures.

Thanks
Hari

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HIVE-9908) Runtime exception with Binary data when vectorization is enabled.


 [ 
https://issues.apache.org/jira/browse/HIVE-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline reassigned HIVE-9908:
--

Assignee: Matt McCline

 Runtime exception with Binary data when vectorization is enabled.
 -

 Key: HIVE-9908
 URL: https://issues.apache.org/jira/browse/HIVE-9908
 Project: Hive
  Issue Type: Bug
Reporter: Priyesh Raj
Assignee: Matt McCline

 I am observing run time exception with binary data, when vectorization is 
 enabled and binary data is observed in Group By clause.
 The exception is unsupported type: binary
 As per document, exception should not come. Rather normal way of execution 
 should continue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9908) vectorization error binary type not supported, group by with binary columns


 [ 
https://issues.apache.org/jira/browse/HIVE-9908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-9908:
---
Summary: vectorization error binary type not supported, group by with 
binary columns  (was: Runtime exception with Binary data when vectorization is 
enabled.)

 vectorization error binary type not supported, group by with binary columns
 ---

 Key: HIVE-9908
 URL: https://issues.apache.org/jira/browse/HIVE-9908
 Project: Hive
  Issue Type: Bug
Reporter: Priyesh Raj
Assignee: Matt McCline

 I am observing run time exception with binary data, when vectorization is 
 enabled and binary data is observed in Group By clause.
 The exception is unsupported type: binary
 As per document, exception should not come. Rather normal way of execution 
 should continue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10525) loading data into list bucketing table fails when nulls in skew column


 [ 
https://issues.apache.org/jira/browse/HIVE-10525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabriel C Balan updated HIVE-10525:
---
Summary: loading data into list bucketing table fails when nulls in skew 
column  (was: loading data into list bucketing table when null in skew column)

 loading data into list bucketing table fails when nulls in skew column
 --

 Key: HIVE-10525
 URL: https://issues.apache.org/jira/browse/HIVE-10525
 Project: Hive
  Issue Type: Bug
  Components: Hive
Affects Versions: 1.1.0
 Environment: linux
Reporter: Gabriel C Balan
Priority: Minor

 I'm trying to load data into a list bucketing table.
 The insert statement fails when there are nulls going into the skew column.
 If this is the expected behavior, there is no mention of this restriction in 
 the doc.
 {code:title=has-null.csv}
 1
 2
 \N
 3
 {code}
 {code:title=no-null.csv}
 1
 2
 3
 {code}
 {code: title=hive cli|borderStyle=solid}
 set hive.mapred.supports.subdirectories=true;
 set hive.optimize.listbucketing=true;
 set mapred.input.dir.recursive=true;  
 set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
 create table src_with_null (x int);
 load data local inpath 'has-null.csv' overwrite into table src_with_null;
 create table src_no_null (x int);
 load data local inpath 'no-null.csv' overwrite into table src_no_null;
 create table lb (x int) partitioned by (p string) 
 skewed by ( x ) on (1) STORED AS DIRECTORIES
 stored as rcfile;
 insert overwrite table lb partition (p = 'foo') select * from src_with_null;
 --fails
 insert overwrite table lb partition (p = 'foo') select * from src_no_null;
 --succeeds
 {code}
 {noformat:nopanel=true}I see this in ${hive.log.dir}/hive.log{noformat}
 2015-04-28 13:43:47,646 WARN  [Thread-82]: mapred.LocalJobRunner 
 (LocalJobRunner.java:run(560)) - job_local402607316_0001
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {x:null}
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
 Caused by: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while 
 processing row {x:null}
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
   at 
 org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
   at java.lang.Thread.run(Thread.java:722)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error while processing row {x:null}
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
   at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
   ... 10 more
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.generateListBucketingDirName(FileSinkOperator.java:833)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:615)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
   at 
 org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
   at 
 org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10470) LLAP: NPE in IO when returning 0 rows with no projection


 [ 
https://issues.apache.org/jira/browse/HIVE-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10470.
--
   Resolution: Fixed
Fix Version/s: llap

Committed to llap branch

 LLAP: NPE in IO when returning 0 rows with no projection
 

 Key: HIVE-10470
 URL: https://issues.apache.org/jira/browse/HIVE-10470
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Prasanth Jayachandran
 Fix For: llap

 Attachments: HIVE-10470.1.patch


 Looks like a trivial fix, unless I'm missing something. I may do it later if 
 you don't ;)
 {noformat}
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:299)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:55)
   at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
   ... 4 more
 {noformat}
 Running q file
 {noformat}
 SET hive.vectorized.execution.enabled=true;
 SET hive.llap.io.enabled=false;
 SET hive.exec.orc.default.row.index.stride=1000;
 SET hive.optimize.index.filter=true;
 DROP TABLE orc_llap;
 CREATE TABLE orc_llap(
 ctinyint TINYINT,
 csmallint SMALLINT,
 cint INT,
 cbigint BIGINT,
 cfloat FLOAT,
 cdouble DOUBLE,
 cstring1 STRING,
 cstring2 STRING,
 ctimestamp1 TIMESTAMP,
 ctimestamp2 TIMESTAMP,
 cboolean1 BOOLEAN,
 cboolean2 BOOLEAN)
 STORED AS ORC tblproperties (orc.compress=ZLIB);
 insert into table orc_llap
 select ctinyint, csmallint, cint, cbigint, cfloat, cdouble, cstring1, 
 cstring2, ctimestamp1, ctimestamp2, cboolean1, cboolean2
 from alltypesorc limit 10;
 SET hive.llap.io.enabled=true;
 select count(*) from orc_llap where cint  6000;
 DROP TABLE orc_llap;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10470) LLAP: NPE in IO when returning 0 rows with no projection


 [ 
https://issues.apache.org/jira/browse/HIVE-10470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-10470:
-
Attachment: HIVE-10470.1.patch

 LLAP: NPE in IO when returning 0 rows with no projection
 

 Key: HIVE-10470
 URL: https://issues.apache.org/jira/browse/HIVE-10470
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
Assignee: Prasanth Jayachandran
 Attachments: HIVE-10470.1.patch


 Looks like a trivial fix, unless I'm missing something. I may do it later if 
 you don't ;)
 {noformat}
 aused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:299)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:55)
   at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
   ... 4 more
 {noformat}
 Running q file
 {noformat}
 SET hive.vectorized.execution.enabled=true;
 SET hive.llap.io.enabled=false;
 SET hive.exec.orc.default.row.index.stride=1000;
 SET hive.optimize.index.filter=true;
 DROP TABLE orc_llap;
 CREATE TABLE orc_llap(
 ctinyint TINYINT,
 csmallint SMALLINT,
 cint INT,
 cbigint BIGINT,
 cfloat FLOAT,
 cdouble DOUBLE,
 cstring1 STRING,
 cstring2 STRING,
 ctimestamp1 TIMESTAMP,
 ctimestamp2 TIMESTAMP,
 cboolean1 BOOLEAN,
 cboolean2 BOOLEAN)
 STORED AS ORC tblproperties (orc.compress=ZLIB);
 insert into table orc_llap
 select ctinyint, csmallint, cint, cbigint, cfloat, cdouble, cstring1, 
 cstring2, ctimestamp1, ctimestamp2, cboolean1, cboolean2
 from alltypesorc limit 10;
 SET hive.llap.io.enabled=true;
 select count(*) from orc_llap where cint  6000;
 DROP TABLE orc_llap;
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10428) NPE in RegexSerDe using HCat

2015-04-28 Thread Ashutosh Chauhan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517983#comment-14517983
 ] 

Ashutosh Chauhan commented on HIVE-10428:
-

+1

 NPE in RegexSerDe using HCat
 

 Key: HIVE-10428
 URL: https://issues.apache.org/jira/browse/HIVE-10428
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10428.1.patch, HIVE-10428.2.patch


 When HCatalog calls to table with org.apache.hadoop.hive.serde2.RegexSerDe, 
 when doing Hcatalog call to get read the table, it throws exception:
 {noformat}
 15/04/21 14:07:31 INFO security.TokenCache: Got dt for hdfs://hdpsecahdfs; 
 Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:hdpsecahdfs, Ident: 
 (HDFS_DELEGATION_TOKEN token 1478 for haha)
 15/04/21 14:07:31 INFO mapred.FileInputFormat: Total input paths to process : 
 1
 Splits len : 1
 SplitInfo : [hdpseca03.seca.hwxsup.com, hdpseca04.seca.hwxsup.com, 
 hdpseca05.seca.hwxsup.com]
 15/04/21 14:07:31 INFO mapreduce.InternalUtil: Initializing 
 org.apache.hadoop.hive.serde2.RegexSerDe with properties 
 {name=casetest.regex_table, numFiles=1, columns.types=string,string, 
 serialization.format=1, columns=id,name, rawDataSize=0, numRows=0, 
 output.format.string=%1$s %2$s, 
 serialization.lib=org.apache.hadoop.hive.serde2.RegexSerDe, 
 COLUMN_STATS_ACCURATE=true, totalSize=25, serialization.null.format=\N, 
 input.regex=([^ ]*) ([^ ]*), transient_lastDdlTime=1429590172}
 15/04/21 14:07:31 WARN serde2.RegexSerDe: output.format.string has been 
 deprecated
 Exception in thread main java.lang.NullPointerException
   at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187)
   at com.google.common.base.Splitter.split(Splitter.java:371)
   at 
 org.apache.hadoop.hive.serde2.RegexSerDe.initialize(RegexSerDe.java:155)
   at 
 org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:49)
   at 
 org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:518)
   at 
 org.apache.hive.hcatalog.mapreduce.InternalUtil.initializeDeserializer(InternalUtil.java:156)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.createDeserializer(HCatRecordReader.java:127)
   at 
 org.apache.hive.hcatalog.mapreduce.HCatRecordReader.initialize(HCatRecordReader.java:92)
   at HCatalogSQLMR.main(HCatalogSQLMR.java:81)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10508) Strip out password information from config passed to Tez/MR in cases where password encryption is not used

2015-04-28 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518176#comment-14518176
 ] 

Thejas M Nair commented on HIVE-10508:
--

+1

 Strip out password information from config passed to Tez/MR in cases where 
 password encryption is not used
 --

 Key: HIVE-10508
 URL: https://issues.apache.org/jira/browse/HIVE-10508
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10508.1.patch, HIVE-10508.2.patch, 
 HIVE-10508.3.patch


 Remove password information from configuration copy that is sent to Yarn/Tez. 
 We don't need it there. The config entries can potentially be visible to 
 other users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-8165) Annotation changes for replication


[ 
https://issues.apache.org/jira/browse/HIVE-8165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518175#comment-14518175
 ] 

Alan Gates commented on HIVE-8165:
--

I suspect that even if we move to and HDFS based logger we'll want to hide that 
behind the scenes and still have a metastore call to grab the event.  

+1

 Annotation changes for replication
 --

 Key: HIVE-8165
 URL: https://issues.apache.org/jira/browse/HIVE-8165
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-8165.2.patch, HIVE-8165.patch


 We need to make a couple of changes for annotating the recent changes.
 a) Marking old notification listener in HCatalog as @Deprecated, linking 
 instead to the new repl/ module.
 b) Mark the new interfaces as @Evolving @Unstable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10505) LLAP: Empty PPD splits from Cache throw error


 [ 
https://issues.apache.org/jira/browse/HIVE-10505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran resolved HIVE-10505.
--
Resolution: Duplicate

Duplicate of HIVE-10470

 LLAP: Empty PPD splits from Cache throw error
 -

 Key: HIVE-10505
 URL: https://issues.apache.org/jira/browse/HIVE-10505
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gopal V

 {code}
 hive select count(1) from store_sales where ss_sold_time_sk = 1;
 ...
 Caused by: java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedTreeReaderFactory.createEncodedTreeReader(EncodedTreeReaderFactory.java:1764)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:92)
   at 
 org.apache.hadoop.hive.llap.io.decode.OrcEncodedDataConsumer.decodeBatch(OrcEncodedDataConsumer.java:39)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:116)
   at 
 org.apache.hadoop.hive.llap.io.decode.EncodedDataConsumer.consumeData(EncodedDataConsumer.java:36)
   at 
 org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:329)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:294)
   at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:56)
   at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
   ... 4 more
 {code}
 This was observed with the 10Tb scale data-set, because the PPD filtering can 
 remove all row-groups from a given split.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9863) Querying parquet tables fails with IllegalStateException [Spark Branch]

2015-04-28 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518088#comment-14518088
 ] 

Xuefu Zhang commented on HIVE-9863:
---

[~spena], given that we already upgraded the version (HIVE-10372), could you 
please verify if that fixes the problem here and close this JIRA if so? Thanks.

 Querying parquet tables fails with IllegalStateException [Spark Branch]
 ---

 Key: HIVE-9863
 URL: https://issues.apache.org/jira/browse/HIVE-9863
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang

 Not necessarily happens only in spark branch, queries such as select count(*) 
 from table_name fails with error:
 {code}
 hive select * from content limit 2;
 OK
 Failed with exception java.io.IOException:java.lang.IllegalStateException: 
 All the offsets listed in the split should be found in the file. expected: 
 [4, 4] found: [BlockMetaData{69644, 881917418 [ColumnMetaData{GZIP [guid] 
 BINARY  [PLAIN, BIT_PACKED], 4}, ColumnMetaData{GZIP [collection_name] BINARY 
  [PLAIN_DICTIONARY, BIT_PACKED], 389571}, ColumnMetaData{GZIP [doc_type] 
 BINARY  [PLAIN_DICTIONARY, BIT_PACKED], 389790}, ColumnMetaData{GZIP [stage] 
 INT64  [PLAIN_DICTIONARY, BIT_PACKED], 389887}, ColumnMetaData{GZIP 
 [meta_timestamp] INT64  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 397673}, 
 ColumnMetaData{GZIP [doc_timestamp] INT64  [RLE, PLAIN_DICTIONARY, 
 BIT_PACKED], 422161}, ColumnMetaData{GZIP [meta_size] INT32  [RLE, 
 PLAIN_DICTIONARY, BIT_PACKED], 460215}, ColumnMetaData{GZIP [content_size] 
 INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 521728}, ColumnMetaData{GZIP 
 [source] BINARY  [RLE, PLAIN, BIT_PACKED], 683740}, ColumnMetaData{GZIP 
 [delete_flag] BOOLEAN  [RLE, PLAIN, BIT_PACKED], 683787}, ColumnMetaData{GZIP 
 [meta] BINARY  [RLE, PLAIN, BIT_PACKED], 683834}, ColumnMetaData{GZIP 
 [content] BINARY  [RLE, PLAIN, BIT_PACKED], 6992365}]}] out of: [4, 
 129785482, 260224757] in range 0, 134217728
 Time taken: 0.253 seconds
 hive 
 {code}
 I can reproduce the problem with either local or yarn-cluster. It seems 
 happening to MR also. Thus, I suspect this is an parquet problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10514) Fix MiniCliDriver tests failure


[ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518170#comment-14518170
 ] 

Szehon Ho commented on HIVE-10514:
--

Thanks a lot for looking at this!  I'm just curious, is writing the qfile names 
out to a temporary file necessary, what is the difference with #foreach ($qf in 
$qfiles)?

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join


 [ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10520:

Fix Version/s: 1.3.0

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.0, 1.3.0

 Attachments: HIVE-10520.01.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join


 [ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10520:

Fix Version/s: 1.2.0

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 1.2.0
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Fix For: 1.2.0

 Attachments: HIVE-10520.01.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10520) LLAP: Must reset small table result columns for Native Vectorization of Map Join

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10520:

Attachment: HIVE-10520.01.patch

 LLAP: Must reset small table result columns for Native Vectorization of Map 
 Join
 

 Key: HIVE-10520
 URL: https://issues.apache.org/jira/browse/HIVE-10520
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Blocker
 Attachments: HIVE-10520.01.patch


 Scratch columns not getting reset by input source, so native vector map join 
 operators must manually reset small table result columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10253) Parquet PPD support DATE

2015-04-28 Thread Dong Chen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-10253:
-
Attachment: HIVE-10253-parquet.patch

Rebased to parquet branch.

 Parquet PPD support DATE
 

 Key: HIVE-10253
 URL: https://issues.apache.org/jira/browse/HIVE-10253
 Project: Hive
  Issue Type: Sub-task
Reporter: Dong Chen
Assignee: Dong Chen
 Attachments: HIVE-10253-parquet.patch, HIVE-10253.patch


 Hive should handle the DATE data type when generating and pushing the 
 predicate to Parquet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10514) Fix MiniCliDriver tests failure


 [ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10514:
-
Attachment: HIVE-10514.2.patch

[~sushanth] Thanks for the review. In patch #2
1. Made minor modification to the previous patch to generate txt file
2. Identified several places where we might run into similar issues in future. 
I have made changes in all such places which might be an overkill for now 
given the fact that we currently do not run into any issues for tests other 
than MiniMrCliDriver; however, in the long run, I believe this is a safer 
approach than getting blocked on unit tests.

Please review and let me know which one of the patches is good to go.

Thanks
Hari

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10399) from_unixtime_millis() Hive UDF

2015-04-28 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon resolved HIVE-10399.

Resolution: Not A Problem

Yes that does exactly what I need, and is an easier workaround than the one I 
had implemented. Thanks!

I'm going to close this ticket as the only other reason to extend 
from_unixtime() support for millis is for custom formatting which seems to be 
added in 1.2 anyway with date_format.

 from_unixtime_millis() Hive UDF
 ---

 Key: HIVE-10399
 URL: https://issues.apache.org/jira/browse/HIVE-10399
 Project: Hive
  Issue Type: New Feature
  Components: UDF
 Environment: HDP 2.2
Reporter: Hari Sekhon
Priority: Minor

 Feature request for a
 {code}from_unixtime_millis(){code}
 Hive UDF - from_unixtime() accepts only secs since epoch, and right now the 
 solution is to create a custom UDF, but this seems like quite a standard 
 thing to support millisecond precision dates in Hive natively.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10506) CBO (Calcite Return Path): Disallow return path to be enable if CBO is off

2015-04-28 Thread Jesus Camacho Rodriguez (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-10506:
---
Attachment: HIVE-10506.01.patch

[~jpullokkaran], you are right, sorry I didn't get before what you were 
meaning. I have just uploaded a new patch that checks dynamically those cases. 
Could you check it? Thanks

 CBO (Calcite Return Path): Disallow return path to be enable if CBO is off
 --

 Key: HIVE-10506
 URL: https://issues.apache.org/jira/browse/HIVE-10506
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10506.01.patch, HIVE-10506.patch


 If hive.cbo.enable=false and hive.cbo.returnpath=true then some optimizations 
 would kick in. It's quite possible that in customer environment, they might 
 end up in these scenarios; we should prevent it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10514) Fix MiniCliDriver tests failure

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10514:
-
Attachment: HIVE-10514.1.patch

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10514) Fix MiniCliDriver tests failure

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516633#comment-14516633
 ] 

Sushanth Sowmyan commented on HIVE-10514:
-

Good work, Hari. From looking at the diff between 
./itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/ 
before and after your patch, I see what it's doing, and it makes sense. We'll 
wait on the test run to see if it succeeds properly, and whether or not this 
gets us around the 64kb method problem.

Also, on my local box, the command specified in the bug description works after 
your patch:

{noformat}
thunderfall:itests sush$ mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
-Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
[INFO] Scanning for projects...
...
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
hive-it-qfile ---
[INFO] Compiling 14 source files to 
/Users/sush/dev/hive.git/itests/qtest/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-qfile ---
[INFO] Surefire report directory: 
/Users/sush/dev/hive.git/itests/qtest/target/surefire-reports

---
 T E S T S
---

---
 T E S T S
---
Running org.apache.hadoop.hive.cli.TestMinimrCliDriver
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 706.564 sec - 
in org.apache.hadoop.hive.cli.TestMinimrCliDriver

Results :

Tests run: 5, Failures: 0, Errors: 0, Skipped: 0
...
[INFO] Reactor Summary:
[INFO] 
[INFO] Hive Integration - Parent . SUCCESS [  1.036 s]
[INFO] Hive Integration - Custom Serde ... SUCCESS [  1.845 s]
[INFO] Hive Integration - HCatalog Unit Tests  SUCCESS [  1.919 s]
[INFO] Hive Integration - Testing Utilities .. SUCCESS [  1.636 s]
[INFO] Hive Integration - Unit Tests . SUCCESS [  3.606 s]
[INFO] Hive Integration - Test Serde . SUCCESS [  0.316 s]
[INFO] Hive Integration - QFile Tests  SUCCESS [11:49 min]
[INFO] JMH benchmark: Hive ... SUCCESS [  0.789 s]
[INFO] Hive Integration - Unit Tests - Hadoop 2 .. SUCCESS [  0.791 s]
[INFO] Hive Integration - Unit Tests with miniKdc  SUCCESS [  1.120 s]
[INFO] Hive Integration - QFile Spark Tests .. SUCCESS [  3.514 s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 12:07 min
[INFO] Finished at: 2015-04-28T01:17:25-08:00
[INFO] Final Memory: 90M/466M
[INFO] 
{noformat}

+1 on intent and change, waiting on test results.

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10514) Fix MiniCliDriver tests failure


 [ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10514:
-
Attachment: (was: HIVE-10514.1.patch)

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch, 
 HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10514) Fix MiniCliDriver tests failure

2015-04-28 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10514:
-
Attachment: (was: HIVE-10514.2.patch)

 Fix MiniCliDriver tests failure
 ---

 Key: HIVE-10514
 URL: https://issues.apache.org/jira/browse/HIVE-10514
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Reporter: Szehon Ho
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10514.1.patch, HIVE-10514.2.patch


 The MinimrCliDriver tests always fail to run.
 This can be reproduced by the following, run the command:
 {noformat}
 mvn -B test -Phadoop-2 -Dtest=TestMinimrCliDriver 
 -Dminimr.query.files=infer_bucket_sort_map_operators.q,join1.q,bucketmapjoin7.q,udf_using.q
 {noformat}
 And the following exception comes:
 {noformat}
 [ERROR] Failed to execute goal 
 org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
 (default-testCompile) on project hive-it-qfile: Compilation failure
 [ERROR] 
 /Users/szehon/repos/apache-hive-git/hive/itests/qtest/target/generated-test-sources/java/org/apache/hadoop/hive/cli/TestCliDriver.java:[100,22]
  code too large
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8165) Annotation changes for replication