[jira] [Created] (HIVE-21864) LlapBaseInputFormat#closeAll() throws ConcurrentModificationException

2019-06-11 Thread Shubham Chaurasia (JIRA)
Shubham Chaurasia created HIVE-21864:


 Summary: LlapBaseInputFormat#closeAll() throws 
ConcurrentModificationException
 Key: HIVE-21864
 URL: https://issues.apache.org/jira/browse/HIVE-21864
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 3.1.1
Reporter: Shubham Chaurasia
Assignee: Shubham Chaurasia






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21863) Fix HIVE-21742 for non-cbo path

2019-06-11 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-21863:
--

 Summary: Fix HIVE-21742 for non-cbo path
 Key: HIVE-21863
 URL: https://issues.apache.org/jira/browse/HIVE-21863
 Project: Hive
  Issue Type: Sub-task
  Components: Vectorization
Affects Versions: 4.0.0
Reporter: Vineet Garg


The fix done in HIVE-21742 applied to cbo path only (i.e when hive.cbo.enable = 
true). This jira is to fix the issue in non-cbo path as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21862) ORC ppd produces wrong result with timestamp

2019-06-11 Thread Vineet Garg (JIRA)
Vineet Garg created HIVE-21862:
--

 Summary: ORC ppd produces wrong result with timestamp
 Key: HIVE-21862
 URL: https://issues.apache.org/jira/browse/HIVE-21862
 Project: Hive
  Issue Type: Bug
  Components: ORC
Affects Versions: 4.0.0
Reporter: Vineet Garg
Assignee: Vineet Garg


Note that to reproduce this ORC-491 and ORC-477 is required.

{code:sql}
set hive.vectorized.execution.enabled=false;
set hive.support.concurrency=true;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
set hive.tez.bucket.pruning=true;
set hive.optimize.index.filter=true;
set hive.metastore.disallow.incompatible.col.type.changes=false;

create table change_allowincompatible_vectorization_false_date (ts date) 
partitioned by (s string) clustered by (ts) into 32 buckets stored as orc 
tblproperties ('transactional'='true');

alter table change_allowincompatible_vectorization_false_date add 
partition(s='aaa');

alter table change_allowincompatible_vectorization_false_date add 
partition(s='bbb');

insert into table change_allowincompatible_vectorization_false_date partition 
(s='aaa') select ctimestamp1 from alltypesorc where ctimestamp1 > '2000-01-01' 
limit 50;

insert into table change_allowincompatible_vectorization_false_date partition 
(s='bbb') select ctimestamp1 from alltypesorc where ctimestamp1 < '2000-01-01' 
limit 50;

select count(*) from change_allowincompatible_vectorization_false_date;

alter table change_allowincompatible_vectorization_false_date change column ts 
ts timestamp;

select count(*) from change_allowincompatible_vectorization_false_date;

insert into table change_allowincompatible_vectorization_false_date partition 
(s='aaa') values ('2038-03-22 07:26:48.0');

select ts from change_allowincompatible_vectorization_false_date where 
ts='2038-03-22 07:26:48.0' and s='aaa';
{code}

Expected is one row but actual is zero
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21861) ClassCastException during CTAS over external table using KafkaStorageHandler

2019-06-11 Thread Justin Leet (JIRA)
Justin Leet created HIVE-21861:
--

 Summary: ClassCastException during CTAS over external table using 
KafkaStorageHandler
 Key: HIVE-21861
 URL: https://issues.apache.org/jira/browse/HIVE-21861
 Project: Hive
  Issue Type: Bug
  Components: kafka integration
Affects Versions: 0.3.0
Reporter: Justin Leet


To reproduce, create a table similar to the following:
{code}
 CREATE EXTERNAL TABLE 
 (raw_value STRING)
ROW FORMAT DELIMITED
LINES TERMINATED BY '\n'
STORED BY 'org.apache.hadoop.hive.kafka.KafkaStorageHandler'
TBLPROPERTIES(
 "kafka.topic"="",
 "kafka.bootstrap.servers"="",
 "kafka.consumer.security.protocol"="PLAINTEXT",
 "kafka.serde.class"="org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe");
{code}

Note the SerDe isn't the default SerDe.  Additionally, this error occurs when 
vectorization is enabled.

Basic queries work fine:
{code}
SELECT * FROM  LIMIT 1;
{code}

Doing a CTAS to bring it into a managed table fails:
{code}
CREATE TABLE  AS
SELECT * FROM ;
{code}

The exception is: 
{code}
Caused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to 
org.apache.hadoop.io.TextCaused by: java.lang.ClassCastException: 
org.apache.hadoop.hive.serde2.lazy.LazyString cannot be cast to 
org.apache.hadoop.io.Text at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:471)
 at 
org.apache.hadoop.hive.ql.exec.vector.VectorAssignRow.assignRowColumn(VectorAssignRow.java:350)
 at 
org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.readNextBatch(VectorizedKafkaRecordReader.java:159)
 at 
org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.next(VectorizedKafkaRecordReader.java:113)
 at 
org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader.next(VectorizedKafkaRecordReader.java:47)
 at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:360)
 ... 24 more
{code}

A workaround to this is to disable vectorization via: 
{code}
set hive.vectorized.execution.enabled = false;
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 70832: HIVE-21815

2019-06-11 Thread Krisztian Kasa

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70832/
---

Review request for hive, Ashutosh Chauhan, Gopal V, and Prasanth_J.


Bugs: HIVE-21815
https://issues.apache.org/jira/browse/HIVE-21815


Repository: hive-git


Description
---

Stats in ORC file are parsed twice
==
ORC record reader unnecessarily parses stats twice

```
  if (orcTail == null) {
Reader orcReader = OrcFile.createReader(file.getPath(),
OrcFile.readerOptions(context.conf)
.filesystem(fs)
.maxLength(AcidUtils.getLogicalLength(fs, file)));
orcTail = new OrcTail(orcReader.getFileTail(), 
orcReader.getSerializedFileFooter(),
file.getModificationTime());
if (context.cacheStripeDetails) {
  context.footerCache.put(new FooterCacheKey(fsFileId, file.getPath()), 
orcTail);
}
  }
  stripes = orcTail.getStripes();
  stripeStats = orcTail.getStripeStatistics();
```

We go from Reader -> OrcTail -> StripeStatistics.

stripeStats is read out of the orcTail and is already read inside 
orcReader.getStripeStatistics().


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 3878bba4d3 


Diff: https://reviews.apache.org/r/70832/diff/1/


Testing
---

run TestInputOutputFormat tests.


Thanks,

Krisztian Kasa



Review Request 70830: HIVE-21860 Incorrect FQDN of HadoopThriftAuthBridge23 in ShimLoader

2019-06-11 Thread Oleksiy Sayankin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70830/
---

Review request for hive, Zoltan Haindrich, Zoltan Haindrich, and Sergey 
Shelukhin.


Repository: hive-git


Description
---

Initial commit


Diffs
-

  shims/common/src/main/java/org/apache/hadoop/hive/shims/ShimLoader.java 
12dbaced3a 


Diff: https://reviews.apache.org/r/70830/diff/1/


Testing
---


Thanks,

Oleksiy Sayankin



[jira] [Created] (HIVE-21860) Incorrect FQDN of HadoopThriftAuthBridge23 in ShimLoader

2019-06-11 Thread Oleksiy Sayankin (JIRA)
Oleksiy Sayankin created HIVE-21860:
---

 Summary: Incorrect FQDN of HadoopThriftAuthBridge23 in ShimLoader
 Key: HIVE-21860
 URL: https://issues.apache.org/jira/browse/HIVE-21860
 Project: Hive
  Issue Type: Bug
Reporter: Oleksiy Sayankin
Assignee: Oleksiy Sayankin






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70819: Break up DDLTask - extract rest of the Alter Table operations

2019-06-11 Thread Zoltan Haindrich

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70819/#review215787
---




ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AlterTableTypes.java
Lines 30 (patched)


I think this should be AlterTableType and not with "s"



ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableArchiveUtils.java
Lines 100 (patched)


these methods seem to be not that closely related to archiving...

looking at these methods: it seems like they are just changing exception 
types/argument order/etc...



ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java
Lines 104 (patched)


5 minutes? seems like a lot



ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableUnarchiveOperation.java
Lines 266 (patched)


I think these TODO messages are pointless... catalog/tablename etc should 
be adressed more with a tableref or something
"adding" another argument would not really make things better



ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableUnarchiveOperation.java
Lines 295 (patched)


I know it's hard to find places for these kind of methodsprobably 
ArchiveUtils?



ql/src/test/results/clientpositive/druid/druidmini_dynamic_partition.q.out
Lines 491-496 (original), 489-492 (patched)


It might seem odd at first to see this "Unset properties" happening during 
an INSERT statement.

Can't we add a different task to invalidate stats?
(followup?)


- Zoltan Haindrich


On June 8, 2019, 3:09 p.m., Miklos Gergely wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70819/
> ---
> 
> (Updated June 8, 2019, 3:09 p.m.)
> 
> 
> Review request for hive and Zoltan Haindrich.
> 
> 
> Bugs: HIVE-21830
> https://issues.apache.org/jira/browse/HIVE-21830
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> DDLTask is a huge class, more than 5000 lines long. The related DDLWork is 
> also a huge class, which has a field for each DDL operation it supports. The 
> goal is to refactor these in order to have everything cut into more 
> handleable classes under the package  org.apache.hadoop.hive.ql.exec.ddl:
> 
> have a separate class for each operation
> have a package for each operation group (database ddl, table ddl, etc), so 
> the amount of classes under a package is more manageable
> make all the requests (DDLDesc subclasses) immutable
> DDLTask should be agnostic to the actual operations
> right now let's ignore the issue of having some operations handled by DDLTask 
> which are not actual DDL operations (lock, unlock, desc...)
> In the interim time when there are two DDLTask and DDLWork classes in the 
> code base the new ones in the new package are called DDLTask2 and DDLWork2 
> thus avoiding the usage of fully qualified class names where both the old and 
> the new classes are in use.
> 
> Step #10: extract the alter table operations that left from the old DDLTask, 
> and move them under the new packages.
> 
> 
> Diffs
> -
> 
>   accumulo-handler/src/test/results/positive/accumulo_queries.q.out 
> c5379c7348 
>   
> accumulo-handler/src/test/results/positive/accumulo_single_sourced_multi_insert.q.out
>  1a5dde0602 
>   
> druid-handler/src/java/org/apache/hadoop/hive/druid/DruidStorageHandler.java 
> 254d0a39a6 
>   hbase-handler/src/test/results/negative/hbase_ddl.q.out 4646def667 
>   hbase-handler/src/test/results/positive/hbase_ddl.q.out 779ca4d16a 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out adf8864363 
>   
> hbase-handler/src/test/results/positive/hbase_single_sourced_multi_insert.q.out
>  d474b4d065 
>   hbase-handler/src/test/results/positive/hbasestats.q.out 367b479556 
>   ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 554df3c6bf 
>   ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractAlterTableDesc.java 
> 432779b3f4 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractAlterTableOperation.java
>  baf98da37a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AbstractAlterTableWithConstraintsDesc.java
>  9babf2a1a9 
>   ql/src/java/org/apache/hadoop/hive/ql/ddl/table/AlterTableTypes.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/table/column/AlterTableAddColumnsDesc.java
>  e40ba1819d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/table/column/AlterTableChangeColumnDesc.java
>  ce3b97eb68 
>  

[jira] [Created] (HIVE-21859) Backport HIVE-17466 to branch-2.3

2019-06-11 Thread Piotr Findeisen (JIRA)
Piotr Findeisen created HIVE-21859:
--

 Summary: Backport HIVE-17466 to branch-2.3
 Key: HIVE-21859
 URL: https://issues.apache.org/jira/browse/HIVE-21859
 Project: Hive
  Issue Type: Improvement
Reporter: Piotr Findeisen


Please backport HIVE-17466 that adds {{get_partition_values}} to {{branch-2.3}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21858) Default to store runtime statistics in the metastore

2019-06-11 Thread Zoltan Haindrich (JIRA)
Zoltan Haindrich created HIVE-21858:
---

 Summary: Default to store runtime statistics in the metastore
 Key: HIVE-21858
 URL: https://issues.apache.org/jira/browse/HIVE-21858
 Project: Hive
  Issue Type: Improvement
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


Right now the reuse scope of runtime statistics is limited to re-running the 
actual query



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)