[jira] [Created] (HIVE-21953) Enable CLUSTERED ON/DISTRIBUTED ON+SORTED ON in incremental rebuild of materialized views

2019-07-03 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21953:
--

 Summary: Enable CLUSTERED ON/DISTRIBUTED ON+SORTED ON in 
incremental rebuild of materialized views
 Key: HIVE-21953
 URL: https://issues.apache.org/jira/browse/HIVE-21953
 Project: Hive
  Issue Type: Bug
  Components: Materialized views
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez


Follow-up of HIVE-18842. For insert and insert branch in merge, we can 
introduce a RS to enforce these properties, as we do when we create the 
materialized view or execute a full rebuild. This will make delta files created 
for the insert to obey the same organization. If the increments are large 
enough, this may improve query execution performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21952) Hive should allow to delete serde properties too, not just add them

2019-07-03 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created HIVE-21952:


 Summary: Hive should allow to delete serde properties too, not 
just add them
 Key: HIVE-21952
 URL: https://issues.apache.org/jira/browse/HIVE-21952
 Project: Hive
  Issue Type: Improvement
Affects Versions: 2.3.5, 3.0.0, 4.0.0
Reporter: Ruslan Dautkhanov


Hive should allow to delete serde properties not just add/change them

We have a use case when a presence of certain serde properties 
causes issues and we want to delete just that one serde property. 

It's not currently possible.

Thanks.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21951) Llap query on external table with header or footer returns incorrect row count.

2019-07-03 Thread Sankar Hariappan (JIRA)
Sankar Hariappan created HIVE-21951:
---

 Summary: Llap query on external table with header or footer 
returns incorrect row count.
 Key: HIVE-21951
 URL: https://issues.apache.org/jira/browse/HIVE-21951
 Project: Hive
  Issue Type: Bug
  Components: llap, Query Processor
Affects Versions: 2.4.0, 4.0.0
Reporter: Sankar Hariappan
Assignee: Sankar Hariappan


If create a table with header and footer as follows.
{code}
CREATE EXTERNAL TABLE IF NOT EXISTS externaltableOpenCSV (eid int, name String, 
salary String, destination String)
 COMMENT 'Employee details'
 ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
 STORED AS TEXTFILE
 LOCATION '/externaltableOpenCSV'
 tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
{code}

Now, query on this table returns incorrect row count as header/footer are not 
skipped.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

2019-07-03 Thread Karen Coppage via Review Board


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > Thanks a lot Karen for the patch!
> > I have some questions, but otherwise the change looks good to me.

Thanks very much for the review!


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
> > Line 223 (original), 224 (patched)
> > 
> >
> > Why did you change the type of this variable to ArrayList from List?

I thought it was necessary in order to make the class serializable but now I 
see it's not. I'll fix this.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
> > Lines 59 (patched)
> > 
> >
> > Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use 
> > this method, or is this just a typo and the CastDateToStringWithFormat, ... 
> > udfs use this?

They do, all 3 classes eventually inherit from CastDateToString.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
> > Line 200 (original), 202 (patched)
> > 
> >
> > Is the formattedOutput variable never going to be null after this 
> > change? If there is a scenario where it can be null, it will cause problems 
> > when trying to cast it.

The output of format(Timestamp|Date) is a .toString() which I 
believe will never return null.


> On July 3, 2019, 10:51 a.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
> > Line 217 (original), 220 (patched)
> > 
> >
> > The same question about being null (previous comment) applies to the t 
> > and d variable as well.

This is not the case now but may be later on (if timestamp range of - 
is ever strictly enforced). I will include a null check here and above just in 
case.


- Karen


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216334
---


On June 26, 2019, 8:44 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> ---
> 
> (Updated June 26, 2019, 8:44 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
> https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT 
> ) and CAST ( AS TIMESTAMP/DATE FORMAT ).
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> fa9d1e9783 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  dfa9f8a00d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java
>  a6dff12e1a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java
>  b48b0136eb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java
>  adc3a9d7b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> 16742eee9b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
>  663237739e 
>   
> 

Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-07-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216337
---


Ship it!




Ship It!

- Marta Kuczora


On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated July 1, 2019, 9:20 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/2/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-07-03 Thread Marta Kuczora via Review Board


> On June 28, 2019, 2:42 p.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
> > Line 1046 (original), 1049-1050 (patched)
> > 
> >
> > Why do you need to make the DB and Table name lower case?
> 
> Laszlo Pinter wrote:
> Partition properties like table and db name must be stored in lower case. 
> This is the same in HiveMestarore as well. 
> Other properties are case sensitive.

Ah, I see, thanks for the explanation.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216227
---


On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated July 1, 2019, 9:20 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/2/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

2019-07-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216334
---



Thanks a lot Karen for the patch!
I have some questions, but otherwise the change looks good to me.


common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Line 223 (original), 224 (patched)


Why did you change the type of this variable to ArrayList from List?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
Lines 59 (patched)


Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use this 
method, or is this just a typo and the CastDateToStringWithFormat, ... udfs use 
this?



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 200 (original), 202 (patched)


Is the formattedOutput variable never going to be null after this change? 
If there is a scenario where it can be null, it will cause problems when trying 
to cast it.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 217 (original), 220 (patched)


The same question about being null (previous comment) applies to the t and 
d variable as well.


- Marta Kuczora


On June 26, 2019, 8:44 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> ---
> 
> (Updated June 26, 2019, 8:44 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
> https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT 
> ) and CAST ( AS TIMESTAMP/DATE FORMAT ).
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> fa9d1e9783 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  dfa9f8a00d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java
>  a6dff12e1a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java
>  b48b0136eb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java
>  adc3a9d7b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> 16742eee9b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
>  663237739e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
>  58fd7b030e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java
>  PRE-CREATION 
>   ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 269edf6da6 
>   ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> 4a502b9700 
> 
> 
> Diff: https://reviews.apache.org/r/70920/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



[jira] [Created] (HIVE-21950) Optimizer complicates execution plan of queries with SUBSTR function in EXISTS clause

2019-07-03 Thread Murshid Chalaev (JIRA)
Murshid Chalaev created HIVE-21950:
--

 Summary: Optimizer complicates execution plan of queries with 
SUBSTR function in EXISTS clause
 Key: HIVE-21950
 URL: https://issues.apache.org/jira/browse/HIVE-21950
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.1, 2.3.0
Reporter: Murshid Chalaev


Queries with SUBSTR function in EXISTS clause have much more complicated 
execution plan in Hive-2.3 then it was in Hive-1.2. A query below has 8 stages 
which submit 4 MR jobs in Hive-2.3, while in Hive-1.2 it has 4 stages and 
submits 1 MR job. Without SUBSTR function or with disabled CBO in Hive-2.3 
execution plan is the same as in Hive-1.2 with enabled CBO.

 *STEPS TO REPRODUCE:*
{code:java}
CREATE TABLE i1122 (id STRING);
INSERT INTO i1122 VALUES (1),(1001); 

EXPLAIN
SELECT *
FROM i1122 AS t1
WHERE EXISTS (
SELECT 1
FROM i1122 AS t2
WHERE t2.id = substr(t1.id,4)
);{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21949) Revert HIVE-21232 LLAP: Add a cache-miss friendly split affinity provider

2019-07-03 Thread Antal Sinkovits (JIRA)
Antal Sinkovits created HIVE-21949:
--

 Summary: Revert HIVE-21232 LLAP: Add a cache-miss friendly split 
affinity provider
 Key: HIVE-21949
 URL: https://issues.apache.org/jira/browse/HIVE-21949
 Project: Hive
  Issue Type: Bug
Reporter: Antal Sinkovits
Assignee: Antal Sinkovits
 Attachments: HIVE-21949.01.patch





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21948) Implement parallell processing in Pre Upgrade Tool

2019-07-03 Thread Krisztian Kasa (JIRA)
Krisztian Kasa created HIVE-21948:
-

 Summary: Implement parallell processing in Pre Upgrade Tool
 Key: HIVE-21948
 URL: https://issues.apache.org/jira/browse/HIVE-21948
 Project: Hive
  Issue Type: Improvement
  Components: Transactions
Affects Versions: 3.1.0
Reporter: Krisztian Kasa
Assignee: Krisztian Kasa
 Fix For: 4.0.0


Pre Upgrade Tool scans for all databases and tables in the warehouse 
sequentially which can be very slow in case of lots of tables.

Example: It took the process 8-10 hours to complete on ~500k tables.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)