[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Status: Patch Available  (was: In Progress)

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 ​.  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Attachment: HIVE-17433.07.patch

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch, HIVE-17433.07.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 ​.  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17433) Vectorization: Support Decimal64 in Hive Query Engine

2017-10-25 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-17433:

Status: In Progress  (was: Patch Available)

> Vectorization: Support Decimal64 in Hive Query Engine
> -
>
> Key: HIVE-17433
> URL: https://issues.apache.org/jira/browse/HIVE-17433
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-17433.03.patch, HIVE-17433.04.patch, 
> HIVE-17433.05.patch, HIVE-17433.06.patch
>
>
> Provide partial support for Decimal64 within Hive.  By partial I mean that 
> our current decimal has a large surface area of features (rounding, multiply, 
> divide, remainder, power, big precision, and many more) but only a small 
> number has been identified as being performance hotspots.
> Those are small precision decimals with precision <= 18 that fit within a 
> 64-bit long we are calling Decimal64 ​.  Just as we optimize row-mode 
> execution engine hotspots by selectively adding new vectorization code, we 
> can treat the current decimal as the full featured one and add additional 
> Decimal64 optimization where query benchmarks really show it help.
> This change creates a Decimal64ColumnVector.
> This change currently detects small decimal with Hive for Vectorized text 
> input format and uses some new Decimal64 vectorized classes for comparison, 
> addition, and later perhaps a few GroupBy aggregations like sum, avg, min, 
> max.
> The patch also supports a new annotation that can mark a 
> VectorizedInputFormat as supporting Decimal64 (it is called DECIMAL_64).  So, 
> in separate work those other formats such as ORC, PARQUET, etc can be done in 
> later JIRAs so they participate in the Decimal64 performance optimization.
> The idea is when you annotate your input format with:
> @VectorizedInputFormatSupports(supports = {DECIMAL_64})
> the Vectorizer in Hive will plan usage of Decimal64ColumnVector instead of 
> DecimalColumnVector.  Upon an input format seeing Decimal64ColumnVector being 
> used, the input format can fill that column vector with decimal64 longs 
> instead of HiveDecimalWritable objects of DecimalColumnVector.
> There will be a Hive environment variable 
> hive.vectorized.input.format.supports.enabled that has a string list of 
> supported features.  The default will start as "decimal_64".  It can be 
> turned off to allow for performance comparisons and testing.
> The query SELECT * FROM DECIMAL_6_1_txt where key - 100BD < 200BD ORDER BY 
> key, value
> Will have a vectorized explain plan looking like:
> ...
> Filter Operator
>   Filter Vectorization:
>   className: VectorFilterOperator
>   native: true
>   predicateExpression: 
> FilterDecimal64ColLessDecimal64Scalar(col 2, val 2000)(children: 
> Decimal64ColSubtractDecimal64Scalar(col 0, val 1000, 
> outputDecimal64AbsMax 999) -> 2:decimal(11,5)/DECIMAL_64) -> boolean
>   predicate: ((key - 100) < 200) (type: boolean)
> ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219959#comment-16219959
 ] 

Hive QA commented on HIVE-17458:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893964/HIVE-17458.09.patch

{color:green}SUCCESS:{color} +1 due to 4 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 11328 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[acid_vectorization_original]
 (batchId=164)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[acid_vectorization_original]
 (batchId=101)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7487/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7487/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7487/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893964 - PreCommit-HIVE-Build

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16827:

Attachment: HIVE-16827.05wip07.patch

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip01.patch, HIVE-16827.04wip02.patch, 
> HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, HIVE-16827.04wip05.patch, 
> HIVE-16827.04wip06.patch, HIVE-16827.04wip07.patch, HIVE-16827.04wip08.patch, 
> HIVE-16827.04wip09.patch, HIVE-16827.04wip10.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.05wip02.patch, HIVE-16827.05wip03.patch, HIVE-16827.05wip04.patch, 
> HIVE-16827.05wip05.patch, HIVE-16827.05wip06.patch, HIVE-16827.05wip07.patch, 
> HIVE-16827.05wip07.patch, HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (HIVE-17129) Increase usage of InterfaceAudience and InterfaceStability annotations

2017-10-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar resolved HIVE-17129.
-
Resolution: Fixed

> Increase usage of InterfaceAudience and InterfaceStability annotations 
> ---
>
> Key: HIVE-17129
> URL: https://issues.apache.org/jira/browse/HIVE-17129
> Project: Hive
>  Issue Type: Improvement
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>
> The {{InterfaceAudience}} and {{InterfaceStability}} annotations were added a 
> while ago to mark certain classes as available for public use. However, they 
> were only added to a few classes. The annotations are largely missing for 
> major APIs such as the SerDe and UDF APIs. We should update these interfaces 
> to use these annotations.
> When done in conjunction with HIVE-17130, we should have an automated way to 
> prevent backwards incompatible changes to Hive APIs.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17743) Add InterfaceAudience and InterfaceStability annotations for Thrift generated APIs

2017-10-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-17743:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Thanks for the review Aihua, pushed to master.

> Add InterfaceAudience and InterfaceStability annotations for Thrift generated 
> APIs
> --
>
> Key: HIVE-17743
> URL: https://issues.apache.org/jira/browse/HIVE-17743
> Project: Hive
>  Issue Type: Sub-task
>  Components: Thrift API
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-17743.1.patch, HIVE-17743.2.patch
>
>
> The Thrift generated files don't have {{InterfaceAudience}} or 
> {{InterfaceStability}} annotations on them, mainly because all the files are 
> auto-generated.
> We should add some code that auto-tags all the Java Thrift generated files 
> with these annotations. This way even when they are re-generated, they still 
> contain the annotations.
> We should be able to do this using the 
> {{com.google.code.maven-replacer-plugin}} similar to what we do in 
> {{standalone-metastore/pom.xml}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17765) expose Hive keywords

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219895#comment-16219895
 ] 

Lefty Leverenz commented on HIVE-17765:
---

Should this be documented somewhere in the wiki?

> expose Hive keywords 
> -
>
> Key: HIVE-17765
> URL: https://issues.apache.org/jira/browse/HIVE-17765
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, 
> HIVE-17765.03.patch, HIVE-17765.nogen.patch, HIVE-17765.patch
>
>
> This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on 
> SQL capabilities of Hive



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17832) Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in metastore

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219888#comment-16219888
 ] 

Lefty Leverenz commented on HIVE-17832:
---

Should the ability to reset 
*hive.metastore.disallow.incompatible.col.type.changes* in the metastore be 
documented in the wiki?

* [Configuration Properties -- 
hive.metastore.disallow.incompatible.col.type.changes | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.metastore.disallow.incompatible.col.type.changes]
* [Alter Table -- Change Column Name/Type/Position/Comment | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment]

> Allow hive.metastore.disallow.incompatible.col.type.changes to be changed in 
> metastore
> --
>
> Key: HIVE-17832
> URL: https://issues.apache.org/jira/browse/HIVE-17832
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE17832.1.patch, HIVE17832.2.patch
>
>
> hive.metastore.disallow.incompatible.col.type.changes when set to true, will 
> disallow incompatible column type changes through alter table.  But, this 
> parameter is not modifiable in HMS.  If HMS in not embedded into HS2, the 
> value cannot be changed.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17116) Vectorization: Add infrastructure for vectorization of ROW__ID struct

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16097134#comment-16097134
 ] 

Lefty Leverenz edited comment on HIVE-17116 at 10/26/17 2:47 AM:
-

Doc note:  This adds *hive.vectorized.row.identifier.enabled* to HiveConf.java, 
so it will need to be documented in the wiki.

* [Configuration Properties -- Vectorization | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization]

Added a TODOC3.0 label.

Doc update 25/Oct/17:  HIVE-17471 changes the default to true, also in release 
3.0.0.


was (Author: le...@hortonworks.com):
Doc note:  This adds *hive.vectorized.row.identifier.enabled* to HiveConf.java, 
so it will need to be documented in the wiki.

* [Configuration Properties -- Vectorization | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization]

Added a TODOC3.0 label.

> Vectorization: Add infrastructure for vectorization of ROW__ID struct
> -
>
> Key: HIVE-17116
> URL: https://issues.apache.org/jira/browse/HIVE-17116
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17116.01.patch, HIVE-17116.02.patch, 
> HIVE-17116.03.patch
>
>
> Supports new ACID work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219883#comment-16219883
 ] 

Lefty Leverenz commented on HIVE-17471:
---

Doc note:  This changes the default value of 
*hive.vectorized.row.identifier.enabled*, which was added by HIVE-17116 in the 
same release.  It will need to be documented in the wiki.

* [Configuration Properties -- Vectorization | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Vectorization]

Added a TODOC3.0 label.

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17471.01.patch, HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-3108) SELECT count(DISTINCT col) ... returns 0 if "col" is a partition column

2017-10-25 Thread Mamta Chawla (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219882#comment-16219882
 ] 

Mamta Chawla commented on HIVE-3108:


Hi,
Before running the count query, run below-
hive>ANALYZE TABLE stocks PARTITION() COMPUTE STATISTICS;
hive> select count(*) from stocks where ;

Regards
Mamta Chawla

> SELECT count(DISTINCT col) ... returns 0 if "col" is a partition column
> ---
>
> Key: HIVE-3108
> URL: https://issues.apache.org/jira/browse/HIVE-3108
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.8.0, 0.9.0
> Environment: Mac OSX running Apache distribution of hadoop and hive 
> natively.
>Reporter: Dean Wampler
>  Labels: Hive
>
> Suppose "stocks" is a managed OR external table, partitioned by "exchange" 
> and "symbol". "count(DISTINCT x)" returns 0 for either "exchange", "symbol", 
> or both:
> hive> SELECT count(DISTINCT exchange), count(DISTINCT symbol) from stocks;
> 0  0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-25 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219880#comment-16219880
 ] 

Jason Dere commented on HIVE-17908:
---

[~aplusplus] [~sseth] can you review?

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17471) Vectorization: Enable hive.vectorized.row.identifier.enabled to true by default

2017-10-25 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-17471:
--
Labels: TODOC3.0  (was: )

> Vectorization: Enable hive.vectorized.row.identifier.enabled to true by 
> default
> ---
>
> Key: HIVE-17471
> URL: https://issues.apache.org/jira/browse/HIVE-17471
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Matt McCline
>Assignee: Sergey Shelukhin
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17471.01.patch, HIVE-17471.patch
>
>
> We set it disabled in https://issues.apache.org/jira/browse/HIVE-17116 
> "Vectorization: Add infrastructure for vectorization of ROW__ID struct"
> But forgot to turn it on to true by default in Teddy's ACID ROW__ID work... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Status: Patch Available  (was: Open)

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere updated HIVE-17908:
--
Attachment: HIVE-17908.1.patch

> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Attachments: HIVE-17908.1.patch
>
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17908) LLAP External client not correctly handling killTask for pending requests

2017-10-25 Thread Jason Dere (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Dere reassigned HIVE-17908:
-


> LLAP External client not correctly handling killTask for pending requests
> -
>
> Key: HIVE-17908
> URL: https://issues.apache.org/jira/browse/HIVE-17908
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
>
> Hitting "Timed out waiting for heartbeat for task ID" errors with the LLAP 
> external client.
> HIVE-17393 fixed some of these errors, however it is also occurring because 
> the client is not correctly handling the killTask notification when the 
> request is accepted but still waiting for the first task heartbeat. In this 
> situation the client should retry the request, similar to what the LLAP AM 
> does. Current logic is ignoring the killTask in this situation, which results 
> in a heartbeat timeout - no heartbeats are sent by LLAP because of the 
> killTask notification.
> {noformat}
> 17/08/09 05:36:02 WARN TaskSetManager: Lost task 10.0 in stage 4.0 (TID 14, 
> cn114-10.l42scl.hortonworks.com, executor 5): java.io.IOException: Received 
> reader event error: Timed out waiting for heartbeat for task ID 
> attempt_7739111832518812959_0005_0_00_10_0
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:178)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:50)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:121)
> at 
> org.apache.hadoop.hive.llap.LlapRowRecordReader.next(LlapRowRecordReader.java:68)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:266)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:211)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at 
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
>  Source)
> at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
> at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:126)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> LlapTaskUmbilicalExternalClient(attempt_7739111832518812959_0005_0_00_10_0):
>  Error while attempting to read chunk length
> at 
> org.apache.hadoop.hive.llap.io.ChunkedInputStream.read(ChunkedInputStream.java:82)
> at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
> at java.io.FilterInputStream.read(FilterInputStream.java:83)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.hasInput(LlapBaseRecordReader.java:267)
> at 
> org.apache.hadoop.hive.llap.LlapBaseRecordReader.next(LlapBaseRecordReader.java:142)
> ... 22 more
> Caused by: java.net.SocketException: Socket closed
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219861#comment-16219861
 ] 

Lefty Leverenz commented on HIVE-15104:
---

Doc note:  This adds *hive.spark.optimize.shuffle.serde* to HiveConf.java, so 
it needs to be documented in the wiki.

* [Configuration Properties -- Spark | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark]

Added a TODOC3.0 label.

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15104) Hive on Spark generate more shuffle data than hive on mr

2017-10-25 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-15104:
--
Labels: TODOC3.0  (was: )

> Hive on Spark generate more shuffle data than hive on mr
> 
>
> Key: HIVE-15104
> URL: https://issues.apache.org/jira/browse/HIVE-15104
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 1.2.1
>Reporter: wangwenli
>Assignee: Rui Li
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-15104.1.patch, HIVE-15104.10.patch, 
> HIVE-15104.2.patch, HIVE-15104.3.patch, HIVE-15104.4.patch, 
> HIVE-15104.5.patch, HIVE-15104.6.patch, HIVE-15104.7.patch, 
> HIVE-15104.8.patch, HIVE-15104.9.patch, TPC-H 100G.xlsx
>
>
> the same sql,  running on spark  and mr engine, will generate different size 
> of shuffle data.
> i think it is because of hive on mr just serialize part of HiveKey, but hive 
> on spark which using kryo will serialize full of Hivekey object.  
> what is your opionion?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17839:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the patch and the testing.

> Cannot generate thrift definitions in standalone-metastore
> --
>
> Key: HIVE-17839
> URL: https://issues.apache.org/jira/browse/HIVE-17839
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Alan Gates
> Fix For: 3.0.0
>
> Attachments: HIVE-17839.patch
>
>
> mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift 
> sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 
> fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17839:

Summary: Cannot generate thrift definitions in standalone-metastore  (was: 
Cannot generate thrift definitions in standalone-metastore.)

> Cannot generate thrift definitions in standalone-metastore
> --
>
> Key: HIVE-17839
> URL: https://issues.apache.org/jira/browse/HIVE-17839
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Alan Gates
> Attachments: HIVE-17839.patch
>
>
> mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift 
> sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 
> fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17839) Cannot generate thrift definitions in standalone-metastore.

2017-10-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219843#comment-16219843
 ] 

Sergey Shelukhin commented on HIVE-17839:
-

+1

> Cannot generate thrift definitions in standalone-metastore.
> ---
>
> Key: HIVE-17839
> URL: https://issues.apache.org/jira/browse/HIVE-17839
> Project: Hive
>  Issue Type: Bug
>Reporter: Harish Jaiprakash
>Assignee: Alan Gates
> Attachments: HIVE-17839.patch
>
>
> mvn clean install -Pthriftif -Dthrift.home=... does not regenerate the thrift 
> sources. This is after the https://issues.apache.org/jira/browse/HIVE-17506 
> fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-12357) Allow user to set tez job name

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219835#comment-16219835
 ] 

Lefty Leverenz commented on HIVE-12357:
---

Doc update:  HIVE-16601 adds "Used by Spark to set the query name, will show up 
in the Spark UI" to the description (and function) of *hive.query.name* in 
release 3.0.0.

> Allow user to set tez job name
> --
>
> Key: HIVE-12357
> URL: https://issues.apache.org/jira/browse/HIVE-12357
> Project: Hive
>  Issue Type: Bug
>Reporter: Gunther Hagleitner
>Assignee: Gunther Hagleitner
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-12357.04.patch, HIVE-12357.05.patch, 
> HIVE-12357.06.patch, HIVE-12357.1.patch, HIVE-12357.2.patch, 
> HIVE-12357.3.patch
>
>
> Need something like mapred.job.name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-25 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219830#comment-16219830
 ] 

Lefty Leverenz commented on HIVE-16601:
---

Doc note:  This adds to the description of *hive.query.name* in HiveConf.java, 
so it needs to be documented in the wiki.  (HIVE-12357 introduced 
*hive.query.name* but it isn't documented yet.)

Apparently *hive.query.name* is for Tez and Spark only.  It could be documented 
in one section with a cross-reference from the other, or it could go in the 
general query section with cross-references from both Tez and Spark sections.

* [Configuration Properties -- Query and DDL Execution | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution]
* [Configuration Properties -- Spark | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Spark]
* [Configuration Properties -- Tez | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-Tez]

Since Tez already has a list of additional configs but Spark does not, I 
suggest putting it in Spark and adding it to the Tez list.

Added a TODOC3.0 label.

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, HIVE-16601.7.patch, HIVE-16601.8.patch, Spark UI 
> Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-25 Thread Lefty Leverenz (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-16601:
--
Labels: TODOC3.0  (was: )

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, HIVE-16601.7.patch, HIVE-16601.8.patch, Spark UI 
> Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17778) Add support for custom counters in trigger expression

2017-10-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219790#comment-16219790
 ] 

Sergey Shelukhin commented on HIVE-17778:
-

+1 pending tests. Why do the test timeouts have to be so long (e.g. 240s)? Are 
the tests timing dependent?

> Add support for custom counters in trigger expression
> -
>
> Key: HIVE-17778
> URL: https://issues.apache.org/jira/browse/HIVE-17778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, 
> HIVE-17778.3.patch, HIVE-17778.4.patch, HIVE-17778.5.patch, HIVE-17778.6.patch
>
>
> HIVE-17508 only supports limited counters. This ticket is to extend it to 
> support custom counters (counters that are not supported by execution engine 
> will be dropped).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17778) Add support for custom counters in trigger expression

2017-10-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219793#comment-16219793
 ] 

Prasanth Jayachandran commented on HIVE-17778:
--

Those tests run many queries. I gave 1 query = 60s. That specific one runs 4 
queries each builds on top of the previous one. Except for the first 
session/query the subsequent sessions should be reused and subsequent queries 
should be faster (giving some leeway in case of session reuse takes time)

> Add support for custom counters in trigger expression
> -
>
> Key: HIVE-17778
> URL: https://issues.apache.org/jira/browse/HIVE-17778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, 
> HIVE-17778.3.patch, HIVE-17778.4.patch, HIVE-17778.5.patch, HIVE-17778.6.patch
>
>
> HIVE-17508 only supports limited counters. This ticket is to extend it to 
> support custom counters (counters that are not supported by execution engine 
> will be dropped).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17833) Publish split generation counters

2017-10-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219789#comment-16219789
 ] 

Prasanth Jayachandran commented on HIVE-17833:
--

Unit test case for triggers based on input counters has to be enabled after 
HIVE-17778 and TEZ-3856

> Publish split generation counters
> -
>
> Key: HIVE-17833
> URL: https://issues.apache.org/jira/browse/HIVE-17833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17833.1.patch, HIVE-17833.2.patch, 
> HIVE-17833.3.patch
>
>
> With TEZ-3856, tez counters are exposed via input initializers which can be 
> used to publish split generation counters. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17833) Publish split generation counters

2017-10-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17833:
-
Attachment: HIVE-17833.3.patch

> Publish split generation counters
> -
>
> Key: HIVE-17833
> URL: https://issues.apache.org/jira/browse/HIVE-17833
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17833.1.patch, HIVE-17833.2.patch, 
> HIVE-17833.3.patch
>
>
> With TEZ-3856, tez counters are exposed via input initializers which can be 
> used to publish split generation counters. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16601) Display Session Id and Query Name / Id in Spark UI

2017-10-25 Thread Sahil Takiar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HIVE-16601:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

> Display Session Id and Query Name / Id in Spark UI
> --
>
> Key: HIVE-16601
> URL: https://issues.apache.org/jira/browse/HIVE-16601
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
> Fix For: 3.0.0
>
> Attachments: HIVE-16601.1.patch, HIVE-16601.2.patch, 
> HIVE-16601.3.patch, HIVE-16601.4.patch, HIVE-16601.5.patch, 
> HIVE-16601.6.patch, HIVE-16601.7.patch, HIVE-16601.8.patch, Spark UI 
> Applications List.png, Spark UI Jobs List.png
>
>
> We should display the session id for each HoS Application Launched, and the 
> Query Name / Id and Dag Id for each Spark job launched. Hive-on-MR does 
> something similar via the {{mapred.job.name}} parameter. The query name is 
> displayed in the Job Name of the MR app.
> The changes here should also allow us to leverage the config 
> {{hive.query.name}} for HoS.
> This should help with debuggability of HoS applications. The Hive-on-Tez UI 
> does something similar.
> Related issues for Hive-on-Tez: HIVE-12357, HIVE-12523



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17730) Queries can be closed automatically

2017-10-25 Thread Alexander Kolbasov (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Kolbasov updated HIVE-17730:
--
Attachment: HIVE-17730.08.patch

> Queries can be closed automatically
> ---
>
> Key: HIVE-17730
> URL: https://issues.apache.org/jira/browse/HIVE-17730
> Project: Hive
>  Issue Type: Bug
>Reporter: Alexander Kolbasov
>Assignee: Alexander Kolbasov
> Attachments: HIVE-17730.07.patch, HIVE-17730.08.patch
>
>
> HIVE-16213 made QueryWrapper AutoCloseable, but queries are still closed 
> manually and not by using try-with-resource. And now Query itself is auto 
> closeable, so we don't need the wrapper at all.
> So we should get rid of QueryWrapper and use try-with-resource to create 
> queries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16722) Converting bucketed non-acid table to acid should perform validation

2017-10-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16722:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

patch 4 committed to master
thanks Alan for the review

> Converting bucketed non-acid table to acid should perform validation
> 
>
> Key: HIVE-16722
> URL: https://issues.apache.org/jira/browse/HIVE-16722
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Fix For: 3.0.0
>
> Attachments: HIVE-16722.01.patch, HIVE-16722.02.patch, 
> HIVE-16722.03.patch, HIVE-16722.04.patch, HIVE-16722.WIP.patch
>
>
> Converting a non acid table to acid only performs metadata validation (in 
> _TransactionalValidationListener_).
> The data read code path only understands certain directory layouts and file 
> names and ignores (generally) files that don't match the expected format.
> In Hive, directory layout and bucket file naming (especially older releases) 
> is poorly enforced.
> Need to add a validation step on 
> {noformat}
> alter table T SET TBLPROPERTIES ('transactional'='true')
> {noformat}
> to 
> scan the file system and report any possible data loss scenarios.
> Currently Acid understands bucket files name like "0_0" and (with 
> HIVE-16177) 0_0_copy1" etc at the root of the partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17778) Add support for custom counters in trigger expression

2017-10-25 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-17778:
-
Attachment: HIVE-17778.6.patch

Addressed review comments and fixes test failures.

> Add support for custom counters in trigger expression
> -
>
> Key: HIVE-17778
> URL: https://issues.apache.org/jira/browse/HIVE-17778
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-17778.1.patch, HIVE-17778.2.patch, 
> HIVE-17778.3.patch, HIVE-17778.4.patch, HIVE-17778.5.patch, HIVE-17778.6.patch
>
>
> HIVE-17508 only supports limited counters. This ticket is to extend it to 
> support custom counters (counters that are not supported by execution engine 
> will be dropped).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Description: 
Enabling and applying the RP should only be runnable in HS2 with active WM. 
Both should validate the full resource plan (or at least enable should; users 
cannot modify the RP via normal means once enabled, but it might be worth 
double checking since we have to fetch it anyway to apply).
Then, apply should propagate the resource plan to the WM instance.

  was:
Enabling and applying the RP should only be runnable in HS2 with active WM. 
Both should validate the full resource plan (or at least enable; users cannot 
modify the RP via normal means once enabled, but it might be worth double 
checking since we have to fetch it anyway to apply).
Then, apply should propagate the command to WM instance.


> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable should; users 
> cannot modify the RP via normal means once enabled, but it might be worth 
> double checking since we have to fetch it anyway to apply).
> Then, apply should propagate the resource plan to the WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17907) enable and apply resource plan commands in HS2

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17907:

Summary: enable and apply resource plan commands in HS2  (was: enable and 
apply commands in HS2)

> enable and apply resource plan commands in HS2
> --
>
> Key: HIVE-17907
> URL: https://issues.apache.org/jira/browse/HIVE-17907
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>
> Enabling and applying the RP should only be runnable in HS2 with active WM. 
> Both should validate the full resource plan (or at least enable; users cannot 
> modify the RP via normal means once enabled, but it might be worth double 
> checking since we have to fetch it anyway to apply).
> Then, apply should propagate the command to WM instance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16722) Converting bucketed non-acid table to acid should perform validation

2017-10-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-16722:
--
Attachment: HIVE-16722.04.patch

patch 4 incorporates Alan's suggestion

> Converting bucketed non-acid table to acid should perform validation
> 
>
> Key: HIVE-16722
> URL: https://issues.apache.org/jira/browse/HIVE-16722
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Affects Versions: 1.0.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
> Attachments: HIVE-16722.01.patch, HIVE-16722.02.patch, 
> HIVE-16722.03.patch, HIVE-16722.04.patch, HIVE-16722.WIP.patch
>
>
> Converting a non acid table to acid only performs metadata validation (in 
> _TransactionalValidationListener_).
> The data read code path only understands certain directory layouts and file 
> names and ignores (generally) files that don't match the expected format.
> In Hive, directory layout and bucket file naming (especially older releases) 
> is poorly enforced.
> Need to add a validation step on 
> {noformat}
> alter table T SET TBLPROPERTIES ('transactional'='true')
> {noformat}
> to 
> scan the file system and report any possible data loss scenarios.
> Currently Acid understands bucket files name like "0_0" and (with 
> HIVE-16177) 0_0_copy1" etc at the root of the partition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17902) add a notions of default pool and unmanaged mapping

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-17902:
---


> add a notions of default pool and unmanaged mapping
> ---
>
> Key: HIVE-17902
> URL: https://issues.apache.org/jira/browse/HIVE-17902
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> This is needed to map queries between WM and non-WM execution



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-14731) Use Tez cartesian product edge in Hive (unpartitioned case only)

2017-10-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang updated HIVE-14731:

Attachment: HIVE-14731.addendum.patch

Previous patch breaks qtest subquery_multi on SparkCliDriver. But same test 
failed on my local machine before committing the patch. There seems to be some 
non-determinism for this test on SparkCliDriver (reopened HIVE-17823 for this). 
But let's revert relevant change here to unblock jenkins run. CC [~hagleitn]

> Use Tez cartesian product edge in Hive (unpartitioned case only)
> 
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhiyuan Yang
>Assignee: Zhiyuan Yang
> Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch, 
> HIVE-14731.11.patch, HIVE-14731.12.patch, HIVE-14731.13.patch, 
> HIVE-14731.14.patch, HIVE-14731.15.patch, HIVE-14731.16.patch, 
> HIVE-14731.17.patch, HIVE-14731.18.patch, HIVE-14731.19.patch, 
> HIVE-14731.2.patch, HIVE-14731.20.patch, HIVE-14731.21.patch, 
> HIVE-14731.22.patch, HIVE-14731.23.patch, HIVE-14731.3.patch, 
> HIVE-14731.4.patch, HIVE-14731.5.patch, HIVE-14731.6.patch, 
> HIVE-14731.7.patch, HIVE-14731.8.patch, HIVE-14731.9.patch, 
> HIVE-14731.addendum.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's 
> integrate it into Hive on Tez. This allows us to have more than one reducer 
> in cross product queries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2017-10-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219664#comment-16219664
 ] 

Sergey Shelukhin commented on HIVE-11553:
-

I think the reason is that it is a metastore dependency.

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.04.patch, HIVE-11553.06.patch, 
> HIVE-11553.07.patch, HIVE-11553.patch
>
>
> This is the first step; uses the simple footer-getting API, without PPD.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-25 Thread Zhiyuan Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhiyuan Yang reopened HIVE-17823:
-

I'll revert relevant changes in HIVE-14731 to unblock jenkins run, but this 
still need investigating.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-25 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219646#comment-16219646
 ] 

Zhiyuan Yang edited comment on HIVE-17823 at 10/25/17 10:16 PM:


subquery_multi on TestSparkCliDriver seems to generate different output order 
on different machine. Specifically this query
{code}
select * from part_null where p_size IN (select p_size from part_null) AND 
p_brand IN (select p_brand from part_null)
{code}

it failed on my local machine (before HIVE-14731 was committed) like this:
{code}
237d236
< 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
MED BAG 1464.48 hely blith
238a238
> 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
> MED BAG 1464.48 hely blith
{code}
After I overwrite it with my local result, it failed on Apache jenkins with a 
similar diff.


was (Author: aplusplus):
subquery_multi on TestSparkCliDriver seems to generate different output order 
on different machine. Specifically this query
{code}
select * from part_null where p_size IN (select p_size from part_null) AND 
p_brand IN (select p_brand from part_null)
{code}

it failed on my local machine (before HIVE-14731 was committed) like this:
{code}
237d236
< 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
MED BAG 1464.48 hely blith
238a238
> 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
> MED BAG 1464.48 hely blith
{code}
After I overwrite it with my local result, it failed on Apache jenkins.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17823) Fix subquery Qtest of Hive on Spark

2017-10-25 Thread Zhiyuan Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219646#comment-16219646
 ] 

Zhiyuan Yang commented on HIVE-17823:
-

subquery_multi on TestSparkCliDriver seems to generate different output order 
on different machine. Specifically this query
{code}
select * from part_null where p_size IN (select p_size from part_null) AND 
p_brand IN (select p_brand from part_null)
{code}

it failed on my local machine (before HIVE-14731 was committed) like this:
{code}
237d236
< 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
MED BAG 1464.48 hely blith
238a238
> 78487 NULLManufacturer#6  Brand#52LARGE BRUSHED BRASS 23  
> MED BAG 1464.48 hely blith
{code}
After I overwrite it with my local result, it failed on Apache jenkins.

> Fix subquery Qtest of Hive on Spark
> ---
>
> Key: HIVE-17823
> URL: https://issues.apache.org/jira/browse/HIVE-17823
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Dapeng Sun
>Assignee: Dapeng Sun
> Attachments: HIVE-17823.001.patch
>
>
> The JIRA is targeted to fix the Qtest files failures of HoS due to HIVE-17726 
> introduced subquery fix.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17482) External LLAP client: acquire locks for tables queried directly by LLAP

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219623#comment-16219623
 ] 

Eugene Koifman commented on HIVE-17482:
---

[~jdere]
It occurred to me that this may lead to deadlocks.
Suppose Spark is running a query (S join S).

So the 1st fragment will get a Shared lock on S.  
Then some other query will try an X lock on S and block.
Then the 2nd fragment will try to get an S lock and will block behind the X 
lock.
Will the 1st frag ever release its' S lock?

> External LLAP client: acquire locks for tables queried directly by LLAP
> ---
>
> Key: HIVE-17482
> URL: https://issues.apache.org/jira/browse/HIVE-17482
> Project: Hive
>  Issue Type: Sub-task
>  Components: llap
>Reporter: Jason Dere
>Assignee: Jason Dere
> Fix For: 3.0.0
>
> Attachments: HIVE-17482.1.patch, HIVE-17482.2.patch, 
> HIVE-17482.3.patch, HIVE-17482.4.patch, HIVE-17482.5.patch, HIVE-17482.6.patch
>
>
> When using the LLAP external client with simple queries (filter/project of 
> single table), the appropriate locks should be taken on the table being read 
> like they are for normal Hive queries. This is important in the case of 
> transactional tables being queried, since the compactor relies on the 
> presence of table locks to determine whether it can safely delete old 
> versions of compacted files without affecting currently running queries.
> This does not have to happen in the complex query case, since a query is used 
> (with the appropriate locking mechanisms) to create/populate the temp table 
> holding the results to the complex query.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17841) implement applying the resource plan

2017-10-25 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219614#comment-16219614
 ] 

Prasanth Jayachandran commented on HIVE-17841:
--

left some comments. Still haven't completely the epic changes to 
WorkloadManager.java. Will finish it later.

> implement applying the resource plan
> 
>
> Key: HIVE-17841
> URL: https://issues.apache.org/jira/browse/HIVE-17841
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17841.01.patch, HIVE-17841.02.patch, 
> HIVE-17841.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17695) collapse union all produced directories into delta directory name suffix for MM

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219592#comment-16219592
 ] 

Eugene Koifman commented on HIVE-17695:
---

This would also help AcidUtils.getAcidState().  It would just work.


> collapse union all produced directories into delta directory name suffix for 
> MM
> ---
>
> Key: HIVE-17695
> URL: https://issues.apache.org/jira/browse/HIVE-17695
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Priority: Minor
>
> this has special handling for writes resulting from Union All query
> In full Acid case at least, these subdirs get collapsed in favor of 
> statementId based dir names (delta_x_y_stmtId).  It would be cleaner/simpler 
> to make MM follow the same logic.  (full acid does it Hive.moveFiles() I 
> think)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17673) JavaUtils.extractTxnId() etc

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219574#comment-16219574
 ] 

Eugene Koifman commented on HIVE-17673:
---

AcidUtils.extractInsertOnlyTxnId(Path file) should probably use 
AcidUtils.parsedDelta()

otherwise LGTM
+1

> JavaUtils.extractTxnId() etc
> 
>
> Key: HIVE-17673
> URL: https://issues.apache.org/jira/browse/HIVE-17673
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
>Priority: Minor
> Attachments: HIVE-17673.patch, HIVE-17673.patch
>
>
> these should be in AcidUtils



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-11553) use basic file metadata cache in ETLSplitStrategy-related paths

2017-10-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219572#comment-16219572
 ] 

Naveen Gangam commented on HIVE-11553:
--

Hey [~sershe], quick follow up on this. Was there a reason we are packaging 
thrift classes *fb303* into hive-exec.jar as part of this change? In the older 
releases, we do not include these classes. Thanks

> use basic file metadata cache in ETLSplitStrategy-related paths
> ---
>
> Key: HIVE-11553
> URL: https://issues.apache.org/jira/browse/HIVE-11553
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: TODOC2.0
> Fix For: 2.0.0
>
> Attachments: HIVE-11553.01.patch, HIVE-11553.02.patch, 
> HIVE-11553.03.patch, HIVE-11553.04.patch, HIVE-11553.06.patch, 
> HIVE-11553.07.patch, HIVE-11553.patch
>
>
> This is the first step; uses the simple footer-getting API, without PPD.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-10-25 Thread Steve Yeom (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219558#comment-16219558
 ] 

Steve Yeom commented on HIVE-17856:
---

Thanks for the info. Eugene. 

FYI, 

The above comments by Eugene is an item of the current plan for MM for this 
jira :
1. Keep delta dirs (for the open transaction reader to keep access the delta 
dirs).
2. base dir processing
   2.1 Writer for IOW statement creates base dir for new rows. 
   2.2 Subsequent reader for a MM table after successful IOW commit shall use 
base dir only when it encounters a complete base dir.
3. Cleaner of the compactor cleans as in the case of Full ACID tables.  

I am actively working on the items.

> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17897) "repl load" in bootstrap phase fails when partitions have whitespace

2017-10-25 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-17897:
-
Status: Patch Available  (was: Open)

> "repl load" in bootstrap phase fails when partitions have whitespace
> 
>
> Key: HIVE-17897
> URL: https://issues.apache.org/jira/browse/HIVE-17897
> Project: Hive
>  Issue Type: Sub-task
>  Components: repl
>Reporter: Sankar Hariappan
>Assignee: Thejas M Nair
>Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-17897.1.patch
>
>
> The issue is that Path.toURI().toString() is being used to serialize the 
> location, while new Path(String) is used to deserialize it. URI escapes chars 
> such as space, so the deserialized location doesn't point to the correct file 
> location.
> Following exception is seen - 
> {code}
> 2017-10-24T11:58:34,451 ERROR [d5606640-8174-4584-8b54-936b0f5628fa main] 
> exec.Task: Failed with exception null
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.regularCopy(CopyUtils.java:211)
> at 
> org.apache.hadoop.hive.ql.parse.repl.CopyUtils.copyAndVerify(CopyUtils.java:71)
> at 
> org.apache.hadoop.hive.ql.exec.ReplCopyTask.execute(ReplCopyTask.java:137)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:206)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2276)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1906)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1623)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1362)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1352)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239)
> at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
> at 
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:409)
> at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:827)
> at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:765)
> at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:692)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17856) MM tables - IOW is not ACID compliant

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219526#comment-16219526
 ] 

Eugene Koifman commented on HIVE-17856:
---

The general idea of how full Acid IOW works is given
Insert into T select * from Foo

we create a new /warehouse/db/T/base_N where N is the current txn id in which 
the above SQL is running.
Then, the every acid reader uses AcidUtils.getAcidState(Path rootOfT,)  to 
create a AcidUtils.Directory object which will have base_N/ files but will put 
all other files in Directory.getObsolete().  This way it will only see what is 
written to base_N.

All Acid readers use getAcidState() to filter the input file list wrt 
ValidTxnList for each operation.

It would make sense for MM to follow the same approach.  I don't think the 
current MM code is quite there yet.
This would also make the Cleaner understand that deltas from before N can be 
removed (but only after all the readers for them went away)


> MM tables - IOW is not ACID compliant
> -
>
> Key: HIVE-17856
> URL: https://issues.apache.org/jira/browse/HIVE-17856
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Steve Yeom
>  Labels: mm-gap-1
>
> The following tests were removed from mm_all during "integration"... I should 
> have never allowed such manner of intergration.
> MM logic should have been kept intact until ACID logic could catch up. Alas, 
> here we are.
> {noformat}
> drop table iow0_mm;
> create table iow0_mm(key int) tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow0_mm select key from intermediate;
> insert into table iow0_mm select key + 1 from intermediate;
> select * from iow0_mm order by key;
> insert overwrite table iow0_mm select key + 2 from intermediate;
> select * from iow0_mm order by key;
> drop table iow0_mm;
> drop table iow1_mm; 
> create table iow1_mm(key int) partitioned by (key2 int)  
> tblproperties("transactional"="true", 
> "transactional_properties"="insert_only");
> insert overwrite table iow1_mm partition (key2)
> select key as k1, key from intermediate union all select key as k1, key from 
> intermediate;
> insert into table iow1_mm partition (key2)
> select key + 1 as k1, key from intermediate union all select key as k1, key 
> from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key from intermediate union all select key + 4 as k1, 
> key from intermediate;
> select * from iow1_mm order by key, key2;
> insert overwrite table iow1_mm partition (key2)
> select key + 3 as k1, key + 3 from intermediate union all select key + 2 as 
> k1, key + 2 from intermediate;
> select * from iow1_mm order by key, key2;
> drop table iow1_mm;
> {noformat}
> {noformat}
> drop table simple_mm;
> create table simple_mm(key int) stored as orc tblproperties 
> ("transactional"="true", "transactional_properties"="insert_only");
> insert into table simple_mm select key from intermediate;
> -insert overwrite table simple_mm select key from intermediate;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17858) MM - some union cases are broken

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219499#comment-16219499
 ] 

Eugene Koifman commented on HIVE-17858:
---

+1 pending tests

> MM - some union cases are broken
> 
>
> Key: HIVE-17858
> URL: https://issues.apache.org/jira/browse/HIVE-17858
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>  Labels: mm-gap-1
> Attachments: HIVE-17858.01.patch, HIVE-17858.patch
>
>
> mm_all test no longer runs on LLAP; if it's executed in LLAP, one can see 
> that some union cases no longer work.
> Queries on partunion_mm, skew_dp_union_mm produce no results.
> I'm not sure what part of "integration" broke it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17895) Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)

2017-10-25 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219465#comment-16219465
 ] 

Matt McCline commented on HIVE-17895:
-

I usually target 'vector.*' but there are a number of vectorized Q files that 
don't have the 'vector' prefix so to do complete you have to run all.

> Vectorization: Wrong results for schema_evol_text_vec_table.q (LLAP)
> 
>
> Key: HIVE-17895
> URL: https://issues.apache.org/jira/browse/HIVE-17895
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
>
> NonVec: 103   NULL0.0 NULLoriginal
> Vec: 103  NULLNULLNULLoriginal



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17765) expose Hive keywords

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17765:

Fix Version/s: 2.4.0

Also branch-2


> expose Hive keywords 
> -
>
> Key: HIVE-17765
> URL: https://issues.apache.org/jira/browse/HIVE-17765
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, 
> HIVE-17765.03.patch, HIVE-17765.nogen.patch, HIVE-17765.patch
>
>
> This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on 
> SQL capabilities of Hive



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17765) expose Hive keywords

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17765:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> expose Hive keywords 
> -
>
> Key: HIVE-17765
> URL: https://issues.apache.org/jira/browse/HIVE-17765
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 3.0.0
>
> Attachments: HIVE-17765.01.patch, HIVE-17765.02.patch, 
> HIVE-17765.03.patch, HIVE-17765.nogen.patch, HIVE-17765.patch
>
>
> This could be useful e.g. for BI tools (via ODBC/JDBC drivers) to decide on 
> SQL capabilities of Hive



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16827:

Attachment: HIVE-16827.05wip07.patch

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip01.patch, HIVE-16827.04wip02.patch, 
> HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, HIVE-16827.04wip05.patch, 
> HIVE-16827.04wip06.patch, HIVE-16827.04wip07.patch, HIVE-16827.04wip08.patch, 
> HIVE-16827.04wip09.patch, HIVE-16827.04wip10.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.05wip02.patch, HIVE-16827.05wip03.patch, HIVE-16827.05wip04.patch, 
> HIVE-16827.05wip05.patch, HIVE-16827.05wip06.patch, HIVE-16827.05wip07.patch, 
> HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-25 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219347#comment-16219347
 ] 

Naveen Gangam commented on HIVE-17891:
--

+1 Looks good to me.

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch, HIVE-17891.02.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF 
> NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the 
> script will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17881) LLAP: Text cache NPE when hive.llap.io.memory.mode=none

2017-10-25 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219343#comment-16219343
 ] 

Sergey Shelukhin commented on HIVE-17881:
-

I don't recall why I added this setting but it was a mistake... it's not really 
a good use case and just complicates the code. For testing it should be 
possible to flush the cache manually or something. 

> LLAP: Text cache NPE when hive.llap.io.memory.mode=none
> ---
>
> Key: HIVE-17881
> URL: https://issues.apache.org/jira/browse/HIVE-17881
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache 
> throws NPE for following query
> {code}
> select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k;
> {code}
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17881) LLAP: Text cache NPE when hive.llap.io.memory.mode=none

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17881:

Summary: LLAP: Text cache NPE when hive.llap.io.memory.mode=none  (was: 
LLAP: Text cache NPE)

> LLAP: Text cache NPE when hive.llap.io.memory.mode=none
> ---
>
> Key: HIVE-17881
> URL: https://issues.apache.org/jira/browse/HIVE-17881
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Prasanth Jayachandran
>
> With LLAP IO enabled and hive.llap.io.memory.mode set to false. Text cache 
> throws NPE for following query
> {code}
> select t1.k,t1.v from src t1 join src t2 on t1.k>=t2.k;
> {code}
> {code}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.readFileWithCache(SerDeEncodedDataReader.java:763)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.performDataRead(SerDeEncodedDataReader.java:668)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:259)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader$5.run(SerDeEncodedDataReader.java:256)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1889)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:256)
>   at 
> org.apache.hadoop.hive.llap.io.encoded.SerDeEncodedDataReader.callInternal(SerDeEncodedDataReader.java:107)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17698) FileSinkDesk.getMergeInputDirName() uses stmtId=0

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-17698:

Attachment: HIVE-17698.01.patch

For HiveQA again

> FileSinkDesk.getMergeInputDirName() uses stmtId=0
> -
>
> Key: HIVE-17698
> URL: https://issues.apache.org/jira/browse/HIVE-17698
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17698.01.patch, HIVE-17698.patch, HIVE-17698.patch
>
>
> this is certainly wrong for multi statement txn but may also affect writes 
> from Union All queries if these are made to follow full Acid convention
> _return new Path(root, AcidUtils.deltaSubdir(txnId, txnId, 0));_



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-25 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219311#comment-16219311
 ] 

Vihang Karajgaonkar commented on HIVE-17891:


[~ngangam] Can you please take a look? I will wait for precommit to complete 
but I don't think it tests schema changes anyway currently.

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch, HIVE-17891.02.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF 
> NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the 
> script will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-25 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17764:
---
   Resolution: Fixed
Fix Version/s: 2.4.0
   Status: Resolved  (was: Patch Available)

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17764-branch-2.01.patch, HIVE17764.1.patch, 
> HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-25 Thread Vihang Karajgaonkar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219306#comment-16219306
 ] 

Vihang Karajgaonkar commented on HIVE-17764:


There was no code change required for the branch-2 patch. A new line was needed 
in q.out file so that diff would match. Merged to branch-2 as well. Thanks for 
your contribution [~janulatha]

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HIVE-17764-branch-2.01.patch, HIVE17764.1.patch, 
> HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17748) ReplCopyTask doesn't support multi-file CopyWork

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219299#comment-16219299
 ] 

Eugene Koifman commented on HIVE-17748:
---

+1

> ReplCopyTask doesn't support multi-file CopyWork
> 
>
> Key: HIVE-17748
> URL: https://issues.apache.org/jira/browse/HIVE-17748
> Project: Hive
>  Issue Type: Sub-task
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17748.patch, HIVE-17748.patch
>
>
> has 
> {noformat}
>   Path fromPath = work.getFromPaths()[0];
>   toPath = work.getToPaths()[0];
> {noformat}
> should this throw if from/to paths have > 1 element?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17750) add a flag to automatically create most tables as MM

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219286#comment-16219286
 ] 

Eugene Koifman commented on HIVE-17750:
---

+1

> add a flag to automatically create most tables as MM 
> -
>
> Key: HIVE-17750
> URL: https://issues.apache.org/jira/browse/HIVE-17750
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-17750.patch
>
>
> After merge we are going to do another round of gap identification... similar 
> to HIVE-14990.
> However the approach used there is a huge PITA. It'd be much better to make 
> tables MM by default at create time, not pretend they are MM at check time, 
> from the perspective of spurious error elimination.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-10-25 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Status: Patch Available  (was: Open)

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-10-25 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR reassigned HIVE-17901:
--

Assignee: BELUGA BEHR

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17901) org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and More

2017-10-25 Thread BELUGA BEHR (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

BELUGA BEHR updated HIVE-17901:
---
Attachment: HIVE-17901.1.patch

> org.apache.hadoop.hive.ql.exec.Utilities - Use Logging Parameterization and 
> More
> 
>
> Key: HIVE-17901
> URL: https://issues.apache.org/jira/browse/HIVE-17901
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Affects Versions: 3.0.0
>Reporter: BELUGA BEHR
>Priority: Minor
> Attachments: HIVE-17901.1.patch
>
>
> {{org.apache.hadoop.hive.ql.exec.Utilities}}
> # Remove unused imports
> # Remove unused variables
> # Modify logging to use logging parameterization
> # Other small tweeks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219267#comment-16219267
 ] 

Hive QA commented on HIVE-16827:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893956/HIVE-16827.05wip06.patch

{color:green}SUCCESS:{color} +1 due to 17 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 97 failed/errored test(s), 11331 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_2] 
(batchId=242)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join1] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_udaf_percentile_approx_23]
 (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf2] 
(batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing_no_cbo]
 (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[drop_table_with_index] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown2] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[gen_udf_example_add10] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby10] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_id3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping]
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_serde] (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input22] (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input3_limit] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullscript] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge5] (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge6] (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat1] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptf_matchpath] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_25] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_2] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_3] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_in_having] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_percentile_approx_23]
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_trunc_number] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_const] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr_2] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_like_2] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_expressions]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_multipartitioning]
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_order_null]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_range_multiorder]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_rank] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_streaming]
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_windowspec]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=23)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer4]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[count] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1]
 

[jira] [Resolved] (HIVE-17771) Implement commands to manage resource plan

2017-10-25 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-17771.
-
Resolution: Fixed

Fixed. In future, it's probably better to handle this as a separate jira

> Implement commands to manage resource plan
> --
>
> Key: HIVE-17771
> URL: https://issues.apache.org/jira/browse/HIVE-17771
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Harish Jaiprakash
>Assignee: Harish Jaiprakash
>  Labels: TODOC3.0
> Fix For: 3.0.0
>
> Attachments: HIVE-17771.01.patch, HIVE-17771.02.patch, 
> HIVE-17771.03.patch, HIVE-17771.04.patch
>
>
> Please see parent jira about llap workload management.
> This jira is to implement create and show resource plan commands in hive to 
> configure resource plans for llap workload. The following are the proposed 
> commands implemented as part of the jira:
> CREATE RESOURCE PLAN plan_name WITH QUERY_PARALLELISM parallelism;
> SHOW RESOURCE PLAN plan_name;
> SHOW RESOURCE PLANS;
> ALTER RESOURCE PLAN plan_name SET QUERY_PARALLELISM = parallelism;
> ALTER RESOURCE PLAN plan_name RENAME TO new_name;
> ALTER RESOURCE PLAN plan_name ACTIVATE;
> ALTER RESOURCE PLAN plan_name DISABLE;
> ALTER RESOURCE PLAN plan_name ENABLE;
> DROP RESOURCE PLAN;
> It will be followed up with more jiras to manage pools, triggers and copy 
> resource plans.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-15305) Add tests for METASTORE_EVENT_LISTENERS

2017-10-25 Thread Mohit Sabharwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohit Sabharwal updated HIVE-15305:
---
Attachment: HIVE-15305.2.patch

> Add tests for METASTORE_EVENT_LISTENERS
> ---
>
> Key: HIVE-15305
> URL: https://issues.apache.org/jira/browse/HIVE-15305
> Project: Hive
>  Issue Type: Bug
>Reporter: Mohit Sabharwal
>Assignee: Mohit Sabharwal
> Attachments: HIVE-15305.1.patch, HIVE-15305.1.patch, 
> HIVE-15305.1.patch, HIVE-15305.1.patch, HIVE-15305.1.patch, 
> HIVE-15305.1.patch, HIVE-15305.2.patch, HIVE-15305.patch
>
>
> HIVE-15232 reused TestDbNotificationListener to test 
> METASTORE_TRANSACTIONAL_EVENT_LISTENERS and removed unit testing of 
> METASTORE_EVENT_LISTENERS config. We should test both. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17764) alter view fails when hive.metastore.disallow.incompatible.col.type.changes set to true

2017-10-25 Thread Janaki Lahorani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Janaki Lahorani updated HIVE-17764:
---
Attachment: HIVE-17764-branch-2.01.patch

> alter view fails when hive.metastore.disallow.incompatible.col.type.changes 
> set to true
> ---
>
> Key: HIVE-17764
> URL: https://issues.apache.org/jira/browse/HIVE-17764
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.1
>Reporter: Janaki Lahorani
>Assignee: Janaki Lahorani
> Fix For: 3.0.0
>
> Attachments: HIVE-17764-branch-2.01.patch, HIVE17764.1.patch, 
> HIVE17764.2.patch
>
>
> A view is a virtual structure that derives the type information from the 
> table(s) the view is based on.If the view definition is altered, the 
> corresponding column types should be updated.  The relevance of the change 
> depending on the previous structure of the view is irrelevant.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17820) Add buckets.q test for blobstores

2017-10-25 Thread Ran Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219172#comment-16219172
 ] 

Ran Gu commented on HIVE-17820:
---

Updated CR

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
>Assignee: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17820) Add buckets.q test for blobstores

2017-10-25 Thread Ran Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Gu updated HIVE-17820:
--
Attachment: (was: HIVE-17820.patch)

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
>Assignee: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17820) Add buckets.q test for blobstores

2017-10-25 Thread Ran Gu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Gu updated HIVE-17820:
--
Attachment: HIVE-17820.patch

Updated patch to use "DROP TABLE IF EXISTS"

> Add buckets.q test for blobstores
> -
>
> Key: HIVE-17820
> URL: https://issues.apache.org/jira/browse/HIVE-17820
> Project: Hive
>  Issue Type: Test
>  Components: Tests
>Reporter: Ran Gu
>Assignee: Ran Gu
> Attachments: HIVE-17820.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17900) analyze stats on columns triggered by Compactor generates malformed SQL with > 1 partition column

2017-10-25 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219113#comment-16219113
 ] 

Eugene Koifman commented on HIVE-17900:
---

Worker.StatsUpdater.gatherStats() is missing a "," when building partition 
clause

> analyze stats on columns triggered by Compactor generates malformed SQL with 
> > 1 partition column
> -
>
> Key: HIVE-17900
> URL: https://issues.apache.org/jira/browse/HIVE-17900
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> {noformat}
> 2017-10-16 09:01:51,255 ERROR [haddl0007.mycenterpointenergy.com-51]: 
> ql.Driver (SessionState.java:printError(993)) - FAILED: ParseException line 
> 1:70 mismatched input 'dates' expecting ) near ''201608'' in analyze statement
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:70 mismatched input 
> 'dates' expecting ) near ''201608'' in analyze statement
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:205)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1221)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:294)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:168)
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  start=1508162511253 end=1508162511255 duration=2 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> ql.Driver (Driver.java:compile(559)) - We are resetting the hadoop caller 
> context to
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) -  method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  method=releaseLocks start=1508162511255 end=1508162511255 duration=0 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> tez.TezSessionPoolManager (TezSessionPoolManager.java:close(183)) - Closing 
> tez session default? false
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> tez.TezSessionState (TezSessionState.java:close(294)) - Closing Tez Session
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> client.TezClient (TezClient.java:stop(518)) - Shutting down Tez Session, 
> sessionName=HIVE-ae652f03-72c7-4ca8-a2d8-05dcc7392f4f, 
> applicationId=application_1507779664083_0159
> 2017-10-16 09:01:51,279 ERROR [haddl0007.mycenterpointenergy.com-51]: 
> compactor.Worker (Worker.java:run(191)) - Caught exception while trying to 
> compact 
> id:3723,dbname:mobiusad,tableName:zces_img_data_small_pt,partName:month=201608/dates=9,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.io.IOException: Could not 
> update stats for table mobiusad.zces_img_data_small_pt/month=201608/dates=9 
> due to: (4,FAILED: ParseException line 1:70 mismatched input 'dates' 
> expecting ) near ''201608'' in analyze statement,42000line 1:70 mismatched 
> input 'dates' expecting ) near ''201608'' in analyze statement)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:296)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:168)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HIVE-17900) analyze stats on columns triggered by Compactor generates malformed SQL with > 1 partition column

2017-10-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman reassigned HIVE-17900:
-

Assignee: Eugene Koifman

> analyze stats on columns triggered by Compactor generates malformed SQL with 
> > 1 partition column
> -
>
> Key: HIVE-17900
> URL: https://issues.apache.org/jira/browse/HIVE-17900
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>
> {noformat}
> 2017-10-16 09:01:51,255 ERROR [haddl0007.mycenterpointenergy.com-51]: 
> ql.Driver (SessionState.java:printError(993)) - FAILED: ParseException line 
> 1:70 mismatched input 'dates' expecting ) near ''201608'' in analyze statement
> org.apache.hadoop.hive.ql.parse.ParseException: line 1:70 mismatched input 
> 'dates' expecting ) near ''201608'' in analyze statement
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:205)
> at 
> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:438)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:321)
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1221)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1262)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1158)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1148)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:294)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:168)
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  start=1508162511253 end=1508162511255 duration=2 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> ql.Driver (Driver.java:compile(559)) - We are resetting the hadoop caller 
> context to
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogBegin(149)) -  method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,255 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> log.PerfLogger (PerfLogger.java:PerfLogEnd(177)) -  method=releaseLocks start=1508162511255 end=1508162511255 duration=0 
> from=org.apache.hadoop.hive.ql.Driver>
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> tez.TezSessionPoolManager (TezSessionPoolManager.java:close(183)) - Closing 
> tez session default? false
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> tez.TezSessionState (TezSessionState.java:close(294)) - Closing Tez Session
> 2017-10-16 09:01:51,256 INFO  [haddl0007.mycenterpointenergy.com-51]: 
> client.TezClient (TezClient.java:stop(518)) - Shutting down Tez Session, 
> sessionName=HIVE-ae652f03-72c7-4ca8-a2d8-05dcc7392f4f, 
> applicationId=application_1507779664083_0159
> 2017-10-16 09:01:51,279 ERROR [haddl0007.mycenterpointenergy.com-51]: 
> compactor.Worker (Worker.java:run(191)) - Caught exception while trying to 
> compact 
> id:3723,dbname:mobiusad,tableName:zces_img_data_small_pt,partName:month=201608/dates=9,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestTxnId:0.
>   Marking failed to avoid repeated failures, java.io.IOException: Could not 
> update stats for table mobiusad.zces_img_data_small_pt/month=201608/dates=9 
> due to: (4,FAILED: ParseException line 1:70 mismatched input 'dates' 
> expecting ) near ''201608'' in analyze statement,42000line 1:70 mismatched 
> input 'dates' expecting ) near ''201608'' in analyze statement)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.Worker$StatsUpdater.gatherStats(Worker.java:296)
> at 
> org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:265)
> at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:168)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219046#comment-16219046
 ] 

Hive QA commented on HIVE-16827:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893956/HIVE-16827.05wip06.patch

{color:green}SUCCESS:{color} +1 due to 17 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 97 failed/errored test(s), 11331 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_2] 
(batchId=242)
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[smb_mapjoin_3] 
(batchId=242)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_join1] 
(batchId=69)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[cbo_rp_udaf_percentile_approx_23]
 (batchId=34)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[compute_stats_date] 
(batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_precision] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf2] 
(batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[decimal_udf] (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_windowing_no_cbo]
 (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[drop_table_with_index] 
(batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[filter_cond_pushdown2] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[gen_udf_example_add10] 
(batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby10] (batchId=61)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_id3] 
(batchId=26)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_grouping_sets_grouping]
 (batchId=3)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[index_serde] (batchId=16)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input22] (batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[input3_limit] 
(batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[llap_acid] (batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullscript] (batchId=35)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge5] (batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge6] (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat1] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge_incompat2] 
(batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ptf_matchpath] 
(batchId=14)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_25] 
(batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_2] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[smb_mapjoin_3] 
(batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[subquery_in_having] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udaf_percentile_approx_23]
 (batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[udf_trunc_number] 
(batchId=11)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_const] 
(batchId=60)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_if_expr_2] 
(batchId=59)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_like_2] 
(batchId=78)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_expressions]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_multipartitioning]
 (batchId=43)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_order_null]
 (batchId=36)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_range_multiorder]
 (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_rank] 
(batchId=56)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_streaming]
 (batchId=50)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_windowing_windowspec]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_mapjoin2] 
(batchId=23)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] 
(batchId=145)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[correlationoptimizer4]
 (batchId=149)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[count] 
(batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization2]
 (batchId=148)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization]
 (batchId=159)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[dynpart_sort_optimization_acid]
 

[jira] [Updated] (HIVE-17891) HIVE-13076 uses create table if not exists for the postgres script

2017-10-25 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17891:
---
Attachment: HIVE-17891.02.patch

precommit didn't trigger for this patch. Attaching again.

> HIVE-13076 uses create table if not exists for the postgres script
> --
>
> Key: HIVE-17891
> URL: https://issues.apache.org/jira/browse/HIVE-17891
> Project: Hive
>  Issue Type: Bug
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17891.01.patch, HIVE-17891.02.patch
>
>
> HIVE-13076 addes a new table to the schema but the patch script uses {{CREATE 
> TABLE IF NOT EXISTS}} syntax to add the new table. The issue is that the {{IF 
> NOT EXISTS}} clause is only available from postgres 9.1 onwards. So the 
> script will fail for older versions of postgres.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17874) Parquet vectorization fails on tables with complex columns when there are no projected columns

2017-10-25 Thread Vihang Karajgaonkar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vihang Karajgaonkar updated HIVE-17874:
---
Attachment: HIVE-17874.05.patch

Fixed TestVectorizedColumnReader test failure.

> Parquet vectorization fails on tables with complex columns when there are no 
> projected columns
> --
>
> Key: HIVE-17874
> URL: https://issues.apache.org/jira/browse/HIVE-17874
> Project: Hive
>  Issue Type: Sub-task
>Affects Versions: 2.2.0
>Reporter: Vihang Karajgaonkar
>Assignee: Vihang Karajgaonkar
> Attachments: HIVE-17874.01-branch-2.patch, HIVE-17874.01.patch, 
> HIVE-17874.02.patch, HIVE-17874.03.patch, HIVE-17874.04.patch, 
> HIVE-17874.05.patch
>
>
> When a parquet table contains an unsupported type like {{Map}}, {{LIST}} or 
> {{UNION}} simple queries like {{select count(*) from table}} fails with 
> {{unsupported type exception}} even though vectorized reader doesn't really 
> need read the complex type into batches.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17842) Run checkstyle on ptest2 module with proper configuration

2017-10-25 Thread Peter Vary (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-17842:
--
   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for your contribution [~szita]!

> Run checkstyle on ptest2 module with proper configuration
> -
>
> Key: HIVE-17842
> URL: https://issues.apache.org/jira/browse/HIVE-17842
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Fix For: 3.0.0
>
> Attachments: HIVE-17842.0.patch
>
>
> Maven module ptest2 is not connected to Hive root pom, therefore if someone 
> (or an automated Yetus check) runs {{mvn checkstyle}} it will not consider 
> Hive-specific checkstyle settings (e.g. validates row lengths against 80, not 
> 100)
> We need to make sure ptest2 pom has the proper checkstyle configuration



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17458) VectorizedOrcAcidRowBatchReader doesn't handle 'original' files

2017-10-25 Thread Eugene Koifman (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-17458:
--
Attachment: HIVE-17458.09.patch

> VectorizedOrcAcidRowBatchReader doesn't handle 'original' files
> ---
>
> Key: HIVE-17458
> URL: https://issues.apache.org/jira/browse/HIVE-17458
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.2.0
>Reporter: Eugene Koifman
>Assignee: Eugene Koifman
>Priority: Critical
> Attachments: HIVE-17458.01.patch, HIVE-17458.02.patch, 
> HIVE-17458.03.patch, HIVE-17458.04.patch, HIVE-17458.05.patch, 
> HIVE-17458.06.patch, HIVE-17458.07.patch, HIVE-17458.07.patch, 
> HIVE-17458.08.patch, HIVE-17458.09.patch
>
>
> VectorizedOrcAcidRowBatchReader will not be used for original files.  This 
> will likely look like a perf regression when converting a table from non-acid 
> to acid until it runs through a major compaction.
> With Load Data support, if large files are added via Load Data, the read ops 
> will not vectorize until major compaction.  
> There is no reason why this should be the case.  Just like 
> OrcRawRecordMerger, VectorizedOrcAcidRowBatchReader can look at the other 
> files in the logical tranche/bucket and calculate the offset for the RowBatch 
> of the split.  (Presumably getRecordReader().getRowNumber() works the same in 
> vector mode).
> In this case we don't even need OrcSplit.isOriginal() - the reader can infer 
> it from file path... which in particular simplifies 
> OrcInputFormat.determineSplitStrategies()



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17617) Rollup of an empty resultset should contain the grouping of the empty grouping set

2017-10-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-17617:

   Resolution: Fixed
Fix Version/s: 3.0.0
   Status: Resolved  (was: Patch Available)

test failures unrelated.
pushed to master. Thank you Ashutosh for the review!

> Rollup of an empty resultset should contain the grouping of the empty 
> grouping set
> --
>
> Key: HIVE-17617
> URL: https://issues.apache.org/jira/browse/HIVE-17617
> Project: Hive
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
> Fix For: 3.0.0
>
> Attachments: HIVE-17617.01.patch, HIVE-17617.03.patch, 
> HIVE-17617.04.patch, HIVE-17617.05.patch, HIVE-17617.06.patch, 
> HIVE-17617.07.patch, HIVE-17617.07.patch, HIVE-17617.08.patch
>
>
> running
> {code}
> drop table if exists tx1;
> create table tx1 (a integer,b integer,c integer);
> select  sum(c),
> grouping(b)
> fromtx1
> group by rollup (b);
> {code}
> returns 0 rows; however 
> according to the standard:
> The  is regarded as the shortest such initial sublist. 
> For example, “ROLLUP ( (A, B), (C, D) )”
> is equivalent to “GROUPING SETS ( (A, B, C, D), (A, B), () )”.
> so I think the totals row (the grouping for {{()}} should be present)  - psql 
> returns it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-16827) Merge stats task and column stats task into a single task

2017-10-25 Thread Zoltan Haindrich (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-16827:

Attachment: HIVE-16827.05wip06.patch

> Merge stats task and column stats task into a single task
> -
>
> Key: HIVE-16827
> URL: https://issues.apache.org/jira/browse/HIVE-16827
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Pengcheng Xiong
>Assignee: Zoltan Haindrich
> Attachments: HIVE-16827.01.patch, HIVE-16827.02.patch, 
> HIVE-16827.03.patch, HIVE-16827.04wip01.patch, HIVE-16827.04wip02.patch, 
> HIVE-16827.04wip03.patch, HIVE-16827.04wip04.patch, HIVE-16827.04wip05.patch, 
> HIVE-16827.04wip06.patch, HIVE-16827.04wip07.patch, HIVE-16827.04wip08.patch, 
> HIVE-16827.04wip09.patch, HIVE-16827.04wip10.patch, HIVE-16827.05wip01.patch, 
> HIVE-16827.05wip02.patch, HIVE-16827.05wip03.patch, HIVE-16827.05wip04.patch, 
> HIVE-16827.05wip05.patch, HIVE-16827.05wip06.patch, HIVE-16827.4.patch
>
>
> Within the task, we can specify whether to compute basic stats only or column 
> stats only or both.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218781#comment-16218781
 ] 

ASF GitHub Bot commented on HIVE-17887:
---

Github user sankarh closed the pull request at:

https://github.com/apache/hive/pull/267


> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218779#comment-16218779
 ] 

Sankar Hariappan commented on HIVE-17887:
-

Test failures are irrelevant to the patch.
Committed the patch to master!
Thanks [~thejas] for the review!

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218700#comment-16218700
 ] 

Hive QA commented on HIVE-17887:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893937/HIVE-17887.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11323 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.authorization.plugin.sqlstd.TestOperation2Privilege.checkHiveOperationTypeMatch
 (batchId=270)
org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable (batchId=232)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=229)
org.apache.hive.jdbc.TestTriggersWorkloadManager.testTriggerHighShuffleBytes 
(batchId=229)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7484/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7484/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7484/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893937 - PreCommit-HIVE-Build

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> 

[jira] [Commented] (HIVE-17842) Run checkstyle on ptest2 module with proper configuration

2017-10-25 Thread Adam Szita (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218673#comment-16218673
 ] 

Adam Szita commented on HIVE-17842:
---

Precommit test results seem to indicate errors _unrelated_ to this patch. 
[~pvary] can you please go ahead and commit it, if you agree?

> Run checkstyle on ptest2 module with proper configuration
> -
>
> Key: HIVE-17842
> URL: https://issues.apache.org/jira/browse/HIVE-17842
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Adam Szita
>Assignee: Adam Szita
> Attachments: HIVE-17842.0.patch
>
>
> Maven module ptest2 is not connected to Hive root pom, therefore if someone 
> (or an automated Yetus check) runs {{mvn checkstyle}} it will not consider 
> Hive-specific checkstyle settings (e.g. validates row lengths against 80, not 
> 100)
> We need to make sure ptest2 pom has the proper checkstyle configuration



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2017-10-25 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-9447:
-
Status: Patch Available  (was: In Progress)

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0, 1.2.0, 1.0.0, 0.14.0
>Reporter: Selina Zhang
>Assignee: Adam Szita
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch, HIVE-9447.3.patch, 
> HIVE-9447.4.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2017-10-25 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-9447:
-
Attachment: HIVE-9447.4.patch

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Selina Zhang
>Assignee: Adam Szita
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch, HIVE-9447.3.patch, 
> HIVE-9447.4.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2017-10-25 Thread Adam Szita (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Szita updated HIVE-9447:
-
Status: In Progress  (was: Patch Available)

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 1.1.0, 1.2.0, 1.0.0, 0.14.0
>Reporter: Selina Zhang
>Assignee: Adam Szita
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch, HIVE-9447.3.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> table and does not need access the whole row. We can use 
> {quote}
> select count(1) from SDS where SDS.CD_ID=?
> {quote}
> CD_ID is index column, the above query will do range scan for index, which is 
> faster. 
> For other DBs support LIMIT syntax such as MySQL, this problem does not 
> exist. However, the new query does not hurt.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218616#comment-16218616
 ] 

Hive QA commented on HIVE-16748:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m  
8s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m  
8s{color} | {color:red} testutils/ptest2: The patch generated 1 new + 94 
unchanged - 0 fixed = 95 total (was 94) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
11s{color} | {color:red} The patch generated 11 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}  5m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux zsombor-ptest-server 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u2 (2017-06-26) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus/dev-support/hive-personality.sh |
| git revision | master / 954f832 |
| Default Java | 1.8.0_131 |
| checkstyle | 
http://35.199.162.129/logs/Precommit-HIVE-Test-Build-1/yetus/diff-checkstyle-testutils_ptest2.txt
 |
| asflicense | 
http://35.199.162.129/logs/Precommit-HIVE-Test-Build-1/yetus/patch-asflicense-problems.txt
 |
| modules | C: testutils/ptest2 U: testutils/ptest2 |
| Console output | 
http://35.199.162.129/logs/Precommit-HIVE-Test-Build-1/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Integreate YETUS to Pre-Commit
> --
>
> Key: HIVE-16748
> URL: https://issues.apache.org/jira/browse/HIVE-16748
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Peter Vary
>Assignee: Adam Szita
> Attachments: HIVE-16748.0.patch, dummytest.patch
>
>
> After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, 
> so the results are added in comments like 
> https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-16748) Integreate YETUS to Pre-Commit

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218605#comment-16218605
 ] 

Hive QA commented on HIVE-16748:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12892602/dummytest.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/Precommit-HIVE-Test-Build/1/testReport
Console output: 
https://builds.apache.org/job/Precommit-HIVE-Test-Build/1/console
Test logs: http://35.199.162.129/logsPrecommit-HIVE-Test-Build-1/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Tests exited with: IllegalArgumentException: null
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12892602 - Precommit-HIVE-Test-Build

> Integreate YETUS to Pre-Commit
> --
>
> Key: HIVE-16748
> URL: https://issues.apache.org/jira/browse/HIVE-16748
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Peter Vary
>Assignee: Adam Szita
> Attachments: HIVE-16748.0.patch, dummytest.patch
>
>
> After HIVE-15051, we should automate the yetus run for the Pre-Commit tests, 
> so the results are added in comments like 
> https://issues.apache.org/jira/browse/YARN-6363?focusedCommentId=15937570=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15937570



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HIVE-9447) Metastore: inefficient Oracle query for removing unused column descriptors when add/drop table/partition

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218579#comment-16218579
 ] 

Hive QA commented on HIVE-9447:
---



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893047/HIVE-9447.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7483/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7483/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7483/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2017-10-25 13:04:39.782
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-7483/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 13:04:39.785
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at 954f832 HIVE-15104: Hive on Spark generate more shuffle data 
than hive on mr (Rui reviewed by Xuefu)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/DependencyCollectionFunction.java
Removing ql/src/java/org/apache/hadoop/hive/ql/exec/util/
Removing standalone-metastore/src/gen/org/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at 954f832 HIVE-15104: Hive on Spark generate more shuffle data 
than hive on mr (Rui reviewed by Xuefu)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2017-10-25 13:04:44.573
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
error: patch failed: 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:3771
error: 
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java:
 patch does not apply
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12893047 - PreCommit-HIVE-Build

> Metastore: inefficient Oracle query for removing unused column descriptors 
> when add/drop table/partition
> 
>
> Key: HIVE-9447
> URL: https://issues.apache.org/jira/browse/HIVE-9447
> Project: Hive
>  Issue Type: Improvement
>  Components: Metastore
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0
>Reporter: Selina Zhang
>Assignee: Adam Szita
> Attachments: HIVE-9447.1.patch, HIVE-9447.2.patch, HIVE-9447.3.patch
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Metastore needs removing unused column descriptors when drop/add partitions 
> or tables. For query the unused column descriptor, the current implementation 
> utilizes datanuleus' range function, which basically equals LIMIT syntax. 
> However, Oracle does not support LIMIT, the query is converted as  
> {quote}
> SQL> SELECT * FROM (SELECT subq.*,ROWNUM rn FROM (SELECT
> 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS
> NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,
> A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM drhcat.SDS A0 
> WHERE A0.CD_ID = ? ) subq ) WHERE  rn <= 1;
> {quote}
> Given that CD_ID is not very selective, this query may have to access large 
> amount of rows (depends how many partitions the table has, millions of rows 
> in our case). Metastore may become unresponsive because of this. 
> Since Metastore only needs to know if the specific CD_ID is referenced in SDS 
> 

[jira] [Commented] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load

2017-10-25 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16218571#comment-16218571
 ] 

Hive QA commented on HIVE-17595:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12893923/HIVE-17595.0.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 56 failed/errored test(s), 11322 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[infer_bucket_sort_convert_join]
 (batchId=52)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast]
 (batchId=156)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin]
 (batchId=91)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] 
(batchId=110)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut 
(batchId=205)
org.apache.hadoop.hive.ql.exec.TestUtilities.testGetTasksHaveNoRepeats 
(batchId=281)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testAlters 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBasic (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBasicWithCM 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapLoadOnExistingDb
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapWithConcurrentDropPartition
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapWithConcurrentDropTable
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapWithConcurrentRename
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testBootstrapWithDropPartitionedTable
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testCMConflict 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConcatenatePartitionedTable
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConcatenateTable 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDrops (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDropsWithCM 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testDumpLimit 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testEventTypesForDynamicAddPartitionByInsert
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testExchangePartition 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIdempotentMoveTaskForInsertFiles
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalAdds 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalInsertDropPartitionedTable
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalInsertDropUnpartitionedTable
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalInsertToPartition
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalInserts 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalLoad 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalLoadFailAndRetry
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalLoadWithVariableLengthEventId
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalRepeatEventOnExistingObject
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testIncrementalRepeatEventOnMissingObject
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testInsertOverwriteOnPartitionedTableWithCM
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testInsertOverwriteOnUnpartitionedTableWithCM
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testInsertToMultiKeyPartition
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRemoveStats 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRenamePartitionWithCM
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRenamePartitionedTableAcrossDatabases
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRenameTableAcrossDatabases
 (batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testRenameTableWithCM 
(batchId=222)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testStatus 
(batchId=222)

[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Status: Patch Available  (was: Open)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Attachment: HIVE-17887.01.patch

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
> Attachments: HIVE-17887.01.patch
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Attachment: (was: HIVE-17887.01.patch)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.

2017-10-25 Thread Sankar Hariappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-17887:

Status: Open  (was: Patch Available)

> Incremental REPL LOAD with Drop partition event on timestamp type partition 
> column fails.
> -
>
> Key: HIVE-17887
> URL: https://issues.apache.org/jira/browse/HIVE-17887
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2, repl
>Affects Versions: 3.0.0
>Reporter: Santhosh B Gowda
>Assignee: Sankar Hariappan
>  Labels: DR, pull-request-available, replication
> Fix For: 3.0.0
>
>
> When try to replicate the drop partition event on a table with partition on 
> timestamp type column fails in REPL LOAD.
> *Scenario:*
> 1. create table with partition on timestamp column.
> 2.bootstrap dump/load.
> 3. insert a record to create partition(p="2001-11-09 00:00:00.0").
> 4. drop the same partition(p="2001-11-09 00:00:00.0").
> 5. incremental dump/load
> -- REPL LOAD throws below exception
> {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: 
> Thread-36769]: metastore.RetryingHMSHandler 
> (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error 
> parsing partition filter; lexer error: line 1:18 no viable alternative at 
> character ':'; exception MismatchedTokenException(12!=23))
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517)
> at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103)
> at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957)
> at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
> at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200)
> at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178)
> at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source)
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018)
> at 
> org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993)
> at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76)
> at 
> org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HIVE-17595) Correct DAG for updating the last.repl.id for a database during bootstrap load

2017-10-25 Thread anishek (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-17595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anishek updated HIVE-17595:
---
Attachment: HIVE-17595.0.patch

> Correct DAG for updating the last.repl.id for a database during bootstrap load
> --
>
> Key: HIVE-17595
> URL: https://issues.apache.org/jira/browse/HIVE-17595
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Affects Versions: 3.0.0
>Reporter: anishek
>Assignee: anishek
> Fix For: 3.0.0
>
> Attachments: HIVE-17595.0.patch
>
>
> We update the last.repl.id as a database property. This is done after all the 
> bootstrap tasks to load the relevant data are done and is the last task to be 
> run. however we are currently not setting up the DAG correctly for this task. 
> This is getting added as the root task for now where as it should be the last 
> task to be run in a DAG. This becomes more important after the inclusion of 
> HIVE-17426 since this will lead to parallel execution and incorrect DAG's 
> will lead to incorrect results/state of the system. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >