date:20200227

[jira] [Comment Edited] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Krisztian Kasa (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047281#comment-17047281
 ] 

Krisztian Kasa edited comment on HIVE-22929 at 2/28/20 7:47 AM:


[~gopalv]
String.replace implementation is:
{code}
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(

this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
{code}
So it also calls Pattern.compile with *target* every time it called.

The difference between replace and replaceAll is:
{code}
replace
Pattern.compile(target.toString(), Pattern.LITERAL)
{code}
{code}
replaceAll
Pattern.compile(regex)
{code}

I did some testing:
{code}
 @Test
  public void testReplacePerf() {
long count = 1000;

long start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replaceAll("am", "b");
}
System.out.println("String.replaceAll: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replace("am", "b");
}
System.out.println("String.replace: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
final Pattern REGEX = Pattern.compile("am", Pattern.LITERAL);
for (int i = 0; i < count; ++i) {
  String s = RegExUtils.replaceAll("sample sample", REGEX, "b");
}
System.out.println("Precompiled regex + RegExUtils.replaceAll:" + 
(System.currentTimeMillis() - start));
  }
{code}
{code}
String.replaceAll: 4037
String.replace: 3072
Precompiled regex + RegExUtils.replaceAll:2216
{code}

Please share your thoughts.


was (Author: kkasa):
[~gopalv]
String.replace implementation is:
{code}
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(

this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
{code}
So it also calls Pattern.compile with *target* every time it called.

The difference between replace and replaceAll is:
{code}
replace
Pattern.compile(target.toString(), Pattern.LITERAL)
{code}
{code}
replaceAll
Pattern.compile(regex)
{code}

I did some testing:
{code}
  public static final Pattern REGEX = Pattern.compile("am", Pattern.LITERAL);
  @Test
  public void testReplacePerf() {
long count = 1000;

long start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replaceAll("am", "b");
}
System.out.println("String.replaceAll: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replace("am", "b");
}
System.out.println("String.replace: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = RegExUtils.replaceAll("sample sample", REGEX, "b");
}
System.out.println("Precompiled regex + RegExUtils.replaceAll:" + 
(System.currentTimeMillis() - start));
  }
{code}
{code}
String.replaceAll: 3997
String.replace: 3028
Precompiled regex + RegExUtils.replaceAll:2164
{code}

Please share your thoughts.

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Krisztian Kasa (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047281#comment-17047281
 ] 

Krisztian Kasa commented on HIVE-22929:
---

[~gopalv]
String.replace implementation is:
{code}
public String replace(CharSequence target, CharSequence replacement) {
return Pattern.compile(target.toString(), Pattern.LITERAL).matcher(

this).replaceAll(Matcher.quoteReplacement(replacement.toString()));
}
{code}
So it also calls Pattern.compile with *target* every time it called.

The difference between replace and replaceAll is:
{code}
replace
Pattern.compile(target.toString(), Pattern.LITERAL)
{code}
{code}
replaceAll
Pattern.compile(regex)
{code}

I did some testing:
{code}
  public static final Pattern REGEX = Pattern.compile("am", Pattern.LITERAL);
  @Test
  public void testReplacePerf() {
long count = 1000;

long start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replaceAll("am", "b");
}
System.out.println("String.replaceAll: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = "sample sample".replace("am", "b");
}
System.out.println("String.replace: " + (System.currentTimeMillis() - 
start));

start = System.currentTimeMillis();
for (int i = 0; i < count; ++i) {
  String s = RegExUtils.replaceAll("sample sample", REGEX, "b");
}
System.out.println("Precompiled regex + RegExUtils.replaceAll:" + 
(System.currentTimeMillis() - start));
  }
{code}
{code}
String.replaceAll: 3997
String.replace: 3028
Precompiled regex + RegExUtils.replaceAll:2164
{code}

Please share your thoughts.

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ashutosh Chauhan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047279#comment-17047279
 ] 

Ashutosh Chauhan commented on HIVE-22786:
-

+1

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch, 
> HIVE-22786.6.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22903) Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047272#comment-17047272
 ] 

Hive QA commented on HIVE-22903:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 1 new + 119 unchanged - 0 
fixed = 120 total (was 119) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20865/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20865/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20865/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20865/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorized row_number() resets the row number after one batch in case of 
> constant expression in partition clause
> 
>
> Key: HIVE-22903
> URL: https://issues.apache.org/jira/browse/HIVE-22903
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22903.01.patch, HIVE-22903.02.patch, 
> HIVE-22903.03.patch, HIVE-22903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorized row number implementation resets the row number when constant 
> expression is passed in partition clause.
> Repro Query
> {code}
> select row_number() over(partition by 1) r1, t from over10k_n8;
> Or
> select row_number() over() r1, t from over10k_n8;
> {code}
> where table over10k_n8 contains more than 1024 records.
> This happens because currently in VectorPTFOperator, we reset evaluators if 
> only partition clause is there.
> {code:java}
> // If we are only processing a PARTITION BY, reset our evaluators.
> if (!isPartitionOrderBy) {
>   groupBatches.resetEvaluators();
> }
> {code}
> To resolve, we should also check if the entire partition

[jira] [Commented] (HIVE-22865) Include data in replication staging directory

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047255#comment-17047255
 ] 

Hive QA commented on HIVE-22865:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994799/HIVE-22865.5.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 159 failed/errored test(s), 18075 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_blobstore]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_local]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_blobstore_to_warehouse]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_addpartition_local_to_blobstore]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_blobstore_nonpart]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_local]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_blobstore_to_warehouse_nonpart]
 (batchId=308)
org.apache.hadoop.hive.cli.TestBlobstoreCliDriver.testCliDriver[import_local_to_blobstore]
 (batchId=308)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_01_nonpart] 
(batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_02_part] 
(batchId=58)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_03_nonpart_over_compat]
 (batchId=6)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_04_all_part] 
(batchId=33)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_05_some_part] 
(batchId=85)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_06_one_part] 
(batchId=100)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_07_all_part_over_nonoverlap]
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_08_nonpart_rename] 
(batchId=71)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_09_part_spec_nonoverlap]
 (batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_10_external_managed]
 (batchId=80)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_11_managed_external]
 (batchId=81)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_12_external_location]
 (batchId=63)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_13_managed_location]
 (batchId=45)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_14_managed_location_over_existing]
 (batchId=62)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_15_external_part] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_16_part_external] 
(batchId=68)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_17_part_managed] 
(batchId=51)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_18_part_external] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_19_00_part_external_location]
 (batchId=77)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_19_part_external_location]
 (batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_20_part_managed_location]
 (batchId=44)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_22_import_exist_authsuccess]
 (batchId=21)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_23_import_part_authsuccess]
 (batchId=24)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_24_import_nonexist_authsuccess]
 (batchId=22)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[exim_hidden_files] 
(batchId=55)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repl_2_exim_basic] 
(batchId=91)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repl_3_exim_metadata] 
(batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[repl_load_old_version] 
(batchId=26)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[mm_exim] 
(batchId=193)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartition (batchId=357)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartitionBucketed (batchId=357)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartitionMM (batchId=357)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartitionMMVectorized 
(batchId=357)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartitionRename (batchId=357)
org.apache.hadoop.hive.ql.TestTxnAddPartition.addPartitionVectorized 
(batchId=357)
org.apache.hadoop.hive.ql.TestTxnExIm.testExportBucketed (batchId=341)

[jira] [Updated] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread Syed Shameerur Rahman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HIVE-22900:
-
Component/s: Standalone Metastore

> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch, 
> HIVE-22900.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the metastore in case of DIRECT-SQL and JDO. The 
> patch also covers all the test cases mentioned in HIVE-5134 because of which 
> Predicate Push Down for like filter was disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047235#comment-17047235
 ] 

László Bodor commented on HIVE-22941:
-

tez_fixed_bucket_pruning.q failure was related, it's about reverting the 
changed numFiles after HIVE-21714

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22941.01.patch, HIVE-22941.02.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Attachment: HIVE-22941.02.patch

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22941.01.patch, HIVE-22941.02.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22943) Metastore JDO pushdown for DATE constants

2020-02-27 Thread Gopal Vijayaraghavan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal Vijayaraghavan updated HIVE-22943:

Summary: Metastore JDO pushdown for DATE constants  (was: Metastore 
pushdown for DATE constants)

> Metastore JDO pushdown for DATE constants
> -
>
> Key: HIVE-22943
> URL: https://issues.apache.org/jira/browse/HIVE-22943
> Project: Hive
>  Issue Type: Bug
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Gopal Vijayaraghavan
>Priority: Major
>
> https://github.com/apache/hive/blame/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/parser/Filter.g#L461
> {code}
> /* When I figure out how to make lexer backtrack after validating predicate, 
> dates would be able 
> to support single quotes [( '\'' DateString '\'' ) |]. For now, what we do 
> instead is have a hack
> to parse the string in metastore code from StringLiteral. */
> DateLiteral
> :
> KW_DATE? DateString { ExtractDate(getText()) != null }?
> ;
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Issue Type: Bug  (was: Improvement)

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Fix Version/s: 4.0.0

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22865) Include data in replication staging directory

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047225#comment-17047225
 ] 

Hive QA commented on HIVE-22865:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
16s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
46s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
45s{color} | {color:red} ql: The patch generated 45 new + 155 unchanged - 4 
fixed = 200 total (was 159) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
23s{color} | {color:red} itests/hive-unit: The patch generated 23 new + 562 
unchanged - 0 fixed = 585 total (was 562) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
4s{color} | {color:green} ql generated 0 new + 1530 unchanged - 1 fixed = 1530 
total (was 1531) {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
46s{color} | {color:green} hive-unit in the patch passed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 56s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20864/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20864/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20864/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20864/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests/hive-unit U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20864/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Include data in replication staging directory
> -
>
> Key: HIVE-22865
> URL: https://issues.apache.org/jira/browse/HIVE-22865
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22865.1.patch,

[jira] [Updated] (HIVE-22865) Include data in replication staging directory

2020-02-27 Thread PRAVIN KUMAR SINHA (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PRAVIN KUMAR SINHA updated HIVE-22865:
--
Attachment: HIVE-22865.6.patch

> Include data in replication staging directory
> -
>
> Key: HIVE-22865
> URL: https://issues.apache.org/jira/browse/HIVE-22865
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22865.1.patch, HIVE-22865.2.patch, 
> HIVE-22865.3.patch, HIVE-22865.4.patch, HIVE-22865.5.patch, HIVE-22865.6.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22840) Race condition in formatters of TimestampColumnVector and DateColumnVector

2020-02-27 Thread Shubham Chaurasia (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22840:
-
Attachment: HIVE-22840.05.patch

> Race condition in formatters of TimestampColumnVector and DateColumnVector 
> ---
>
> Key: HIVE-22840
> URL: https://issues.apache.org/jira/browse/HIVE-22840
> Project: Hive
>  Issue Type: Bug
>  Components: storage-api
>Reporter: László Bodor
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22840.03.patch, HIVE-22840.04.patch, 
> HIVE-22840.05.patch, HIVE-22840.1.patch, HIVE-22840.2.patch, HIVE-22840.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> HIVE-22405 added support for proleptic calendar. It uses java's 
> SimpleDateFormat/Calendar APIs which are not thread-safe and cause race in 
> some scenarios. 
> As a result of those race conditions, we see some exceptions like
> {code:java}
> 1) java.lang.NumberFormatException: For input string: "" 
> OR 
> java.lang.NumberFormatException: For input string: ".821582E.821582E44"
> OR
> 2) Caused by: java.lang.ArrayIndexOutOfBoundsException: -5325980
>   at 
> sun.util.calendar.BaseCalendar.getCalendarDateFromFixedDate(BaseCalendar.java:453)
>   at 
> java.util.GregorianCalendar.computeFields(GregorianCalendar.java:2397)
> {code}
> This issue is to address those thread-safety issues/race conditions.
> cc [~jcamachorodriguez] [~abstractdog] [~omalley]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Status: In Progress  (was: Patch Available)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.03.patch, HIVE-22926.04.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Attachment: HIVE-22926.04.patch
Status: Patch Available  (was: In Progress)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.03.patch, HIVE-22926.04.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047210#comment-17047210
 ] 

Hive QA commented on HIVE-22872:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994794/HIVE-22872.03.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20863/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20863/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20863/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ date '+%Y-%m-%d %T.%3N'
2020-02-28 05:13:03.068
+ [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]]
+ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
+ export 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ 
PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'MAVEN_OPTS=-Xmx1g '
+ MAVEN_OPTS='-Xmx1g '
+ cd /data/hiveptest/working/
+ tee /data/hiveptest/logs/PreCommit-HIVE-Build-20863/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ date '+%Y-%m-%d %T.%3N'
2020-02-28 05:13:03.071
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at ffba5d6 HIVE-22893: Enhance data size estimation for fields 
computed by UDFs (Zoltan Haindrich reviewed by Jesus Camacho Rodriguez)
+ git clean -f -d
Removing standalone-metastore/metastore-server/src/gen/
+ git checkout master
Already on 'master'
Your branch is up-to-date with 'origin/master'.
+ git reset --hard origin/master
HEAD is now at ffba5d6 HIVE-22893: Enhance data size estimation for fields 
computed by UDFs (Zoltan Haindrich reviewed by Jesus Camacho Rodriguez)
+ git merge --ff-only origin/master
Already up-to-date.
+ date '+%Y-%m-%d %T.%3N'
2020-02-28 05:13:03.873
+ rm -rf ../yetus_PreCommit-HIVE-Build-20863
+ mkdir ../yetus_PreCommit-HIVE-Build-20863
+ git gc
+ cp -R . ../yetus_PreCommit-HIVE-Build-20863
+ mkdir /data/hiveptest/logs/PreCommit-HIVE-Build-20863/yetus
+ patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hiveptest/working/scratch/build.patch
+ [[ -f /data/hiveptest/working/scratch/build.patch ]]
+ chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh
+ /data/hiveptest/working/scratch/smart-apply-patch.sh 
/data/hiveptest/working/scratch/build.patch
Trying to apply the patch with -p0
error: patch failed: ql/src/test/queries/clientpositive/schq_ingest.q:39
Falling back to three-way merge...
Applied patch to 'ql/src/test/queries/clientpositive/schq_ingest.q' cleanly.
error: patch failed: 
standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql:56
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql'
 with conflicts.
error: patch failed: 
standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql:30
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql'
 with conflicts.
error: patch failed: 
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql:60
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql'
 with conflicts.
error: patch failed: 
standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql:60
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql'
 with conflicts.
error: patch failed: 
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql:191
Falling back to three-way merge...
Applied patch to 
'standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql'
 with conflicts.
Going to apply patch with: git apply -p0
/data/hiveptest/working/scratch/build.patch:288: trailing whitespace.
!sleep 10; 
/data/hiveptest/working/scratch/build.patch:409: trailing whitespace.
active_execution_id bigint  from deserializer

[jira] [Commented] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047209#comment-17047209
 ] 

Hive QA commented on HIVE-22941:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994781/HIVE-22941.01.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 18074 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schq_ingest]
 (batchId=184)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_fixed_bucket_pruning]
 (batchId=189)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitions
 (batchId=292)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerCustomCreatedDynamicPartitionsUnionAll
 (batchId=292)
org.apache.hive.jdbc.TestTriggersTezSessionPoolManager.testTriggerHighShuffleBytes
 (batchId=292)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20862/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20862/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20862/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994781 - PreCommit-HIVE-Build

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22527) Hive on Tez : Job of merging samll files will be submitted into another queue (default queue)

2020-02-27 Thread zhangbutao (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047206#comment-17047206
 ] 

zhangbutao commented on HIVE-22527:
---

[~ngangam] A new patch for master HIVE-22527.01.patch. We use the patch for 
production environment and it works well.  Maybe you can give  better advice 
for this question. Thanks 

> Hive on Tez : Job of merging samll files will be submitted into another queue 
> (default queue)
> -
>
> Key: HIVE-22527
> URL: https://issues.apache.org/jira/browse/HIVE-22527
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-22527-branch-3.1.0.patch, HIVE-22527.01.patch, 
> explain with merge files.png, file merge job.png, hive logs.png
>
>
> Hive on Tez. We enable small file merge configuration with set 
> *hive.merge.tezfiles=true*. So , There will be another job launched for 
> merging files after sql job. However, the merge file job is submitted into 
> another yarn queue, not the queue of current beeline client session. It seems 
> that the merging files job start a new tez session with new conf which is 
> different the current session conf, leading to the merging file job goes into 
> default queue.
>  
> Attachment *hive logs.png* shows that current session queue is 
> *root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
> queue name is *null* ( String confQueueName = 
> conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
> beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
> jobs should be submitted into the same queue including file merge job.
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]
>  
> Attachment *explain with merge files.png* shows that ** the stage-4 is 
> individual merge file job which is submitted into another yarn queue（default 
> queue）， not the queue root.bdoc.production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Status: In Progress  (was: Patch Available)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.03.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Attachment: HIVE-22926.03.patch
Status: Patch Available  (was: In Progress)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.03.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22527) Hive on Tez : Job of merging samll files will be submitted into another queue (default queue)

2020-02-27 Thread zhangbutao (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhangbutao updated HIVE-22527:
--
Attachment: HIVE-22527.01.patch

> Hive on Tez : Job of merging samll files will be submitted into another queue 
> (default queue)
> -
>
> Key: HIVE-22527
> URL: https://issues.apache.org/jira/browse/HIVE-22527
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-22527-branch-3.1.0.patch, HIVE-22527.01.patch, 
> explain with merge files.png, file merge job.png, hive logs.png
>
>
> Hive on Tez. We enable small file merge configuration with set 
> *hive.merge.tezfiles=true*. So , There will be another job launched for 
> merging files after sql job. However, the merge file job is submitted into 
> another yarn queue, not the queue of current beeline client session. It seems 
> that the merging files job start a new tez session with new conf which is 
> different the current session conf, leading to the merging file job goes into 
> default queue.
>  
> Attachment *hive logs.png* shows that current session queue is 
> *root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
> queue name is *null* ( String confQueueName = 
> conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
> beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
> jobs should be submitted into the same queue including file merge job.
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]
>  
> Attachment *explain with merge files.png* shows that ** the stage-4 is 
> individual merge file job which is submitted into another yarn queue（default 
> queue）， not the queue root.bdoc.production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22903) Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

2020-02-27 Thread Shubham Chaurasia (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22903:
-
Attachment: HIVE-22903.03.patch

> Vectorized row_number() resets the row number after one batch in case of 
> constant expression in partition clause
> 
>
> Key: HIVE-22903
> URL: https://issues.apache.org/jira/browse/HIVE-22903
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22903.01.patch, HIVE-22903.02.patch, 
> HIVE-22903.03.patch, HIVE-22903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorized row number implementation resets the row number when constant 
> expression is passed in partition clause.
> Repro Query
> {code}
> select row_number() over(partition by 1) r1, t from over10k_n8;
> Or
> select row_number() over() r1, t from over10k_n8;
> {code}
> where table over10k_n8 contains more than 1024 records.
> This happens because currently in VectorPTFOperator, we reset evaluators if 
> only partition clause is there.
> {code:java}
> // If we are only processing a PARTITION BY, reset our evaluators.
> if (!isPartitionOrderBy) {
>   groupBatches.resetEvaluators();
> }
> {code}
> To resolve, we should also check if the entire partition clause is a constant 
> expression, if it is so then we should not do 
> {{groupBatches.resetEvaluators()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22903) Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

2020-02-27 Thread Shubham Chaurasia (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047201#comment-17047201
 ] 

Shubham Chaurasia commented on HIVE-22903:
--

Attaching the patch again as the tests didn't trigger.

> Vectorized row_number() resets the row number after one batch in case of 
> constant expression in partition clause
> 
>
> Key: HIVE-22903
> URL: https://issues.apache.org/jira/browse/HIVE-22903
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22903.01.patch, HIVE-22903.02.patch, 
> HIVE-22903.03.patch, HIVE-22903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorized row number implementation resets the row number when constant 
> expression is passed in partition clause.
> Repro Query
> {code}
> select row_number() over(partition by 1) r1, t from over10k_n8;
> Or
> select row_number() over() r1, t from over10k_n8;
> {code}
> where table over10k_n8 contains more than 1024 records.
> This happens because currently in VectorPTFOperator, we reset evaluators if 
> only partition clause is there.
> {code:java}
> // If we are only processing a PARTITION BY, reset our evaluators.
> if (!isPartitionOrderBy) {
>   groupBatches.resetEvaluators();
> }
> {code}
> To resolve, we should also check if the entire partition clause is a constant 
> expression, if it is so then we should not do 
> {{groupBatches.resetEvaluators()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047197#comment-17047197
 ] 

Hive QA commented on HIVE-22941:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
55s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20862/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20862/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20862/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the

[jira] [Commented] (HIVE-22527) Hive on Tez : Job of merging samll files will be submitted into another queue (default queue)

2020-02-27 Thread Richard Zhang (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047194#comment-17047194
 ] 

Richard Zhang commented on HIVE-22527:
--

[~zhangbutao]: have this patch been reviewed? 

> Hive on Tez : Job of merging samll files will be submitted into another queue 
> (default queue)
> -
>
> Key: HIVE-22527
> URL: https://issues.apache.org/jira/browse/HIVE-22527
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0, 3.1.1
>Reporter: zhangbutao
>Assignee: zhangbutao
>Priority: Blocker
> Fix For: 3.1.0
>
> Attachments: HIVE-22527-branch-3.1.0.patch, explain with merge 
> files.png, file merge job.png, hive logs.png
>
>
> Hive on Tez. We enable small file merge configuration with set 
> *hive.merge.tezfiles=true*. So , There will be another job launched for 
> merging files after sql job. However, the merge file job is submitted into 
> another yarn queue, not the queue of current beeline client session. It seems 
> that the merging files job start a new tez session with new conf which is 
> different the current session conf, leading to the merging file job goes into 
> default queue.
>  
> Attachment *hive logs.png* shows that current session queue is 
> *root.bdoc.production* ( String queueName = session.getQueueName();) incoming 
> queue name is *null* ( String confQueueName = 
> conf.get(TezConfiguration.TEZ_QUEUE_NAME);). In fact, we log in to the same 
> beeline client with *set tez.queue.name=* *root.bdoc.production,* and  all  
> jobs should be submitted into the same queue including file merge job.
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L445]
> [https://github.com/apache/hive/blob/bcc7df95824831a8d2f1524e4048dfc23ab98c19/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L446]
>  
> Attachment *explain with merge files.png* shows that ** the stage-4 is 
> individual merge file job which is submitted into another yarn queue（default 
> queue）， not the queue root.bdoc.production.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22931) HoS dynamic partitioning fails with blobstore optimizations off

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047187#comment-17047187
 ] 

Hive QA commented on HIVE-22931:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994776/HIVE-22931.1.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18074 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=92)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20861/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20861/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20861/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994776 - PreCommit-HIVE-Build

> HoS dynamic partitioning fails with blobstore optimizations off
> ---
>
> Key: HIVE-22931
> URL: https://issues.apache.org/jira/browse/HIVE-22931
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22931.0.patch, HIVE-22931.1.patch
>
>
> Reproduction steps:
>  - Create s3a backed table and normal table.
> {code:java}
> CREATE TABLE source (
>   a string,
>   b int,
>   c int);
>   
> CREATE TABLE target (
>   a string)
> PARTITIONED BY (
>   b int,
>   c int)
> STORED AS parquet
> LOCATION
>   's3a://somepath';
> {code}
>  - Insert values into normal table.
> {code:java}
> INSERT INTO TABLE source VALUES ("a", "1", "1");
> {code}
>  - Do an insert overwrite with dynamic partitions:
> {code:java}
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.blobstore.optimizations.enabled=false;
> set hive.execution.engine=spark;
> INSERT OVERWRITE TABLE target partition (b,c)
> SELECT *
> FROM source;{code}
> This fails only with Spark execution engine + blobstorage optimizations being 
> turned off with:
> {code}
> 2020-01-16 15:24:56,064 ERROR hive.ql.metadata.Hive: 
> [load-dynamic-partitions-5]: Exception when loading partition with parameters 
>  
> partPath=hdfs://nameservice1/tmp/hive/hive/6bcee075-b637-429e-9bf0-a2658355415e/hive_2020-01-16_15-24-01_156_4299941251929377815-4/-mr-1/.hive-staging_hive_2020-01-16_15-24-01_156_4299941251929377815-4/-ext-10002,
>   table=email_click_base,  partSpec={b=null, c=null},  replace=true,  
> listBucketingEnabled=false,  isAcid=false,  
> hasFollowingStatsTask=trueorg.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Partition spec is incorrect. {companyid=null, 
> eventmonth=null})
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:1666)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22931) HoS dynamic partitioning fails with blobstore optimizations off

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047166#comment-17047166
 ] 

Hive QA commented on HIVE-22931:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m 
11s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
46s{color} | {color:red} ql: The patch generated 2 new + 16 unchanged - 0 fixed 
= 18 total (was 16) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 33s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20861/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20861/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20861/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20861/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> HoS dynamic partitioning fails with blobstore optimizations off
> ---
>
> Key: HIVE-22931
> URL: https://issues.apache.org/jira/browse/HIVE-22931
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22931.0.patch, HIVE-22931.1.patch
>
>
> Reproduction steps:
>  - Create s3a backed table and normal table.
> {code:java}
> CREATE TABLE source (
>   a string,
>   b int,
>   c int);
>   
> CREATE TABLE target (
>   a string)
> PARTITIONED BY (
>   b int,
>   c int)
> STORED AS parquet
> LOCATION
>   's3a://somepath';
> {code}
>  - Insert values into normal table.
> {code:java}
> INSERT INTO TABLE source VALUES ("a", "1", "1");
> {code}
>  - Do an insert overwrite with dynamic partitions:
> {code:java}
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.blobstore.optimizations.enabled=false;
> set hive.execution.engine=spark;
> INSERT OVERWRITE TABLE target partition (b,c)
> SELECT *
> FROM source;{code}
> This fails only with Spark execution engine + blobstorage optimizations being 
> turned off with:
> {code}
> 2020-01-16 15:24:56,064 ERROR hive.ql.metadata.Hive: 
> [load-dynamic-partitions-5]: Exception when loading partition with parameters 
>  
>

[jira] [Commented] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047139#comment-17047139
 ] 

Hive QA commented on HIVE-22925:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994773/HIVE-22925.3.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18076 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=92)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20860/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20860/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20860/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994773 - PreCommit-HIVE-Build

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch, HIVE-22925.2.patch, 
> HIVE-22925.3.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047115#comment-17047115
 ] 

Hive QA commented on HIVE-22925:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
39s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
53s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
11s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
23s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} common: The patch generated 3 new + 371 unchanged - 0 
fixed = 374 total (was 371) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 19 new + 44 unchanged - 1 
fixed = 63 total (was 45) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
59s{color} | {color:red} ql generated 1 new + 99 unchanged - 1 fixed = 100 
total (was 100) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20860/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20860/yetus/diff-checkstyle-common.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20860/yetus/diff-checkstyle-ql.txt
 |
| javadoc | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20860/yetus/diff-javadoc-javadoc-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20860/yetus/patch-asflicense-problems.txt
 |
| modules | C: common ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20860/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch, HIVE-22925.2.patch, 
> HIVE-22925.3.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient

[jira] [Commented] (HIVE-22453) Describe table unnecessarily fetches partitions

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047091#comment-17047091
 ] 

Hive QA commented on HIVE-22453:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994771/HIVE-22453.4.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18073 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20859/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20859/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20859/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994771 - PreCommit-HIVE-Build

> Describe table unnecessarily fetches partitions
> ---
>
> Key: HIVE-22453
> URL: https://issues.apache.org/jira/browse/HIVE-22453
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 2.3.6
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-22453.2.patch, HIVE-22453.2.patch, 
> HIVE-22453.3.patch, HIVE-22453.4.patch, HIVE-22453.patch
>
>
> The simple describe table command without EXTENDED and FORMATTED (i.e., 
> DESCRIBE table_name) fetches all partitions when no partition is specified, 
> although it does not display partition statistics in nature.
> The command should not fetch partitions since it can take a long time for a 
> large amount of partitions.
> For instance, in our environment, the command takes around 8 seconds for a 
> table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22453) Describe table unnecessarily fetches partitions

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047058#comment-17047058
 ] 

Hive QA commented on HIVE-22453:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 31s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20859/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20859/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20859/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Describe table unnecessarily fetches partitions
> ---
>
> Key: HIVE-22453
> URL: https://issues.apache.org/jira/browse/HIVE-22453
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 2.3.6
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-22453.2.patch, HIVE-22453.2.patch, 
> HIVE-22453.3.patch, HIVE-22453.4.patch, HIVE-22453.patch
>
>
> The simple describe table command without EXTENDED and FORMATTED (i.e., 
> DESCRIBE table_name) fetches all partitions when no partition is specified, 
> although it does not display partition statistics in nature.
> The command should not fetch partitions since it can take a long time for a 
> large amount of partitions.
> For instance, in our environment, the command takes around 8 seconds for a 
> table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Gopal Vijayaraghavan (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047054#comment-17047054
 ] 

Gopal Vijayaraghavan commented on HIVE-22929:
-

{code}
 QuotedIdentifier 
 :
-'`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
getText().length() -1 ).replaceAll("``", "`")); }
+'`'  ( '``' | ~('`') )* '`' { 
setText(RegExUtils.replaceAll(getText().substring(1, getText().length() -1 ), 
QUOTED_REGEX, "`")); }
{code}

For the single literal case, what we needed to use was String.replace(), not a 
Regex.

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047031#comment-17047031
 ] 

Hive QA commented on HIVE-22929:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994772/HIVE-22929.1.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18073 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20858/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20858/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20858/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994772 - PreCommit-HIVE-Build

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22786:
---

Assignee: Rajesh Balamohan  (was: Ramesh Kumar Thangarajan)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch, 
> HIVE-22786.6.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22786:

Status: Open  (was: Patch Available)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch, 
> HIVE-22786.6.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22786:
---

Assignee: Ramesh Kumar Thangarajan  (was: Rajesh Balamohan)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch, 
> HIVE-22786.6.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22786:

Attachment: HIVE-22786.6.patch
Status: Patch Available  (was: Open)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch, 
> HIVE-22786.6.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17047009#comment-17047009
 ] 

Hive QA commented on HIVE-22929:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
7s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
58s{color} | {color:blue} parser in master has 3 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
0s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
49s{color} | {color:red} ql: The patch generated 7 new + 724 unchanged - 16 
fixed = 731 total (was 740) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 31m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20858/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20858/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20858/yetus/patch-asflicense-problems.txt
 |
| modules | C: parser ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20858/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046991#comment-17046991
 ] 

Hive QA commented on HIVE-22926:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994779/HIVE-22926.02.patch

{color:green}SUCCESS:{color} +1 due to 16 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 152 failed/errored test(s), 18075 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_dump_requires_admin]
 (batchId=106)
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testCliDriver[repl_load_requires_admin]
 (batchId=106)
org.apache.hadoop.hive.ql.parse.TestCopyUtils.testPrivilegedDistCpWithSameUserAsCurrentDoesNotTryToImpersonate
 (batchId=282)
org.apache.hadoop.hive.ql.parse.TestMetaStoreEventListenerInRepl.testReplEvents 
(batchId=269)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAbortTxnEvent
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidBootstrapReplLoadRetryAfterFailure
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesBootstrapWithConcurrentDropTable
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesBootstrapWithConcurrentWrites
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testAcidTablesMoveOptimizationIncremental
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testMultiDBTxn
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testOpenTxnEvent
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcidTablesWithJsonMessage.testTxnEventNonAcid
 (batchId=276)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootStrapDumpOfWarehouse
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapFunctionReplication
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapLoadRetryAfterFailureForAlterTable
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapReplLoadRetryAfterFailureForFunctions
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapReplLoadRetryAfterFailureForPartitions
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testBootstrapReplLoadRetryAfterFailureForTablesAndConstraints
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testCreateFunctionIncrementalReplication
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testDropFunctionIncrementalReplication
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIfBootstrapReplLoadFailWhenRetryAfterBootstrapComplete
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIfCkptAndSourceOfReplPropsIgnoredByReplDump
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIfCkptPropIgnoredByExport
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalCreateFunctionWithFunctionBinaryJarsOnHDFS
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalDumpEmptyDumpDirectory
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalDumpMultiIteration
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalMetadataReplication
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalReplWithDropAndCreateTableDifferentPartitionTypeAndInsert
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testIncrementalReplWithEventsBatchHavingDropCreateTable
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testMoveOptimizationBootstrapReplLoadRetryAfterFailure
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testMoveOptimizationIncrementalFailureAfterCopy
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testMoveOptimizationIncrementalFailureAfterCopyReplace
 (batchId=267)
org.apache.hadoop.hive.ql.parse.TestReplAcrossInstancesWithJsonMessageFormat.testMultipleStagesOfReplicationLoadTask
 (batchId=267)

[jira] [Commented] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046967#comment-17046967
 ] 

Hive QA commented on HIVE-22926:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
0s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 8s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
55s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
43s{color} | {color:blue} itests/hive-unit in master has 2 extant Findbugs 
warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
43s{color} | {color:red} ql: The patch generated 3 new + 41 unchanged - 0 fixed 
= 44 total (was 41) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
33s{color} | {color:red} itests/hive-unit: The patch generated 65 new + 1121 
unchanged - 1 fixed = 1186 total (was 1122) {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m 
25s{color} | {color:red} ql generated 2 new + 1531 unchanged - 0 fixed = 1533 
total (was 1531) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 33m 27s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Boxed value is unboxed and then immediately reboxed in 
org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.getEventFromPreviousDumpMetadata(Path)
  At ReplDumpTask.java:then immediately reboxed in 
org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.getEventFromPreviousDumpMetadata(Path)
  At ReplDumpTask.java:[line 158] |
|  |  Primitive is boxed to call Long.compareTo(Long):Long.compareTo(Long): use 
Long.compare(long, long) instead  At ReplDumpTask.java:[line 153] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20857/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20857/yetus/diff-checkstyle-ql.txt
 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20857/yetus/diff-checkstyle-itests_hive-unit.txt
 |
| whitespace | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20857/yetus/whitespace-eol.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20857/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20857/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql itests/hive-unit U: . |
|

[jira] [Commented] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046954#comment-17046954
 ] 

Hive QA commented on HIVE-22900:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994759/HIVE-22900.03.patch

{color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18085 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20856/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20856/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20856/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994759 - PreCommit-HIVE-Build

> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch, 
> HIVE-22900.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the metastore in case of DIRECT-SQL and JDO. The 
> patch also covers all the test cases mentioned in HIVE-5134 because of which 
> Predicate Push Down for like filter was disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046921#comment-17046921
 ] 

Hive QA commented on HIVE-22900:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
50s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 4s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
22s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
56s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
30s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
23s{color} | {color:red} standalone-metastore/metastore-server: The patch 
generated 2 new + 520 unchanged - 4 fixed = 522 total (was 524) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
16s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 32m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20856/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20856/yetus/diff-checkstyle-standalone-metastore_metastore-server.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20856/yetus/patch-asflicense-problems.txt
 |
| modules | C: standalone-metastore/metastore-server ql U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20856/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch, 
> HIVE-22900.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the

[jira] [Commented] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046880#comment-17046880
 ] 

Hive QA commented on HIVE-22919:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994758/HIVE-22919.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18074 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20855/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20855/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20855/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994758 - PreCommit-HIVE-Build

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch, HIVE-22919.4.patch, HIVE-22919.5.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user does not have an access:
> SET hive.metastore.warehouse.dir=/tmp/m2;
> -- 3. Try to create a database:
> CREATE DATABASE m2;
> -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user has an access:
> SET hive.metastore.warehouse.dir=/tmp/m3;
> -- 5. Try to create a database:
> CREATE DATABASE m3;
> {code}
> *ACTUAL RESULT:*
> Query 5 fails with an exception below. It does not handle 
> "hive.metastore.warehouse.dir" proprty:
> {code:java}
> hive> -- 5. Try to create a database:
> hive> CREATE DATABASE m3;
> FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
> testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
> hive>
> {code}
> *EXPECTED RESULT:*
> Query 5 creates a database;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046842#comment-17046842
 ] 

Hive QA commented on HIVE-22919:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m  
4s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
52s{color} | {color:blue} ql in master has 1531 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} | {color:blue} jdbc in master has 16 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
33s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} ql: The patch generated 0 new + 5 unchanged - 1 
fixed = 5 total (was 6) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
11s{color} | {color:red} jdbc: The patch generated 1 new + 2 unchanged - 1 
fixed = 3 total (was 3) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20855/dev-support/hive-personality.sh
 |
| git revision | master / ffba5d6 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20855/yetus/diff-checkstyle-jdbc.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20855/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql jdbc U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20855/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch, HIVE-22919.4.patch, HIVE-22919.5.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>

[jira] [Updated] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-27 Thread Vineet Garg (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22824:
---
Status: Patch Available  (was: Open)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch, HIVE-22824.6.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-27 Thread Vineet Garg (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22824:
---
Status: Open  (was: Patch Available)

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch, HIVE-22824.6.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22824) JoinProjectTranspose rule should skip Projects containing windowing expression

2020-02-27 Thread Vineet Garg (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vineet Garg updated HIVE-22824:
---
Attachment: HIVE-22824.6.patch

> JoinProjectTranspose rule should skip Projects containing windowing expression
> --
>
> Key: HIVE-22824
> URL: https://issues.apache.org/jira/browse/HIVE-22824
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 4.0.0
>Reporter: Vineet Garg
>Assignee: Vineet Garg
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22824.1.patch, HIVE-22824.2.patch, 
> HIVE-22824.3.patch, HIVE-22824.4.patch, HIVE-22824.5.patch, HIVE-22824.6.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Otherwise this rule could end up creating plan with windowing expression 
> within join condition which hive doesn't know how to process.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046800#comment-17046800
 ] 

Hive QA commented on HIVE-22832:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994755/HIVE-22832.8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[timestamptz_2] 
(batchId=92)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20854/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20854/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20854/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994755 - PreCommit-HIVE-Build

> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch, 
> HIVE-22832.6.patch, HIVE-22832.7.patch, HIVE-22832.8.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22903) Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

2020-02-27 Thread Shubham Chaurasia (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046793#comment-17046793
 ] 

Shubham Chaurasia commented on HIVE-22903:
--

[~rameshkumar] 
Thanks for the suggestions. Yes, it was not related to constants. It's related 
to batch size.

Resetting evaluator only when isLastGroupBatch=true fixed all the cases. Fixed 
https://issues.apache.org/jira/browse/HIVE-22909 as well. I uploaded a new 
patch with this approach.

{code:java}
if (!isPartitionOrderBy) {
  // To keep the row counting correct, don't reset for row_number evaluator 
if it's not a isLastGroupBatch
  if (!isLastGroupBatch && isRowNumberFunction()) {
return;
  }
  groupBatches.resetEvaluators();
}
{code}

However I think this can be safely generalized for all the functions like - 
{code:java}
if (!isPartitionOrderBy && isLastGroupBatch) {
  groupBatches.resetEvaluators();
}
{code}
Will give this a try tomorrow.

> Vectorized row_number() resets the row number after one batch in case of 
> constant expression in partition clause
> 
>
> Key: HIVE-22903
> URL: https://issues.apache.org/jira/browse/HIVE-22903
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22903.01.patch, HIVE-22903.02.patch, 
> HIVE-22903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorized row number implementation resets the row number when constant 
> expression is passed in partition clause.
> Repro Query
> {code}
> select row_number() over(partition by 1) r1, t from over10k_n8;
> Or
> select row_number() over() r1, t from over10k_n8;
> {code}
> where table over10k_n8 contains more than 1024 records.
> This happens because currently in VectorPTFOperator, we reset evaluators if 
> only partition clause is there.
> {code:java}
> // If we are only processing a PARTITION BY, reset our evaluators.
> if (!isPartitionOrderBy) {
>   groupBatches.resetEvaluators();
> }
> {code}
> To resolve, we should also check if the entire partition clause is a constant 
> expression, if it is so then we should not do 
> {{groupBatches.resetEvaluators()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22903) Vectorized row_number() resets the row number after one batch in case of constant expression in partition clause

2020-02-27 Thread Shubham Chaurasia (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shubham Chaurasia updated HIVE-22903:
-
Attachment: HIVE-22903.02.patch

> Vectorized row_number() resets the row number after one batch in case of 
> constant expression in partition clause
> 
>
> Key: HIVE-22903
> URL: https://issues.apache.org/jira/browse/HIVE-22903
> Project: Hive
>  Issue Type: Bug
>  Components: UDF, Vectorization
>Affects Versions: 4.0.0
>Reporter: Shubham Chaurasia
>Assignee: Shubham Chaurasia
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22903.01.patch, HIVE-22903.02.patch, 
> HIVE-22903.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Vectorized row number implementation resets the row number when constant 
> expression is passed in partition clause.
> Repro Query
> {code}
> select row_number() over(partition by 1) r1, t from over10k_n8;
> Or
> select row_number() over() r1, t from over10k_n8;
> {code}
> where table over10k_n8 contains more than 1024 records.
> This happens because currently in VectorPTFOperator, we reset evaluators if 
> only partition clause is there.
> {code:java}
> // If we are only processing a PARTITION BY, reset our evaluators.
> if (!isPartitionOrderBy) {
>   groupBatches.resetEvaluators();
> }
> {code}
> To resolve, we should also check if the entire partition clause is a constant 
> expression, if it is so then we should not do 
> {{groupBatches.resetEvaluators()}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-27 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22893:

Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I've added a comment about how it could be used to the interface.

pushed to master. Thank you Jesus for reviewing the changes!

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch, HIVE-22893.13.patch, HIVE-22893.14.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22893?focusedWorklogId=394300=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394300
 ]

ASF GitHub Bot logged work on HIVE-22893:
-

Author: ASF GitHub Bot
Created on: 27/Feb/20 16:25
Start Date: 27/Feb/20 16:25
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #915: HIVE-22893 
StatEstimate
URL: https://github.com/apache/hive/pull/915
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 394300)
Time Spent: 2h  (was: 1h 50m)

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch, HIVE-22893.13.patch, HIVE-22893.14.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22938) Investigate possibility of removing empty bucket file creation mechanism in Hive-on-MR

2020-02-27 Thread Marton Bod (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046776#comment-17046776
 ] 

Marton Bod commented on HIVE-22938:
---

As an addition: even for MR, the current state now is that when inserting into 
full ACID tables the empty bucket files are only generated inside the staging 
directory and they aren't moved to the final delta directory folder. The empty 
bucket files are only kept for MM/insert-only tables.

> Investigate possibility of removing empty bucket file creation mechanism in 
> Hive-on-MR
> --
>
> Key: HIVE-22938
> URL: https://issues.apache.org/jira/browse/HIVE-22938
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Priority: Major
>
> As a follow-up to HIVE-22918, this ticket is to investigate whether the empty 
> bucket file creation mechanism can be removed safely when using MR as the 
> engine. 
> For a bucketed table of N buckets, each insert will generate N bucket files 
> in the delta directory, regardless of how many actual buckets are written to. 
> As an example, if a table has 500 buckets, and we insert a single record, 499 
> empty bucket files are generated alongside the single bucket that contains 
> the actual data. This makes the operation substantially slower in some cases. 
> This behaviour only seems to happen when using MR as the execution engine.
> Some components/parts of the code might depend on this behaviour though, so 
> it needs to be verified that removing this logic does not interfere with 
> anything.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22832) Parallelise direct insert directory cleaning process

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046765#comment-17046765
 ] 

Hive QA commented on HIVE-22832:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
1s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
 7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
1s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
41s{color} | {color:red} ql: The patch generated 5 new + 103 unchanged - 3 
fixed = 108 total (was 106) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m  6s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20854/dev-support/hive-personality.sh
 |
| git revision | master / ff9fa68 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20854/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20854/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20854/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Parallelise direct insert directory cleaning process
> 
>
> Key: HIVE-22832
> URL: https://issues.apache.org/jira/browse/HIVE-22832
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22832.1.patch, HIVE-22832.2.patch, 
> HIVE-22832.3.patch, HIVE-22832.4.patch, HIVE-22832.5.patch, 
> HIVE-22832.6.patch, HIVE-22832.7.patch, HIVE-22832.8.patch
>
>
> Inside Utilities::handleDirectInsertTableFinalPath, the 
> cleanDirectInsertDirectories method is called sequentially for each element 
> of the directInsertDirectories list, which might have a large number of 
> elements depending on how many partitions were written. This current 
> sequential execution could be improved by parallelising the clean up process. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22865) Include data in replication staging directory

2020-02-27 Thread PRAVIN KUMAR SINHA (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PRAVIN KUMAR SINHA updated HIVE-22865:
--
Attachment: HIVE-22865.5.patch

> Include data in replication staging directory
> -
>
> Key: HIVE-22865
> URL: https://issues.apache.org/jira/browse/HIVE-22865
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22865.1.patch, HIVE-22865.2.patch, 
> HIVE-22865.3.patch, HIVE-22865.4.patch, HIVE-22865.5.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22938) Investigate possibility of removing empty bucket file creation mechanism in Hive-on-MR

2020-02-27 Thread Marton Bod (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046751#comment-17046751
 ] 

Marton Bod commented on HIVE-22938:
---

[~ashutoshc], [~gopalv] - I'm investigating whether we can stop creating empty 
bucket files when using MR/Spark (seems like we're already not creating them 
with Tez). So far I have not seen a scenario which makes use of these empty 
files: in my local tests, I have manually deleted some of these empty files 
from the delta directories and did not see any anomalies afterwards when 
reading the data back, or running compaction. But I might be missing some other 
area - do you have any ideas where the empty bucket files might become 
important?

> Investigate possibility of removing empty bucket file creation mechanism in 
> Hive-on-MR
> --
>
> Key: HIVE-22938
> URL: https://issues.apache.org/jira/browse/HIVE-22938
> Project: Hive
>  Issue Type: Task
>Reporter: Marton Bod
>Priority: Major
>
> As a follow-up to HIVE-22918, this ticket is to investigate whether the empty 
> bucket file creation mechanism can be removed safely when using MR as the 
> engine. 
> For a bucketed table of N buckets, each insert will generate N bucket files 
> in the delta directory, regardless of how many actual buckets are written to. 
> As an example, if a table has 500 buckets, and we insert a single record, 499 
> empty bucket files are generated alongside the single bucket that contains 
> the actual data. This makes the operation substantially slower in some cases. 
> This behaviour only seems to happen when using MR as the execution engine.
> Some components/parts of the code might depend on this behaviour though, so 
> it needs to be verified that removing this logic does not interfere with 
> anything.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22865) Include data in replication staging directory

2020-02-27 Thread PRAVIN KUMAR SINHA (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PRAVIN KUMAR SINHA updated HIVE-22865:
--
Attachment: HIVE-22865.4.patch

> Include data in replication staging directory
> -
>
> Key: HIVE-22865
> URL: https://issues.apache.org/jira/browse/HIVE-22865
> Project: Hive
>  Issue Type: Task
>Reporter: PRAVIN KUMAR SINHA
>Assignee: PRAVIN KUMAR SINHA
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22865.1.patch, HIVE-22865.2.patch, 
> HIVE-22865.3.patch, HIVE-22865.4.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046739#comment-17046739
 ] 

Hive QA commented on HIVE-22786:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994756/HIVE-22786.5.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18] (batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join18_multi_distinct]
 (batchId=30)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join32] (batchId=99)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[distinct_stats] 
(batchId=83)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby8_map_skew] 
(batchId=57)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby9] (batchId=7)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_distinct_samekey]
 (batchId=66)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_multi_insert_common_distinct]
 (batchId=86)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_nocolumnalign] 
(batchId=64)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[groupby_position] 
(batchId=46)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join18] (batchId=102)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[join18_multi_distinct] 
(batchId=32)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_distinct] 
(batchId=9)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[multi_insert_gby3] 
(batchId=87)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullgroup4] (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[nullgroup4_multi_distinct]
 (batchId=12)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_vectorization_limit]
 (batchId=28)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_count] 
(batchId=15)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorization_limit] 
(batchId=42)
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_distinct_gby] 
(batchId=86)
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[explainuser_2] 
(batchId=163)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[count] 
(batchId=172)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[metadataonly1]
 (batchId=185)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_union_multiinsert]
 (batchId=175)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_groupby_sort_11]
 (batchId=190)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20853/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20853/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20853/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 25 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994756 - PreCommit-HIVE-Build

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor reassigned HIVE-22941:
---

Assignee: László Bodor

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22942) Replace PTest with an alternative

2020-02-27 Thread Zoltan Haindrich (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046730#comment-17046730
 ] 

Zoltan Haindrich commented on HIVE-22942:
-

How it works right now:
* we run a [job on the ASF jenkins| 
instancehttps://builds.apache.org/job/PreCommit-HIVE-Build/] which logs into 
some cloud instance to launch the ptest execution
* the ptest uses a predefined number of executors(16?) 
* the tests are batched by a custom logic into ~200 batches
* every executor runs 2 batches at a time
* there are some specially tailored features; like timeout at batch level and a 
way to run something in "isolation"

Right now I think the following would be the most promising:
* drop in something else for make use of the [parallel-test-executor plugin for 
jenkins|https://plugins.jenkins.io/parallel-test-executor/]
* it basically works by scanning the last result and it  makes around equally 
sized test groups - and runs that...however it is unable to work if there are 
testcases which run for more time than the bucket size this could be 
probably explored by shoveling in some logic to split the larger cases into 
~30m parts
* creating a job which utilizes the plugin is quite straight forward; so adding 
all the executors as slaves to a jenkins will be able to utilize the same 
compute power

> Replace PTest with an alternative
> -
>
> Key: HIVE-22942
> URL: https://issues.apache.org/jira/browse/HIVE-22942
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> I never opened a jira about this...but it might actually help collect ideas 
> and actually start going somewhere sooner than later :D
> Right now we maintain the ptest2 project inside Hive to be able to run Hive 
> tests in a distributed fashion...the backstab of this solution is that we are 
> putting much effort into maintaining a distributed test execution framework...
> I think it would be better if we could find an off the shelf solution for the 
> task and migrate to that instead of putting more efforts into the ptest 
> framework



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22872) Support multiple executors for scheduled queries

2020-02-27 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-22872:

Attachment: HIVE-22872.03.patch

> Support multiple executors for scheduled queries
> 
>
> Key: HIVE-22872
> URL: https://issues.apache.org/jira/browse/HIVE-22872
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22872.01.patch, HIVE-22872.02.patch, 
> HIVE-22872.03.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046684#comment-17046684
 ] 

Hive QA commented on HIVE-22786:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  
6s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
47s{color} | {color:red} ql: The patch generated 1 new + 404 unchanged - 0 
fixed = 405 total (was 404) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 37s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20853/dev-support/hive-personality.sh
 |
| git revision | master / ff9fa68 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20853/yetus/diff-checkstyle-ql.txt
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20853/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20853/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22942) Replace PTest with an alternative

2020-02-27 Thread Aron Hamvas (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046676#comment-17046676
 ] 

Aron Hamvas commented on HIVE-22942:


Good!
We have already been brainstorming about this a bit with [~b.maidics] and 
[~zchovan], and I heard rumors that [~pvary] had ideas about this last year. 
Might be good to involve those guys as they all seemed heavily interested.

The (informal) discussions were focused on two major topics:
1. Execution engine. E.g. moving to JUnit 5 is probably not a huge effort and 
it supports parallel execution that could speed up execution.
2. Rewriting Ptest framework or replacing it with a more general purpose 
alternative.

> Replace PTest with an alternative
> -
>
> Key: HIVE-22942
> URL: https://issues.apache.org/jira/browse/HIVE-22942
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> I never opened a jira about this...but it might actually help collect ideas 
> and actually start going somewhere sooner than later :D
> Right now we maintain the ptest2 project inside Hive to be able to run Hive 
> tests in a distributed fashion...the backstab of this solution is that we are 
> putting much effort into maintaining a distributed test execution framework...
> I think it would be better if we could find an off the shelf solution for the 
> task and migrate to that instead of putting more efforts into the ptest 
> framework



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046653#comment-17046653
 ] 

Hive QA commented on HIVE-22893:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994754/HIVE-22893.14.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18073 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20852/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20852/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20852/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994754 - PreCommit-HIVE-Build

> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL: https://issues.apache.org/jira/browse/HIVE-22893
> Project: Hive
>  Issue Type: Improvement
>  Components: Statistics
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22893.01.patch, HIVE-22893.02.patch, 
> HIVE-22893.03.patch, HIVE-22893.04.patch, HIVE-22893.05.patch, 
> HIVE-22893.06.patch, HIVE-22893.07.patch, HIVE-22893.08.patch, 
> HIVE-22893.09.patch, HIVE-22893.10.patch, HIVE-22893.11.patch, 
> HIVE-22893.12.patch, HIVE-22893.13.patch, HIVE-22893.14.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Right now if we have columnstat on a column ; we use that to estimate things 
> about the column; - however if an UDF is executed on a column ; the resulting 
> column is treated as unknown thing and defaults are assumed.
> An improvement could be to give wide estimation(s) in case of frequently used 
> udf.
> For example; consider {{substr(c,1,1)}} ; no matter what the input; the 
> output is at most a 1 long string



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Status: Patch Available  (was: Open)

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Attachment: HIVE-22941.01.patch

> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
> Attachments: HIVE-22941.01.patch
>
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Attachment: HIVE-22926.02.patch
Status: Patch Available  (was: In Progress)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Status: In Progress  (was: Patch Available)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.02.patch, 
> HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Status: In Progress  (was: Patch Available)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Attachment: HIVE-22926.01.patch
Status: Patch Available  (was: In Progress)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.01.patch, HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22931) HoS dynamic partitioning fails with blobstore optimizations off

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ádám Szita updated HIVE-22931:
--
Attachment: HIVE-22931.1.patch

> HoS dynamic partitioning fails with blobstore optimizations off
> ---
>
> Key: HIVE-22931
> URL: https://issues.apache.org/jira/browse/HIVE-22931
> Project: Hive
>  Issue Type: Bug
>Reporter: Ádám Szita
>Assignee: Ádám Szita
>Priority: Major
> Attachments: HIVE-22931.0.patch, HIVE-22931.1.patch
>
>
> Reproduction steps:
>  - Create s3a backed table and normal table.
> {code:java}
> CREATE TABLE source (
>   a string,
>   b int,
>   c int);
>   
> CREATE TABLE target (
>   a string)
> PARTITIONED BY (
>   b int,
>   c int)
> STORED AS parquet
> LOCATION
>   's3a://somepath';
> {code}
>  - Insert values into normal table.
> {code:java}
> INSERT INTO TABLE source VALUES ("a", "1", "1");
> {code}
>  - Do an insert overwrite with dynamic partitions:
> {code:java}
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.blobstore.optimizations.enabled=false;
> set hive.execution.engine=spark;
> INSERT OVERWRITE TABLE target partition (b,c)
> SELECT *
> FROM source;{code}
> This fails only with Spark execution engine + blobstorage optimizations being 
> turned off with:
> {code}
> 2020-01-16 15:24:56,064 ERROR hive.ql.metadata.Hive: 
> [load-dynamic-partitions-5]: Exception when loading partition with parameters 
>  
> partPath=hdfs://nameservice1/tmp/hive/hive/6bcee075-b637-429e-9bf0-a2658355415e/hive_2020-01-16_15-24-01_156_4299941251929377815-4/-mr-1/.hive-staging_hive_2020-01-16_15-24-01_156_4299941251929377815-4/-ext-10002,
>   table=email_click_base,  partSpec={b=null, c=null},  replace=true,  
> listBucketingEnabled=false,  isAcid=false,  
> hasFollowingStatsTask=trueorg.apache.hadoop.hive.ql.metadata.HiveException: 
> MetaException(message:Partition spec is incorrect. {companyid=null, 
> eventmonth=null})
> at 
> org.apache.hadoop.hive.ql.metadata.Hive.loadPartitionInternal(Hive.java:1666)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22893) Enhance data size estimation for fields computed by UDFs

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046611#comment-17046611
 ] 

Hive QA commented on HIVE-22893:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
57s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
35s{color} | {color:blue} common in master has 63 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  1m 
19s{color} | {color:blue} standalone-metastore/metastore-server in master has 
185 extant Findbugs warnings. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
59s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
25s{color} | {color:blue} contrib in master has 11 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
44s{color} | {color:red} ql: The patch generated 6 new + 127 unchanged - 0 
fixed = 133 total (was 127) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  4m  
1s{color} | {color:red} ql generated 1 new + 1530 unchanged - 0 fixed = 1531 
total (was 1530) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 2 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 38m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:ql |
|  |  Dead store to start in 
org.apache.hadoop.hive.ql.udf.UDFSubstr$SubStrStatEstimator.estimate(List)  At 
UDFSubstr.java:org.apache.hadoop.hive.ql.udf.UDFSubstr$SubStrStatEstimator.estimate(List)
  At UDFSubstr.java:[line 156] |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20852/dev-support/hive-personality.sh
 |
| git revision | master / a846608 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| checkstyle | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20852/yetus/diff-checkstyle-ql.txt
 |
| findbugs | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20852/yetus/new-findbugs-ql.html
 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20852/yetus/patch-asflicense-problems.txt
 |
| modules | C: common standalone-metastore/metastore-server ql contrib U: . |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20852/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Enhance data size estimation for fields computed by UDFs
> 
>
> Key: HIVE-22893
> URL:

[jira] [Updated] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-27 Thread Peter Vary (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-22819:
--
Fix Version/s: 4.0.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Pushed to master.

Thanks for the patch [~Marton Bod], and [~ste...@apache.org] for the review!

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch, 
> HIVE-22819.6.patch, HIVE-22819.7.patch, HIVE-22819.8.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22453) Describe table unnecessarily fetches partitions

2020-02-27 Thread Toshihiko Uchida (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046593#comment-17046593
 ] 

Toshihiko Uchida commented on HIVE-22453:
-

[~vgarg]
Rebased and uploaded the patch.

> Describe table unnecessarily fetches partitions
> ---
>
> Key: HIVE-22453
> URL: https://issues.apache.org/jira/browse/HIVE-22453
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 2.3.6
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-22453.2.patch, HIVE-22453.2.patch, 
> HIVE-22453.3.patch, HIVE-22453.4.patch, HIVE-22453.patch
>
>
> The simple describe table command without EXTENDED and FORMATTED (i.e., 
> DESCRIBE table_name) fetches all partitions when no partition is specified, 
> although it does not display partition statistics in nature.
> The command should not fetch partitions since it can take a long time for a 
> large amount of partitions.
> For instance, in our environment, the command takes around 8 seconds for a 
> table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-27 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Attachment: HIVE-22925.3.patch

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch, HIVE-22925.2.patch, 
> HIVE-22925.3.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22929:
--
Status: Patch Available  (was: Open)

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-27 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Open  (was: Patch Available)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch, HIVE-22925.2.patch, 
> HIVE-22925.3.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22925) Implement TopNKeyFilter efficiency check

2020-02-27 Thread Attila Magyar (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Magyar updated HIVE-22925:
-
Status: Patch Available  (was: Open)

> Implement TopNKeyFilter efficiency check
> 
>
> Key: HIVE-22925
> URL: https://issues.apache.org/jira/browse/HIVE-22925
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Reporter: Attila Magyar
>Assignee: Attila Magyar
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-22925.1.patch, HIVE-22925.2.patch, 
> HIVE-22925.3.patch
>
>
> In certain cases the TopNKey filter might work in an inefficient way and adds 
> extra CPU overhead. For example if the rows are coming in an descending order 
> but the filter wants the top N smallest elements the filter will forward 
> everything.
> Inefficient should be detected in runtime so that the filter can be disabled 
> of the ration between forwarder_rows/total_rows is too high.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22929) Performance: quoted identifier parsing uses throwaway Regex via String.replaceAll()

2020-02-27 Thread Krisztian Kasa (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Kasa updated HIVE-22929:
--
Attachment: HIVE-22929.1.patch

> Performance: quoted identifier parsing uses throwaway Regex via 
> String.replaceAll()
> ---
>
> Key: HIVE-22929
> URL: https://issues.apache.org/jira/browse/HIVE-22929
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal Vijayaraghavan
>Assignee: Krisztian Kasa
>Priority: Major
> Attachments: HIVE-22929.1.patch, String.replaceAll.png
>
>
>  !String.replaceAll.png! 
> https://github.com/apache/hive/blob/master/parser/src/java/org/apache/hadoop/hive/ql/parse/HiveLexer.g#L530
> {code}
> '`'  ( '``' | ~('`') )* '`' { setText(getText().substring(1, 
> getText().length() -1 ).replaceAll("``", "`")); }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22453) Describe table unnecessarily fetches partitions

2020-02-27 Thread Toshihiko Uchida (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiko Uchida updated HIVE-22453:

Attachment: HIVE-22453.4.patch

> Describe table unnecessarily fetches partitions
> ---
>
> Key: HIVE-22453
> URL: https://issues.apache.org/jira/browse/HIVE-22453
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.2, 2.3.6
>Reporter: Toshihiko Uchida
>Assignee: Toshihiko Uchida
>Priority: Minor
> Attachments: HIVE-22453.2.patch, HIVE-22453.2.patch, 
> HIVE-22453.3.patch, HIVE-22453.4.patch, HIVE-22453.patch
>
>
> The simple describe table command without EXTENDED and FORMATTED (i.e., 
> DESCRIBE table_name) fetches all partitions when no partition is specified, 
> although it does not display partition statistics in nature.
> The command should not fetch partitions since it can take a long time for a 
> large amount of partitions.
> For instance, in our environment, the command takes around 8 seconds for a 
> table with 8760 (24 * 365) partitions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work started] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-22926 started by Aasha Medhi.
--
> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22926) Schedule Repl Dump Task using Hive scheduler

2020-02-27 Thread Aasha Medhi (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aasha Medhi updated HIVE-22926:
---
Attachment: HIVE-22926.patch
Status: Patch Available  (was: In Progress)

> Schedule Repl Dump Task using Hive scheduler
> 
>
> Key: HIVE-22926
> URL: https://issues.apache.org/jira/browse/HIVE-22926
> Project: Hive
>  Issue Type: Bug
>Reporter: Aasha Medhi
>Assignee: Aasha Medhi
>Priority: Major
> Attachments: HIVE-22926.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046565#comment-17046565
 ] 

Hive QA commented on HIVE-22819:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994744/HIVE-22819.8.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:green}SUCCESS:{color} +1 due to 18073 tests passed

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20851/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20851/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20851/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994744 - PreCommit-HIVE-Build

> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch, 
> HIVE-22819.6.patch, HIVE-22819.7.patch, HIVE-22819.8.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22731) Probe MapJoin hashtables for row level filtering

2020-02-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22731?focusedWorklogId=394107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394107
 ]

ASF GitHub Bot logged work on HIVE-22731:
-

Author: ASF GitHub Bot
Created on: 27/Feb/20 11:52
Start Date: 27/Feb/20 11:52
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #926: HIVE-22731 Probe 
decode using ORC row-level filtering
URL: https://github.com/apache/hive/pull/926
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 394107)
Time Spent: 0.5h  (was: 20m)

> Probe MapJoin hashtables for row level filtering
> 
>
> Key: HIVE-22731
> URL: https://issues.apache.org/jira/browse/HIVE-22731
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22731.1.patch, HIVE-22731.2.patch, 
> HIVE-22731.WIP.patch, decode_time_bars.pdf
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, RecordReaders such as ORC support filtering at coarser-grained 
> levels, namely: File, Stripe (64 to 256mb), and Row group (10k row) level. 
> They only filter sets of rows if they can guarantee that none of the rows can 
> pass a filter (usually given as searchable argument).
> However, a significant amount of time can be spend decoding rows with 
> multiple columns that are not even used in the final result. See figure where 
> original is what happens today and in LazyDecode we skip decoding rows that 
> do not match the key.
> To enable a more fine-grained filtering in the particular case of a MapJoin 
> we could utilize the key HashTable created from the smaller table to skip 
> deserializing row columns at the larger table that do not match any key and 
> thus save CPU time. 
> This Jira investigates this direction. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread Syed Shameerur Rahman (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046531#comment-17046531
 ] 

Syed Shameerur Rahman commented on HIVE-22900:
--

[~jcamachorodriguez]

Thank You for the review. I have addressed your comments in the latest patch 
and the same has been updated in the PR.

Yes i have covered all the test cases in HIVE-5134.

> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch, 
> HIVE-22900.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the metastore in case of DIRECT-SQL and JDO. The 
> patch also covers all the test cases mentioned in HIVE-5134 because of which 
> Predicate Push Down for like filter was disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22819) Refactor Hive::listFilesCreatedByQuery to make it faster for object stores

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046524#comment-17046524
 ] 

Hive QA commented on HIVE-22819:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 10m 
20s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  3m 
58s{color} | {color:blue} ql in master has 1530 extant Findbugs warnings. 
{color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
15s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 26m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 
3.16.43-2+deb8u5 (2017-09-19) x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/data/hiveptest/working/yetus_PreCommit-HIVE-Build-20851/dev-support/hive-personality.sh
 |
| git revision | master / a846608 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.1 |
| asflicense | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20851/yetus/patch-asflicense-problems.txt
 |
| modules | C: ql U: ql |
| Console output | 
http://104.198.109.242/logs//PreCommit-HIVE-Build-20851/yetus.txt |
| Powered by | Apache Yetushttp://yetus.apache.org |


This message was automatically generated.



> Refactor Hive::listFilesCreatedByQuery to make it faster for object stores
> --
>
> Key: HIVE-22819
> URL: https://issues.apache.org/jira/browse/HIVE-22819
> Project: Hive
>  Issue Type: Improvement
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
> Attachments: HIVE-22819.1.patch, HIVE-22819.2.patch, 
> HIVE-22819.3.patch, HIVE-22819.4.patch, HIVE-22819.5.patch, 
> HIVE-22819.6.patch, HIVE-22819.7.patch, HIVE-22819.8.patch
>
>
> {color:#ff}Hive::listFilesCreatedByQuery{color} does an exists(), an 
> isDir() and then a listing call. This can be expensive in object stores. We 
> should instead directly list the files in the directory (we'd have to handle 
> an exception if the directory does not exists, but issuing a single call to 
> the object store would most likely still end up being more performant). 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22731) Probe MapJoin hashtables for row level filtering

2020-02-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22731?focusedWorklogId=394100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394100
 ]

ASF GitHub Bot logged work on HIVE-22731:
-

Author: ASF GitHub Bot
Created on: 27/Feb/20 11:19
Start Date: 27/Feb/20 11:19
Worklog Time Spent: 10m 
  Work Description: pgaref commented on pull request #884: HIVE-22731 Probe 
decode initial patch
URL: https://github.com/apache/hive/pull/884
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 394100)
Time Spent: 20m  (was: 10m)

> Probe MapJoin hashtables for row level filtering
> 
>
> Key: HIVE-22731
> URL: https://issues.apache.org/jira/browse/HIVE-22731
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive, llap
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
> Attachments: HIVE-22731.1.patch, HIVE-22731.2.patch, 
> HIVE-22731.WIP.patch, decode_time_bars.pdf
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, RecordReaders such as ORC support filtering at coarser-grained 
> levels, namely: File, Stripe (64 to 256mb), and Row group (10k row) level. 
> They only filter sets of rows if they can guarantee that none of the rows can 
> pass a filter (usually given as searchable argument).
> However, a significant amount of time can be spend decoding rows with 
> multiple columns that are not even used in the final result. See figure where 
> original is what happens today and in LazyDecode we skip decoding rows that 
> do not match the key.
> To enable a more fine-grained filtering in the particular case of a MapJoin 
> we could utilize the key HashTable created from the smaller table to skip 
> deserializing row columns at the larger table that do not match any key and 
> thus save CPU time. 
> This Jira investigates this direction. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22933) Allow connecting kerberos-enabled Hive to connect to a non-kerberos druid cluster

2020-02-27 Thread Hive QA (Jira)



[ 
https://issues.apache.org/jira/browse/HIVE-22933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046491#comment-17046491
 ] 

Hive QA commented on HIVE-22933:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12994682/HIVE-22933.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 18073 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_dynamic_partition_pruning]
 (batchId=199)
{noformat}

Test results: 
https://builds.apache.org/job/PreCommit-HIVE-Build/20850/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/20850/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-20850/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12994682 - PreCommit-HIVE-Build

> Allow connecting kerberos-enabled Hive to connect to a non-kerberos druid 
> cluster
> -
>
> Key: HIVE-22933
> URL: https://issues.apache.org/jira/browse/HIVE-22933
> Project: Hive
>  Issue Type: Bug
>Reporter: Nishant Bangarwa
>Assignee: Nishant Bangarwa
>Priority: Major
> Attachments: HIVE-22933.patch
>
>
> Currently, If kerberos is enabled for hive, it can only connect to external 
> druid clusters which are kerberos enabled, Since the Druid client used to 
> connect to druid is always KerberosHTTPClient, This task is to allow a 
> kerberos enabled hiverserver2 to connect to non-kerberized druid cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046467#comment-17046467
 ] 

László Bodor edited comment on HIVE-22941 at 2/27/20 10:44 AM:
---

issue reproduced:
{code}
export QTEST_LEAVE_FILES=true
mvn test -Dtest.output.overwrite=true -Pitests,hadoop-2 -Denforcer.skip=true 
-pl itests/qtest -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=empty_files_non_bucketed.q
...
 lbodor@HW12459  ~/repos/hive   HDP-3.1-maint ●  ls -la 
itests/qtest/target/localfs/warehouse/t1/00_0
-rw-r--r--  1 lbodor  staff  0 Feb 25 11:42 
itests/qtest/target/localfs/warehouse/t1/00_0
{code}
https://github.com/abstractdog/hive/commit/7e08a3f654d67848cc2f3a915ebb8294d98e4328


easy fix with acid/mm regression:
https://github.com/abstractdog/hive/commit/8e25b5ce11220e22dbe90958d52c63b52a482931

not necessarily related, but there are other recent jiras about empty files, 
I'm linking them in order to be aware of each other: HIVE-22918 (, HIVE-22938 
for MR)


was (Author: abstractdog):
issue reproduced:
{code}
export QTEST_LEAVE_FILES=true
mvn test -Dtest.output.overwrite=true -Pitests,hadoop-2 -Denforcer.skip=true 
-pl itests/qtest -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=empty_files_non_bucketed.q
...
 lbodor@HW12459  ~/repos/hive   HDP-3.1-maint ●  ls -la 
itests/qtest/target/localfs/warehouse/t1/00_0
-rw-r--r--  1 lbodor  staff  0 Feb 25 11:42 
itests/qtest/target/localfs/warehouse/t1/00_0
{code}
https://github.com/abstractdog/hive/commit/7e08a3f654d67848cc2f3a915ebb8294d98e4328


easy fix with acid/mm regression:
https://github.com/abstractdog/hive/commit/8e25b5ce11220e22dbe90958d52c63b52a482931


> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread Syed Shameerur Rahman (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated HIVE-22900:
-
Attachment: HIVE-22900.03.patch

> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch, 
> HIVE-22900.03.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the metastore in case of DIRECT-SQL and JDO. The 
> patch also covers all the test cases mentioned in HIVE-5134 because of which 
> Predicate Push Down for like filter was disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (HIVE-22900) Predicate Push Down Of Like Filter While Fetching Partition Data From MetaStore

2020-02-27 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22900?focusedWorklogId=394088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-394088
 ]

ASF GitHub Bot logged work on HIVE-22900:
-

Author: ASF GitHub Bot
Created on: 27/Feb/20 10:36
Start Date: 27/Feb/20 10:36
Worklog Time Spent: 10m 
  Work Description: shameersss1 commented on pull request #916: HIVE-22900 
: Predicate Push Down Of Like Filter While Fetching Partition Data From 
MetaStore
URL: https://github.com/apache/hive/pull/916#discussion_r385043264
 
 

 ##
 File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 ##
 @@ -1222,6 +1217,10 @@ public void visit(LeafNode node) throws MetaException {
 params.add(nodeValue);
   }
 
+  if (node.operator == Operator.LIKE) {
+nodeValue0 = nodeValue0 + " ESCAPE '\\' ";
 
 Review comment:
   Added.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 394088)
Time Spent: 0.5h  (was: 20m)

> Predicate Push Down Of Like Filter While Fetching Partition Data From 
> MetaStore
> ---
>
> Key: HIVE-22900
> URL: https://issues.apache.org/jira/browse/HIVE-22900
> Project: Hive
>  Issue Type: New Feature
>Reporter: Syed Shameerur Rahman
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-22900.01.patch, HIVE-22900.02.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Predicate Push Down is disabled for like filter while fetching partition data 
> from metastore. The following patch introduces PPD for like filters while 
> fetching partition data from the metastore in case of DIRECT-SQL and JDO. The 
> patch also covers all the test cases mentioned in HIVE-5134 because of which 
> Predicate Push Down for like filter was disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-27 Thread Oleksiy Sayankin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-22919:

Status: In Progress  (was: Patch Available)

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch, HIVE-22919.4.patch, HIVE-22919.5.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user does not have an access:
> SET hive.metastore.warehouse.dir=/tmp/m2;
> -- 3. Try to create a database:
> CREATE DATABASE m2;
> -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user has an access:
> SET hive.metastore.warehouse.dir=/tmp/m3;
> -- 5. Try to create a database:
> CREATE DATABASE m3;
> {code}
> *ACTUAL RESULT:*
> Query 5 fails with an exception below. It does not handle 
> "hive.metastore.warehouse.dir" proprty:
> {code:java}
> hive> -- 5. Try to create a database:
> hive> CREATE DATABASE m3;
> FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
> testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
> hive>
> {code}
> *EXPECTED RESULT:*
> Query 5 creates a database;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-27 Thread Oleksiy Sayankin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-22919:

Status: Patch Available  (was: In Progress)

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch, HIVE-22919.4.patch, HIVE-22919.5.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user does not have an access:
> SET hive.metastore.warehouse.dir=/tmp/m2;
> -- 3. Try to create a database:
> CREATE DATABASE m2;
> -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user has an access:
> SET hive.metastore.warehouse.dir=/tmp/m3;
> -- 5. Try to create a database:
> CREATE DATABASE m3;
> {code}
> *ACTUAL RESULT:*
> Query 5 fails with an exception below. It does not handle 
> "hive.metastore.warehouse.dir" proprty:
> {code:java}
> hive> -- 5. Try to create a database:
> hive> CREATE DATABASE m3;
> FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
> testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
> hive>
> {code}
> *EXPECTED RESULT:*
> Query 5 creates a database;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22919) StorageBasedAuthorizationProvider does not allow create databases after changing hive.metastore.warehouse.dir

2020-02-27 Thread Oleksiy Sayankin (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksiy Sayankin updated HIVE-22919:

Attachment: HIVE-22919.5.patch

> StorageBasedAuthorizationProvider does not allow create databases after 
> changing hive.metastore.warehouse.dir
> -
>
> Key: HIVE-22919
> URL: https://issues.apache.org/jira/browse/HIVE-22919
> Project: Hive
>  Issue Type: Bug
>Reporter: Oleksiy Sayankin
>Assignee: Oleksiy Sayankin
>Priority: Major
> Attachments: HIVE-22919.1.patch, HIVE-22919.2.patch, 
> HIVE-22919.3.patch, HIVE-22919.4.patch, HIVE-22919.5.patch
>
>
> *ENVIRONMENT:*
> Hive-2.3
> *STEPS TO REPRODUCE:*
> 1. Configure Storage Based Authorization:
> {code:xml}
>   hive.security.authorization.enabled
>   true
> 
> 
>   hive.security.metastore.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.authorization.manager
>   
> org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider
> 
> 
>   hive.security.metastore.authenticator.manager
>   
> org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator
> 
> 
>   hive.metastore.pre.event.listeners
>   
> org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener
> {code}
> 2. Create a few directories, change owners and permissions to it:
> {code:java}hadoop fs -mkdir /tmp/m1
> hadoop fs -mkdir /tmp/m2
> hadoop fs -mkdir /tmp/m3
> hadoop fs -chown testuser1:testuser1 /tmp/m[1,3]
> hadoop fs -chmod 700 /tmp/m[1-3]{code}
> 3. Check permissions:
> {code:java}[test@node2 ~]$ hadoop fs -ls /tmp|grep m[1-3]
> drwx--   - testuser1 testuser1  0 2020-02-11 10:25 /tmp/m1
> drwx--   - test  test   0 2020-02-11 10:25 /tmp/m2
> drwx--   - testuser1 testuser1  1 2020-02-11 10:36 /tmp/m3
> [test@node2 ~]$
> {code}
> 4. Loggin into Hive CLI using embedded Hive Metastore as *"testuser1"* user, 
> with *"hive.metastore.warehouse.dir"* set to *"/tmp/m1"*:
> {code:java}
> sudo -u testuser1 hive --hiveconf hive.metastore.uris= --hiveconf 
> hive.metastore.warehouse.dir=/tmp/m1
> {code}
> 5. Perform the next steps:
> {code:sql}-- 1. Check "hive.metastore.warehouse.dir" value:
> SET hive.metastore.warehouse.dir;
> -- 2. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user does not have an access:
> SET hive.metastore.warehouse.dir=/tmp/m2;
> -- 3. Try to create a database:
> CREATE DATABASE m2;
> -- 4. Set "hive.metastore.warehouse.dir" to the path, to which "testuser1" 
> user has an access:
> SET hive.metastore.warehouse.dir=/tmp/m3;
> -- 5. Try to create a database:
> CREATE DATABASE m3;
> {code}
> *ACTUAL RESULT:*
> Query 5 fails with an exception below. It does not handle 
> "hive.metastore.warehouse.dir" proprty:
> {code:java}
> hive> -- 5. Try to create a database:
> hive> CREATE DATABASE m3;
> FAILED: HiveException org.apache.hadoop.security.AccessControlException: User 
> testuser1(user id 5001)  does not have access to hdfs:/tmp/m2/m3.db
> hive>
> {code}
> *EXPECTED RESULT:*
> Query 5 creates a database;



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



[ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046467#comment-17046467
 ] 

László Bodor commented on HIVE-22941:
-

issue reproduced:
{code}
export QTEST_LEAVE_FILES=true
mvn test -Dtest.output.overwrite=true -Pitests,hadoop-2 -Denforcer.skip=true 
-pl itests/qtest -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=empty_files_non_bucketed.q
...
 lbodor@HW12459  ~/repos/hive   HDP-3.1-maint ●  ls -la 
itests/qtest/target/localfs/warehouse/t1/00_0
-rw-r--r--  1 lbodor  staff  0 Feb 25 11:42 
itests/qtest/target/localfs/warehouse/t1/00_0
{code}
https://github.com/abstractdog/hive/commit/7e08a3f654d67848cc2f3a915ebb8294d98e4328


easy fix with acid/mm regression:
https://github.com/abstractdog/hive/commit/8e25b5ce11220e22dbe90958d52c63b52a482931


> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22942) Replace PTest with an alternative

2020-02-27 Thread Zoltan Haindrich (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-22942:
---


> Replace PTest with an alternative
> -
>
> Key: HIVE-22942
> URL: https://issues.apache.org/jira/browse/HIVE-22942
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> I never opened a jira about this...but it might actually help collect ideas 
> and actually start going somewhere sooner than later :D
> Right now we maintain the ptest2 project inside Hive to be able to run Hive 
> tests in a distributed fashion...the backstab of this solution is that we are 
> putting much effort into maintaining a distributed test execution framework...
> I think it would be better if we could find an off the shelf solution for the 
> task and migrate to that instead of putting more efforts into the ptest 
> framework



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22941) Empty files are inserted into external tables after HIVE-21714

2020-02-27 Thread Jira



 [ 
https://issues.apache.org/jira/browse/HIVE-22941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-22941:

Description: 
There were multiple patches targeting an issue when INSERT OVERWRITE was 
ineffective if the input is empty:
HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
overwriting
HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input is 
empty
HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
input is empty

>From these patches, HIVE-21714 seems to have a bad effect on external tables, 
>because of this part:
https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268

The original issue before HIVE-21714 was that the original files in the table 
survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 seems 
to enable writing empty files regardless of execution engine / table type, 
which is not the proper way, as the proper solution would be to completely 
avoid writing empty files for Tez (this is what HIVE-14014 was about). I found 
that changing condition to...
{code}
if (!isTez && (isStreaming || this.isInsertOverwrite)) 
{code}
(which could be an easy solution for external tables) breaks some test cases 
(both full ACID and MM) in insert_overwrite.q, which could mean they rely 
somehow on the empty generated file. We need to find a proper solution which is 
applicable for all table types without polluting external tables.

  was:
There were multiple patches targeting an issue when INSERT OVERWRITE was 
ineffective if the input is empty:
HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
overwriting
HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input is 
empty
HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
input is empty

>From these patches, HIVE-21714 seems to have a bad effect on external tables, 
>because of this part:
https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268

The issue was that the original files in the table survived an insert 
overwrite, and select(*)>0 was after that. HIVE-21714 seems to enable writing 
empty files regardless of execution engine, which is not the proper way, as the 
proper solution would be to completely avoid writing empty files for Tez (this 
is what HIVE-14014 was about). I found that changing condition to...
{code}
if (!isTez && (isStreaming || this.isInsertOverwrite)) 
{code}
(which could be an easy solution for external tables) breaks some test cases 
(both full ACID and MM) in insert_overwrite.q, which could mean they rely 
somehow on the empty generated file. We need to find a proper solution which is 
applicable for all table types without polluting external tables.


> Empty files are inserted into external tables after HIVE-21714
> --
>
> Key: HIVE-22941
> URL: https://issues.apache.org/jira/browse/HIVE-22941
> Project: Hive
>  Issue Type: Improvement
>Reporter: László Bodor
>Priority: Major
>
> There were multiple patches targeting an issue when INSERT OVERWRITE was 
> ineffective if the input is empty:
> HIVE-18702: INSERT OVERWRITE TABLE doesn't clean the table directory before 
> overwriting
> HIVE-21714: Insert overwrite on an acid/mm table is ineffective if the input 
> is empty
> HIVE-21784: Insert overwrite on an acid (not mm) table is ineffective if the 
> input is empty
> From these patches, HIVE-21714 seems to have a bad effect on external tables, 
> because of this part:
> https://github.com/apache/hive/commit/9a10bc28bee5250c0f667c94a295706a44ed4d7e#diff-9bea2581a1fba611f2c10904857b8823R1268
> The original issue before HIVE-21714 was that the original files in the table 
> survived an insert overwrite, and select(*)>0 was after that. HIVE-21714 
> seems to enable writing empty files regardless of execution engine / table 
> type, which is not the proper way, as the proper solution would be to 
> completely avoid writing empty files for Tez (this is what HIVE-14014 was 
> about). I found that changing condition to...
> {code}
> if (!isTez && (isStreaming || this.isInsertOverwrite)) 
> {code}
> (which could be an easy solution for external tables) breaks some test cases 
> (both full ACID and MM) in insert_overwrite.q, which could mean they rely 
> somehow on the empty generated file. We need to find a proper solution which 
> is applicable for all table types without polluting external tables.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan updated HIVE-22786:

Attachment: HIVE-22786.5.patch
Status: Patch Available  (was: Open)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Assigned] (HIVE-22786) Vectorization: Agg with distinct can be optimised in HASH mode

2020-02-27 Thread Ramesh Kumar Thangarajan (Jira)



 [ 
https://issues.apache.org/jira/browse/HIVE-22786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ramesh Kumar Thangarajan reassigned HIVE-22786:
---

Assignee: Rajesh Balamohan  (was: Ramesh Kumar Thangarajan)

> Vectorization: Agg with distinct can be optimised in HASH mode
> --
>
> Key: HIVE-22786
> URL: https://issues.apache.org/jira/browse/HIVE-22786
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: Rajesh Balamohan
>Assignee: Rajesh Balamohan
>Priority: Minor
> Attachments: HIVE-22786.1.patch, HIVE-22786.2.patch, 
> HIVE-22786.3.patch, HIVE-22786.4.wip.patch, HIVE-22786.5.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

1 2 >

1 - 100 of 118 matches

Mail list logo