[jira] [Resolved] (HIVE-24048) Harmonise Jackson components to version 2.10.latest - Hive

2020-09-05 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-24048.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been committed to master. Thank you for the patch [~hemanth619]

> Harmonise Jackson components to version 2.10.latest - Hive
> --
>
> Key: HIVE-24048
> URL: https://issues.apache.org/jira/browse/HIVE-24048
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
> Fix For: 4.0.0
>
>
> Hive uses the following jackson components not harmonised with 
> jackson-databind's version (2.10.0)
>  * jackson-dataformat-yaml 2.9.8
>  * jackson-jaxrs-base 2.9.8
> To avoid conflicts caused by version mismatches please harmonise it with 
> jackson-databind's version.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (HIVE-24045) No logging related to when default database is created

2020-09-05 Thread Naveen Gangam (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam resolved HIVE-24045.
--
Fix Version/s: 4.0.0
   Resolution: Fixed

Fix has been committed to master. Thank you for the patch [~hemanth619]

> No logging related to when default database is created
> --
>
> Key: HIVE-24045
> URL: https://issues.apache.org/jira/browse/HIVE-24045
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
> Fix For: 4.0.0
>
>
> There does not appear to be any HMS logs related to when the "default" 
> database is first created in Hive. This would be useful for troubleshooting.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23779) BasicStatsTask Info is not getting printed in beeline console

2020-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23779?focusedWorklogId=479378=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479378
 ]

ASF GitHub Bot logged work on HIVE-23779:
-

Author: ASF GitHub Bot
Created on: 06/Sep/20 00:47
Start Date: 06/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1191:
URL: https://github.com/apache/hive/pull/1191#issuecomment-687681750


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479378)
Time Spent: 50m  (was: 40m)

> BasicStatsTask Info is not getting printed in beeline console
> -
>
> Key: HIVE-23779
> URL: https://issues.apache.org/jira/browse/HIVE-23779
> Project: Hive
>  Issue Type: Bug
>Reporter: Naresh P R
>Assignee: Naresh P R
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> After HIVE-16061, partition basic stats are not getting printed in beeline 
> console.
> {code:java}
> INFO : Partition {dt=2020-06-29} stats: [numFiles=21, numRows=22, 
> totalSize=14607, rawDataSize=0]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23665) Rewrite last_value to first_value to enable streaming results

2020-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23665?focusedWorklogId=479380=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479380
 ]

ASF GitHub Bot logged work on HIVE-23665:
-

Author: ASF GitHub Bot
Created on: 06/Sep/20 00:47
Start Date: 06/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #1177:
URL: https://github.com/apache/hive/pull/1177#issuecomment-687681754


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479380)
Time Spent: 1h 20m  (was: 1h 10m)

> Rewrite last_value to first_value to enable streaming results
> -
>
> Key: HIVE-23665
> URL: https://issues.apache.org/jira/browse/HIVE-23665
> Project: Hive
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: HIVE-23665.1.patch, HIVE-23665.2.patch, 
> HIVE-23665.3.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Rewrite last_value to first_value to enable streaming results
> last_value cannot be streamed because the intermediate results need to be 
> buffered to determine the window result till we get the last row in the 
> window. But if we can rewrite to first_value we can stream the results, 
> although the order of results will not be guaranteed (also not important)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-22634) Improperly SemanticException when filter is optimized to False on a partition table

2020-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-22634?focusedWorklogId=479379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479379
 ]

ASF GitHub Bot logged work on HIVE-22634:
-

Author: ASF GitHub Bot
Created on: 06/Sep/20 00:47
Start Date: 06/Sep/20 00:47
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #865:
URL: https://github.com/apache/hive/pull/865#issuecomment-687681763


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479379)
Time Spent: 1h 10m  (was: 1h)

> Improperly SemanticException when filter is optimized to False on a partition 
> table
> ---
>
> Key: HIVE-22634
> URL: https://issues.apache.org/jira/browse/HIVE-22634
> Project: Hive
>  Issue Type: Improvement
>Reporter: EdisonWang
>Assignee: EdisonWang
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HIVE-22634.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When filter is optimized to False on a partition table, it will throw 
> improperly SemanticException reporting that there is no partition predicate 
> found.
> The step to reproduce is
> {code:java}
> set hive.strict.checks.no.partition.filter=true;
> CREATE TABLE test(id int, name string)PARTITIONED BY (`date` string);
> select * from test where `date` = '20191201' and 1<>1;
> {code}
>  
> The above sql will throw "Queries against partitioned tables without a 
> partition filter"  exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-23454) Querying hive table which has Materialized view fails with HiveAccessControlException

2020-09-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-23454?focusedWorklogId=479369=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-479369
 ]

ASF GitHub Bot logged work on HIVE-23454:
-

Author: ASF GitHub Bot
Created on: 05/Sep/20 21:09
Start Date: 05/Sep/20 21:09
Worklog Time Spent: 10m 
  Work Description: jcamachor commented on a change in pull request #1471:
URL: https://github.com/apache/hive/pull/1471#discussion_r483989906



##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -1718,6 +1722,34 @@ public Table 
apply(org.apache.hadoop.hive.metastore.api.Table table) {
 }
   }
 
+  /**
+   * Validate if given materialized view has SELECT privileges for current user
+   * @param cachedMVTable
+   * @return false if user does not have privilege otherwise true
+   * @throws HiveException
+   */
+  private boolean checkPrivillegeForMV(final Table cachedMVTable) throws 
HiveException{
+List colNames =
+cachedMVTable.getAllCols().stream()
+.map(FieldSchema::getName)
+.collect(Collectors.toList());
+
+HivePrivilegeObject privObject = new 
HivePrivilegeObject(cachedMVTable.getDbName(),
+cachedMVTable.getTableName(), colNames);
+List privObjects = new 
ArrayList();
+privObjects.add(privObject);
+try {
+  SessionState.get().getAuthorizerV2().
+  checkPrivileges(HiveOperationType.QUERY, privObjects, privObjects, 
new HiveAuthzContext.Builder().build());

Review comment:
   Can we check the privileges for all MVs used by the query at once so we 
do not need multiple round trips?

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -1718,6 +1722,34 @@ public Table 
apply(org.apache.hadoop.hive.metastore.api.Table table) {
 }
   }
 
+  /**
+   * Validate if given materialized view has SELECT privileges for current user
+   * @param cachedMVTable
+   * @return false if user does not have privilege otherwise true
+   * @throws HiveException
+   */
+  private boolean checkPrivillegeForMV(final Table cachedMVTable) throws 
HiveException{

Review comment:
   Can we move this to HiveMaterializedViewUtils?
   
   There is also a typo: `Privillege`

##
File path: ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
##
@@ -1736,6 +1768,10 @@ public boolean 
validateMaterializedViewsFromRegistry(List cachedMateriali
   // Final result
   boolean result = true;
   for (Table cachedMaterializedViewTable : cachedMaterializedViewTables) {
+if (!checkPrivillegeForMV(cachedMaterializedViewTable)) {

Review comment:
   `validateMaterializedViewsFromRegistry` is only called if the MV is 
coming from the registry. However, we need the authorization check in all 
cases, e.g., dummy registry.
   You can simply call the new method from Calcite planner.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 479369)
Time Spent: 20m  (was: 10m)

> Querying hive table which has Materialized view fails with 
> HiveAccessControlException
> -
>
> Key: HIVE-23454
> URL: https://issues.apache.org/jira/browse/HIVE-23454
> Project: Hive
>  Issue Type: Bug
>  Components: Authorization, HiveServer2
>Affects Versions: 3.0.0, 3.2.0
>Reporter: Chiran Ravani
>Assignee: Vineet Garg
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Query fails with HiveAccessControlException against table when there is  
> Materialized view pointing to that table which end user does not have access 
> to, but the actual table user has all the privileges.
> From the HiveServer2 logs - it looks as part of optimization Hive uses 
> materialized view to query the data instead of table and since end user does 
> not have access on MV we receive HiveAccessControlException.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/cost/HiveVolcanoPlanner.java#L99
> The Simplest reproducer for this issue is as below.
> 1. Create a table using hive user and insert some data
> {code:java}
> create table db1.testmvtable(id int, name string) partitioned by(year int);
> insert into db1.testmvtable partition(year=2020) values(1,'Name1');
> insert into db1.testmvtable partition(year=2020) values(1,'Name2');
> insert into db1.testmvtable 

[jira] [Comment Edited] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190804#comment-17190804
 ] 

László Bodor edited comment on HIVE-24111 at 9/5/20, 7:52 AM:
--

For reference:
logs for a good run:  
[^org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-output.txt] 
logs for a hanging run 1:  [^TestCrudCompactorOnTez.log] 
logs for a hanging run 2:  [^TestCrudCompactorOnTez2.log] 

what is strange for the first sight, I cannot see MergeManager related log 
messages when it's expected, so this could be a shuffle issue

good run:
{code}
2020-09-03T15:13:19,604  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-03T15:13:19,606  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-03T15:13:19,606  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-03T15:13:19,606  INFO [ShuffleAndMergeRunner {Map_1}] 
orderedgrouped.MergeManager: Setting merger's parent thread to 
ShuffleAndMergeRunner {Map_1}
2020-09-03T15:13:19,606  INFO [TezChild] task.TaskRunner2Callable: Running 
task, taskAttemptId=attempt_1599171197926_0001_1_01_00_0
2020-09-03T15:13:19,607  INFO 
[TezTaskEventRouter{attempt_1599171197926_0001_1_01_00_0}] 
orderedgrouped.ShuffleInputEventHandlerOrderedGrouped: Map 1: 
numDmeEventsSeen=1, numDmeEventsSeenWithNoData=0, numObsoletionEventsSeen=0
2020-09-03T15:13:19,607  INFO [TezChild] exec.SerializationUtilities: 
Deserializing ReduceWork using kryo
2020-09-03T15:13:19,607  INFO [TezChild] exec.Utilities: Deserialized plan (via 
RPC) - name: Reducer 2 size: 1.87KB
2020-09-03T15:13:19,607  INFO [TezChild] tez.ObjectCache: Caching key: 
lbodor_20200903151317_7f539b53-07fb-4bb1-97db-c37d72aba99d_Reducer 
2__REDUCE_PLAN__
2020-09-03T15:13:19,607  INFO [TezChild] tez.RecordProcessor: conf class path = 
[]
2020-09-03T15:13:19,608  INFO [TezChild] tez.RecordProcessor: thread class path 
= []
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.MergeManager: close onDiskFile. State: NumOnDiskFiles=1. 
Current: 
path=/Users/lbodor/apache/hive/itests/hive-unit/target/tmp/scratchdir/lbodor/_tez_session_dir/e01fa9d5-36d9-4449-bfa4-d12b5fa290f8/.tez/application_1599171197926_0001_wd/localmode-local-dir/output/attempt_1599171197926_0001_1_00_00_0_10098/file.out,
 len=26
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] ShuffleScheduler.fetch: 
Completed fetch for attempt: {0, 0, 
attempt_1599171197926_0001_1_00_00_0_10098} to DISK_DIRECT, csize=26, 
dsize=22, EndTime=1599171199608, TimeTaken=1, Rate=0.02 MB/s
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.ShuffleScheduler: All inputs fetched for input vertex : Map 1
{code}

hanging run:
{code}
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-04T02:12:16,394  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-04T02:12:16,398  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-04T02:12:16,398  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-04T02:12:16,398  INFO [TezChild] task.TaskRunner2Callable: Running 
task, 

[jira] [Comment Edited] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190804#comment-17190804
 ] 

László Bodor edited comment on HIVE-24111 at 9/5/20, 7:39 AM:
--

For reference:
logs for a good run:  
[^org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-output.txt] 
logs for a hanging run 1:  [^TestCrudCompactorOnTez.log] 
logs for a hanging run 2:  [^TestCrudCompactorOnTez2.log] 

what is strange for the first sight, I cannot see MergeManager related log 
messages when it's expected, so this could be a shuffle issue

good run:
{code}
2020-09-03T15:13:19,604  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-03T15:13:19,606  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-03T15:13:19,606  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-03T15:13:19,606  INFO [ShuffleAndMergeRunner {Map_1}] 
orderedgrouped.MergeManager: Setting merger's parent thread to 
ShuffleAndMergeRunner {Map_1}
2020-09-03T15:13:19,606  INFO [TezChild] task.TaskRunner2Callable: Running 
task, taskAttemptId=attempt_1599171197926_0001_1_01_00_0
2020-09-03T15:13:19,607  INFO 
[TezTaskEventRouter{attempt_1599171197926_0001_1_01_00_0}] 
orderedgrouped.ShuffleInputEventHandlerOrderedGrouped: Map 1: 
numDmeEventsSeen=1, numDmeEventsSeenWithNoData=0, numObsoletionEventsSeen=0
2020-09-03T15:13:19,607  INFO [TezChild] exec.SerializationUtilities: 
Deserializing ReduceWork using kryo
2020-09-03T15:13:19,607  INFO [TezChild] exec.Utilities: Deserialized plan (via 
RPC) - name: Reducer 2 size: 1.87KB
2020-09-03T15:13:19,607  INFO [TezChild] tez.ObjectCache: Caching key: 
lbodor_20200903151317_7f539b53-07fb-4bb1-97db-c37d72aba99d_Reducer 
2__REDUCE_PLAN__
2020-09-03T15:13:19,607  INFO [TezChild] tez.RecordProcessor: conf class path = 
[]
2020-09-03T15:13:19,608  INFO [TezChild] tez.RecordProcessor: thread class path 
= []
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.MergeManager: close onDiskFile. State: NumOnDiskFiles=1. 
Current: 
path=/Users/lbodor/apache/hive/itests/hive-unit/target/tmp/scratchdir/lbodor/_tez_session_dir/e01fa9d5-36d9-4449-bfa4-d12b5fa290f8/.tez/application_1599171197926_0001_wd/localmode-local-dir/output/attempt_1599171197926_0001_1_00_00_0_10098/file.out,
 len=26
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] ShuffleScheduler.fetch: 
Completed fetch for attempt: {0, 0, 
attempt_1599171197926_0001_1_00_00_0_10098} to DISK_DIRECT, csize=26, 
dsize=22, EndTime=1599171199608, TimeTaken=1, Rate=0.02 MB/s
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.ShuffleScheduler: All inputs fetched for input vertex : Map 1
{code}

hanging run:
{code}
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-04T02:12:16,394  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-04T02:12:16,398  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-04T02:12:16,398  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-04T02:12:16,398  INFO [TezChild] task.TaskRunner2Callable: Running 
task, 

[jira] [Comment Edited] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


[ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17190804#comment-17190804
 ] 

László Bodor edited comment on HIVE-24111 at 9/5/20, 7:29 AM:
--

For reference:
logs for a good run:  
[^org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-output.txt] 
logs for a hanging run 1:  [^TestCrudCompactorOnTez.log] 
logs for a hanging run 2:  [^TestCrudCompactorOnTez2.log] 

what is strange for the first sight, I cannot see MergeManager related log 
messages when it's expected, so this could be a shuffle issue

good run:
{code}
2020-09-03T15:13:19,604  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-03T15:13:19,605  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-03T15:13:19,606  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-03T15:13:19,606  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-03T15:13:19,606  INFO [ShuffleAndMergeRunner {Map_1}] 
orderedgrouped.MergeManager: Setting merger's parent thread to 
ShuffleAndMergeRunner {Map_1}
2020-09-03T15:13:19,606  INFO [TezChild] task.TaskRunner2Callable: Running 
task, taskAttemptId=attempt_1599171197926_0001_1_01_00_0
2020-09-03T15:13:19,607  INFO 
[TezTaskEventRouter{attempt_1599171197926_0001_1_01_00_0}] 
orderedgrouped.ShuffleInputEventHandlerOrderedGrouped: Map 1: 
numDmeEventsSeen=1, numDmeEventsSeenWithNoData=0, numObsoletionEventsSeen=0
2020-09-03T15:13:19,607  INFO [TezChild] exec.SerializationUtilities: 
Deserializing ReduceWork using kryo
2020-09-03T15:13:19,607  INFO [TezChild] exec.Utilities: Deserialized plan (via 
RPC) - name: Reducer 2 size: 1.87KB
2020-09-03T15:13:19,607  INFO [TezChild] tez.ObjectCache: Caching key: 
lbodor_20200903151317_7f539b53-07fb-4bb1-97db-c37d72aba99d_Reducer 
2__REDUCE_PLAN__
2020-09-03T15:13:19,607  INFO [TezChild] tez.RecordProcessor: conf class path = 
[]
2020-09-03T15:13:19,608  INFO [TezChild] tez.RecordProcessor: thread class path 
= []
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.MergeManager: close onDiskFile. State: NumOnDiskFiles=1. 
Current: 
path=/Users/lbodor/apache/hive/itests/hive-unit/target/tmp/scratchdir/lbodor/_tez_session_dir/e01fa9d5-36d9-4449-bfa4-d12b5fa290f8/.tez/application_1599171197926_0001_wd/localmode-local-dir/output/attempt_1599171197926_0001_1_00_00_0_10098/file.out,
 len=26
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] ShuffleScheduler.fetch: 
Completed fetch for attempt: {0, 0, 
attempt_1599171197926_0001_1_00_00_0_10098} to DISK_DIRECT, csize=26, 
dsize=22, EndTime=1599171199608, TimeTaken=1, Rate=0.02 MB/s
2020-09-03T15:13:19,608  INFO [Fetcher_O {Map_1} #0] 
orderedgrouped.ShuffleScheduler: All inputs fetched for input vertex : Map 1
{code}

hanging run:
{code}
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.Shuffle: Map_1: Shuffle assigned with 1 inputs, codec: None, 
ifileReadAhead: true
2020-09-04T02:12:16,392  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.MergeManager: Map 1: MergerManager: memoryLimit=1278984847, 
maxSingleShuffleLimit=319746208, mergeThreshold=844130048, ioSortFactor=10, 
postMergeMem=0, memToMemMergeOutputsThreshold=10
2020-09-04T02:12:16,394  INFO [I/O Setup 0 Start: {Map 1}] 
orderedgrouped.ShuffleScheduler: ShuffleScheduler running for sourceVertex: Map 
1 with configuration: maxFetchFailuresBeforeReporting=5, 
reportReadErrorImmediately=true, maxFailedUniqueFetches=1, 
abortFailureLimit=15, maxTaskOutputAtOnce=20, numFetchers=1, 
hostFailureFraction=0.2, minFailurePerHost=4, 
maxAllowedFailedFetchFraction=0.5, maxStallTimeFraction=0.5, 
minReqProgressFraction=0.5, checkFailedFetchSinceLastCompletion=true
2020-09-04T02:12:16,398  INFO [I/O Setup 0 Start: {Map 1}] 
runtime.LogicalIOProcessorRuntimeTask: Started Input with src edge: Map 1
2020-09-04T02:12:16,398  INFO [TezChild] runtime.LogicalIOProcessorRuntimeTask: 
AutoStartComplete
2020-09-04T02:12:16,398  INFO [TezChild] task.TaskRunner2Callable: Running 
task, 

[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Attachment: 
org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-output.txt

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorOnTez.log, TestCrudCompactorOnTez2.log, 
> jstack.log, 
> org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-output.txt
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Attachment: (was: 
org.apache.hadoop.hive.ql.txn.compactor.TestMmCompactorOnTez-output.txt.log)

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorOnTez.log, TestCrudCompactorOnTez2.log, 
> jstack.log
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Attachment: TestCrudCompactorOnTez2.log

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorOnTez.log, TestCrudCompactorOnTez2.log, 
> jstack.log
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Attachment: (was: TestCrudCompactorTez.log)

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorOnTez.log, jstack.log, 
> org.apache.hadoop.hive.ql.txn.compactor.TestMmCompactorOnTez-output.txt.log
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Attachment: TestCrudCompactorOnTez.log

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorOnTez.log, jstack.log, 
> org.apache.hadoop.hive.ql.txn.compactor.TestMmCompactorOnTez-output.txt.log
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HIVE-24111) TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 0.10.0 staging artifact

2020-09-05 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HIVE-24111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-24111:

Summary: TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running 
against Tez 0.10.0 staging artifact  (was: TestMmCompactorOnTez hangs when 
running against Tez 0.10.0 staging artifact)

> TestMmCompactorOnTez/TestCrudCompactorOnTez hangs when running against Tez 
> 0.10.0 staging artifact
> --
>
> Key: HIVE-24111
> URL: https://issues.apache.org/jira/browse/HIVE-24111
> Project: Hive
>  Issue Type: Bug
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TestCrudCompactorTez.log, jstack.log, 
> org.apache.hadoop.hive.ql.txn.compactor.TestMmCompactorOnTez-output.txt.log
>
>
> Reproduced issue in ptest run which I made to run against tez staging 
> artifacts 
> (https://repository.apache.org/content/repositories/orgapachetez-1068/)
> http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-1311/14/pipeline/417
> I'm about to investigate this. I think Tez 0.10.0 cannot be released until we 
> won't confirm if it's a hive or tez bug.
> {code}
> mvn test -Pitests,hadoop-2 -Dtest=TestMmCompactorOnTez -pl ./itests/hive-unit
> {code}
> tez setup:
> https://github.com/apache/hive/commit/92516631ab39f39df5d0692f98ac32c2cd320997#diff-a22bcc9ba13b310c7abfee4a57c4b130R83-R97



--
This message was sent by Atlassian Jira
(v8.3.4#803005)