[
https://issues.apache.org/jira/browse/HIVE-27142?focusedWorklogId=851265&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-851265
]
ASF GitHub Bot logged work on HIVE-27142:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Mar/23 05:10
Start Date: 16/Mar/23 05:10
Worklog Time Spent: 10m
Work Description: sonarcloud[bot] commented on PR #4120:
URL: https://github.com/apache/hive/pull/4120#issuecomment-1471327469
Kudos, SonarCloud Quality Gate passed! [](https://sonarcloud.io/dashboard?id=apache_hive&pullRequest=4120)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=BUG)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=BUG)
[0
Bugs](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=BUG)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=VULNERABILITY)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=VULNERABILITY)
[0
Vulnerabilities](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=VULNERABILITY)
[](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4120&resolved=false&types=SECURITY_HOTSPOT)
[](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4120&resolved=false&types=SECURITY_HOTSPOT)
[0 Security
Hotspots](https://sonarcloud.io/project/security_hotspots?id=apache_hive&pullRequest=4120&resolved=false&types=SECURITY_HOTSPOT)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=CODE_SMELL)
[](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=CODE_SMELL)
[2 Code
Smells](https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=4120&resolved=false&types=CODE_SMELL)
[](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4120&metric=coverage&view=list)
No Coverage information
[](https://sonarcloud.io/component_measures?id=apache_hive&pullRequest=4120&metric=duplicated_lines_density&view=list)
No Duplication information
Issue Time Tracking
-------------------
Worklog Id: (was: 851265)
Time Spent: 0.5h (was: 20m)
> Map Join not working as expected when joining non-native tables with native
> tables
> -----------------------------------------------------------------------------------
>
> Key: HIVE-27142
> URL: https://issues.apache.org/jira/browse/HIVE-27142
> Project: Hive
> Issue Type: Bug
> Components: Statistics
> Affects Versions: All Versions
> Reporter: Syed Shameerur Rahman
> Assignee: Syed Shameerur Rahman
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> *1. Issue :*
> When *_hive.auto.convert.join=true_* and if the underlying query is trying to
> join a large non-native hive table with a small native hive table, The map
> join is happening in the wrong side i.e on the map task which process the
> small native hive table and it can lead to OOM when the non-native table is
> really large and only few map tasks are spawned to scan the small native hive
> tables.
>
> *2. Why is this happening ?*
> This happens due to improper stats collection/computation of non native hive
> tables. Since the non-native hive tables are actually stored in a different
> location which Hive does not know of and only a temporary path which is
> visible to Hive while creating a non native table does not store the actual
> data, The stats collection logic tend to under estimate the data/rows and
> hence causes the map join to happen in the wrong side.
>
> *3. Potential Solutions*
> 3.1 Turn off *_hive.auto.convert.join=false._* This can have a negative
> impact of the query if the same query is trying to do multiple joins i.e
> one join with non-native tables and other join where both the tables are
> native.
> 3.2 Compute stats for non-native table by firing the ANALYZE TABLE <>
> command before joining native and non-native commands. The user may or may
> not choose to do it.
> 3.3 Do not collect/estimate stats for non-native hive tables by default
> (Preferred solution)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
