[
https://issues.apache.org/jira/browse/HIVE-24812?focusedWorklogId=557076&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-557076
]
ASF GitHub Bot logged work on HIVE-24812:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Feb/21 15:57
Start Date: 24/Feb/21 15:57
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on pull request #2006:
URL: https://github.com/apache/hive/pull/2006#issuecomment-785178016
@jcamachor alternate approach could be to instead of disabling; restrict the
optimization to only target case when:
```
TS_SJ -> [...] -> JOIN
TS_FILTERED -> [...] -> JOIN -> [...]
TS_X -> [...]
```
`TS_X` and `TS_FILTERED` scans the same table - those are being optimized;
`TS_SJ` is a table from which the SJ filter is computed.
* in the above case if there is no `RS -> MAPJOIN ` in the `TS_FILTERED ->
[...] -> JOIN` path then the optimization might not make much harm...
* or more generally - if there is no `RS` on that path it may definetly go
ahead and make the optimization
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 557076)
Time Spent: 20m (was: 10m)
> Disable sharedworkoptimizer remove semijoin by default
> ------------------------------------------------------
>
> Key: HIVE-24812
> URL: https://issues.apache.org/jira/browse/HIVE-24812
> Project: Hive
> Issue Type: Sub-task
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> SJ removal backfired a bit when I was testing stuff - because of the
> additional opportunities paralleledges may enable ; because it will increased
> the shuffled memory amount and/or even make MJ broadcast inputs larger
> set hive.optimize.shared.work.semijoin=false by default for now
> right now it's better to leave dppunion to pick up these cases instead of
> removing the SJ fully - after HIVE-24376 we might enable it back
--
This message was sent by Atlassian Jira
(v8.3.4#803005)