[
https://issues.apache.org/jira/browse/HIVE-24306?focusedWorklogId=528474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-528474
]
ASF GitHub Bot logged work on HIVE-24306:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 26/Dec/20 00:59
Start Date: 26/Dec/20 00:59
Worklog Time Spent: 10m
Work Description: github-actions[bot] commented on pull request #1602:
URL: https://github.com/apache/hive/pull/1602#issuecomment-751305507
This pull request has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.
Feel free to reach out on the [email protected] list if the patch is in
need of reviews.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 528474)
Time Spent: 20m (was: 10m)
> Launch single copy task for single batch of partitions in repl load for
> managed table
> -------------------------------------------------------------------------------------
>
> Key: HIVE-24306
> URL: https://issues.apache.org/jira/browse/HIVE-24306
> Project: Hive
> Issue Type: Task
> Reporter: Aasha Medhi
> Assignee: Aasha Medhi
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-24306.01.patch
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> For data dumped in staging location, we will run a single distcp at the table
> level for all partitions as the data is already present in the staging
> location.
> For _files case where data is on source cluster and staging just has the file
> list, distcp is executed at the each file level. This is to take care of the
> cm case where we need the full path and encoded path(for cm). If the table is
> dropped, table level distcp will fail.
> This patch takes care of single copy for staging data.
> However to run single distcp at the table level, file listing in distcp might
> lead to OOM if the number of files are too high. So it needs to be fixed at
> the distcp level before committing this patch.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)