[jira] [Updated] (PIG-4850) Registered jars do not use submit replication
[ https://issues.apache.org/jira/browse/PIG-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheolsoo Park updated PIG-4850: --- Resolution: Fixed Fix Version/s: 0.16.0 Status: Resolved (was: Patch Available) Committed into trunk. > Registered jars do not use submit replication > - > > Key: PIG-4850 > URL: https://issues.apache.org/jira/browse/PIG-4850 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ryan Blue >Assignee: Ryan Blue > Fix For: 0.16.0 > > Attachments: PIG-4850.1.patch > > > PIG-4074 added support for mapred.submit.replication, which sets the > replication factor for files added to the distributed cache. The purpose is > to avoid a huge number of task attempts downloading the same file in HDFS at > once during localization and slowing down because of contention over few > replicas. The replication factor for files was set correctly, but registered > jars are added to HDFS through a different code path and weren't using the > submit replication factor. This causes localization time for jobs to increase > by as much as 10 minutes (at which point the tasks are killed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4850) Registered jars do not use submit replication
[ https://issues.apache.org/jira/browse/PIG-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated PIG-4850: --- Status: Patch Available (was: Open) > Registered jars do not use submit replication > - > > Key: PIG-4850 > URL: https://issues.apache.org/jira/browse/PIG-4850 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ryan Blue >Assignee: Ryan Blue > Attachments: PIG-4850.1.patch > > > PIG-4074 added support for mapred.submit.replication, which sets the > replication factor for files added to the distributed cache. The purpose is > to avoid a huge number of task attempts downloading the same file in HDFS at > once during localization and slowing down because of contention over few > replicas. The replication factor for files was set correctly, but registered > jars are added to HDFS through a different code path and weren't using the > submit replication factor. This causes localization time for jobs to increase > by as much as 10 minutes (at which point the tasks are killed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (PIG-4850) Registered jars do not use submit replication
[ https://issues.apache.org/jira/browse/PIG-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated PIG-4850: --- Attachment: PIG-4850.1.patch Attaching a simple patch that fixes the problem. > Registered jars do not use submit replication > - > > Key: PIG-4850 > URL: https://issues.apache.org/jira/browse/PIG-4850 > Project: Pig > Issue Type: Bug > Components: impl >Reporter: Ryan Blue >Assignee: Ryan Blue > Attachments: PIG-4850.1.patch > > > PIG-4074 added support for mapred.submit.replication, which sets the > replication factor for files added to the distributed cache. The purpose is > to avoid a huge number of task attempts downloading the same file in HDFS at > once during localization and slowing down because of contention over few > replicas. The replication factor for files was set correctly, but registered > jars are added to HDFS through a different code path and weren't using the > submit replication factor. This causes localization time for jobs to increase > by as much as 10 minutes (at which point the tasks are killed). -- This message was sent by Atlassian JIRA (v6.3.4#6332)