[
https://issues.apache.org/jira/browse/PIG-4069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-4069:
------------------------------------
Attachment: PIG-4069-1.patch
Initial patch. Does not show much performance improvement in local box.
Investigating that. I guess most likely it is because the mappers still have to
finish even though the reducer is done early. Will also run on bigger data on a
cluster and see.
> Limit reduce task should start as soon as one map task finishes
> ---------------------------------------------------------------
>
> Key: PIG-4069
> URL: https://issues.apache.org/jira/browse/PIG-4069
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Reporter: Rohini Palaniswamy
> Fix For: 0.14.0
>
> Attachments: PIG-4069-1.patch
>
>
> Set very low values for
> ShuffleVertexManager.TEZ_AM_SHUFFLE_VERTEX_MANAGER_MIN_SRC_FRACTION and
> ShuffleVertexManager.TEZ_AM_SHUFFLE_VERTEX_MANAGER_MAX_SRC_FRACTION in case
> of LIMIT job not following an order by so that the reduce task starts as soon
> as 1 map task finishes.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)