[ 
https://issues.apache.org/jira/browse/PIG-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sriranjan Manjunath updated PIG-1134:
-------------------------------------

    Attachment: PIG-1134.patch

As a stop gap, I have replaced PoissonSampleLoader with RandomSampleLoader. 
This does not obtain the input size. Instead it obtains 100 samples per block. 

> Skewed Join sampling job overwhelms the name node
> -------------------------------------------------
>
>                 Key: PIG-1134
>                 URL: https://issues.apache.org/jira/browse/PIG-1134
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Sriranjan Manjunath
>            Assignee: Sriranjan Manjunath
>         Attachments: PIG-1134.patch
>
>
> The map tasks of the sampling job estimate the file size. For a large 
> directory and a large number of maps the file system calls over whelm the 
> name node. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to