Poisson Sample Loader should compute the number of samples required only once
-----------------------------------------------------------------------------
Key: PIG-1143
URL: https://issues.apache.org/jira/browse/PIG-1143
Project: Pig
Issue Type: Bug
Reporter: Sriranjan Manjunath
Assignee: Sriranjan Manjunath
The current poisson sampler forces each of the maps to compute the sample
number. This is redundant and causes issues when a large directory is specified
in the join. The sampler should be changed to calculate the sample count only
once and this information should be shared with the remaining mappers.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.