William Watson created PIG-4458:
-----------------------------------
Summary: Support UDFs in a FOREACH Before a Merge Join
Key: PIG-4458
URL: https://issues.apache.org/jira/browse/PIG-4458
Project: Pig
Issue Type: New Feature
Reporter: William Watson
Right now, the MapSideMergeValidator outright rejects any foreach that has a
UDF in it:
{code}
private boolean isAcceptableForEachOp(Operator lo) throws
LogicalToPhysicalTranslatorException {
if (lo instanceof LOForEach) {
OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
validateMapSideMerge(innerPlan.getSinks(), innerPlan);
return !containsUDFs((LOForEach) lo);
} else {
return false;
}
}
{code}
There is a TODO for this later on in that same class (inside containsUDFs):
{code}
// TODO (dvryaboy): in the future we could relax this rule by tracing what
fields
// are being passed into the UDF, and only refusing if the UDF is working on the
// join key. Transforms of other fields should be ok.
{code}
We should do the TODO and relax this requirement or just remove it altogether
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)