[ 
https://issues.apache.org/jira/browse/PIG-845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12737377#action_12737377
 ] 

Pradeep Kamath commented on PIG-845:
------------------------------------

Some initial comments on POMergeJoin.java:

If status is not OK - it shuld just be returned (no run time
exception like above) - similar comments for other places in POMergeJoin where 
there is
a switch case on processInput() - once this change is made, the code in 
if(processingFE) also
will need to change accordingly
{code}
        if(firstTime){
            // Do initial setup.
            curLeftInp = processInput();
            switch(curLeftInp.returnStatus){
            case POStatus.STATUS_OK:
                break;

            case POStatus.STATUS_EOP: // Return because we want to fetch next 
left tuple.
                return curLeftInp;
            default:
                throw new RuntimeException("Unexpected Status");
            }
{code}

All non RuntimeExceptions should follow error handling specification by using 
the correct Exception created with error code, cause, message, src constructor.
http://wiki.apache.org/pig/PigErrorHandlingFunctionalSpecification#head-9f71d78d362c3307711f98ec9db3ee12b55e92f6
 should be updated with new error code #

detachInput() is not required in POMergeJoin - processInput takes care of it

IN the code below, we could cache away the key to be used while processFE is 
true as processFEKey and then we need not
extract key for each join
        // Cant use the prevLeftKey, because we are reading ahead.
                            // Need key of current bag. Since we have just 
finished doing the join
                            // bag must contain atleast one element.
                            res.returnStatus = POStatus.STATUS_OK;
                            res.result = leftTuples.get(0);
                            curLeftKey = extractKeysFromTuple(res, 0);


> PERFORMANCE: Merge Join
> -----------------------
>
>                 Key: PIG-845
>                 URL: https://issues.apache.org/jira/browse/PIG-845
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>         Attachments: merge-join-for-review.patch
>
>
> Thsi join would work if the data for both tables is sorted on the join key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to