Ashutosh Chauhan updated PIG-513:

    Attachment: pig-513_2.patch

I encountered the same issue of  wasted work in checkBounds() while profiling 
the Merge Join. Since java in any case performs bound checks before accessing 
elements in ArrayList, this method call results in duplication of work. In this 
particular case, 6% of total time of query is spent in this method call. 
Attaching the patch generated against current trunk.

> PERFORMANCE: optimize some of the code in DefaultTuple
> ------------------------------------------------------
>                 Key: PIG-513
>                 URL: https://issues.apache.org/jira/browse/PIG-513
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>         Attachments: PIG-513.patch, pig-513_2.patch
> The following areas in DefaultTuple.java can be changed:
> The member methods get(), set(), getType() and isNull() all call 
> checkBounds() which is redundant call since all these 4 functions throw 
> ExecException. Instead of doing a bounds check, we can catch the 
> IndexOutOfBounds exception in a try-catch and throw it as an ExecException
> The write() method has the following unused object (d in the code below):
> {code}
> for (int i = 0; i < sz; i++) {
>                 try {
>                     Object d = get(i);
>                 } catch (ExecException ee) {
>                     throw new RuntimeException(ee);
>                 }
>                 DataReaderWriter.writeDatum(out, mFields.get(i));
>             }
> {code}
> {noformat}
> The get(i) call in the try should be replaced by the writeDatum call directly 
> since d is never used and there is an unncessary call to get()
> {noformat}

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to