[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-09-15 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated PIG-865:
---

Status: Open  (was: Patch Available)

When I ran PigMix_2 (which does FR join) on this patch it actually slowed it 
down about 10%.

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Attachments: pig-865.patch, pig-865_v2.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-07-02 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-865:
-

Attachment: pig-865_v2.patch

In discussions with Pradeep, it came out that creating new objects may be a 
better choice then reusing the same reference because reusing same reference 
again and again may move it to older generation which may result in it taking 
longer to get garbage collected. Moreover, it seems it also depends on skewness 
of distribution of keys in data. Thus, its not clear which is a better choice 
here. Thus, not including 
that change in this jira. Other suggested changes are incorporated in the patch.

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: pig-865.patch, pig-865_v2.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-07-02 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-865:
-

Status: Patch Available  (was: Open)

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: pig-865.patch, pig-865_v2.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-07-02 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-865:
-

Status: Open  (was: Patch Available)

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: pig-865.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-865:
-

Status: Patch Available  (was: Open)

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: pig-865.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin

2009-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated PIG-865:
-

Attachment: pig-865.patch

Patch which fixes the issue described above. A useful side-effect of it is it 
removes the code duplication as function 
getValueTuple(POLocalRearrange lr, Tuple tuple) is also present in 
POPackage.java  

> Performance: Unnnecessary computation in FRJoin
> ---
>
> Key: PIG-865
> URL: https://issues.apache.org/jira/browse/PIG-865
> Project: Pig
>  Issue Type: Improvement
>  Components: impl
>Affects Versions: 0.3.0
>Reporter: Ashutosh Chauhan
>Priority: Minor
> Fix For: 0.4.0
>
> Attachments: pig-865.patch
>
>
> In POFRJoin implementation POLocalRearrange is used to extract join keys from 
> the input tuples. If keys match then to perform actual join input tuples are 
> fed to Foreach which does a cross on its inputs. After keys are extracted 
> using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, 
> Tuple tuple) is called to reconstruct the input tuple. It seems that this 
> function call is unnecessary since we already have input tuple at that time. 
> This is not a bug, but since this function would get called for every tuple, 
> if it is eliminated, it should certainly help to improve performance. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.