[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin
[ https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated PIG-865: --- Status: Open (was: Patch Available) When I ran PigMix_2 (which does FR join) on this patch it actually slowed it down about 10%. Performance: Unnnecessary computation in FRJoin --- Key: PIG-865 URL: https://issues.apache.org/jira/browse/PIG-865 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Attachments: pig-865.patch, pig-865_v2.patch In POFRJoin implementation POLocalRearrange is used to extract join keys from the input tuples. If keys match then to perform actual join input tuples are fed to Foreach which does a cross on its inputs. After keys are extracted using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, Tuple tuple) is called to reconstruct the input tuple. It seems that this function call is unnecessary since we already have input tuple at that time. This is not a bug, but since this function would get called for every tuple, if it is eliminated, it should certainly help to improve performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin
[ https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-865: - Status: Patch Available (was: Open) Performance: Unnnecessary computation in FRJoin --- Key: PIG-865 URL: https://issues.apache.org/jira/browse/PIG-865 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Priority: Minor Fix For: 0.4.0 Attachments: pig-865.patch, pig-865_v2.patch In POFRJoin implementation POLocalRearrange is used to extract join keys from the input tuples. If keys match then to perform actual join input tuples are fed to Foreach which does a cross on its inputs. After keys are extracted using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, Tuple tuple) is called to reconstruct the input tuple. It seems that this function call is unnecessary since we already have input tuple at that time. This is not a bug, but since this function would get called for every tuple, if it is eliminated, it should certainly help to improve performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin
[ https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-865: - Attachment: pig-865.patch Patch which fixes the issue described above. A useful side-effect of it is it removes the code duplication as function getValueTuple(POLocalRearrange lr, Tuple tuple) is also present in POPackage.java Performance: Unnnecessary computation in FRJoin --- Key: PIG-865 URL: https://issues.apache.org/jira/browse/PIG-865 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.3.0 Reporter: Ashutosh Chauhan Priority: Minor Fix For: 0.4.0 Attachments: pig-865.patch In POFRJoin implementation POLocalRearrange is used to extract join keys from the input tuples. If keys match then to perform actual join input tuples are fed to Foreach which does a cross on its inputs. After keys are extracted using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, Tuple tuple) is called to reconstruct the input tuple. It seems that this function call is unnecessary since we already have input tuple at that time. This is not a bug, but since this function would get called for every tuple, if it is eliminated, it should certainly help to improve performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-865) Performance: Unnnecessary computation in FRJoin
[ https://issues.apache.org/jira/browse/PIG-865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated PIG-865: - Status: Patch Available (was: Open) Performance: Unnnecessary computation in FRJoin --- Key: PIG-865 URL: https://issues.apache.org/jira/browse/PIG-865 Project: Pig Issue Type: Improvement Components: impl Affects Versions: 0.3.0 Reporter: Ashutosh Chauhan Priority: Minor Fix For: 0.4.0 Attachments: pig-865.patch In POFRJoin implementation POLocalRearrange is used to extract join keys from the input tuples. If keys match then to perform actual join input tuples are fed to Foreach which does a cross on its inputs. After keys are extracted using POLocalRearrange output; function getValueTuple(POLocalRearrange lr, Tuple tuple) is called to reconstruct the input tuple. It seems that this function call is unnecessary since we already have input tuple at that time. This is not a bug, but since this function would get called for every tuple, if it is eliminated, it should certainly help to improve performance. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.