[
https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antonio Magnaghi updated PIG-32:
--------------------------------
Attachment: TEST.LOG
PATCH.2008.01.31
I have isolated the problem.
During the compilation process of MR jobs, in some instances (like when a
logical operator is an LOEval: in the case of TestPigSplit we have a long chain
of 500 LOEval's) the copy method is called on the compiled input. The copy
method performs a copy via serialization/deserialization of the input MR job.
In the current tree represenation that we are using, each physical operator
contains a pointer to the global table of physical operators that define the
operator tree. In the initial implementation, the copy method in the
Abstraction Layer patch was not avoiding a useless
serialization/deserialization of the opTable.
In this specific test case, this was causing a significant time overhead.
I have attahced a patch that fixes the problem.
The unit tests pass and the unit test logs attached show execution times that
seem to be in line with the execution times before the AL patch.
I have also check that the regression tests still pass:
=== Regression test results ===
tail /tmp/miners_test_harness_log_1201817146
[...]
Results so far, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Final results, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Finished test run at 1201821034
> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
> Key: PIG-32
> URL: https://issues.apache.org/jira/browse/PIG-32
> Project: Pig
> Issue Type: New Feature
> Components: impl
> Reporter: Antonio Magnaghi
> Assignee: Antonio Magnaghi
> Attachments: 2008.01.29.patch, DataStorage.diff,
> DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk,
> patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff,
> patch2007_12_26_II.diff, patch2007_12_27.diff, pig.jar, pig.jar.2008.01.16,
> TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an
> abstraction layer for Pig as defined at
> http://wiki.apache.org/pig/PigAbstractionLayer
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.