[
https://issues.apache.org/jira/browse/PIG-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12843527#action_12843527
]
Hadoop QA commented on PIG-1285:
--------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12438254/PIG-1285.patch
against trunk revision 921185.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 3 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac
compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of
release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/231/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/231/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output:
http://hudson.zones.apache.org/hudson/job/Pig-Patch-h7.grid.sp2.yahoo.net/231/console
This message is automatically generated.
> Allow SingleTupleBag to be serialized
> -------------------------------------
>
> Key: PIG-1285
> URL: https://issues.apache.org/jira/browse/PIG-1285
> Project: Pig
> Issue Type: Improvement
> Reporter: Dmitriy V. Ryaboy
> Assignee: Dmitriy V. Ryaboy
> Fix For: 0.7.0
>
> Attachments: PIG-1285.patch
>
>
> Currently, Pig uses a SingleTupleBag for efficiency when a full-blown
> spillable bag implementation is not needed in the Combiner optimization.
> Unfortunately this can create problems. The below Initial.exec() code fails
> at run-time with the message that a SingleTupleBag cannot be serialized:
> {code}
> @Override
> public Tuple exec(Tuple in) throws IOException {
> // single record. just copy.
> if (in == null) return null;
> try {
> Tuple resTuple = tupleFactory_.newTuple(in.size());
> for (int i=0; i< in.size(); i++) {
> resTuple.set(i, in.get(i));
> }
> return resTuple;
> } catch (IOException e) {
> log.warn(e);
> return null;
> }
> }
> {code}
> The code below can fix the problem in the UDF, but it seems like something
> that should be handled transparently, not requiring UDF authors to know about
> SingleTupleBags.
> {code}
> @Override
> public Tuple exec(Tuple in) throws IOException {
> // single record. just copy.
> if (in == null) return null;
>
> /*
> * Unfortunately SingleTupleBags are not serializable. We cache whether
> a given index contains a bag
> * in the map below, and copy all bags into DefaultBags before
> returning to avoid serialization exceptions.
> */
> Map<Integer, Boolean> isBagAtIndex = Maps.newHashMap();
>
> try {
> Tuple resTuple = tupleFactory_.newTuple(in.size());
> for (int i=0; i< in.size(); i++) {
> Object obj = in.get(i);
> if (!isBagAtIndex.containsKey(i)) {
> isBagAtIndex.put(i, obj instanceof SingleTupleBag);
> }
> if (isBagAtIndex.get(i)) {
> DataBag newBag = bagFactory_.newDefaultBag();
> newBag.addAll((DataBag)obj);
> obj = newBag;
> }
> resTuple.set(i, obj);
> }
> return resTuple;
> } catch (IOException e) {
> log.warn(e);
> return null;
> }
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.