[
https://issues.apache.org/jira/browse/PIG-1277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-1277:
----------------------------
Attachment: PIG-1277-3.patch
test-patch result:
[exec] -1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 9 new or
modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] -1 release audit. The applied patch generated 468 release
audit warnings (more than the trunk's current 464 warnings).
Release audit warning is because changed NullableBytesWritable construct
triggered a new jdiff file.
Unit test:
all pass
end-to-end test:
all pass
> Pig should give error message when cogroup on tuple keys of different inner
> type
> --------------------------------------------------------------------------------
>
> Key: PIG-1277
> URL: https://issues.apache.org/jira/browse/PIG-1277
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.6.0
> Reporter: Daniel Dai
> Assignee: Alan Gates
> Fix For: 0.9.0
>
> Attachments: PIG-1277-1.patch, PIG-1277-2.patch, PIG-1277-3.patch
>
>
> When we cogroup on a tuple, if the inner type of tuple does not match, we
> treat them as different keys. This is confusing. It is desirable to give
> error/warnings when it happens.
> Here is one example:
> UDF:
> {code}
> public class MapGenerate extends EvalFunc<Map> {
> @Override
> public Map exec(Tuple input) throws IOException {
> // TODO Auto-generated method stub
> Map m = new HashMap();
> m.put("key", new Integer(input.size()));
> return m;
> }
>
> @Override
> public Schema outputSchema(Schema input) {
> return new Schema(new Schema.FieldSchema(null, DataType.MAP));
> }
> }
> {code}
> Pig script:
> {code}
> a = load '1.txt' as (a0);
> b = foreach a generate a0, MapGenerate(*) as m:map[];
> c = foreach b generate a0, m#'key' as key;
> d = load '2.txt' as (c0, c1);
> e = cogroup c by (a0, key), d by (c0, c1);
> dump e;
> {code}
> 1.txt
> {code}
> 1
> {code}
> 2.txt
> {code}
> 1 1
> {code}
> User expected result (which is not right):
> {code}
> ((1,1),{(1,1)},{(1,1)})
> {code}
> Real result:
> {code}
> ((1,1),{(1,1)},{})
> ((1,1),{},{(1,1)})
> {code}
> We shall give user the message that we can not merge the key due to the type
> mismatch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.