Schema of a relation reported by DESCRIBE and allowed operations on the
relation are not compatible
---------------------------------------------------------------------------------------------------
Key: PIG-768
URL: https://issues.apache.org/jira/browse/PIG-768
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.2.0
Reporter: George Mavromatis
Fix For: 0.2.0
The DESCIBE command in the following script prints:
{s: bytearray, pg: bytearray, wm: bytearray}
However, the script later treats the s field of urlMap as a map instead of a
bytearray, as shown in s#'Url'.
Pig does not complain about this contradiction and at execution time, the s
field is treated as hash, although it was reported as byterray at parse time.
Pig should either not report s as a byterray or exit with a parsing error.
Note that all above operations happen before the query executes at the cluster.
register WebDataProcessing.jar;
register opencrawl.jar;
urlMap = LOAD '$input' USING opencrawl.pigudf.WebDataLoader() AS (s, pg, wm);
DESCRIBE urlMap;
-- in fact the loader in the WebDataProcessing.jar populates s and pg as
s:map[], pg:bag{t1:(contents:bytearray)}
-- and defines that in determineSchema() but pig describe ignores it!
urlMap2 = LIMIT urlMap 20;
urlList2 = FOREACH urlMap2 GENERATE s#'Url', pg;
DESCRIBE urlList2;
STORE urlList2 INTO 'output2' USING BinStorage();
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.