[
https://issues.apache.org/jira/browse/PIG-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12697054#action_12697054
]
Alan Gates commented on PIG-724:
Currently Pig doesn't require that all keys and values in a map share the same
type. There is a proposal to change it so that key types can only be chararray
(see PIG-734), as we don't see anyone using anything but chararray and the
generality is causing us some other issues. But we still wouldn't require that
all values in a given map be of the same type. Are you proposing allowing
users to put a constraint on a given map so that all values in that particular
map must be of that type?
Treating integers and strings in PigStorage
---
Key: PIG-724
URL: https://issues.apache.org/jira/browse/PIG-724
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: 0.2.1
Reporter: Santhosh Srinivasan
Fix For: 0.2.1
Currently, PigStorage cannot treats the materialized string 123 as an integer
with the value 123. If the user intended this to be the string 123,
PigStorage cannot deal with it. This reasoning also applies to doubles. Due
to this issue, maps that contain values which are of the same type but
manifest the issue discussed at beginning of the paragraph, Pig throws its
hands up at runtime. An example to illustrate the problem will help.
In the example below a sample row in the data (map.txt) contains the
following:
[key01#35,key02#value01]
When Pig tries to convert the stream to a map, it creates a MapObject,
Object where the key is a string and the value is an integer. Running the
script shown below, results in a run-time error.
{code}
grunt a = load 'map.txt' as (themap: map[]);
grunt b = filter a by (chararray)(themap#'key01') == 'hello';
grunt dump b;
2009-03-18 15:19:03,773 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2009-03-18 15:19:28,797 [main] ERROR
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Map reduce job failed
2009-03-18 15:19:28,817 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
1081: Cannot cast to chararray. Expected bytearray but received: int
{code}
There are two ways to resolve this issue:
1. Change the conversion routine for bytesToMap to return a map where the
value is a bytearray and not the actual type. This change breaks backward
compatibility
2. Introduce checks in POCast where conversions that are legal in the type
checking world are allowed, i.e., run time checks will be made to check for
compatible casts. In the above example, an int can be converted to a
chararray and the cast will be made. If on the other hand, it was a chararray
to int conversion then an exception will be thrown.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.