Setting twoLevelAccessRequired field in a bag schema should not be required to 
access fields in the tuples of the bag

                 Key: PIG-847
             Project: Pig
          Issue Type: Improvement
    Affects Versions: 0.2.1
            Reporter: Pradeep Kamath

Currently Pig interprets the result type of a relation as a bag. However the 
schema of the relation directly contains the schema describing the fields in 
the tuples for the relation. However when a udf wants to return a bag or if 
there is a bag in input data or if the user creates a bag constant, the schema 
of the bag has one field schema which is that of the tuple. The Tuple's schema 
has the types of the fields. To be able to access the fields from the bag 
directly in such a case by using something like <bagname>.<fieldname> or 
<bag>.<fieldposition>, the schema of the bag should have the twoLevelAccess set 
to true so that pig's type system can get traverse the tuple schema and get to 
the field in question. This is confusing - we should try and see if we can 
avoid needing this extra flag. A possible solution is to treat bags the same 
way - whether they represent relations or real bags. Another way is to 
introduce a special "relation" datatype for the result type of a relation and 
bag type would be used only for true bags. In this case, we would always need 
bag schema to have a tuple schema which would describe the fields. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to