Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The "SemanticsCleanup" page has been changed by AlanGates. http://wiki.apache.org/pig/SemanticsCleanup?action=diff&rev1=1&rev2=2 -------------------------------------------------- The bugs have been placed into the following categories: * Schema: These are related to schemas that are improperly inferred, etc. * Grammar: Places where the grammar is unclear or produces unexpected results. - * Two Level Access: The concept of two level access was introduced long ago to deal with oddities in bag schemas. Ideally we will remove this. At least we have to improve it. + * Nested Types: Issues dealing with bags, tuples, and maps. + * Dynamic Type Binding: In certain situations Pig assumes a value to be of type byte array when it does not know the actual type, and handles whatever actual type it is at runtime. There are situations where this does not work properly. == Bug Table == - || *JIRA* || *Category* || *Proposed Solution* || + || '''JIRA''' || '''Category''' || '''Proposed Solution''' || '''Backward Compatible''' || - || [[https://issues.apache.org/jira/browse/PIG-1627|PIG-1627]] || Schema || Flattening a bag with an unknown schema should produce a record with an unknown schema || + || [[https://issues.apache.org/jira/browse/PIG-1627|PIG-1627]] || Schema || Flattening a bag with an unknown schema should produce a record with an unknown schema || no || - || [[https://issues.apache.org/jira/browse/PIG-1584|PIG-1584]] || Grammar || Cogroup inner does not match the semantics of inner join. It is also not clear what value the inner keyword has for cogroup. || + || [[https://issues.apache.org/jira/browse/PIG-1584|PIG-1584]] || Grammar || Cogroup inner does not match the semantics of inner join. It is also not clear what value the inner keyword has for cogroup. Consider removing it. || || - || [[https://issues.apache.org/jira/browse/PIG-1538|PIG-1538]] || Two level access || Remove two level access || + || [[https://issues.apache.org/jira/browse/PIG-1538|PIG-1538]] || Nested types || Remove two level access || Maybe, if we can find a way to ignore calls to Schema.isTwoLevelAccessRequired(). || - || [[https://issues.apache.org/jira/browse/PIG-1536|PIG-1536]] || Schema || Pig one semantic for schema merges and use it consistently throughout Pig || + || [[https://issues.apache.org/jira/browse/PIG-1536|PIG-1536]] || Schema || Pick one semantic for schema merges and use it consistently throughout Pig || no || + || [[https://issues.apache.org/jira/browse/PIG-1341|PIG-1341]] || Dynamic type binding || Close as won't fix || yes || + || [[https://issues.apache.org/jira/browse/PIG-1281|PIG-1281]] || Dynamic type binding || In situations where a Hadoop shuffle key is assumed to be of type bytearray wrap the value in a tuple so that if the type is actually something else Hadoop can still process it. || yes || + || [[https://issues.apache.org/jira/browse/PIG-1277|PIG-1277]] || Nested types || Unknown || || + || [[https://issues.apache.org/jira/browse/PIG-1188|PIG-1188]] || Schema || Make sure Pig handles missing data in Tuples by returning a null rather than failing. || yes || + || [[https://issues.apache.org/jira/browse/PIG-1112|PIG-1112]] || Schema || When user provides AS to flatten of undefined bag or tuple, the contents of that AS are taken to be the schema of the bag or tuple. || yes || + || [[https://issues.apache.org/jira/browse/PIG-1065|PIG-1065]] || Dynamic type binding || In situations where a Hadoop shuffle key is assumed to be of type bytearray wrap the value in a tuple so that if the type is actually something else Hadoop can still process it. || yes || + || [[https://issues.apache.org/jira/browse/PIG-999|PIG-999]] || Dynamic type binding || In situations where a Hadoop shuffle key is assumed to be of type bytearray wrap the value in a tuple so that if the type is actually something else Hadoop can still process it. || yes || + || [[https://issues.apache.org/jira/browse/PIG-767|PIG-767]] || Nested types || Remove two level access; bring DUMP and DESCRIBE output into sync. || no || + || [[https://issues.apache.org/jira/browse/PIG-730|PIG-730]] || Nested types || Make sure schema of union is the same as schema before union (suspect his is a two level access issue) || unclear || + || [[https://issues.apache.org/jira/browse/PIG-723|PIG-723]] || Nested types || Suspect this is a two level access issue || unclear || + || [[https://issues.apache.org/jira/browse/PIG-696|PIG-696]] || Dynamic type binding || Class cast exceptions such as this should result in a null value and a warning, not a failure. || yes || + || [[https://issues.apache.org/jira/browse/PIG-694|PIG-694]] || Nested types || Determine the semantics for merging tuples and bags. || unclear || + || [[https://issues.apache.org/jira/browse/PIG-621|PIG-621]] || Dynamic type binding || Class cast exceptions such as this should result in a null value and a warning, not a failure. || yes || + || [[https://issues.apache.org/jira/browse/PIG-435|PIG-435]] || Schema || Decide definitely on what it means when users declare a schema for a load. || unclear || + || [[https://issues.apache.org/jira/browse/PIG-333|PIG-333]] || Dynamic type binding || Since it is specified that MIN and MAX treat unknown types as double, all the actual string data should be converted to NULLs, rather than cause errors. || yes || + || [[https://issues.apache.org/jira/browse/PIG-313|PIG-313]] || Grammar || I propose that we continue not supporting this. But we should detect it at compile time rather than at runtime. || yes || + == Discussion == + Removing two level access: + + Schema merging and nested types: + + Declaration of schemas in load: +