[
https://issues.apache.org/jira/browse/PIG-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906932#action_12906932
]
Daniel Dai commented on PIG-1595:
---------------------------------
+1 for the test failure fix.
> casting relation to scalar- problem with handling of data from non PigStorage
> loaders
> -------------------------------------------------------------------------------------
>
> Key: PIG-1595
> URL: https://issues.apache.org/jira/browse/PIG-1595
> Project: Pig
> Issue Type: Bug
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.8.0
>
> Attachments: PIG-1595.1.patch, PIG-1595.2.patch
>
>
> If load functions that don't follow the same bytearray format as PigStorage
> for other supported datatypes, or those that don't implement the LoadCaster
> interface are used in 'casting relation to scalar' (PIG-1434), it can cause
> the query to fail or create incorrect results.
> The root cause of the problem is that there is a real dependency between the
> ReadScalars udf that returns the scalar value and the LogicalOperator that
> acts as its input. But the logicalplan does not capture this dependency. So
> in SchemaResetter visitor used by the optimizer, the order in which schema is
> reset and evaluated does not take this into consideration. If the schema of
> the input LogicalOperator does not get evaluated before the ReadScalar udf,
> the resutltype of ReadScalar udf becomes bytearray. POUserFunc will convert
> the input to bytearray using ' new DataByteArray(inp.toString().getBytes())'.
> But this bytearray encoding of other supported types might not be same for
> the LoadFunction associated with the column, and that can result in problems.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.