[ 
https://issues.apache.org/jira/browse/PIG-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated PIG-1595:
-------------------------------

    Attachment: PIG-1595.1.patch

PIG-1595.1.patch 
- In this patch a new sublcass of DependencyOrderWalker has been created 
(DependencyOrderWalkerLPScalar) , when it chooses the sink nodes of the plan to 
start the walk, it chooses them in the order as determined by the dependency 
order resulting from the ReadScalars dependencies.
- The LOCast that was being added after ReadScalars to get expected type is no 
longer necessary and has been removed. 
- There is also a check in PigServer.mergeScalars() to see if the LOStore that 
the code attempts to re-use has the same store function - InterStorage which is 
used by ReadScalar udf to read the input .
- No new unit test case has been added as the test 
TestScalarAliases.testFilteredScalarDollarProj is a test case that was failing 
without the additional cast now succeeds without the cast.

Unit tests have passed. Test-patch result results are pasted below. Patch is 
ready for review.
     [exec] -1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     -1 tests included.  The patch doesn't appear to include any new 
or modified tests.
     [exec]                         Please justify why no tests are needed for 
this patch.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.





> casting relation to scalar- problem with handling of data from non PigStorage 
> loaders
> -------------------------------------------------------------------------------------
>
>                 Key: PIG-1595
>                 URL: https://issues.apache.org/jira/browse/PIG-1595
>             Project: Pig
>          Issue Type: Bug
>            Reporter: Thejas M Nair
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>
>         Attachments: PIG-1595.1.patch
>
>
> If load functions that don't follow the same bytearray format as PigStorage 
> for other supported datatypes, or those that don't implement the LoadCaster 
> interface are used in 'casting relation to scalar' (PIG-1434), it can cause 
> the query to fail or create incorrect results.
> The root cause of the problem is that there is a real dependency between the 
> ReadScalars udf that returns the scalar value and the LogicalOperator that 
> acts as its input. But the logicalplan does not capture this dependency. So 
> in SchemaResetter visitor used by the optimizer, the order in which schema is 
> reset and evaluated does not take this into consideration. If the schema of 
> the input LogicalOperator does not get evaluated before the ReadScalar udf, 
> the resutltype of ReadScalar udf becomes bytearray. POUserFunc will convert 
> the input to bytearray using ' new DataByteArray(inp.toString().getBytes())'. 
> But this bytearray encoding of other supported types might not be same for 
> the LoadFunction associated with the column, and that can result in problems.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to