[ https://issues.apache.org/jira/browse/PIG-1595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Thejas M Nair updated PIG-1595: ------------------------------- Attachment: PIG-1595.1.patch PIG-1595.1.patch - In this patch a new sublcass of DependencyOrderWalker has been created (DependencyOrderWalkerLPScalar) , when it chooses the sink nodes of the plan to start the walk, it chooses them in the order as determined by the dependency order resulting from the ReadScalars dependencies. - The LOCast that was being added after ReadScalars to get expected type is no longer necessary and has been removed. - There is also a check in PigServer.mergeScalars() to see if the LOStore that the code attempts to re-use has the same store function - InterStorage which is used by ReadScalar udf to read the input . - No new unit test case has been added as the test TestScalarAliases.testFilteredScalarDollarProj is a test case that was failing without the additional cast now succeeds without the cast. Unit tests have passed. Test-patch result results are pasted below. Patch is ready for review. [exec] -1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] -1 tests included. The patch doesn't appear to include any new or modified tests. [exec] Please justify why no tests are needed for this patch. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. > casting relation to scalar- problem with handling of data from non PigStorage > loaders > ------------------------------------------------------------------------------------- > > Key: PIG-1595 > URL: https://issues.apache.org/jira/browse/PIG-1595 > Project: Pig > Issue Type: Bug > Reporter: Thejas M Nair > Assignee: Thejas M Nair > Fix For: 0.8.0 > > Attachments: PIG-1595.1.patch > > > If load functions that don't follow the same bytearray format as PigStorage > for other supported datatypes, or those that don't implement the LoadCaster > interface are used in 'casting relation to scalar' (PIG-1434), it can cause > the query to fail or create incorrect results. > The root cause of the problem is that there is a real dependency between the > ReadScalars udf that returns the scalar value and the LogicalOperator that > acts as its input. But the logicalplan does not capture this dependency. So > in SchemaResetter visitor used by the optimizer, the order in which schema is > reset and evaluated does not take this into consideration. If the schema of > the input LogicalOperator does not get evaluated before the ReadScalar udf, > the resutltype of ReadScalar udf becomes bytearray. POUserFunc will convert > the input to bytearray using ' new DataByteArray(inp.toString().getBytes())'. > But this bytearray encoding of other supported types might not be same for > the LoadFunction associated with the column, and that can result in problems. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.