Adding soft link to plan to solve input file dependency
-------------------------------------------------------

                 Key: PIG-1605
                 URL: https://issues.apache.org/jira/browse/PIG-1605
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.8.0
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.8.0


In scalar implementation, we need to deal with implicit dependencies. 
[PIG-1603|https://issues.apache.org/jira/browse/PIG-1603] is trying to solve 
the problem by adding a LOScalar operator. Here is a different approach. We 
will add a soft link to the plan, and soft link is only visible to the walkers. 
All other part of the logical plan does not know the existence of the soft 
link. The benefits are:

1. Logical plan do not need to deal with LOScalar, this makes logical plan 
cleaner
2. Conceptually scalar dependency is different. Regular link represent a data 
flow in pipeline. In scalar, the dependency means an operator depends on a file 
generated by the other operator. It's different type of data dependency.
3. Soft link can solve other dependency problem in the future. If we introduce 
another UDF dependent on a file generated by another operator, we can use this 
mechanism to solve it. 

Currently, there are two cases we can use soft link:
1. scalar dependency, where ReadScalar UDF will use a file generate by a LOStore
2. store-load dependency, where we will load a file which is generated by a 
store in the same script. This happens in multi-store case. Currently we solve 
it by regular link. It is better to use a soft link.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to