[
https://issues.apache.org/jira/browse/PIG-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12584072#action_12584072
]
shravanmn edited comment on PIG-161 at 4/1/08 2:10 AM:
----------------------------------------------------------------------------
Thank you for the input Pi. Responses in line...
* The proposal in wiki says "Make the model entirely push-based" but this
implementation is pull-based. What is the real direction on this?
[shrav] Hmm, Interesting observation. We never realized there could be such a
grave error lurking in the documentation. The consensus was to have a single
threaded, pull based model. The current model is push based. We need to change
this.
@Alan: I am not sure how this happened?!
* This comment is obvious. Those dummy type instances in POFilter can be
made static. If they are used all over the place, we should move them to a more
generic parent class.
[shrav] Its a good idea. I think we can introduce these into PhysicalOperator
as static dummy types. But we should also make this known to anyone
implementing physical operators so that these can be used. I will move them.
* visit() should not throw ParseException. Look at PIG-169 for example. In
some cases we use visit later in the processing after parsing. If we expect
ParseException when parsing then we can just wrap in ParseException.
[shrav] This is a side effect of the Operator's visit method throwing
ParseException. But I think you are right. We should throw some other generic
exception than a ParseException.
@Alan: Could you please comment on this?
* Therefore doAllPredecessors() and doAllPredecessors() also should not
throw ParseException
[shrav] Same as the previous one.
* GreaterThanExpr, LessThanExpr, GTOrEqualToExpr, NotEqualToExpr + a few
more are 90% the same. Is it possible to have a common abstract class for them?
[shrav] Well, if you can think of a better way it would be great. The main
thing that nfluenced this choice was that we wanted to minimize the branching
constructs. The moment we introduce an abstract class, we would need them. This
was done because the physical side assumes that the type checker will provide
appropriate type information and we can use them inside expressions to avoid
branching.
* This is subjective. The name depthFirst() for me is a bit mis-leading
because this specific depthFirst() doesn't walk over already seen nodes.
[shrav] I concur. It is depthFirst in a restricted sense. Again the same search
can be done in both directions either from the top or from the bottom. If we
choose to change this in OperatorPlan, I am fine with this.
@Alan: Your comments needed.
* From this "The input model assumes that it can either be taken from an
operator or can be attached directly to this operator" could you explain more
what attachinput and input in PhysicalOperator do?
[shrav] PhysicalOperator is the base class for all the operators. Since each
operator needed the same input model, I just refactored it into the base class.
So the input model is an artifact of the execution model. Since we decided have
attribute plans, the model that we converged to was that the attribute plans be
separated from the top level operator plan and scoped within the operator that
it is an attribute of. So when the top level operator pulls tuples from its
input operator, it attaches these tuples to the attribute plan and calls the
getNext on the root operator of the plan. The attach of the attribute plan in
turn uses attachInput fn the leaf operators. I hope I am clear. If not please
let me know. I wil rephrase it.
was (Author: shravanmn):
Thank you for the input Pi. Responses in line...
* The proposal in wiki says "Make the model entirely push-based" but this
implementation is pull-based. What is the real direction on this?
Hmm, Interesting observation. We never realized there could be such a grave
error lurking in the documentation. The consensus was to have a single
threaded, pull based model. The current model is push based. We need to change
this.
@Alan: I am not sure how this happened?!
* This comment is obvious. Those dummy type instances in POFilter can be
made static. If they are used all over the place, we should move them to a more
generic parent class.
Its a good idea. I think we can introduce these into PhysicalOperator as static
dummy types. But we should also make this known to anyone implementing physical
operators so that these can be used. I will move them.
* visit() should not throw ParseException. Look at PIG-169 for example. In
some cases we use visit later in the processing after parsing. If we expect
ParseException when parsing then we can just wrap in ParseException.
This is a side effect of the Operator's visit method throwing ParseException.
But I think you are right. We should throw some other generic exception than a
ParseException.
@Alan: Could you please comment on this?
* Therefore doAllPredecessors() and doAllPredecessors() also should not
throw ParseException
Same as the previous one.
* GreaterThanExpr, LessThanExpr, GTOrEqualToExpr, NotEqualToExpr + a few
more are 90% the same. Is it possible to have a common abstract class for them?
Well, if you can think of a better way it would be great. The main thing that
nfluenced this choice was that we wanted to minimize the branching constructs.
The moment we introduce an abstract class, we would need them. This was done
because the physical side assumes that the type checker will provide
appropriate type information and we can use them inside expressions to avoid
branching.
* This is subjective. The name depthFirst() for me is a bit mis-leading
because this specific depthFirst() doesn't walk over already seen nodes.
I concur. It is depthFirst in a restricted sense. Again the same search can be
done in both directions either from the top or from the bottom. If we choose to
change this in OperatorPlan, I am fine with this.
@Alan: Your comments needed.
* From this "The input model assumes that it can either be taken from an
operator or can be attached directly to this operator" could you explain more
what attachinput and input in PhysicalOperator do?
PhysicalOperator is the base class for all the operators. Since each operator
needed the same input model, I just refactored it into the base class. So the
input model is an artifact of the execution model. Since we decided have
attribute plans, the model that we converged to was that the attribute plans be
separated from the top level operator plan and scoped within the operator that
it is an attribute of. So when the top level operator pulls tuples from its
input operator, it attaches these tuples to the attribute plan and calls the
getNext on the root operator of the plan. The attach of the attribute plan in
turn uses attachInput fn the leaf operators. I hope I am clear. If not please
let me know. I wil rephrase it.
> Rework physical plan
> --------------------
>
> Key: PIG-161
> URL: https://issues.apache.org/jira/browse/PIG-161
> Project: Pig
> Issue Type: Sub-task
> Reporter: Alan Gates
> Assignee: Alan Gates
> Attachments: Phy_AbsClass.patch
>
>
> This bug tracks work to rework all of the physical operators as described in
> http://wiki.apache.org/pig/PigTypesFunctionalSpec
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.