[
https://issues.apache.org/jira/browse/FLINK-4257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Greg Hogan updated FLINK-4257:
------------------------------
Description:
A class created by {{ProxyFactory}} can intercept and reinterpret method calls
using its {{MethodHandler}}, but is restricted in that
* the type of the proxy class cannot be changed
* method return types must be honored
We have algorithms such as {{VertexDegree}} and {{TriangleListing}} that change
return type depending on configuration, even between single and dual input
functions. This can be problematic, e.g. in {{OperatorTranslation}} where we
test {{dataSet instanceof SingleInputOperator}} or {{dataSet instanceof
TwoInputOperator}}.
Even simply changing operator can be problematic, e.g.
{{MapOperator.translateToDataFlow}} returns {{MapOperatorBase}} whereas
{{ReduceOperator.translateToDataFlow}} returns {{SingleInputOperator}}.
I see two ways to solve these issues. By adding a simple {{NoOpOperator}} that
is skipped over during {{OperatorTranslation}} we could wrap all algorithm
output and always be proxying the same class.
Alternatively, making changes only within Gelly we can append a "no-op"
pass-through {{MapFunction}} to any algorithm output which is not a
{{SingleInputOperator}}. And {{Delegate}} can also walk the superclass
hierarchy such we are always proxying {{SingleInputOperator}}.
There is one additional issue. When we call {{DataSet.output}} the delegate's
{{MethodHandler}} must reinterpret this call to add itself to the list of sinks.
was:
A class created by {{ProxyFactory}} can intercept and reinterpret method calls
using its {{MethodHandler}}, but is restricted in that
* the type of the proxy class cannot be changed
* method return types must be honored
We have algorithms such as {{VertexDegree}} and {{TriangleListing}} that change
return type depending on configuration, even between single and dual input
functions. This can be problematic, e.g. in {{OperatorTranslation}} where we
test {{dataSet instanceof SingleInputOperator}} or {{dataSet instanceof
TwoInputOperator}}.
Even simply changing operator can be problematic, e.g.
{{MapOperator.translateToDataFlow}} returns {{MapOperatorBase}} whereas
{{ReduceOperator.translateToDataFlow}} returns {{SingleInputOperator}}.
I see two ways to solve these issues. By adding a simple {{NoOpOperator}} that
is skipped over during {{OperatorTranslation}} we could wrap all algorithm
output and always be proxying the same class.
Alternatively, making changes only within Gelly we can append a "no-op"
pass-through {{MapFunction}} to any algorithm output which is not a
{{SingleInputOperator}}. And {{Delegate}} can also walk the superclass
hierarchy such we are always proxying {{SingleInputOperator}}.
There is one additional issue. When we call {{DataSet.output}} the delegate's
{{MethodHandler}} must reinterpret this call to add itself to the list of sinks.
As part of this issue I will also add manual tests to Gelly for the library
algorithms which do not have integration tests.
> Handle delegating algorithm change of class
> -------------------------------------------
>
> Key: FLINK-4257
> URL: https://issues.apache.org/jira/browse/FLINK-4257
> Project: Flink
> Issue Type: Bug
> Components: Gelly
> Affects Versions: 1.1.0
> Reporter: Greg Hogan
> Assignee: Greg Hogan
>
> A class created by {{ProxyFactory}} can intercept and reinterpret method
> calls using its {{MethodHandler}}, but is restricted in that
> * the type of the proxy class cannot be changed
> * method return types must be honored
> We have algorithms such as {{VertexDegree}} and {{TriangleListing}} that
> change return type depending on configuration, even between single and dual
> input functions. This can be problematic, e.g. in {{OperatorTranslation}}
> where we test {{dataSet instanceof SingleInputOperator}} or {{dataSet
> instanceof TwoInputOperator}}.
> Even simply changing operator can be problematic, e.g.
> {{MapOperator.translateToDataFlow}} returns {{MapOperatorBase}} whereas
> {{ReduceOperator.translateToDataFlow}} returns {{SingleInputOperator}}.
> I see two ways to solve these issues. By adding a simple {{NoOpOperator}}
> that is skipped over during {{OperatorTranslation}} we could wrap all
> algorithm output and always be proxying the same class.
> Alternatively, making changes only within Gelly we can append a "no-op"
> pass-through {{MapFunction}} to any algorithm output which is not a
> {{SingleInputOperator}}. And {{Delegate}} can also walk the superclass
> hierarchy such we are always proxying {{SingleInputOperator}}.
> There is one additional issue. When we call {{DataSet.output}} the delegate's
> {{MethodHandler}} must reinterpret this call to add itself to the list of
> sinks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)