[
https://issues.apache.org/jira/browse/FLINK-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14941379#comment-14941379
]
Greg Hogan commented on FLINK-2716:
-----------------------------------
[~StephanEwen], I would like to use {{TypeComparator.hash}} within a
{{RichFlatMapFunction}} (similar to {{DataSet.count}}) for this implementation.
You had noted earlier discussion about making serializers available to
{{RichFunction}} implementations and access to type comparators could be
implemented likewise.
Ideally the user would only see the available number of serializers and type
comparators: {{getInputSerializer()}} for single input functions,
{{getFirstInputSerializer()}} and {{getSecondInputSerializer()}} for dual input
functions.
Currently {{RichFlatMapFunction}} inherits from {{AbstractRichFunction}} which
implements access to the {{RuntimeContext}}. We could add a layer and have each
single input function inherit from an {{AbstractSingleInputRichFunction}}
(similar to how {{FlatMapOperator}} inherits from {{SingleInputUdfOperator}})
that would provide access to serializers and type comparators (and likewise
with {{AbstractTwoInputRichFunction}} for dual input functions).
> Checksum method for DataSet and Graph
> -------------------------------------
>
> Key: FLINK-2716
> URL: https://issues.apache.org/jira/browse/FLINK-2716
> Project: Flink
> Issue Type: Improvement
> Components: Gelly, Java API, Scala API
> Affects Versions: master
> Reporter: Greg Hogan
> Assignee: Greg Hogan
> Priority: Minor
>
> {{DataSet.count()}}, {{Graph.numberOfVertices()}}, and
> {{Graph.numberOfEdges()}} provide measures of the number of distributed data
> elements. New {{DataSet.checksum()}} and {{Graph.checksum()}} methods will
> summarize the content of data elements and support algorithm validation,
> integration testing, and benchmarking.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)