[
https://issues.apache.org/jira/browse/FLINK-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661491#comment-14661491
]
Fabian Hueske commented on FLINK-2444:
--------------------------------------
Hi [~James_cao],
I think for these tests it is not necessary to use a MiniCluster and execute
actual Hadoop InputFormats. What these tests must ensure is that all methods of
Hadoop InputFormats (and InputSplits) are called and are called in the correct
order by Flink's HadoopInputFormat wrapper. To do that, we need to look into
Hadoop's code. The two different Hadoop API make this a bit more difficult.
Also InputFormats and InputSplits can implement a {{Configurable}} or
{{JobConfigurable}} interface. So there are several corner cases that we need
to check. Until now, there are no tests that check this. The InputFormats are
updated whenever a user reports an issue but we don't have tests which check,
that nothing is broken by fixing something else...
To solve this issue, I would implement mocking Hadoop InputFormat (and
InputSplit) classes which check that all methods are called in the correct
order. For example, check that the {{configure()}} method of a configurable
InputFormat or InputSplit is called by the Flink HadoopInputFormat wrapper.
Please let me know if you have further questions and if you'd like to work on
this issue.
> Add tests for HadoopInputFormats
> --------------------------------
>
> Key: FLINK-2444
> URL: https://issues.apache.org/jira/browse/FLINK-2444
> Project: Flink
> Issue Type: Test
> Components: Hadoop Compatibility, Tests
> Affects Versions: 0.10, 0.9.0
> Reporter: Fabian Hueske
> Labels: starter
>
> The HadoopInputFormats and HadoopInputFormatBase classes are not sufficiently
> covered by unit tests.
> We need tests that ensure that the methods of the wrapped Hadoop InputFormats
> are correctly called.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)