[jira] [Commented] (FLINK-1107) Document how to use Avro files and the Hadoop Input Format Wrappers

ASF GitHub Bot (JIRA) Fri, 03 Oct 2014 01:42:12 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-1107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157801#comment-14157801
 ]


ASF GitHub Bot commented on FLINK-1107:
---------------------------------------

Github user rmetzger commented on a diff in the pull request:

    https://github.com/apache/incubator-flink/pull/131#discussion_r18385767
  
    --- Diff: docs/hadoop_compatibility.md ---
    @@ -1,5 +1,75 @@
     ---
    -title: "Hadoop Compatability"
    +title: "Hadoop I/O Compatibility"
     ---
     
    -To be written.
    \ No newline at end of file
    +Flink not only supports types that implement Apache Hadoop's `Writable` 
interface by default, but also provides a compatibility 
    +layer that allows for using any class extending 
`org.apache.hadoop.mapred(uce).InputFormat` as a Flink `InputFormat` as well as 
any 
    +class extending `org.apache.hadoop.mapred(uce).OutputFormat` as a Flink 
`OutputFormat`. 
    +
    +Thus, Flink can handle Hadoop-related formats from the common 
`TextInputFormat` up to third-party components such as Hive through HCatalog's 
`HCatInputFormat`. Flink supports both the formats which use the old 
`org.apache.hadoop.mapred` API as well as the new `org.apache.hadoop.mapreduce` 
API.
    +
    +This document explains how to configure your Maven project correctly and 
shows an example.
    +
    +### Project Configuration
    +
    +The Hadoop Compatibility Layer is part of the *addons* Maven project. All 
relevant classes are located in the `org.apache.flink.hadoopcompatibility` 
package. The package includes separate packages and classes for the Hadoop 
`mapred` and `mapreduce` API.
    +
    +Add the following dependency to your `pom.xml` to use the Hadoop 
Compatibility Layer.
    +
    +~~~xml
    +<dependency>
    +   <groupId>org.apache.flink</groupId>
    +   <artifactId>flink-hadoop-compatibility</artifactId>
    +   <version>{{site.FLINK_VERSION_STABLE}}</version>
    +</dependency>
    +~~~
    --- End diff --
    
    can you add another headline here, that quickly explains the usage?
    Just how to read data from the environment to a data set using a hadoop 
input format (something like a copy-past skeleton).
    I think its better if you have a) a general definition how to use it
    and b) an example that people see how to use it in context.


> Document how to use Avro files and the Hadoop Input Format Wrappers
> -------------------------------------------------------------------
>
>                 Key: FLINK-1107
>                 URL: https://issues.apache.org/jira/browse/FLINK-1107
>             Project: Flink
>          Issue Type: Task
>          Components: Documentation
>    Affects Versions: 0.7-incubating
>            Reporter: Robert Metzger
>            Assignee: Timo Walther
>            Priority: Minor
>
> The documentation lacks any examples or description on how to read from avro 
> files. 
> Also, we should document the Hadoop Input Formats a bit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-1107) Document how to use Avro files and the Hadoop Input Format Wrappers

Reply via email to