[ 
https://issues.apache.org/jira/browse/KAFKA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gwen Shapira resolved KAFKA-1705.
---------------------------------
    Resolution: Won't Fix

If I didn't do it by now...
Also, MapReduce is so 2014 :)

> Add MR layer to Kafka
> ---------------------
>
>                 Key: KAFKA-1705
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1705
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Gwen Shapira
>            Assignee: Gwen Shapira
>            Priority: Major
>
> Many NoSQL-type storage systems (HBase, Mongo,
> Cassandra) and file formats (Avro, Parquet) provide is a MapReduce
> integration layer - usually an InputFormat, OutputFormat and a utility
> class. Sometimes there's also an abstract Job and Mapper that do more
> setup, which can make things even more convenient.
> This is different than the existing Hadoop contrib project or Camus in that 
> an MR layer will be providing components for use in MR jobs, not an entire 
> job that ingests data from Kafka to HDFS.
> The benefits I see for a MapReduce layer are:
> * Developers can create their own jobs, processing the data as it is
> ingested - rather than having to process it in two steps.
> * There's reusable components for developers looking to integrate with
> Kafka, rather than having everyone implement their own solution.
> * Hadoop developers expect projects to have this layer.
> * Spark reuses Hadoop's InputFormat and OutputFormat - so we get Spark
> integration for free.
> * There's a layer to plug the delegation token code into and make it
> invisible to MapReduce developers. Without this, everyone who writes
> MR jobs will need to think about how to implement authentication.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to