[ 
https://issues.apache.org/jira/browse/BEAM-549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419570#comment-15419570
 ] 

ASF GitHub Bot commented on BEAM-549:
-------------------------------------

GitHub user amitsela opened a pull request:

    https://github.com/apache/incubator-beam/pull/822

    [BEAM-549] SparkRunner should support Beam's KafkaIO instead of providing 
it's own.

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [ ] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
    
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/amitsela/incubator-beam BEAM-549

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-beam/pull/822.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #822
    
----
commit 960b4e9847df3d3fd5b9ab99bcfbeae6b2c24f01
Author: Sela <[email protected]>
Date:   2016-08-12T21:40:09Z

    Spark Kafka io support via an adapting PTransform.

commit be567246ff88f9049d1aa206b7d7b556a3a8d565
Author: Sela <[email protected]>
Date:   2016-08-12T21:41:34Z

    Expose relevant fields as public, they are final anyway.

commit 9414d760fcb1953f55eb99dfce78da56c464c15f
Author: Sela <[email protected]>
Date:   2016-08-12T21:42:26Z

    Remove SparkRunner's implementation of KafkaIO.

commit 0f822ac80dfdca83cc9204e8784561d5f066b43e
Author: Sela <[email protected]>
Date:   2016-08-12T21:43:07Z

    Dependency on Beam's KafkaIO.

commit 89f52131b044e6e99c96f9bb5741050233422ebc
Author: Sela <[email protected]>
Date:   2016-08-12T21:44:23Z

    Translate reading of Kafka using Beam's KafkaIO, let the runner override 
with it's own PTransform.

commit c1965659d5b477fe33c2ab396e9067ce3c2967a7
Author: Sela <[email protected]>
Date:   2016-08-12T21:45:52Z

    Adapt tests.

----


> SparkRunner should support Beam's KafkaIO instead of providing it's own.
> ------------------------------------------------------------------------
>
>                 Key: BEAM-549
>                 URL: https://issues.apache.org/jira/browse/BEAM-549
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> For portability, and in the spirit of Apache Beam, the SparkRunner should use 
> the Beam implementation of KafkaIO instead of it's own.
> Having said that, the runner will translate the KafkaIO as defined in the 
> pipeline into it's own internal implementation, but should still map the 
> properties the user defined in the pipeline in a way that the IO behaves the 
> same - i.e., brokers, topic, etc.
> Eventually, the SparkRunner will implement reading from Kafka using Spark's 
> KafakUtils.createDirectStream() as described here: 
> http://spark.apache.org/docs/1.6.2/streaming-kafka-integration.html#approach-2-direct-approach-no-receivers
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to