GitHub user peay opened a pull request:

    https://github.com/apache/beam/pull/2330

    [BEAM-1573] Use Kafka serializers instead of coders in KafkaIO

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [x] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [x] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
    
    ---
    
    
    
    **This is a work in progress, do not merge**.
    
    Modified:
    - Use Kafka serializers and deserializers in KafkaIO
    - Added helper methods `fromAvro` and `toAvro`, to use serialization based 
on `AvroCoder`. This is uniform with other IO such as HDFS.
    - Moved `CoderBasedKafkaSerializer` out, and added 
`CoderBaseKafkaDeserializer`. These are used for `toAvro/fromAvro`, and can be 
useful to port existing code that relies on coder.
    - Added `InstantSerializer` and `InstantDeserializer`, as `Instant` is used 
in some of the tests.
     
    Writer lets Kafka handle serialization itself. Reader uses Kafka byte 
deserializers, and calls the user-provided Kafka deserializer from `advance`. 
Note that Kafka serializers and deserializers are not themselves 
`Serializable`. Hence, I've used a `Class<..>` in the `spec` both for read and 
write.
    
    There is still an issue, though. `Read` still takes **both a deserializer 
and a coder**. This is because the source must implement 
`getDefaultOutputCoder`, and I am not sure how to infer it. Having to provide 
the two is heavy, but I am not sure how to infer the coders in this context. 
Any thoughts?
    
    cc @rangadi @jkff


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/peay/beam BEAM-1573

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2330.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2330
    
----
commit 511a1301190b08b05573e3025d7ade2746d61e5f
Author: peay <[email protected]>
Date:   2017-03-26T14:51:59Z

    [BEAM-1573] Use Kafka serializers instead of coders in KafkaIO

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to