Re: Sink/source tickets

Kam Kasravi Thu, 05 May 2016 21:24:05 -0700

+1 we'll get less layered code IMO

On Thursday, May 5, 2016, Manu Zhang <[email protected]> wrote:


> Hi Kam,
>
> What I mean is like
>
> HDFS ~ kafka-hdfs-connector ~> Kafka ~ KafkaSource ~> Gearpump ~ KafkaSink
> ~> Kafka ~ kafka-jdbc-connector ~ MySQL
>
> Well, I think this will be easier to implement than wrapping Storm
> connectors.
>
> On Fri, May 6, 2016 at 10:09 AM Jiang Weihua <[email protected]
> <javascript:;>> wrote:
>
> > From the usage I know, many cleaning applications will read from Kafka
> and
> > write to Kafka. But, other kind apps don’t follow this pattern.
> >
> >
> >
> >
> > 在 16/5/6 上午9:37，“Kam Kasravi”<[email protected] <javascript:;>> 写入:
> >
> > >Other benefits? Is there performance cost? Would we co-locate both our
> > >source and KafkaSouce in same JVM?
> > >
> > >On Thursday, May 5, 2016, Jiang Weihua <[email protected]
> <javascript:;>> wrote:
> > >
> > >> I will say it is a good shortcut for current usage. However we
> > definitely
> > >> need our own source and sinks in long term.
> > >>
> > >> Sent from my iPhone
> > >>
> > >> ? 2016?5?6??06:49?Manu Zhang <[email protected] <javascript:;>
> <javascript:;>
> > >> <mailto:[email protected] <javascript:;> <javascript:;>>> ???
> > >>
> > >> Hi Kam and others,
> > >>
> > >> Do you think it makes sense to utilize kafka-connect
> > >> <http://docs.confluent.io/2.0.0/connect/connectors.html> for
> > source/sink ?
> > >> The topology would be like source ~> KafkaSource ~> DAG ~> KafkaSink
> ~>
> > >> sink.
> > >> One benefit is we always get at-least-once delivery provided by the
> > current
> > >> KafkaSource.
> > >> Kafka provides HDFS and JDBC connector out of box and other connectors
> > are
> > >> being contributed by the community
> > >> <
> > >>
> >
> https://github.com/search?p=1&q=kafka-connect&type=Repositories&utf8=%E2%9C%93
> > >> >
> > >> .
> > >>
> > >> On Thu, May 5, 2016 at 11:35 PM Kam Kasravi <[email protected]
> <javascript:;>
> > >> <javascript:;><mailto:[email protected] <javascript:;>
> <javascript:;>>> wrote:
> > >>
> > >> Hi Karol
> > >>
> > >> Good feedback, I'm not sure if GEARPUMP-116 would allow easy
> > integration of
> > >> Redis, JMS, AMQP
> > >> from beam and akka-stream perspectives. Huafeng, Manu?
> > >>
> > >>
> > >> On Wed, May 4, 2016 at 10:34 AM, Karol Brejna <[email protected]
> <javascript:;>
> > >> <javascript:;><mailto:[email protected] <javascript:;>
> <javascript:;>>>
> > >> wrote:
> > >>
> > >> We have a series of jira tickets regarding Gearpump sinks/sources:
> > >>
> > >> https://issues.apache.org/jira/browse/GEARPUMP-116 - Compatibility
> > >> layer/adapter for Apache Storm
> > >> https://issues.apache.org/jira/browse/GEARPUMP-115 - Create MQTT
> > >> source/sink
> > >> https://issues.apache.org/jira/browse/GEARPUMP-106 - Gearpump Redis
> > >> Integration
> > >> https://issues.apache.org/jira/browse/GEARPUMP-105 - Provide
> > >> non-persistent
> > >> Sink Task so that examples like word count can materialize Sum results
> > >> within the Client
> > >> https://issues.apache.org/jira/browse/GEARPUMP-100 - Source task that
> > >> emits
> > >> messages per a schedule (interval or otherwise) should be provided
> > >> https://issues.apache.org/jira/browse/GEARPUMP-95 - Add parquet
> > >> datasource
> > >> and datasink connectors
> > >> https://issues.apache.org/jira/browse/GEARPUMP-91 - Apache Cassandra
> > >> Integration
> > >>
> > >> We also had a ticket for 'Add a HDFS Sink with secutiry' (
> > >> https://github.com/gearpump/gearpump/issues/1547) - I am not sure as
> > for
> > >> the outcome of this one.
> > >>
> > >> Most of them consider the medium (MQTT, Redis, Casandra, ...). Other
> > talk
> > >> about the source mechanics (scheduled/repetative source).
> > >>
> > >> I'd like to discuss the order in wich we plan implementation for them.
> > >>
> > >> In my opinion Redis an MQTT (GEARPUMP-106, GEARPUMP-115) seems most
> > >> important to have.
> > >> Redis is well known and widely used. MQTT is a de facto standard in
> IoT
> > >> communications.
> > >>
> > >> Then I would like to have HDFS sink (if we didn't merged this
> already).
> > >>
> > >> Non-persistent datasink could be very useful for examples/demo
> purposes.
> > >> (Imagine we have capped collection that the application can send
> > messages
> > >> to, kind of application console. In the dashboard there could be a
> > >> section
> > >> that presents lates 'console' messages. This way a user could "watch"
> > the
> > >> application progress. Especially if he/she doesn't have access to the
> > >> backend - as it happens often in YARN mode. But this is a topic for
> > >> dedicated discussion, I think.)
> > >>
> > >> On the other hand, if we start working on GEARPUMP-116, we'd probably
> > >> quickly have Redis, JMS, AMQP sources (adapted from Storm)
> > >>
> > >> Please, let me know what do you think.
> > >>
> > >> Karol
> > >>
> > >>
> > >>
> >
> >
>

Re: Sink/source tickets

Reply via email to