+1 we'll get less layered code IMO On Thursday, May 5, 2016, Manu Zhang <[email protected]> wrote:
> Hi Kam, > > What I mean is like > > HDFS ~ kafka-hdfs-connector ~> Kafka ~ KafkaSource ~> Gearpump ~ KafkaSink > ~> Kafka ~ kafka-jdbc-connector ~ MySQL > > Well, I think this will be easier to implement than wrapping Storm > connectors. > > On Fri, May 6, 2016 at 10:09 AM Jiang Weihua <[email protected] > <javascript:;>> wrote: > > > From the usage I know, many cleaning applications will read from Kafka > and > > write to Kafka. But, other kind apps don’t follow this pattern. > > > > > > > > > > 在 16/5/6 上午9:37,“Kam Kasravi”<[email protected] <javascript:;>> 写入: > > > > >Other benefits? Is there performance cost? Would we co-locate both our > > >source and KafkaSouce in same JVM? > > > > > >On Thursday, May 5, 2016, Jiang Weihua <[email protected] > <javascript:;>> wrote: > > > > > >> I will say it is a good shortcut for current usage. However we > > definitely > > >> need our own source and sinks in long term. > > >> > > >> Sent from my iPhone > > >> > > >> ? 2016?5?6??06:49?Manu Zhang <[email protected] <javascript:;> > <javascript:;> > > >> <mailto:[email protected] <javascript:;> <javascript:;>>> ??? > > >> > > >> Hi Kam and others, > > >> > > >> Do you think it makes sense to utilize kafka-connect > > >> <http://docs.confluent.io/2.0.0/connect/connectors.html> for > > source/sink ? > > >> The topology would be like source ~> KafkaSource ~> DAG ~> KafkaSink > ~> > > >> sink. > > >> One benefit is we always get at-least-once delivery provided by the > > current > > >> KafkaSource. > > >> Kafka provides HDFS and JDBC connector out of box and other connectors > > are > > >> being contributed by the community > > >> < > > >> > > > https://github.com/search?p=1&q=kafka-connect&type=Repositories&utf8=%E2%9C%93 > > >> > > > >> . > > >> > > >> On Thu, May 5, 2016 at 11:35 PM Kam Kasravi <[email protected] > <javascript:;> > > >> <javascript:;><mailto:[email protected] <javascript:;> > <javascript:;>>> wrote: > > >> > > >> Hi Karol > > >> > > >> Good feedback, I'm not sure if GEARPUMP-116 would allow easy > > integration of > > >> Redis, JMS, AMQP > > >> from beam and akka-stream perspectives. Huafeng, Manu? > > >> > > >> > > >> On Wed, May 4, 2016 at 10:34 AM, Karol Brejna <[email protected] > <javascript:;> > > >> <javascript:;><mailto:[email protected] <javascript:;> > <javascript:;>>> > > >> wrote: > > >> > > >> We have a series of jira tickets regarding Gearpump sinks/sources: > > >> > > >> https://issues.apache.org/jira/browse/GEARPUMP-116 - Compatibility > > >> layer/adapter for Apache Storm > > >> https://issues.apache.org/jira/browse/GEARPUMP-115 - Create MQTT > > >> source/sink > > >> https://issues.apache.org/jira/browse/GEARPUMP-106 - Gearpump Redis > > >> Integration > > >> https://issues.apache.org/jira/browse/GEARPUMP-105 - Provide > > >> non-persistent > > >> Sink Task so that examples like word count can materialize Sum results > > >> within the Client > > >> https://issues.apache.org/jira/browse/GEARPUMP-100 - Source task that > > >> emits > > >> messages per a schedule (interval or otherwise) should be provided > > >> https://issues.apache.org/jira/browse/GEARPUMP-95 - Add parquet > > >> datasource > > >> and datasink connectors > > >> https://issues.apache.org/jira/browse/GEARPUMP-91 - Apache Cassandra > > >> Integration > > >> > > >> We also had a ticket for 'Add a HDFS Sink with secutiry' ( > > >> https://github.com/gearpump/gearpump/issues/1547) - I am not sure as > > for > > >> the outcome of this one. > > >> > > >> Most of them consider the medium (MQTT, Redis, Casandra, ...). Other > > talk > > >> about the source mechanics (scheduled/repetative source). > > >> > > >> I'd like to discuss the order in wich we plan implementation for them. > > >> > > >> In my opinion Redis an MQTT (GEARPUMP-106, GEARPUMP-115) seems most > > >> important to have. > > >> Redis is well known and widely used. MQTT is a de facto standard in > IoT > > >> communications. > > >> > > >> Then I would like to have HDFS sink (if we didn't merged this > already). > > >> > > >> Non-persistent datasink could be very useful for examples/demo > purposes. > > >> (Imagine we have capped collection that the application can send > > messages > > >> to, kind of application console. In the dashboard there could be a > > >> section > > >> that presents lates 'console' messages. This way a user could "watch" > > the > > >> application progress. Especially if he/she doesn't have access to the > > >> backend - as it happens often in YARN mode. But this is a topic for > > >> dedicated discussion, I think.) > > >> > > >> On the other hand, if we start working on GEARPUMP-116, we'd probably > > >> quickly have Redis, JMS, AMQP sources (adapted from Storm) > > >> > > >> Please, let me know what do you think. > > >> > > >> Karol > > >> > > >> > > >> > > > > >
