Re: kafka 0.9 support

2019-04-02 Thread Mingmin Xu
We're still using Kafka 0.10 a lot, similar as 0.9 IMO. To expand multiple
versions in KafkaIO is quite complex now, and it confuses users which is
supported / which is not. I would prefer to support Kafka 2.0+ only in the
latest version. For old versions, there're some options:
1). document Kafka-Beam support versions, like what we do in FlinkRunner;
2). maintain separated KafkaIOs for old versions;

1) would be easy to maintain, and I assume there should be no issue to use
Beam-Core 3.0 together with KafkaIO 2.0.

Any thoughts?

Mingmin

On Tue, Apr 2, 2019 at 9:56 AM Reuven Lax  wrote:

> KafkaIO is marked as Experimental, and the comment already warns that 0.9
> support might be removed. I think that if users still rely on Kafka 0.9 we
> should leave a fork (renamed) of the IO in the tree for 0.9, but we can
> definitely remove 0.9 support from the main IO if we want, especially if
> it's complicated changes to that IO. If we do though, we should fail with a
> clear error message telling users to use the Kafka 0.9 IO.
>
> On Tue, Apr 2, 2019 at 9:34 AM Alexey Romanenko 
> wrote:
>
>> > How are multiple versions of Kafka supported? Are they all in one
>> client, or is there a case for forks like ElasticSearchIO?
>>
>> They are supported in one client but we have additional “ConsumerSpEL”
>> adapter which unifies interface difference among different Kafka client
>> versions (mostly to support old ones 0.9-0.10.0).
>>
>> On the other hand, we warn user in Javadoc of KafkaIO (which is Unstable,
>> btw) by the following:
>> *“KafkaIO relies on kafka-clients for all its interactions with the Kafka
>> cluster.**kafka-clients versions 0.10.1 and newer are supported at
>> runtime. The older versions 0.9.x **- 0.10.0.0 are also supported, but
>> are deprecated and likely be removed in near future.”*
>>
>> Despite the fact that, personally, I’d prefer to have only one unified
>> client interface but, since people still use Beam with old Kafka instances,
>> we, likely, should stick with it till Beam 3.0.
>>
>> WDYT?
>>
>> On 2 Apr 2019, at 02:27, Austin Bennett 
>> wrote:
>>
>> FWIW --
>>
>> On my (desired, not explicitly job-function) roadmap is to tap into a
>> bunch of our corporate Kafka queues to ingest that data to places I can
>> use.  Those are 'stuck' 0.9, with no upgrade in sight (am told the upgrade
>> path isn't trivial, is very critical flows, and they are scared for it to
>> break, so it just sits behind firewalls, etc).  But, I wouldn't begin that
>> for probably at least another quarter.
>>
>> I don't contribute to nor understand the burden of maintaining the
>> support for the older version, so can't reasonably lobby for that continued
>> pain.
>>
>> Anecdotally, this could be a place many enterprises are at (though I also
>> wonder whether many of the people that would be 'stuck' on such versions
>> would also have Beam on their current radar).
>>
>>
>> On Mon, Apr 1, 2019 at 2:29 PM Kenneth Knowles  wrote:
>>
>>> This could be a backward-incompatible change, though that notion has
>>> many interpretations. What matters is user pain. Technically if we don't
>>> break the core SDK, users should be able to use Java SDK >=2.11.0 with
>>> KafkaIO 2.11.0 forever.
>>>
>>> How are multiple versions of Kafka supported? Are they all in one
>>> client, or is there a case for forks like ElasticSearchIO?
>>>
>>> Kenn
>>>
>>> On Mon, Apr 1, 2019 at 10:37 AM Jean-Baptiste Onofré 
>>> wrote:
>>>
 +1 to remove 0.9 support.

 I think it's more interesting to test and verify Kafka 2.2.0 than 0.9 ;)

 Regards
 JB

 On 01/04/2019 19:36, David Morávek wrote:
 > Hello,
 >
 > is there still a reason to keep Kafka 0.9 support? This unfortunately
 > adds lot of complexity to KafkaIO implementation.
 >
 > Kafka 0.9 was released on Nov 2015.
 >
 > My first shot on removing Kafka 0.9 support would remove second
 > consumer, which is used for fetching offsets.
 >
 > WDYT? Is this support worth keeping?
 >
 > https://github.com/apache/beam/pull/8186
 >
 > D.

 --
 Jean-Baptiste Onofré
 jbono...@apache.org
 http://blog.nanthrax.net
 Talend - http://www.talend.com

>>>
>>

-- 

Mingmin


Re: Request for invitation to Slack team

2018-01-19 Thread Mingmin Xu
Just sent, welcome!

On Fri, Jan 19, 2018 at 10:29 AM, Andrew Nguonly 
wrote:

> Request for invitation to Slack team
>



-- 

Mingmin


Re: KafkaIO reading from latest offset when pipeline fails on FlinkRunner

2018-01-10 Thread Mingmin Xu
@Sushil, I have several jobs running on KafkaIO+FlinkRunner, hope my
experience can help you a bit.

For short, `ENABLE_AUTO_COMMIT_CONFIG` doesn't meet your requirement, you
need to leverage exactly-once checkpoint/savepoint in Flink. The reason
is,  with `ENABLE_AUTO_COMMIT_CONFIG` KafkaIO commits offset after data is
read, and once job is restarted KafkaIO reads from last_committed_offset.

In my jobs, I enable external(external should be optional I think?)
checkpoint on exactly-once mode in Flink cluster. When the job auto-restart
on failures it doesn't lost data. In case of manually redeploy the job, I
use savepoint to cancel and launch the job.

Mingmin

On Wed, Jan 10, 2018 at 10:34 AM, Raghu Angadi  wrote:

> How often does your pipeline checkpoint/snapshot? If the failure happens
> before the first checkpoint, the pipeline could restart without any state,
> in which case KafkaIO would read from latest offset. There is probably some
> way to verify if pipeline is restarting from a checkpoint.
>
> On Sun, Jan 7, 2018 at 10:57 PM, Sushil Ks  wrote:
>
>> HI Aljoscha,
>>The issue is let's say I consumed 100 elements in 5
>> mins Fixed Window with *GroupByKey* and later I applied *ParDO* for all
>> those elements. If there is an issue while processing element 70 in
>> *ParDo *and the pipeline restarts with *UserCodeException *it's skipping
>> the rest 30 elements. Wanted to know if this is expected? In case if you
>> still having doubt let me know will share a code snippet.
>>
>> Regards,
>> Sushil Ks
>>
>
>


-- 

Mingmin


Re: Beam Slack Channel Invitation Request

2017-11-04 Thread Mingmin Xu
sent, welcome!

On Sat, Nov 4, 2017 at 4:42 PM, Tristan Shephard  wrote:

> Hello,
>
> Can someone please add me to the Beam slack channel?
>
> Thanks in advance,
> Tristan
>



-- 

Mingmin


Re: Request to add to Beam Slack Channel

2017-11-04 Thread Mingmin Xu
sent, welcome to Beam.

On Sat, Nov 4, 2017 at 7:44 PM, Ananth G  wrote:

> Hello,
>
> Could someone please add me to the Beam slack channel?
>
> Regards,
> Ananth
>



-- 

Mingmin


Re: Slack channel

2017-08-12 Thread Mingmin Xu
sent, welcome Andy!

On Sat, Aug 12, 2017 at 10:59 PM, Andy Barron  wrote:

> Hi,
>
> I'd like to join the Slack channel for Apache Beam. I work at Maestro with
> Steve (CC'd), who was recently added (st...@maestro.io). My email is
> a...@maestro.io.
>
> Thanks!
> Andy
>



-- 

Mingmin


Re: slack invite

2017-08-07 Thread Mingmin Xu
sent, welcome!

On Mon, Aug 7, 2017 at 12:44 PM,  wrote:

> hi there,
>
> could i get an invite to the apache beam slack channel please?
>
> thanks!
>
> - Reece
>
>
>
> Get Outlook for iOS 
>



-- 

Mingmin


Re: Slack invite

2017-08-07 Thread Mingmin Xu
sent, welcome!

On Mon, Aug 7, 2017 at 2:20 PM, Akagi Norio 
wrote:

> Could I get a slack invitation?
> Thank you!
>



-- 

Mingmin


Re: Slack invite

2017-07-26 Thread Mingmin Xu
done

On Wed, Jul 26, 2017 at 10:52 AM, Punit Naik  wrote:

> Could I get one slack invite too, please?
>
>
> On Jul 26, 2017 11:20 PM, "Mingmin Xu"  wrote:
>
> sent, welcome @Nathan.
>
> On Wed, Jul 26, 2017 at 10:47 AM, Nathan Deren <
> nathan.de...@zonarsystems.com> wrote:
>
>> Hi,
>>
>> Could I get a slack invite, please?
>>
>> Thanks very much!
>> —Nathan Deren
>>
>
>
>
> --
> 
> Mingmin
>
>
>


-- 

Mingmin


Re: Slack invite

2017-07-26 Thread Mingmin Xu
sent, welcome @Nathan.

On Wed, Jul 26, 2017 at 10:47 AM, Nathan Deren <
nathan.de...@zonarsystems.com> wrote:

> Hi,
>
> Could I get a slack invite, please?
>
> Thanks very much!
> —Nathan Deren
>



-- 

Mingmin


Re: Beam Slack channel

2017-06-27 Thread Mingmin Xu
sent, welcome @Arun @Cao Manh Dat

On Mon, Jun 26, 2017 at 11:52 PM, Đạt Cao Mạnh 
wrote:

> Hi JB,
>
> Can you please send me an invite to the slack channel as well?
>
> Thanks,
> Cao Manh Dat
>
> On Tue, Jun 27, 2017 at 1:47 PM Arun Mahadevan  wrote:
>
>> Hi JB,
>>
>> Can you please send me an invite to the slack channel?
>>
>> Thanks,
>> Arun
>>
>>
>>
>>
>> On 6/27/17, 12:32 AM, "Jean-Baptiste Onofré"  wrote:
>>
>> >Done.
>> >
>> >Welcome !
>> >
>> >Regards
>> >JB
>> >
>> >On 06/26/2017 07:15 PM, Vinay Patil wrote:
>> >> Hi,
>> >>
>> >> I am interested to join the slack channel. Can you please send me an
>> invite.
>> >>
>> >>
>> >> Regards,
>> >> Vinay Patil
>> >>
>> >> Regards,
>> >> Vinay Patil
>> >>
>> >> On Mon, Jun 26, 2017 at 1:42 PM, Jean-Baptiste Onofré > >> > wrote:
>> >>
>> >> Hi Paul,
>> >>
>> >> you should have received an invite.
>> >>
>> >> Welcome !
>> >>
>> >> Regards
>> >> JB
>> >>
>> >> On 06/26/2017 10:03 AM, Paul Findlay wrote:
>> >>
>> >> Hi there,
>> >>
>> >> I would be interested in joining the slack channel as well.
>> >>
>> >> Kind regards,
>> >>
>> >> Paul Findlay
>> >>
>> >> On Mon, Jun 26, 2017 at 1:58 AM, Aleksandr <
>> aleksandr...@gmail.com
>> >>  > >> >> wrote:
>> >>
>> >>  Thank you!
>> >>
>> >>
>> >>  25. juuni 2017 4:55 PM kirjutas kuupäeval "Aviem Zur"
>> >> mailto:aviem...@gmail.com>
>> >>  >>:
>> >>
>> >>  Done
>> >>
>> >>  On Sun, Jun 25, 2017 at 4:51 PM Aleksandr
>> >> mailto:aleksandr...@gmail.com>
>> >>  > >> >> wrote:
>> >>
>> >>  Hello,
>> >>  Can someone  please add me to the slack channel?
>> >>
>> >>  Best regards
>> >>  Aleksandr Gortujev.
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Jean-Baptiste Onofré
>> >> jbono...@apache.org 
>> >> http://blog.nanthrax.net
>> >> Talend - http://www.talend.com
>> >>
>> >>
>> >
>> >--
>> >Jean-Baptiste Onofré
>> >jbono...@apache.org
>> >http://blog.nanthrax.net
>> >Talend - http://www.talend.com
>> >
>>
>>


-- 

Mingmin


Re: SQL in Stream Computing: MERGE or INSERT?

2017-06-22 Thread Mingmin Xu
Would like to share my thoughts in another perspective. IMO this is a
typical scenario for column based databases, like Hbase/Cassandra. You may
need to choose a right database if possible.

UPSERT is another alternative option, but I wouldn't suggest to a
customized check-insert/check-update implementation. The actual job should
be done in database side.

On Thu, Jun 22, 2017 at 6:59 PM, James  wrote:

> Hi Tyler,
>
> I think upsert is a good alternative, concise as INSERT and have the valid
> semantics. Just that user seems rarely use UPSERT either(might because
> there's no UPDATE in batch big data processing).
>
> By *"INSERT will behave differently in batch & stream processing"* I
> mean, if we use the "INSERT" solution I described above, there will be ten
> INSERTs:
>
> *INSERT INTO result(rowkey, col1) values(...)*
>
> *INSERT INTO result(rowkey, col2) values(...)*
>
> *...INSERT INTO result(rowkey, col10) values(...)*
>
> Although we issued ten INSERTs, but there will be only ONE new records in
> the target table, because 9 of the INSERTs are actually UPDATing the
> record, so in stream computing *INSERT = (INSERT or UPDATE)*, while in
> batch,* INSERT is just INSERT*.
>
> I think the essence of this problem is, there is no UPDATE in batch, but
> require UPDATE in streaming.
>
>
>
> Tyler Akidau 于2017年6月22日周四 下午11:35写道:
>
>> Calcite appears to have UPSERT
>>  support, can we just
>> use that instead?
>>
>> Also, I don't understand your statement that "INSERT will behave
>> differently in batch & stream processing". Can you explain further?
>>
>>
>> -Tyler
>>
>>
>> On Thu, Jun 22, 2017 at 7:35 AM Jesse Anderson 
>> wrote:
>>
>>> If I'm understanding correctly, Hive does that with a insert into
>>> followed by a select statement that does the aggregation. https://cwiki.
>>> apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-
>>> InsertingdataintoHiveTablesfromqueries
>>>
>>> On Thu, Jun 22, 2017 at 1:32 AM James  wrote:
>>>
 Hi team,

 I am thinking about a SQL and stream computing related problem, want to
 hear your opinions.

 In stream computing, there is a typical case like this:

 *We want to calculate a big wide result table, which has one rowkey and
 ten
 value columns:*
 *create table result (*
 *rowkey varchar(127) PRIMARY KEY,*
 *col1 int,*
 *col2 int,*
 *...*
 *col10 int*
 *);*

 Each of the value columns is calculated by a complex query, so there
 will
 be ten SQLs to calculate
 data for this table, for each sql:

 * First check whether there is a row for the specified `rowkey`.
 * If yes, then `update`, otherwise `insert`.

 There is actually a dedicated sql syntax called `MERGE` designed for
 this(SQL2008), a sample usage is:

 MERGE INTO result D
USING (SELECT rowkey, col1 FROM input WHERE flag = 80) S
ON (D.rowkey = S.rowkey)
WHEN MATCHED THEN UPDATE SET D.col1 = S.col1
WHEN NOT MATCHED THEN INSERT (D.rowkey, D.col1)


 *The semantic fits perfectly, but it is very verbose, and normal users
 rarely used this syntax.*

 So my colleagues invented a new syntax for this scenario (Or more
 precisely, a new interpretation for the INSERT statement). For the above
 scenario, user will always write `insert` statement:

 insert into result(rowkey, col1) values(...)
 insert into result(rowkey, col2) values(...)

 The sql interpreter will do a trick behind the scene: if the `rowkey`
 exists, then update, otherwise `insert`. This solution is very concise,
 but
 violates the semantics of `insert`, using this solution INSERT will
 behave
 differently in batch & stream processing.

 How do you guys think? which do you prefer? What's your reasoning?

 Looking forward to your opinions, thanks in advance.

>>> --
>>> Thanks,
>>>
>>> Jesse
>>>
>>


-- 

Mingmin


Re: Slack invite

2017-05-05 Thread Mingmin Xu
sent

On Fri, May 5, 2017 at 10:06 AM, Michael Luckey  wrote:

> Would you please invite me to the slack group also?
>
> Cheers,
>
> michel
>



-- 

Mingmin


Re: Slack channel invite pls

2017-05-04 Thread Mingmin Xu
sent

On Thu, May 4, 2017 at 1:47 PM, Parker Coleman  wrote:

> Can I also get a slack invite?  adinsx...@gmail.com
>
> On Thu, May 4, 2017 at 9:36 AM, Lukasz Cwik  wrote:
>
>> Sent
>>
>> On Thu, May 4, 2017 at 9:32 AM, Seshadri Raghunathan 
>> wrote:
>>
>>> Hi ,
>>>
>>> Please add me to the Slack channel in the next possible cycle.
>>>
>>> sesh...@gmail.com
>>>
>>> Thanks,
>>> Seshadri
>>>
>>
>>
>


-- 

Mingmin


Re: KafkaIO nothing received?

2017-05-04 Thread Mingmin Xu
@Conrad,

Your code should be good to go, I can run it in my local env. There're two
points you may have a check:
1). does the topic have data there, you can confirm with kafka cli '
*bin/kafka-console-consumer.sh*';
2). is the port in bootstrapServers right? By default it's 9092.



On Thu, May 4, 2017 at 9:05 AM, Conrad Crampton  wrote:

> Hi,
>
> New to the group – ‘hello’!
>
>
>
> Just starting to look into Beam and I very much like the concepts, but
> have rather fallen at the first hurdle – that being trying to subscribe to
> a kafka topic and process results.
>
> Very simply the following code doesn’t get receive any records (the data
> is going into the queue) – I just get nothing.
>
> I have tried on both direct-runner and flink-runner (using the Quickstart
> as a base for options, mvn profile etc.)
>
>
>
> Code
>
>
>
> Pipeline p = Pipeline.*create*(options);
>
> List topics = ImmutableList.*of*(*"test-http-logs-json"*);
>
>
> PCollection logs = p.apply(KafkaIO.*read*()
> .withBootstrapServers(
> *"datanode2-cm1.mis-cds.local:6667,datanode3-cm1.mis-cds.local:6667,datanode6-cm1.mis-cds.local:6667"*
> )
> .withTopics(topics)
> .withKeyCoder(StringUtf8Coder.*of*())
> .withValueCoder(StringUtf8Coder.*of*())
> .withMaxNumRecords(10)
> .updateConsumerProperties(ImmutableMap.*builder*()
> .put(*"auto.offset.reset"*, (Object) *"earliest"*)
> .put(*"group.id "*, (Object)
> *"http-logs-beam-json"*)
> .put(*"enable.auto.commit"*, (Object) *"true"*)
> .put(*"receive.buffer.bytes"*, 1024 * 1024)
> .build())
>
> *// set a Coder for Key and Value *.withoutMetadata())
> .apply(*"Transform "*, MapElements.*via*(*new 
> *SimpleFunction String>, String>() {
> @Override
> *public *String apply(KV input) {
> *log*.debug(*"{}"*, input.getValue());
> *return *input.getKey() + *" " *+ input.getValue();
> }
> }));
>
>
> p.run();
>
>
>
>
>
> Result:
>
> May 04, 2017 5:02:13 PM org.apache.kafka.common.config.AbstractConfig
> logAll
>
> INFO: ConsumerConfig values:
>
> metric.reporters = []
>
> metadata.max.age.ms = 30
>
> value.deserializer = class org.apache.kafka.common.serialization.
> ByteArrayDeserializer
>
> group.id = http-logs-beam-json
>
> partition.assignment.strategy = [org.apache.kafka.clients.
> consumer.RangeAssignor]
>
> reconnect.backoff.ms = 50
>
> sasl.kerberos.ticket.renew.window.factor = 0.8
>
> max.partition.fetch.bytes = 1048576
>
> bootstrap.servers = [datanode2-cm1.mis-cds.local:6667,
> datanode3-cm1.mis-cds.local:6667, datanode6-cm1.mis-cds.local:6667]
>
> retry.backoff.ms = 100
>
> sasl.kerberos.kinit.cmd = /usr/bin/kinit
>
> sasl.kerberos.service.name = null
>
> sasl.kerberos.ticket.renew.jitter = 0.05
>
> ssl.keystore.type = JKS
>
> ssl.trustmanager.algorithm = PKIX
>
> enable.auto.commit = true
>
> ssl.key.password = null
>
> fetch.max.wait.ms = 500
>
> sasl.kerberos.min.time.before.relogin = 6
>
> connections.max.idle.ms = 54
>
> ssl.truststore.password = null
>
> session.timeout.ms = 3
>
> metrics.num.samples = 2
>
> client.id =
>
> ssl.endpoint.identification.algorithm = null
>
> key.deserializer = class org.apache.kafka.common.serialization.
> ByteArrayDeserializer
>
> ssl.protocol = TLS
>
> check.crcs = true
>
> request.timeout.ms = 4
>
>ssl.provider = null
>
> ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
>
> ssl.keystore.location = null
>
> heartbeat.interval.ms = 3000
>
> auto.commit.interval.ms = 5000
>
> receive.buffer.bytes = 1048576
>
> ssl.cipher.suites = null
>
> ssl.truststore.type = JKS
>
> security.protocol = PLAINTEXT
>
> ssl.truststore.location = null
>
> ssl.keystore.password = null
>
> ssl.keymanager.algorithm = SunX509
>
> metrics.sample.window.ms = 3
>
> fetch.min.bytes = 1
>
> send.buffer.bytes = 131072
>
> auto.offset.reset = earliest
>
>
>
> May 04, 2017 5:02:13 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo
> 
>
> INFO: Kafka version : 0.9.0.1
>
> May 04, 2017 5:02:13 PM org.apache.kafka.common.utils.AppInfoParser$AppInfo
> 
>
> INFO: Kafka commitId : 23c69d62a0cabf06
>
> May 04, 2017 5:02:13 PM 
> org.apache.beam.sdk.io.kafka.KafkaIO$UnboundedKafkaSource
> generateInitialSplits
>
> INFO: Partitions assigned to split 0 (total 1): test-http-logs-json-0
>
> May 04, 2017 5:02:13 PM org.apache.kafka.common.config.AbstractConfig
> logAll
>
> INFO: ConsumerConfig values:
>
> metric.reporters = []
>
> metadata.max.age.ms = 30
>
> 

Re: Slack Channel Request

2017-04-13 Thread Mingmin Xu
both sent.

On Thu, Apr 13, 2017 at 10:00 PM, Anant Bhandarkar <
anant.bhandar...@impactanalytics.co> wrote:

> Would love to be part of beam group on slack.
> Also please add anil.b...@impactanaytics.co
> Thanks,
> Anant
>
> On 14-Apr-2017 9:15 AM, "Mingmin Xu"  wrote:
>
>> @James, @Jingsong, @Tom, invite sent.
>>
>> On Thu, Apr 13, 2017 at 8:34 PM, Tom Pollard <
>> tpoll...@flashpoint-intel.com> wrote:
>>
>>> If it's not inconvenient, I'd also like an invitation to the Slack
>>> channel.
>>>
>>> Tom
>>>
>>>
>>> On Apr 13, 2017, at 11:31 PM, JingsongLee 
>>> wrote:
>>>
>>> Please add me too.
>>>
>>> Best,
>>>
>>> JingsongLee
>>>
>>>
>>> --
>>> From:James 
>>> Time:2017 Apr 14 (Fri) 11:00
>>> To:user 
>>> Subject:Re: Slack Channel Request
>>>
>>> Could I also have an invite please?
>>>
>>> On 2017-03-28 08:28 (+0800), Davor Bonaci  wrote:
>>> > Invite sent.
>>> >
>>> > On Sat, Mar 25, 2017 at 2:48 AM, Prabeesh K. >> > wrote:
>>> >
>>> > > Hi Jean,
>>> > >
>>> > > Thank you for your reply. I am eagerly waiting for the o
>>> ther options.
>>> > >
>>> > > Regards,
>>> > > Prabeesh K.
>>> > >
>>> > > On 25 March 2017 at 10:08, Jean-Baptiste Onofré >> > wrote:
>>> > >
>>> > >> Unfortunately we reached the max number of people on Slack (90).
>>> > >>
>>> > >> Let me see what we can do.
>>> > >>
>>> > >> Regards
>>> > >> JB
>>> > >>
>>> > >>
>>> > >> On 03/24/2017 09:49 PM, Prabeesh K. wrote:
>>> > >>
>>> > >>> Hi,
>>> > >>>
>>> > >>> Can someone please add me to the Apache Beam slack channel?
>>> > >>>
>>> > >>> Regards,
>>> > >>>
>>> > >>> Prabeesh K.
>>> > >>>
>>> > >>>
>>> > >> --
>>> > >> Jean-Baptiste Onofré
>>> > >> jbono...@apache.org
>>> > >> http://blog.nanthrax.net
>>> > >> Talend - http://www.talend.com
>>> > >>
>>> > >
>>> > >
>>> >
>>>
>>>
>>>
>>
>>
>> --
>> 
>> Mingmin
>>
>


-- 

Mingmin


Re: Slack Channel Request

2017-04-13 Thread Mingmin Xu
@James, @Jingsong, @Tom, invite sent.

On Thu, Apr 13, 2017 at 8:34 PM, Tom Pollard 
wrote:

> If it's not inconvenient, I'd also like an invitation to the Slack channel.
>
> Tom
>
>
> On Apr 13, 2017, at 11:31 PM, JingsongLee  wrote:
>
> Please add me too.
>
> Best,
>
> JingsongLee
>
>
> --
> From:James 
> Time:2017 Apr 14 (Fri) 11:00
> To:user 
> Subject:Re: Slack Channel Request
>
> Could I also have an invite please?
>
> On 2017-03-28 08:28 (+0800), Davor Bonaci  wrote:
> > Invite sent.
> >
> > On Sat, Mar 25, 2017 at 2:48 AM, Prabeesh K.  > wrote:
> >
> > > Hi Jean,
> > >
> > > Thank you for your reply. I am eagerly waiting for the other options.
> > >
> > > Regards,
> > > Prabeesh K.
> > >
> > > On 25 March 2017 at 10:08, Jean-Baptiste Onofré  > wrote:
> > >
> > >> Unfortunately we reached the max number of people on Slack (90).
> > >>
> > >> Let me see what we can do.
> > >>
> > >> Regards
> > >> JB
> > >>
> > >>
> > >> On 03/24/2017 09:49 PM, Prabeesh K. wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Can someone please add me to the Apache Beam slack channel?
> > >>>
> > >>> Regards,
> > >>>
> > >>> Prabeesh K.
> > >>>
> > >>>
> > >> --
> > >> Jean-Baptiste Onofré
> > >> jbono...@apache.org
> > >> http://blog.nanthrax.net
> > >> Talend - http://www.talend.com
> > >>
> > >
> > >
> >
>
>
>


-- 

Mingmin


Re: Kafka Offset handling for Restart/failure scenarios.

2017-03-21 Thread Mingmin Xu
>From KafkaIO itself, looks like it either start_from_beginning or
start_from_latest. It's designed to leverage
`UnboundedSource.CheckpointMark` during initialization, but so far I don't
see it's provided by runners. At the moment Flink savepoints is a good
option, created a JIRA(BEAM-1775
)  to handle it in KafkaIO.

Mingmin

On Tue, Mar 21, 2017 at 3:40 AM, Aljoscha Krettek 
wrote:

> Hi,
> Are you using Flink savepoints [1] when restoring your application? If you
> use this the Kafka offset should be stored in state and it should restart
> from the correct position.
>
> Best,
> Aljoscha
>
> [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/
> setup/savepoints.html
> > On 21 Mar 2017, at 01:50, Jins George  wrote:
> >
> > Hello,
> >
> > I am writing a Beam pipeline(streaming) with Flink runner to consume
> data from Kafka and apply some transformations and persist to Hbase.
> >
> > If I restart the application ( due to failure/manual restart), consumer
> does not resume from the offset where it was prior to restart. It always
> resume from the latest offset.
> >
> > If I enable Flink checkpionting with hdfs state back-end, system appears
> to be resuming from the earliest offset
> >
> > Is there a recommended way to resume from the offset where it was
> stopped ?
> >
> > Thanks,
> > Jins George
>
>


-- 

Mingmin


Re: Slack

2017-03-13 Thread Mingmin Xu
added

ps, resent as it seems the previous one is blocked,

On Mon, Mar 13, 2017 at 1:07 PM, Sunil K Sahu  wrote:

> Could someone add me to slack channel as well.
>
> Thanks,
> Sunil
>
> ​
> Sunil Kumar Sahu
> ​CS
>  Dept - Graduate Student
> ​BU - ​Watson School of Engineering
>
> On Mon, Mar 13, 2017 at 9:28 AM, Amit Sela  wrote:
>
>> I'm so well trained, I do it on my phone now!
>>
>> On Mon, Mar 13, 2017, 15:24 Tobias Feldhaus <
>> tobias.feldh...@localsearch.ch> wrote:
>>
>>> Same for me please :)
>>>
>>> Tobi
>>>
>>>
>>>
>>> On 13.03.17, 13:30, "Amit Sela"  wrote:
>>>
>>>
>>>
>>> Done. Welcome!
>>>
>>>
>>>
>>> On Mon, Mar 13, 2017 at 2:29 PM Alexander Gallego <
>>> gallego.al...@gmail.com> wrote:
>>>
>>> same for me please.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> .alex
>>>
>>>
>>>
>>> On Fri, Mar 10, 2017 at 3:01 PM, Amit Sela  wrote:
>>>
>>> Done
>>>
>>>
>>>
>>> On Fri, Mar 10, 2017, 21:59 Devon Meunier 
>>> wrote:
>>>
>>> Hi!
>>>
>>>
>>>
>>> Sorry for the noise but could someone invite me to the slack channel?
>>>
>>>
>>>
>>> Thanks,
>>>
>>>
>>>
>>> Devon
>>>
>>>
>>>
>>>
>


-- 

Mingmin