[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2019-04-24 Thread Ottomata
Ottomata added a comment. Can we close this task? TASK DETAIL https://phabricator.wikimedia.org/T161731 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Ottomata Cc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria, Anomie, Aklapper, Smalyshev,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2018-06-25 Thread Ottomata
Ottomata added a comment. OO yes @Smalyshev and in case you didn't see, we also increased retention of mediawiki topics to 31 days in the main kafka clusters.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2018-06-25 Thread Smalyshev
Smalyshev added a comment. @Nuria I don't see any immediate blockers so far.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria, Anomie,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2018-06-25 Thread Nuria
Nuria added a comment. Ping @Smalyshev now that you have a reliable stream on the new kafka cluster (that supports time-based consumption) is there any other blockers on your end ?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-13 Thread Smalyshev
Smalyshev added a comment. yes, definitelyTASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria, Anomie, Aklapper, Smalyshev, Cpaulf30,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-13 Thread Nuria
Nuria added a comment. @Smalyshev Please, 45 minutes with me and @Ottomata would do?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, NuriaCc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-13 Thread Smalyshev
Smalyshev added a comment. @Nuria yes mostly, though I do have some questions, maybe we should set up a short meeting to discuss them?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc:

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-13 Thread Nuria
Nuria added a comment. @Smalyshev Ok, we aim to have the cluster handling all prod traffic by end of next quarter, until then it will be mirroing data which i think should be sufficient for you to get started in the wdqs consumer? Correct me if I am wrong.TASK

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-12 Thread Smalyshev
Smalyshev added a comment. @Nuria yes, consuming the data works.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc: gerritbot, JAllemandou, Pchelolo, Ladsgroup, Nuria, Anomie, Aklapper,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread Ottomata
Ottomata added a comment. So, FYI, the timestamps as they are now are the timestamp that the kafka jumbo-eqiad cluster received the messages. These are replicated from the main-eqiad cluster, and might have a short (seconds usually, minutes max) delay. Eventually (work not planned yet) we will

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread Nuria
Nuria added a comment. Nice, Can @Smalyshev check whether consuming from these topics as set would work for his purposes?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, NuriaCc: gerritbot,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread Ottomata
Ottomata added a comment. Woot, that did it ^. We need topics to default to LogAppendTime. [@stat1005:/home/otto] $ ./kafkacat -Q -b kafka-jumbo1001.eqiad.wmnet:9092 -t eqiad.mediawiki.revision-create:0:151275919 eqiad.mediawiki.revision-create [0] offset 3658631TASK

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread gerritbot
gerritbot added a comment. Change 396439 merged by Ottomata: [operations/puppet@production] Set default topic timestamp.type to LogAppendTime https://gerrit.wikimedia.org/r/396439TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread gerritbot
gerritbot added a comment. Change 396439 had a related patch set uploaded (by Ottomata; owner: Ottomata): [operations/puppet@production] Set default topic timestamp.type to LogAppendTime https://gerrit.wikimedia.org/r/396439TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-08 Thread Ottomata
Ottomata added a comment. Seems like the mirroring is done by 0.9 MirrorMaker and timestamp handling was added only in 0.10 MirrorMaker. Hm, ya but I had thought that if a timestamp was not set by the producer, it would be set to server receive time. Maybe I was wrong!TASK

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-07 Thread Nuria
Nuria added a comment. I got same doing: /home/otto/kafkacat -Q -b kafka-jumbo1003.eqiad.wmnet -t eqiad.mediawiki.revision-create:0:1512687299 -Xdebug=allTASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To:

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-07 Thread Pchelolo
Pchelolo added a comment. Hm, actually if I just try to consume from that topic (any topic actually) with -F "%T" that should give me message timestamps it gives -1 as well. I suppose that the problem is that we're actually producing these messages into Kafka 0.9 and perhaps not specifying the

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-07 Thread Smalyshev
Smalyshev added a comment. /home/otto/kafkacat runs fine but -Q seems to return this for everything: eqiad.mediawiki.revision-create [0] offset -1 Maybe I'm doing something wrong?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-06 Thread Ottomata
Ottomata added a comment. You can easily 'quickbuild' kafkacat with a statically linked librdkafka. I've just done this on a stretch labs host, and copied the kafkacat binary to stat1005 at /home/otto/kafkacat. Try it out!TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-05 Thread Smalyshev
Smalyshev added a comment. @Pchelolo thanks for the pointer, this is very helpful! Indeed, kafkacat for example supports it since a year ago. However, looks like we have this version of Kafka: Copyright (c) 2014-2015, Magnus Edenhill Version KAFKACAT_VERSION (JSON) (librdkafka 0.9.3) which

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-05 Thread Pchelolo
Pchelolo added a comment. In T161731#3814596, @Smalyshev wrote: @Ottomata thanks, I can connect to the hosts above, but still not sure how to control the starting point. I'll try to look around for clients that can do this. Java client has offsetsForTimes implemented and supports seek to an

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-05 Thread Smalyshev
Smalyshev added a comment. @Ottomata thanks, I can connect to the hosts above, but still not sure how to control the starting point. I'll try to look around for clients that can do this.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-04 Thread Ottomata
Ottomata added a comment. Sure, I suppose! You can connect to it with a Kafka client now. The Kafka brokers are kafka-jumbo100[1-6].eqiad.wmnet:9092 I think you are most interested in the eqiad.mediawiki.revision-create topic. I haven't tried yet at all, but these topics should have a broker

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-04 Thread Nuria
Nuria added a comment. @Ottomata Could @Smalyshev do a test on consuming from the new cluster though with teh understanding it is not yet productionized to make sure it fits the use cases?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-04 Thread Ottomata
Ottomata added a comment. Unfortunately not yet! We are very close...the cluster is up and running, but porting clients has been blocked on getting proper keys and certificates for SSL support for a long time now. SSL is finally moving now, so we should be able to start porting clients over

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-12-02 Thread Smalyshev
Smalyshev added a comment. @Ottomata, @Nuria what's the status on seekable Kafka streaming - do we have necessary infrastructure now?TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc:

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-06-01 Thread Smalyshev
Smalyshev added a comment. As the result of the discussion, we've arrived to a following conclusion: After we have Kafka version installed that allows to start by timestamp, we can create a prototype that takes recent changes from either Kafka or EventStreams. We need to evaluate if unfiltered

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-06-01 Thread Nuria
Nuria added a comment. From meeting: @Smalyshev can consume from either kafka or event stream once we add the ability to consume from a given point in time, this is what is mean by "seekable" (on new kafka cluster, next quarter, Q1) . Keeping data for longer than 7 days is not an issue for

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-05-29 Thread Smalyshev
Smalyshev added a comment. @Nuria yes, still very much needed and unsolved. Please feel welcome to set up a meet.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, SmalyshevCc: Nuria, Anomie,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-05-29 Thread Nuria
Nuria added a comment. ping @Smalyshev is this still a need? Maybe we should set up a short 30 minute sync upTASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL PREFERENCEShttps://phabricator.wikimedia.org/settings/panel/emailpreferences/To: Ottomata, NuriaCc: Nuria, Anomie, Aklapper,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Smalyshev
Smalyshev added a comment. FYI, neither base Kafka Consumer clients nor EventStreams does this. Yes, I know :) It's one of the decisions I still haven't figured out - how much I can/should do on the backend so I don't have to do it on the client, vs sending the client the raw firehose output and

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Ottomata
Ottomata added a comment. I'd rather have some intermediary that cleans up, deduplicates, etc. the changes. FYI, neither base Kafka Consumer clients nor EventStreams does this.TASK DETAILhttps://phabricator.wikimedia.org/T161731EMAIL

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Smalyshev
Smalyshev added a comment. It will be more, a lot more. What language are you working in? The end consumer will be Java, but I don't want to consume the raw Kafka stream from Java, I'd rather have some intermediary that cleans up, deduplicates, etc. the changes. Load balanced parallel consumers,

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Ottomata
Ottomata added a comment. If so, you may want to consider consuming from Kafka rather than EventStreams. I am considering this too, but I assume it's more code for me to write (maybe wrongly, I didn't look at it closely). It will be more, a lot more. What language are you working in? But what

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Smalyshev
Smalyshev added a comment. let's jump in a hangout sometime to discuss this more. Would be glad to. I'll try to set up something next week. If there is a desire, we can expose these in EventStreams. Do you have desire? :) Yes, see T145712 - recentchanges ignores pageprops updates, and it would

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-04-06 Thread Ottomata
Ottomata added a comment. @Smalyshev let's jump in a hangout sometime to discuss this more. Just a few quick points: Does not have data back more than 7 days We could probably bump this up to 14 days for specific topics like recentchange. Scalable - there's no hard limit on the number of

[Wikidata-bugs] [Maniphest] [Commented On] T161731: Create reliable change stream for specific wiki

2017-03-29 Thread Smalyshev
Smalyshev added a comment. I think EventStreams is closest to the goal too, but I want to have a complete description of the pony for the record so that we know what we need and what is missing. If and once it's implemented (T152731) covers part of it but not all - still need seeking and longer