Re: [Wikimedia-search] Scaleable Event Systems recap

2015-08-03 Thread Tomasz Finc
Very excited to see this moving forward

On Mon, Aug 3, 2015 at 3:12 PM, Oliver Keyes  wrote:
> Heyo, Discovery team!
>
> (Analytics CCd)
>
> This is just a quick writeup of the Scaleable Event Systems meeting
> that Erik, Dan, Stas and I went to (although just from my
> perspective).
>
> For people not in the initial thread, this is a proposal to replace
> the internal architecture of EventLogging and similar services with
> Apache Kafka brokers
> (http://www.confluent.io/blog/stream-data-platform-1/ ). What that
> means in practice is that the current 1-2k events/second limit on
> EventLogging will disappear and we can stop worrying about sampling
> and accidentally bringing down the system. We can be a lot less
> cautious about our schemas and a lot less cautious about our sampling
> rate!
>
> It also offers up a lot of opportunities around streaming data and
> making it available in a layered fashion - while we don't want to
> explore that right now, I don't think, it's nice to have as an option
> when we better understand our search data and how we can safely
> distribute it.
>
> I'd like to thank the Analytics team, particularly Andrew, for putting
> this together; it was a super-helpful discussion to be in and this
> sort of product is precisely what I, at least, have been hoping for
> out of the AnEng brain trust. Full speed ahead!
>
> --
> Oliver Keyes
> Count Logula
> Wikimedia Foundation
>
> ___
> Wikimedia-search mailing list
> Wikimedia-search@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-search

___
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search


[Wikimedia-search] Scaleable Event Systems recap

2015-08-03 Thread Oliver Keyes
Heyo, Discovery team!

(Analytics CCd)

This is just a quick writeup of the Scaleable Event Systems meeting
that Erik, Dan, Stas and I went to (although just from my
perspective).

For people not in the initial thread, this is a proposal to replace
the internal architecture of EventLogging and similar services with
Apache Kafka brokers
(http://www.confluent.io/blog/stream-data-platform-1/ ). What that
means in practice is that the current 1-2k events/second limit on
EventLogging will disappear and we can stop worrying about sampling
and accidentally bringing down the system. We can be a lot less
cautious about our schemas and a lot less cautious about our sampling
rate!

It also offers up a lot of opportunities around streaming data and
making it available in a layered fashion - while we don't want to
explore that right now, I don't think, it's nice to have as an option
when we better understand our search data and how we can safely
distribute it.

I'd like to thank the Analytics team, particularly Andrew, for putting
this together; it was a super-helpful discussion to be in and this
sort of product is precisely what I, at least, have been hoping for
out of the AnEng brain trust. Full speed ahead!

-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

___
Wikimedia-search mailing list
Wikimedia-search@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search