Re: email backend for fedmsg

2020-03-28 Thread clime
On Wed, 25 Mar 2020 at 23:11, Peter Silva  wrote:

> Most Sarrracenia stuff is tied to AMQP, but next-gen messages are called
> v03 (version 3) they use a JSON payload
> for all the information, and that makes it somewhat protocol independent.
> There is also a 500 line MQTT demo
> that implements a file replication network, using the same JSON messages,
> and primed from an AMQP upstream.
>
> https://github.com/MetPX/wmo_mesh
>
> the peer code there is just a demonstration prototype, but it processes
> the messages the same way as real Sarracenia.
>
> That code has been run against mosquitto and EMQT, and I think another
> broker, I forget... It worked without issues on all of them. MQTT interop
> is flawless afaict.   note: we were using v3.  Have not played with v5.
>
> Sarracenia essentially defines a JSON payload for advertising that a file
> exists. That is a fairly popular problem, but if your problem isn´t that,
> then you should define a different payload.  It could be used for file
> replication, or orchestration/workload co-ordination, or other things in
> the IFTTT style... but in the end, this is just one application of a
> message bus, it doesn´t need to encompass all applications, but is a good
> way to get a useful thing implemented with it, so people see that it is
> useful.   I think applications need to define their messages, and trying to
> be too general makes them harder to understand and apply.
>

Right, I think every application participating in communication through the
bus could also provide a message schema (either json or yaml schema) on
demand. I.e. it when a message is sent, it would also always include a
reference to a particular schema and if the recipient of the message
doesn't have this schema stored locally, it would send a message to the
sender asking for it and the sender would send it back.

I am an upstream maintainer of fedmsg now and this is an option that I see
to make fedmsg a viable solution for a linux distribution message bus. If
somebody would like to cooperate on this from Debian community, it would be
great, I think we could create an awesome thing together. Needless to say,
I also have a day job and quite a long TODO queue but I could get to
working on this if I can get somebody else interested.

If not, a confirmation from somebody from Debian community that this is
interesting and they would think about using it if something like this
existed would also help.

Best regards!
clime


>
>
> On Wed, Mar 25, 2020 at 5:57 PM clime  wrote:
>
>> 
>>> I work in telecom for meteorology, and we ended up with a general method
>>> for file copying (catchphrase: rsync on steroids*.) ( *every catchphrase is
>>> a distortion, no dis to rsync, but in certain cases we do work much faster,
>>> it just communicates the idea.) Sarracenia (
>>> https://github.com/MetPX/Sarracenia) is a GPL2 app (Python and C
>>> implementations) that use mozilla public license rabbitmq broker, as well
>>> as openssh and/or any web server to do fastish file synching, and/or
>>> processing/orchestration. The app is just json messages with file metadata
>>> sent through the broker. Then you daisy chain brokers through clients.  No
>>> centralization (every entity installs their own broker), No federated
>>> identity required (authentication is to each broker, but they can pass
>>> files/messages to each other.)
>>> A firstish thing to do with it would be to sync the debian mirrors in
>>> real-time rather than periodically.  Each mirror has a broker, they get
>>> advertisements (AMQP messages containing JSON file metadata) download the
>>> corresponding file, and re-advertise (publish on the local broker with the
>>> local file URL) for downstream clients. You can then make a mesh of
>>> mirrors, where, if each mirror is subscribed to at least two others, then
>>> it can withstand the failure of any node.  If you add more connections, you
>>> increase redundancy.
>>> Once you have that sort of anchor tenant for an AMQP message bus, people
>>> might want to use it to provide other forms of automation, but way quicker
>>> and in some ways much simpler than SMTP.  but yeah... SMTP is a lot more
>>> well-known/common. RabbitMQ is the industry dominant open solution for AMQP
>>> brokers. sounds like marketing bs, but if you look around it is what the
>>> vast majority are using, and there are thousands upon thousands of
>>> deployments. It's a much more viable starting point, for stability, and a
>>> lot less assembly required to get something going. Sarracenia makes it a
>>> bit easier again, but messages are kind of alien and different, so it takes
>>> a while to get used to them.
>>> 
>>
>>
>> Peter, I like the solution and for the mirrors it sounds great but I have
>> a few nitpicks:
>>
>> - the file syncing part is makes a perfect sense for the debian mirrors
>> but in general case you might only want to send a message and skip the file
>> syncing part
>> - I am currently, personally more 

Re: email backend for fedmsg

2020-03-25 Thread Peter Silva
Most Sarrracenia stuff is tied to AMQP, but next-gen messages are called
v03 (version 3) they use a JSON payload
for all the information, and that makes it somewhat protocol independent.
There is also a 500 line MQTT demo
that implements a file replication network, using the same JSON messages,
and primed from an AMQP upstream.

https://github.com/MetPX/wmo_mesh

the peer code there is just a demonstration prototype, but it processes the
messages the same way as real Sarracenia.

That code has been run against mosquitto and EMQT, and I think another
broker, I forget... It worked without issues on all of them. MQTT interop
is flawless afaict.   note: we were using v3.  Have not played with v5.

Sarracenia essentially defines a JSON payload for advertising that a file
exists. That is a fairly popular problem, but if your problem isn´t that,
then you should define a different payload.  It could be used for file
replication, or orchestration/workload co-ordination, or other things in
the IFTTT style... but in the end, this is just one application of a
message bus, it doesn´t need to encompass all applications, but is a good
way to get a useful thing implemented with it, so people see that it is
useful.   I think applications need to define their messages, and trying to
be too general makes them harder to understand and apply.


On Wed, Mar 25, 2020 at 5:57 PM clime  wrote:

> 
>> I work in telecom for meteorology, and we ended up with a general method
>> for file copying (catchphrase: rsync on steroids*.) ( *every catchphrase is
>> a distortion, no dis to rsync, but in certain cases we do work much faster,
>> it just communicates the idea.) Sarracenia (
>> https://github.com/MetPX/Sarracenia) is a GPL2 app (Python and C
>> implementations) that use mozilla public license rabbitmq broker, as well
>> as openssh and/or any web server to do fastish file synching, and/or
>> processing/orchestration. The app is just json messages with file metadata
>> sent through the broker. Then you daisy chain brokers through clients.  No
>> centralization (every entity installs their own broker), No federated
>> identity required (authentication is to each broker, but they can pass
>> files/messages to each other.)
>> A firstish thing to do with it would be to sync the debian mirrors in
>> real-time rather than periodically.  Each mirror has a broker, they get
>> advertisements (AMQP messages containing JSON file metadata) download the
>> corresponding file, and re-advertise (publish on the local broker with the
>> local file URL) for downstream clients. You can then make a mesh of
>> mirrors, where, if each mirror is subscribed to at least two others, then
>> it can withstand the failure of any node.  If you add more connections, you
>> increase redundancy.
>> Once you have that sort of anchor tenant for an AMQP message bus, people
>> might want to use it to provide other forms of automation, but way quicker
>> and in some ways much simpler than SMTP.  but yeah... SMTP is a lot more
>> well-known/common. RabbitMQ is the industry dominant open solution for AMQP
>> brokers. sounds like marketing bs, but if you look around it is what the
>> vast majority are using, and there are thousands upon thousands of
>> deployments. It's a much more viable starting point, for stability, and a
>> lot less assembly required to get something going. Sarracenia makes it a
>> bit easier again, but messages are kind of alien and different, so it takes
>> a while to get used to them.
>> 
>
>
> Peter, I like the solution and for the mirrors it sounds great but I have
> a few nitpicks:
>
> - the file syncing part is makes a perfect sense for the debian mirrors
> but in general case you might only want to send a message and skip the file
> syncing part
> - I am currently, personally more intrigued by even more standard
> technologies than RabbitMQ and I believe that a good solution might lie
> there
>
> What I particularly like about Sarracenia is that it is decentralized
> because each host has its own broker - that I think is cool and I would
> like to potentially do something similar...
>
> clime
>
>
>
> On Wed, 25 Mar 2020 at 01:07, clime  wrote:
>
>> On Wed, 25 Mar 2020 at 01:00, clime  wrote:
>> >
>> > On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont 
>> wrote:
>> > >
>> > > On Tue, Mar 24, 2020, at 21:51, clime wrote:
>> > > > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont 
>> wrote:
>> > > > >
>> > > > > Hi!
>> > > > >
>> > > > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
>> > > > > > Hello!
>> > > > > >
>> > > > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html
>> -
>> > > > > > fedmsg usage in Debian.
>> > > > > >
>> > > > > > There is a note: "it seems that people actually like parsing
>> emails"
>> > > > >
>> > > > > This was just a way to say that fedmsg never got much of a user
>> base in the services that run on Debian infra, and that even the new
>> services introduced at the time kept parsing emails.
>> > > >
>> 

Re: email backend for fedmsg

2020-03-25 Thread clime
I was looking into the email approach more and maybe I found a few
improvements:

- each communicating agent has an exim instance
- there are a few dns servers that replicate their configuration between
each other (to provide redundancy)
- these servers store subscriptions of the agents for each topic probably
as TXT records
- they have rest API to manage those subscriptions from the agents (so that
a service can register itself at its startup)
- when a message should be sent to another agent, first dns lookup is made
to obtain recipients based on a topic of the message and the subscriptions
stored in dns for that topic
- when recipients are fetched emails are sent to recipients in peer-to-peer
fashion

Now, I am not sure if it would be possible to configure Exim for this
role...

On Wed, 25 Mar 2020 at 22:56, clime  wrote:

> 
>> I work in telecom for meteorology, and we ended up with a general method
>> for file copying (catchphrase: rsync on steroids*.) ( *every catchphrase is
>> a distortion, no dis to rsync, but in certain cases we do work much faster,
>> it just communicates the idea.) Sarracenia (
>> https://github.com/MetPX/Sarracenia) is a GPL2 app (Python and C
>> implementations) that use mozilla public license rabbitmq broker, as well
>> as openssh and/or any web server to do fastish file synching, and/or
>> processing/orchestration. The app is just json messages with file metadata
>> sent through the broker. Then you daisy chain brokers through clients.  No
>> centralization (every entity installs their own broker), No federated
>> identity required (authentication is to each broker, but they can pass
>> files/messages to each other.)
>> A firstish thing to do with it would be to sync the debian mirrors in
>> real-time rather than periodically.  Each mirror has a broker, they get
>> advertisements (AMQP messages containing JSON file metadata) download the
>> corresponding file, and re-advertise (publish on the local broker with the
>> local file URL) for downstream clients. You can then make a mesh of
>> mirrors, where, if each mirror is subscribed to at least two others, then
>> it can withstand the failure of any node.  If you add more connections, you
>> increase redundancy.
>> Once you have that sort of anchor tenant for an AMQP message bus, people
>> might want to use it to provide other forms of automation, but way quicker
>> and in some ways much simpler than SMTP.  but yeah... SMTP is a lot more
>> well-known/common. RabbitMQ is the industry dominant open solution for AMQP
>> brokers. sounds like marketing bs, but if you look around it is what the
>> vast majority are using, and there are thousands upon thousands of
>> deployments. It's a much more viable starting point, for stability, and a
>> lot less assembly required to get something going. Sarracenia makes it a
>> bit easier again, but messages are kind of alien and different, so it takes
>> a while to get used to them.
>> 
>
>
> Peter, I like the solution and for the mirrors it sounds great but I have
> a few nitpicks:
>
> - the file syncing part is makes a perfect sense for the debian mirrors
> but in general case you might only want to send a message and skip the file
> syncing part
> - I am currently, personally more intrigued by even more standard
> technologies than RabbitMQ and I believe that a good solution might lie
> there
>
> What I particularly like about Sarracenia is that it is decentralized
> because each host has its own broker - that I think is cool and I would
> like to potentially do something similar...
>
> clime
>
>
>
> On Wed, 25 Mar 2020 at 01:07, clime  wrote:
>
>> On Wed, 25 Mar 2020 at 01:00, clime  wrote:
>> >
>> > On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont 
>> wrote:
>> > >
>> > > On Tue, Mar 24, 2020, at 21:51, clime wrote:
>> > > > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont 
>> wrote:
>> > > > >
>> > > > > Hi!
>> > > > >
>> > > > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
>> > > > > > Hello!
>> > > > > >
>> > > > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html
>> -
>> > > > > > fedmsg usage in Debian.
>> > > > > >
>> > > > > > There is a note: "it seems that people actually like parsing
>> emails"
>> > > > >
>> > > > > This was just a way to say that fedmsg never got much of a user
>> base in the services that run on Debian infra, and that even the new
>> services introduced at the time kept parsing emails.
>> > > >
>> > > > Hello Nicolas!
>> > > >
>> > > > Do you remember some such service and how it used email parsing
>> specifically?
>> > >
>> > > I believe that tracker.debian.org was introduced around that time.
>> > >
>> > > At the point it was created, tracker.d.o was mostly consuming emails
>> from packages.debian.org to update its data. These days tracker.d.o has
>> replaced packages.d.o as "email router", in that it receives all the mails
>> from services (e.g. the BTS, the archive maintenance software, buildds,
>> salsa webhooks, ...) and forwards them to 

Re: email backend for fedmsg

2020-03-25 Thread clime
>
> 
> I work in telecom for meteorology, and we ended up with a general method
> for file copying (catchphrase: rsync on steroids*.) ( *every catchphrase is
> a distortion, no dis to rsync, but in certain cases we do work much faster,
> it just communicates the idea.) Sarracenia (
> https://github.com/MetPX/Sarracenia) is a GPL2 app (Python and C
> implementations) that use mozilla public license rabbitmq broker, as well
> as openssh and/or any web server to do fastish file synching, and/or
> processing/orchestration. The app is just json messages with file metadata
> sent through the broker. Then you daisy chain brokers through clients.  No
> centralization (every entity installs their own broker), No federated
> identity required (authentication is to each broker, but they can pass
> files/messages to each other.)
> A firstish thing to do with it would be to sync the debian mirrors in
> real-time rather than periodically.  Each mirror has a broker, they get
> advertisements (AMQP messages containing JSON file metadata) download the
> corresponding file, and re-advertise (publish on the local broker with the
> local file URL) for downstream clients. You can then make a mesh of
> mirrors, where, if each mirror is subscribed to at least two others, then
> it can withstand the failure of any node.  If you add more connections, you
> increase redundancy.
> Once you have that sort of anchor tenant for an AMQP message bus, people
> might want to use it to provide other forms of automation, but way quicker
> and in some ways much simpler than SMTP.  but yeah... SMTP is a lot more
> well-known/common. RabbitMQ is the industry dominant open solution for AMQP
> brokers. sounds like marketing bs, but if you look around it is what the
> vast majority are using, and there are thousands upon thousands of
> deployments. It's a much more viable starting point, for stability, and a
> lot less assembly required to get something going. Sarracenia makes it a
> bit easier again, but messages are kind of alien and different, so it takes
> a while to get used to them.
> 


Peter, I like the solution and for the mirrors it sounds great but I have a
few nitpicks:

- the file syncing part is makes a perfect sense for the debian mirrors but
in general case you might only want to send a message and skip the file
syncing part
- I am currently, personally more intrigued by even more standard
technologies than RabbitMQ and I believe that a good solution might lie
there

What I particularly like about Sarracenia is that it is decentralized
because each host has its own broker - that I think is cool and I would
like to potentially do something similar...

clime



On Wed, 25 Mar 2020 at 01:07, clime  wrote:

> On Wed, 25 Mar 2020 at 01:00, clime  wrote:
> >
> > On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont 
> wrote:
> > >
> > > On Tue, Mar 24, 2020, at 21:51, clime wrote:
> > > > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont 
> wrote:
> > > > >
> > > > > Hi!
> > > > >
> > > > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > > > > > Hello!
> > > > > >
> > > > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html
> -
> > > > > > fedmsg usage in Debian.
> > > > > >
> > > > > > There is a note: "it seems that people actually like parsing
> emails"
> > > > >
> > > > > This was just a way to say that fedmsg never got much of a user
> base in the services that run on Debian infra, and that even the new
> services introduced at the time kept parsing emails.
> > > >
> > > > Hello Nicolas!
> > > >
> > > > Do you remember some such service and how it used email parsing
> specifically?
> > >
> > > I believe that tracker.debian.org was introduced around that time.
> > >
> > > At the point it was created, tracker.d.o was mostly consuming emails
> from packages.debian.org to update its data. These days tracker.d.o has
> replaced packages.d.o as "email router", in that it receives all the mails
> from services (e.g. the BTS, the archive maintenance software, buildds,
> salsa webhooks, ...) and forwards them to the public.
> > >
> > > > I am still a bit unclear how email parsing is used in Debian
> > > > infrastructure, don't get me wrong, I find it elegant
> > >
> > > Ha. I find that it's a big mess.
> > >
> > > Here's the set of headers of a message I received today from
> tracker.d.o, which are supposed to make parsing these emails better:
> > >
> > > X-PTS-Approved: yes
> > > X-Distro-Tracker-Package: facter
> > > X-Distro-Tracker-Keyword: derivatives
> > > X-Remote-Delivered-To: dispa...@tracker.debian.org
> > > X-Loop: dispa...@tracker.debian.org
> > > X-Distro-Tracker-Keyword: derivatives
> > > X-Distro-Tracker-Package: facter
> > > List-Id: 
> > > X-Debian: tracker.debian.org
> > > X-Debian-Package: facter
> > > X-PTS-Package: facter
> > > X-PTS-Keyword: derivatives
> > > Precedence: list
> > > List-Unsubscribe:  ?body=unsubscribe%20facter>
> > >
> > > I'll leave you to judge whether 

Re: email backend for fedmsg

2020-03-24 Thread clime
On Wed, 25 Mar 2020 at 01:00, clime  wrote:
>
> On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont  wrote:
> >
> > On Tue, Mar 24, 2020, at 21:51, clime wrote:
> > > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont  wrote:
> > > >
> > > > Hi!
> > > >
> > > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > > > > Hello!
> > > > >
> > > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> > > > > fedmsg usage in Debian.
> > > > >
> > > > > There is a note: "it seems that people actually like parsing emails"
> > > >
> > > > This was just a way to say that fedmsg never got much of a user base in 
> > > > the services that run on Debian infra, and that even the new services 
> > > > introduced at the time kept parsing emails.
> > >
> > > Hello Nicolas!
> > >
> > > Do you remember some such service and how it used email parsing 
> > > specifically?
> >
> > I believe that tracker.debian.org was introduced around that time.
> >
> > At the point it was created, tracker.d.o was mostly consuming emails from 
> > packages.debian.org to update its data. These days tracker.d.o has replaced 
> > packages.d.o as "email router", in that it receives all the mails from 
> > services (e.g. the BTS, the archive maintenance software, buildds, salsa 
> > webhooks, ...) and forwards them to the public.
> >
> > > I am still a bit unclear how email parsing is used in Debian
> > > infrastructure, don't get me wrong, I find it elegant
> >
> > Ha. I find that it's a big mess.
> >
> > Here's the set of headers of a message I received today from tracker.d.o, 
> > which are supposed to make parsing these emails better:
> >
> > X-PTS-Approved: yes
> > X-Distro-Tracker-Package: facter
> > X-Distro-Tracker-Keyword: derivatives
> > X-Remote-Delivered-To: dispa...@tracker.debian.org
> > X-Loop: dispa...@tracker.debian.org
> > X-Distro-Tracker-Keyword: derivatives
> > X-Distro-Tracker-Package: facter
> > List-Id: 
> > X-Debian: tracker.debian.org
> > X-Debian-Package: facter
> > X-PTS-Package: facter
> > X-PTS-Keyword: derivatives
> > Precedence: list
> > List-Unsubscribe: 
> > 
> >
> > I'll leave you to judge whether this makes sense or not.
> >
> > (and it turns out that the actual useful payload was just plaintext with no 
> > real chance of automated parsing)
> >
> > > but from what I have found (e.g. reportbug), in the beginning there is an
> > > email being sent by some human which will then trigger some automatic
> > > action (e.g. putting the bug into db). So it's like you could do all
> > > your work simply by sending emails (some of them machine-parsable).
> > >
> > > So do you have the opposite? I do some clicking action somewhere and
> > > it will send an email to a certain mailing list to inform human
> > > beings? Or let's not just clicking but e.g. `git push` (something that
> > > you can still do from command line).
> > >
> > > Do you have: I do some clicking action somewhere and it will send an
> > > email to a certain mailing list where the email is afterward parsed by
> > > another service which will do an action (e.g. launch a build) based on
> > > it?
> >
> > Both of these are somewhat true.
> >
> > Some examples of email-based behaviors:
> >  - Our bug tracking system is fully controlled by email.
> >  - Closing a bug in reaction to an upload is done by an email from the 
> > archive maintenance system (dak) to the bug tracking system.
> >  - Salsa has a webhook service that react to UI clicks (e.g. "clicking the 
> > merge button") by sending an email to the BTS (e.g. to tag bugs as 
> > pending), or to tracker.d.o (for new commit notifications).
> >  - Some of our IRC bots are triggered by procmail rules.
> >  - At some point mentors.debian.net depended on a NNTP gateway to the 
> > debian-devel-changes mailing list to trigger removal of superseded packages 
> > (...)
> >  - etc. etc.
> >
> > I'm still not sure where your trail of questions is going? fedmsg in Debian 
> > has been dead for years at this point, and there still doesn't seem to be 
> > much interest to implement anything beyond email parsing in some of our 
> > core systems.
>
> Cool, so basically what I am thinking about is to create a free
> software from what you are describing. I.e. create reusable tooling
> out of the Debian messaging system. Something that a new linux
> distribution can easily start using to connect their services.
>
> I didn't know Debian infra works like this but I find it very
> elegant/efficient and I would like the solution you have to be
> reusable by others.
>
> So basically the tooling should contain:
> - unified email message format
> - library that is able to translate a message to a language data
> structure (e.g. dictionary in python)
> - email receiver that would be listening for emails coming from the
> bus and emitting events based on that (this could be part of the
> library so you would be able to attach a callback for an incoming
> 

Re: email backend for fedmsg

2020-03-24 Thread clime
On Tue, 24 Mar 2020 at 22:45, Nicolas Dandrimont  wrote:
>
> On Tue, Mar 24, 2020, at 21:51, clime wrote:
> > On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont  wrote:
> > >
> > > Hi!
> > >
> > > On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > > > Hello!
> > > >
> > > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> > > > fedmsg usage in Debian.
> > > >
> > > > There is a note: "it seems that people actually like parsing emails"
> > >
> > > This was just a way to say that fedmsg never got much of a user base in 
> > > the services that run on Debian infra, and that even the new services 
> > > introduced at the time kept parsing emails.
> >
> > Hello Nicolas!
> >
> > Do you remember some such service and how it used email parsing 
> > specifically?
>
> I believe that tracker.debian.org was introduced around that time.
>
> At the point it was created, tracker.d.o was mostly consuming emails from 
> packages.debian.org to update its data. These days tracker.d.o has replaced 
> packages.d.o as "email router", in that it receives all the mails from 
> services (e.g. the BTS, the archive maintenance software, buildds, salsa 
> webhooks, ...) and forwards them to the public.
>
> > I am still a bit unclear how email parsing is used in Debian
> > infrastructure, don't get me wrong, I find it elegant
>
> Ha. I find that it's a big mess.
>
> Here's the set of headers of a message I received today from tracker.d.o, 
> which are supposed to make parsing these emails better:
>
> X-PTS-Approved: yes
> X-Distro-Tracker-Package: facter
> X-Distro-Tracker-Keyword: derivatives
> X-Remote-Delivered-To: dispa...@tracker.debian.org
> X-Loop: dispa...@tracker.debian.org
> X-Distro-Tracker-Keyword: derivatives
> X-Distro-Tracker-Package: facter
> List-Id: 
> X-Debian: tracker.debian.org
> X-Debian-Package: facter
> X-PTS-Package: facter
> X-PTS-Keyword: derivatives
> Precedence: list
> List-Unsubscribe: 
> 
>
> I'll leave you to judge whether this makes sense or not.
>
> (and it turns out that the actual useful payload was just plaintext with no 
> real chance of automated parsing)
>
> > but from what I have found (e.g. reportbug), in the beginning there is an
> > email being sent by some human which will then trigger some automatic
> > action (e.g. putting the bug into db). So it's like you could do all
> > your work simply by sending emails (some of them machine-parsable).
> >
> > So do you have the opposite? I do some clicking action somewhere and
> > it will send an email to a certain mailing list to inform human
> > beings? Or let's not just clicking but e.g. `git push` (something that
> > you can still do from command line).
> >
> > Do you have: I do some clicking action somewhere and it will send an
> > email to a certain mailing list where the email is afterward parsed by
> > another service which will do an action (e.g. launch a build) based on
> > it?
>
> Both of these are somewhat true.
>
> Some examples of email-based behaviors:
>  - Our bug tracking system is fully controlled by email.
>  - Closing a bug in reaction to an upload is done by an email from the 
> archive maintenance system (dak) to the bug tracking system.
>  - Salsa has a webhook service that react to UI clicks (e.g. "clicking the 
> merge button") by sending an email to the BTS (e.g. to tag bugs as pending), 
> or to tracker.d.o (for new commit notifications).
>  - Some of our IRC bots are triggered by procmail rules.
>  - At some point mentors.debian.net depended on a NNTP gateway to the 
> debian-devel-changes mailing list to trigger removal of superseded packages 
> (...)
>  - etc. etc.
>
> I'm still not sure where your trail of questions is going? fedmsg in Debian 
> has been dead for years at this point, and there still doesn't seem to be 
> much interest to implement anything beyond email parsing in some of our core 
> systems.

Cool, so basically what I am thinking about is to create a free
software from what you are describing. I.e. create reusable tooling
out of the Debian messaging system. Something that a new linux
distribution can easily start using to connect their services.

I didn't know Debian infra works like this but I find it very
elegant/efficient and I would like the solution you have to be
reusable by others.

So basically the tooling should contain:
- unified email message format
- library that is able to translate a message to a language data
structure (e.g. dictionary in python)
- email receiver that would be listening for emails coming from the
bus and emitting events based on that (this could be part of the
library so you would be able to attach a callback for an incoming
message or just do blocking waits)
- email publisher - something that can send a new message into the
bus, i.e. to a preconfigured mail server (a "broker" or "hub")
- mail server that would have an http API to manage topic
subscriptions  (i.e. add/delete me from a given 

Re: email backend for fedmsg

2020-03-24 Thread Nicolas Dandrimont
On Tue, Mar 24, 2020, at 21:51, clime wrote:
> On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont  wrote:
> >
> > Hi!
> >
> > On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > > Hello!
> > >
> > > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> > > fedmsg usage in Debian.
> > >
> > > There is a note: "it seems that people actually like parsing emails"
> >
> > This was just a way to say that fedmsg never got much of a user base in the 
> > services that run on Debian infra, and that even the new services 
> > introduced at the time kept parsing emails.
> 
> Hello Nicolas!
> 
> Do you remember some such service and how it used email parsing specifically?

I believe that tracker.debian.org was introduced around that time.

At the point it was created, tracker.d.o was mostly consuming emails from 
packages.debian.org to update its data. These days tracker.d.o has replaced 
packages.d.o as "email router", in that it receives all the mails from services 
(e.g. the BTS, the archive maintenance software, buildds, salsa webhooks, ...) 
and forwards them to the public.

> I am still a bit unclear how email parsing is used in Debian
> infrastructure, don't get me wrong, I find it elegant

Ha. I find that it's a big mess.

Here's the set of headers of a message I received today from tracker.d.o, which 
are supposed to make parsing these emails better:

X-PTS-Approved: yes
X-Distro-Tracker-Package: facter
X-Distro-Tracker-Keyword: derivatives
X-Remote-Delivered-To: dispa...@tracker.debian.org
X-Loop: dispa...@tracker.debian.org
X-Distro-Tracker-Keyword: derivatives
X-Distro-Tracker-Package: facter
List-Id: 
X-Debian: tracker.debian.org
X-Debian-Package: facter
X-PTS-Package: facter
X-PTS-Keyword: derivatives
Precedence: list
List-Unsubscribe: 

I'll leave you to judge whether this makes sense or not.

(and it turns out that the actual useful payload was just plaintext with no 
real chance of automated parsing)

> but from what I have found (e.g. reportbug), in the beginning there is an
> email being sent by some human which will then trigger some automatic
> action (e.g. putting the bug into db). So it's like you could do all
> your work simply by sending emails (some of them machine-parsable).
> 
> So do you have the opposite? I do some clicking action somewhere and
> it will send an email to a certain mailing list to inform human
> beings? Or let's not just clicking but e.g. `git push` (something that
> you can still do from command line).
> 
> Do you have: I do some clicking action somewhere and it will send an
> email to a certain mailing list where the email is afterward parsed by
> another service which will do an action (e.g. launch a build) based on
> it?

Both of these are somewhat true.

Some examples of email-based behaviors:
 - Our bug tracking system is fully controlled by email.
 - Closing a bug in reaction to an upload is done by an email from the archive 
maintenance system (dak) to the bug tracking system.
 - Salsa has a webhook service that react to UI clicks (e.g. "clicking the 
merge button") by sending an email to the BTS (e.g. to tag bugs as pending), or 
to tracker.d.o (for new commit notifications).
 - Some of our IRC bots are triggered by procmail rules.
 - At some point mentors.debian.net depended on a NNTP gateway to the 
debian-devel-changes mailing list to trigger removal of superseded packages 
(...)
 - etc. etc.

I'm still not sure where your trail of questions is going? fedmsg in Debian has 
been dead for years at this point, and there still doesn't seem to be much 
interest to implement anything beyond email parsing in some of our core 
systems. 

Bye,
-- 
Nicolas Dandrimont



Re: email backend for fedmsg

2020-03-24 Thread Paul Gevers
Hi Clime,

On 24-03-2020 21:51, clime wrote:
> So do you have the opposite? I do some clicking action somewhere and
> it will send an email to a certain mailing list to inform human
> beings? Or let's not just clicking but e.g. `git push` (something that
> you can still do from command line).
> 
> Do you have: I do some clicking action somewhere and it will send an
> email to a certain mailing list where the email is afterward parsed by
> another service which will do an action (e.g. launch a build) based on
> it?

git push to most salsa based repos with "Closes: #" in the commit
message will trigger a message to the bts which update the bug meta data
to mark the bug as pending and informs the submitter of that bug that
the bug is about to be fixed.

Paul



signature.asc
Description: OpenPGP digital signature


Re: email backend for fedmsg

2020-03-24 Thread clime
On Tue, 24 Mar 2020 at 20:40, Nicolas Dandrimont  wrote:
>
> Hi!
>
> On Sun, Mar 22, 2020, at 13:06, clime wrote:
> > Hello!
> >
> > Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> > fedmsg usage in Debian.
> >
> > There is a note: "it seems that people actually like parsing emails"
>
> This was just a way to say that fedmsg never got much of a user base in the 
> services that run on Debian infra, and that even the new services introduced 
> at the time kept parsing emails.

Hello Nicolas!

Do you remember some such service and how it used email parsing specifically?

I am still a bit unclear how email parsing is used in Debian
infrastructure, don't get me wrong, I find it elegant but from what I
have found (e.g. reportbug), in the beginning there is an
email being sent by some human which will then trigger some automatic
action (e.g. putting the bug into db). So it's like you could do all
your work simply by sending emails (some of them machine-parsable).

So do you have the opposite? I do some clicking action somewhere and
it will send an email to a certain mailing list to inform human
beings? Or let's not just clicking but e.g. `git push` (something that
you can still do from command line).

Do you have: I do some clicking action somewhere and it will send an
email to a certain mailing list where the email is afterward parsed by
another service which will do an action (e.g. launch a build) based on
it?

Thanks
clime

>
> > [...]
> >
> > So fedmsg would become a tiny wrapper over email that would just
> > serialize and parse json data to and from email messages and check
> > signatures.
>
> The only native fedmsg producer in Debian was mentors.debian.net. Other 
> events were generated by various email parsers connected to mailing lists 
> (debian-devel-announce, debian-bugs-announce).
>
> > I am asking because I like the idea of distribution-independent
> > infrastructure message bus that this project had.
>
> Yes, it was a nice idea.
>
> > [...]
>
> Cheers,
> --
> Nicolas Dandrimont



Re: email backend for fedmsg

2020-03-24 Thread Nicolas Dandrimont
Hi!

On Sun, Mar 22, 2020, at 13:06, clime wrote:
> Hello!
> 
> Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> fedmsg usage in Debian.
> 
> There is a note: "it seems that people actually like parsing emails"

This was just a way to say that fedmsg never got much of a user base in the 
services that run on Debian infra, and that even the new services introduced at 
the time kept parsing emails.

> [...]
>
> So fedmsg would become a tiny wrapper over email that would just
> serialize and parse json data to and from email messages and check
> signatures.

The only native fedmsg producer in Debian was mentors.debian.net. Other events 
were generated by various email parsers connected to mailing lists 
(debian-devel-announce, debian-bugs-announce).

> I am asking because I like the idea of distribution-independent
> infrastructure message bus that this project had.

Yes, it was a nice idea.

> [...]

Cheers,
-- 
Nicolas Dandrimont



Re: email backend for fedmsg

2020-03-24 Thread Peter Silva
MQTT is the best thing going for interop purposes.

On Tue, Mar 24, 2020 at 1:20 PM Jeremy Stanley  wrote:

> On 2020-03-24 13:09:35 -0400 (-0400), Peter Silva wrote:
> [...]
> > We could talk about the merits of various protocols (I see fedmsg
> > uses ZeroMQ) but that is a deep rabbit hole... to me, fedmsg looks
> > like it is making a ZeroMQ version of a broker (which is a bit
> > ironic given the original point of that protocol) trying to build
> > a broker ecosystem is hard. Using an existing one is much easier.
> > so to me it makes sense that fedmsg is not really working out.
> [...]
>
> In the OpenDev collaboratory we added an event stream for our
> services some years ago using the MQTT protocol (a long-established
> ISO/OASIS standard). I gather there was some work done to make
> fedmsg support MQTT as a result of that, so it might be an
> alternative to relying on ZeroMQ at least.
> --
> Jeremy Stanley
>


Re: email backend for fedmsg

2020-03-24 Thread Jeremy Stanley
On 2020-03-24 13:09:35 -0400 (-0400), Peter Silva wrote:
[...]
> We could talk about the merits of various protocols (I see fedmsg
> uses ZeroMQ) but that is a deep rabbit hole... to me, fedmsg looks
> like it is making a ZeroMQ version of a broker (which is a bit
> ironic given the original point of that protocol) trying to build
> a broker ecosystem is hard. Using an existing one is much easier.
> so to me it makes sense that fedmsg is not really working out.
[...]

In the OpenDev collaboratory we added an event stream for our
services some years ago using the MQTT protocol (a long-established
ISO/OASIS standard). I gather there was some work done to make
fedmsg support MQTT as a result of that, so it might be an
alternative to relying on ZeroMQ at least.
-- 
Jeremy Stanley


signature.asc
Description: PGP signature


Re: email backend for fedmsg

2020-03-24 Thread Peter Silva
hi, totally different take on this...

We could talk about the merits of various protocols (I see fedmsg uses
ZeroMQ) but that is a
deep rabbit hole... to me, fedmsg looks like it is making a ZeroMQ version
of a broker (which is a bit ironic given the original point of that
protocol) trying to build a broker ecosystem is hard. Using an existing one
is much easier.  so to me it makes sense that fedmsg is not really working
out.

However,



I work in telecom for meteorology, and we ended up with a general method
for file copying (catchphrase: rsync on steroids*.) ( *every catchphrase is
a distortion, no dis to rsync, but in certain cases we do work much faster,
it just communicates the idea.) Sarracenia (
https://github.com/MetPX/Sarracenia) is a GPL2 app (Python and C
implementations) that use mozilla public license rabbitmq broker, as well
as openssh and/or any web server to do fastish file synching, and/or
processing/orchestration. The app is just json messages with file metadata
sent through the broker. Then you daisy chain brokers through clients.  No
centralization (every entity installs their own broker), No federated
identity required (authentication is to each broker, but they can pass
files/messages to each other.)

A firstish thing to do with it would be to sync the debian mirrors in
real-time rather than periodically.  Each mirror has a broker, they get
advertisements (AMQP messages containing JSON file metadata) download the
corresponding file, and re-advertise (publish on the local broker with the
local file URL) for downstream clients. You can then make a mesh of
mirrors, where, if each mirror is subscribed to at least two others, then
it can withstand the failure of any node.  If you add more connections, you
increase redundancy.

Once you have that sort of anchor tenant for an AMQP message bus, people
might want to use it to provide other forms of automation, but way quicker
and in some ways much simpler than SMTP.  but yeah... SMTP is a lot more
well-known/common. RabbitMQ is the industry dominant open solution for AMQP
brokers. sounds like marketing bs, but if you look around it is what the
vast majority are using, and there are thousands upon thousands of
deployments. It's a much more viable starting point, for stability, and a
lot less assembly required to get something going. Sarracenia makes it a
bit easier again, but messages are kind of alien and different, so it takes
a while to get used to them.




On Sun, Mar 22, 2020 at 8:24 AM clime  wrote:

> Hello!
>
> Ad. https://lists.debian.org/debian-devel/2016/07/msg00377.html -
> fedmsg usage in Debian.
>
> There is a note: "it seems that people actually like parsing emails"
>
> What about adding email backend to fedmsg then. Wouldn't it be an
> interesting idea? It could basically rely on postfix for sending
> messages, hence providing decentralization as well as high
> reliability. I think that amount of events that happen in distribution
> (like package update, package build) is never so huge that email
> infrastructure wouldn't handle it and also the machine mailing
> infrastructure could be optionally be separated from the human one if
> needed.
>
> So fedmsg would become a tiny wrapper over email that would just
> serialize and parse json data to and from email messages and check
> signatures.
>
> I am asking because I like the idea of distribution-independent
> infrastructure message bus that this project had.
>
> Btw. instead of json, yaml could be used so it is nicer to human eyes.
>
> clime
>
>


Re: email backend for fedmsg

2020-03-23 Thread clime
The email backend might be quite a heavy-weight idea ... although I
think it would do the job if properly setup and _very_ reliably. I was
thinking about something similar to google pub/sub.

Another approach how to add reliability to the current fedmsg would be
to add an optional sqlite persistence to each application publishing
the fedmsg messages. Basically, the messages would be stored in a
circular buffer of configurable size so that when some service drops
off, we could still keep messages for it for some time until it
recovers. This is a very rough idea and there is this whole problem
how subscribes should get information about providers and so on...some
analysis is here http://fedmsg.com/federated-message-bus/

I would like to continue the fedmsg project that was mainly started by
Ralph Bean some years ago because I like the idea of a federated
distribution-independent message bus that was triggered by
Debian/Fedora cooperation. There might be some technical challenges on
the way but solvable in the end and the solution might be interesting.

On the other hand, I am only one guy with limited time so if anyone
wants to cooperate on this, it would be most welcome.

clime