Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-15 Thread Job Snijders via routing-wg
Dear Ties, group,

Thank you for the outline.

On Wed, Apr 14, 2021 at 02:33:37PM +0200, Ties de Kock wrote:
> The RPKI application does not support writing the complete repository to disk
> for each state (as needed for spooling the repository as proposed in scripts).
> Synchronously writing every state of the repository to disk is not feasible,
> given our update frequency and repository size. Functionality for
> asynchronously writing the repository to disk needs to be developed. We have 
> two
> paths to develop this:
> - The first is a new daemon that writes to disk from the database state at a 
> set interval.
> - The second one is using RRDP as a source of truth and writing the 
> repository to disk.
> Furthermore, we would need to migrate the storage from NFS to have faster 
> writes.
> 
> Both approaches need an extended period for validation and we are not able to
> deploy these within a few weeks. The latter approach (using RRDP) has less 
> risk
> and is the option we are aiming for at the moment. We plan to release the new
> publication infrastructure in Q2/Q3 2021 and hope to migrate earlier.

The "RRDP as source of truth" approach indeed seems the more appealing
(and simpler!) option. I would encourage the NCC to follow that path.

In the mean time, can 
https://www.ripe.net/support/service-announcements/service-announcements/current
be updated to reflect that there are known race conditions and problems
with the RIPE NCC RSYNC service?

Are there any other tweaks the NCC can think of that reduce the
operational pain? Maybe increasing the publication interval?

Kind regards,

Job



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-14 Thread Ties de Kock
Dear colleagues,

First, let me give you an overview of our rsync infrastructure and the situation
encountered by a client. Afterwards, I will describe the context of our
application and repository, and how that limits our design space.

RPKI objects are created on machines in an isolated network. The active machine
writes new objects to an NFS share (with replicated storage in two data 
centres).
The rsync machines (outside the isolated network) serve these files. These are
behind a load balancer.

Sets of objects to be updated (for example a manifest, CRL, and certificate) are
written to a staging directory by the application. After all the files are
created, they are moved into the main repository directory. There is a small
period between these moves. In a recent incident that was reported to us, this
was ~30ms, with files written in an order where the repository was correct on
disk at all times. This part of the code has been in place since 2012.
While the files are written to the filesystem they are also sent to a (draft
version of RFC 8181) publication server. The files are sent atomically in one
message. The publication is synchronous: when a ROA is created, it is 
immediately
published to rsync and the publication server.

The affected client in the reported incident read the file list *before* the new
certificate was present, but checked the content (and copied) the updated
manifest which referred to a certificate that was not present when the file list
was created. In the rest of this document, we will call this situation a
non-repeatable read; part of the repository reflects one state while another
part reflects a different state.

On April 12, we published 41,600 sets of objects. This resulted in 41,600
distinct repository states on disk. The RIPE NCC repository contains ~65,500
files in ~34,500 directories, with a total size of 157MB.

The repository is consistent (on disk) when the application is not publishing
objects. The repository is also consistent (for a slow client) when no files are
added or changed after their rsync client starts retrieving the file list.

Copying the repository without coordination from the application (i.e. to spool
it) has the same risk of a non-repeatable read as rsync clients have. However,
in this case, it would affect many clients for an extended period - and mask
instead of solve the underlying issue. Other approaches (such as snapshotting)
also have limitations that make them untenable.

The RPKI application does not support writing the complete repository to disk
for each state (as needed for spooling the repository as proposed in scripts).
Synchronously writing every state of the repository to disk is not feasible,
given our update frequency and repository size. Functionality for
asynchronously writing the repository to disk needs to be developed. We have two
paths to develop this:
- The first is a new daemon that writes to disk from the database state at a 
set interval.
- The second one is using RRDP as a source of truth and writing the repository 
to disk.
Furthermore, we would need to migrate the storage from NFS to have faster 
writes.

Both approaches need an extended period for validation and we are not able to
deploy these within a few weeks. The latter approach (using RRDP) has less risk
and is the option we are aiming for at the moment. We plan to release the new
publication infrastructure in Q2/Q3 2021 and hope to migrate earlier.

I’m happy to answer any further questions you may have.

Kind regards,

Ties de Kock
RIPE NCC


> On 12 Apr 2021, at 15:12, Nick Hilliard  wrote:
> 
> Erik Bais wrote on 12/04/2021 11:41:
>> This looks to be a 3 line bash script fix on a cronjob …  So why isn’t this 
>> just tested on a testbed and updated before the end of the week ?
> 
> cache coherency and transaction guarantees are problems which are known to be 
> difficult to solve.  Long term, the RIPE NCC probably needs to aim for the 
> degree of transaction read consistency that might be provided by an ACID 
> database or something equivalent, and that will take time and probably a 
> migration away from a filesystem-based data store.
> 
> So the question is what to do in the interim?  The bash script that Job 
> posted may help to reduce some of the race conditions that are being 
> observed, but it's unlikely to guarantee transaction consistency in a deep 
> sense.  Does the RIPE NCC have an opinion on whether the approach used in 
> this script would help with some of the problems that are being seen and if 
> so, would there be any significant negative effects associated with 
> implementing it as an intermediate patch-up?
> 
> Nick
> 




Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-13 Thread George Michaelson
I'll see if I can do that from the log stream. It may take some time.

cheers

-G

On Tue, Apr 13, 2021 at 10:39 PM Nick Hilliard  wrote:
>
> George Michaelson wrote on 13/04/2021 05:40:
> > As of Feb 2021
> > •1,009 total ASNs reading the APNIC RPKI data every day
> > –902 distinct ASNs collecting using RRDP protocol (https)
> > –927 distinct ASNs collecting via rsync
>
> mmm, interesting.  full preso here:
>
> > https://conference.apnic.net/51/assets/files/APSr481/routing-security-sig-rpki-status-report.pdf
>
> Would it be possible to drill down into these figures a bit more?  I.e.
> is it possible to work out how many are pulling the TAL via rsync, but
> then using rrdp to synchronise their local instances?  Or equivalently,
> how many people are using rsync for everyone?  Either figure will give ~
> the other.
>
> Pulling the TAL via rsync and then using rrdp for everything else is not
> a scenario that needs to be taken into account for this rsync
> consistency issue.
>
> Nick



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-13 Thread Ties de Kock
Hi Nick,

> On 13 Apr 2021, at 15:33, Nick Hilliard  wrote:
> 
>> Would it be possible to drill down into these figures a bit more?  I.e. is 
>> it possible to work out how many are pulling the TAL via rsync, but then 
>> using rrdp to synchronise their local instances?  Or
> 
> that came out badly garbled.  I meant how many clients were pulling the trust 
> anchor certs via rsync due to having older TALs installed on the local RP 
> cache, and then downloading the manifest/roas data via rrdp afterwards 
> because the TA contains both rsync and https locator information, and the RP 
> software was able to select rrdp instead of rsync because that was presented 
> as an option.

For RIPE I can give more details on what we see.

Because of the structure of our repository, we can split out clients connecting
over rsync to retrieve the trust anchor from those connecting to the main 
repository.

We do see a change on the 2nd of April so I'm providing data both for the week 
before and after this date. The cause for this change is unknown.

In the week leading up to the 2nd of April, on average per dag we see:
 * 192 unique IPs (from 182 /24's/64's) creating 8636 connections to /repository
 * 911 unique IPs (from 721 /24's/64's) creating 81855 connections to /ta
In the week starting on the 2nd of April on average per day we see:
 * 598 unique IPs (from 582 /24's/64's) creating 17594 connections to 
/repository
 * 1301 unique IPs (from 1114 /24's/64's) creating 89675 connections to /ta

Traffic also increased from ~34 to ~73GB an hour (for rsync).

We see ~1086 unique IPs accessing the TA certificate over HTTPS per day.

Kind regards,
Ties




Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-13 Thread Nick Hilliard
Would it be possible to drill down into these figures a bit more?  I.e. 
is it possible to work out how many are pulling the TAL via rsync, but 
then using rrdp to synchronise their local instances?  Or


that came out badly garbled.  I meant how many clients were pulling the 
trust anchor certs via rsync due to having older TALs installed on the 
local RP cache, and then downloading the manifest/roas data via rrdp 
afterwards because the TA contains both rsync and https locator 
information, and the RP software was able to select rrdp instead of 
rsync because that was presented as an option.


Nick




Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-13 Thread Nick Hilliard

George Michaelson wrote on 13/04/2021 05:40:

As of Feb 2021
•1,009 total ASNs reading the APNIC RPKI data every day
–902 distinct ASNs collecting using RRDP protocol (https)
–927 distinct ASNs collecting via rsync


mmm, interesting.  full preso here:


https://conference.apnic.net/51/assets/files/APSr481/routing-security-sig-rpki-status-report.pdf


Would it be possible to drill down into these figures a bit more?  I.e. 
is it possible to work out how many are pulling the TAL via rsync, but 
then using rrdp to synchronise their local instances?  Or equivalently, 
how many people are using rsync for everyone?  Either figure will give ~ 
the other.


Pulling the TAL via rsync and then using rrdp for everything else is not 
a scenario that needs to be taken into account for this rsync 
consistency issue.


Nick



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-13 Thread Ben Maddison via routing-wg
Hi Nathalie,

On 04/12, Nathalie Trenaman wrote:
> Dear colleagues,
> 
> 
>
> We are planning on changing our publication infrastructure and using the same 
> "revisions" RRDP uses for the content of the rsync repository. Rsync is an 
> officially supported distribution protocol for RPKI repository data, and it 
> is one of our highest priorities that the data published is atomic and 
> consistent. We plan to release the new publication infrastructure in Q2/Q3 
> 2021. Part of this work will mitigate these non-repeatable-reads for clients 
> using rsync.
> 
> We will update you on our progress during RIPE 82, taking place online from 
> 17-21 May 2021.
> 
The above description seems to suggest that repository access via rsync
is an optional extra that the RIPE NCC provides as a value add.

However, as of course you know, rsync is the only mandatory to implement
access method *today* and we are not yet on an agreed path towards an
RRDP-only future.

It seems to me that this issue deserves the same urgency as "the
publication server periodically reboots when used correctly".

If that requires a workaround (of the form suggested by others) now, and
then a redesign in 6 months, fine. But it is a dis-service to the
Internet community as a whole to skip the "workaround now" step.

Cheers,

Ben


signature.asc
Description: PGP signature


Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread George Michaelson
It has been pointed out to me I must have meant chdir() when I said
chroot(). Sorry.

rysnc --daemon is chdir() to the target of a symlink when it runs. So,
changing the symlink which an earlier instance has chdir() "into"
doesn't alter the directory of that forked daemon, if  you change the
symlink.

-G



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread George Michaelson
Not to detract from the paper Randy posted,  in any way.

For APRICOT 2021 I reported to the APNIC routing security sig as follows:

As of Feb 2021
•1,009 total ASNs reading the APNIC RPKI data every day
–902 distinct ASNs collecting using RRDP protocol (https)
–927 distinct ASNs collecting via rsync

--

So for us, they are mostly co-incident sets of ASNs. For APNIC's
publication point, the scale of the issue is "almost everybody does
both"

Whats the size of the problem?

The size of the problem is the likelihood of updating rsync state
whilst its being fetched. There is a simple model to avoid this which
has been noted: modify discrete directories, swing a symlink so the
change from "old" to "new" state is as atomic as possible. rsync on
UNIX is chroot() and the existing bound clients will drain from the
old state. Then run some cleanup process.

But, if you are caught behind a process which writes to the actively
served rsync directory, the "size" of the subdirectory structure is an
indication of the risk of a Manifest being written to, whilst being
fetched. Yes, in absolute terms it could happen to a 1 ROA manifest,
but it is more likely to happen in any manifest of size The "cost" of
a non-atomic upate is higher, and I believe the risk is higher. The
risk is computing the checksums and writing the Manifest and signing
it, while some asynchronous update is happening, and whilst people are
fetching the state.

RIR host thousands of children, so we have at least one manifest each
which is significantly larger over those certificated products and
more likely to trigger a problem.

ggm@host6 repository % !fi
find . -type d | xargs -n 1 -I {} echo "ls {} | wc -l" | sh | sort | uniq -c
   12
   13
   17
   18
   19
   1   52
   1  147
   1 3352
ggm@host6 repository %

Our hosted solution has this structure. Most children by far have less
than 10 objects.

ggm@host6 member_repository % find . -type d | xargs -n 1 -I {} echo
"ls {} | wc -l" | sh | sort | uniq -c
29971
16972
20993
 5604
 2295
  966
  447
  228
  179
  11   10
   6   11
   5   12
   5   13
   6   14
   2   15
   4   16
   2   17
   2   18
   1   20
   1   23
   1   25
   1   27
   1   28
   1   29
   1   34
   1   38
   1   40
   3   42
   1   46
   1   60
   1   97
   1  848
ggm@host6 member_repository %



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Randy Bush
> I'm curious about the scale of the issue here.

if you make an approximation that all RPs touch all PPs, you may find
this useful

John Kristoff, Randy Bush, Chris Kanich, George Michaelson, Amreesh
Phokeer, Thomas Schmidt, Matthias Wählisch. On Measuring RPKI Relying
Parties, ACM IMC 2020
https://archive.psg.com/201029.imc-rp.pdf

randy

---
ra...@psg.com
`gpg --locate-external-keys --auto-key-locate wkd ra...@psg.com`
signatures are back, thanks to dmarc header butchery




Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Nick Hilliard

Job Snijders wrote on 12/04/2021 16:10:

There is a wealth of knowledge available in this working group on how
POSIX-like systems work, how ISP operations work, and the RPKI works, I
hope RIPE NCC can leverage that.


I'm curious about the scale of the issue here.

Would someone from the RIPE NCC be able to disclose how many rysnc 
clients they're seeing? And what percentage of total rpki (i.e. rsync + 
rrdp) that is?  I.e. how much attention needs to be given to resolving 
this issue?


Nick



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Job Snijders via routing-wg
On Mon, Apr 12, 2021 at 02:12:10PM +0100, Nick Hilliard wrote:
> Erik Bais wrote on 12/04/2021 11:41:
> > This looks to be a 3 line bash script fix on a cronjob …  So why
> > isn’t this just tested on a testbed and updated before the end of
> > the week ?
> 
> cache coherency and transaction guarantees are problems which are
> known to be difficult to solve.  Long term, the RIPE NCC probably
> needs to aim for the degree of transaction read consistency that might
> be provided by an ACID database or something equivalent, and that will
> take time and probably a migration away from a filesystem-based data
> store.
> 
> So the question is what to do in the interim?  The bash script that
> Job posted may help to reduce some of the race conditions that are
> being observed, but it's unlikely to guarantee transaction consistency
> in a deep sense. Does the RIPE NCC have an opinion on whether the
> approach used in this script would help with some of the problems that
> are being seen and if so, would there be any significant negative
> effects associated with implementing it as an intermediate patch-up?

Perhaps the script [0] can be of use, or perhaps not. The script assumes
a POSIXish-compliant environment. It is not clear to me what software
process runs where and how RIPE NCC runs their publication service.

The core problem seems to me that while RSYNC clients are connected the
RIPE NCC RPKI server appears to 'pull the rug' from underneath them.
This practise reduces the reliability of the RIPE NCC RPKI service.

I can only guess how the RIPE NCC RPKI publication service exactly is
configured, but I imagine there is a 'Signer Server' which writes to
disk the few thousand individual RPKI objects, and separately there is a
RSYNC server (rpki.ripe.net) which serves the files to RSYNC clients.
Transferring sets of inter-related files around is a 'batch' operation,
the pipeline should set up accordingly.

As such, calling 'rsync' from crontab to populate the rpki.ripe.net
rsync server would likely lead to inconsistent results.

There are (at least) two objectives to keep in mind:

1/ While the Signer software is writing new files out to disk, the
'signer to publisher' replication process should not run, because the
signer isn't finished yet.

2/ While a given RSYNC client is fetching from 'rpki.ripe.net', the
'signer to publisher' replication process should not alter the contents
of the filesystem hierarchy the RSYNC client is fetching from.

The satisfy the above two conditions, I suspect a number of solutions
are available:

A) take ownership and control and only launch subsequent pipeline steps
when the Signer is done signing the latest requests. After a consistent
set of files has been written to disk, only then copy, stage, and switch
to the new directory contents using a symlink swap (allowing already
connected RSYNC clients to complete their fetch).

B) Use a load balancer to direct new RSYNC clients to a RSYNC server
containing the latest (consistent) set of files.

C) Make the RSYNC service pull from the latest (allegedly consistent)
RRDP snapshot.xml file, then move newly connected clients to the new
content using either the symlink [0] trick or a orchestrate
draining/onramping via a load balancer like haproxy.

There is a wealth of knowledge available in this working group on how
POSIX-like systems work, how ISP operations work, and the RPKI works, I
hope RIPE NCC can leverage that.

Kind regards,

Job

[0]: http://sobornost.net/~job/rpki-rsync-move.sh.txt



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Nick Hilliard

Erik Bais wrote on 12/04/2021 11:41:
This looks to be a 3 line bash script fix on a cronjob …  So why isn’t 
this just tested on a testbed and updated before the end of the week ?


cache coherency and transaction guarantees are problems which are known 
to be difficult to solve.  Long term, the RIPE NCC probably needs to aim 
for the degree of transaction read consistency that might be provided by 
an ACID database or something equivalent, and that will take time and 
probably a migration away from a filesystem-based data store.


So the question is what to do in the interim?  The bash script that Job 
posted may help to reduce some of the race conditions that are being 
observed, but it's unlikely to guarantee transaction consistency in a 
deep sense.  Does the RIPE NCC have an opinion on whether the approach 
used in this script would help with some of the problems that are being 
seen and if so, would there be any significant negative effects 
associated with implementing it as an intermediate patch-up?


Nick



Re: [routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Erik Bais
Hi Nathalie,

Thank you for addressing this RIPE NCC infra issue.

It looks like the RIPE NCC RPKI infra for rsync is updating the ROA’s in the 
same directory while the RPKI clients that use rsync, are still fetching the 
files.

This is common knowledge on using rsync  .. but with the use of MD5 of crypto 
checks on the files, that becomes an issue.

It is best practise to dump the files in a specific (timestamp)directory .. 
symlink the download link to the timestamp directory ..  and keep things 
read-only once the stuff is written on disk.. so there are no improper updates 
that would cause crypto or MD5 hash issues.
Once there is new RPKI data, create a new timestamp dir, move the symlink to 
the new location and be done with it.

As a RPKI-client user that is happy with the security within the software, that 
starts to barf over improper RPKI data .. as one should hope it would ..  I 
would like to ask the NCC to update their rsync method quicker than ‘perhaps in 
6 months … ‘

This looks to be a 3 line bash script fix on a cronjob …  So why isn’t this 
just tested on a testbed and updated before the end of the week ?

Regards,
Erik Bais

From: routing-wg  on behalf of Nathalie Trenaman 

Date: Monday 12 April 2021 at 12:04
To: "routing-wg@ripe.net" 
Subject: [routing-wg] Issue affecting rsync RPKI repository fetching

Dear colleagues,

We have been made aware of an issue that may affect some users who use RPKI 
relying party (RP) software that uses rsync. Please note that by default, only 
rpki-client reads from rsync; the rest of the RPs prefer the RPKI Repository 
Delta Protocol (RRDP).

The issue appears to create some inconsistency between the RPKI repository and 
rsync clients. In more detail, an RRDP client reads a complete state for a 
specific “serial” from the repository. In contrast, an rsync client syncs the 
state in multiple steps. First, a list of files is copied, followed by updates 
for files that have been copied. In an affected scenario, a certificate is 
added and one of the other files (the manifest) is modified after the file list 
has been sent. By reading the new manifest, but not copying the new file (it is 
not on the rsync file list), the repository copied by the rsync client contains 
an invalid manifest (a file is missing) and the RP software rejects it.

We are planning on changing our publication infrastructure and using the same 
"revisions" RRDP uses for the content of the rsync repository. Rsync is an 
officially supported distribution protocol for RPKI repository data, and it is 
one of our highest priorities that the data published is atomic and consistent. 
We plan to release the new publication infrastructure in Q2/Q3 2021. Part of 
this work will mitigate these non-repeatable-reads for clients using rsync.

We will update you on our progress during RIPE 82, taking place online from 
17-21 May 2021.

Kind regards,

Nathalie Trenaman
RIPE NCC




[routing-wg] Issue affecting rsync RPKI repository fetching

2021-04-12 Thread Nathalie Trenaman
Dear colleagues,

We have been made aware of an issue that may affect some users who use RPKI 
relying party (RP) software that uses rsync. Please note that by default, only 
rpki-client reads from rsync; the rest of the RPs prefer the RPKI Repository 
Delta Protocol (RRDP). 

The issue appears to create some inconsistency between the RPKI repository and 
rsync clients. In more detail, an RRDP client reads a complete state for a 
specific “serial” from the repository. In contrast, an rsync client syncs the 
state in multiple steps. First, a list of files is copied, followed by updates 
for files that have been copied. In an affected scenario, a certificate is 
added and one of the other files (the manifest) is modified after the file list 
has been sent. By reading the new manifest, but not copying the new file (it is 
not on the rsync file list), the repository copied by the rsync client contains 
an invalid manifest (a file is missing) and the RP software rejects it.

We are planning on changing our publication infrastructure and using the same 
"revisions" RRDP uses for the content of the rsync repository. Rsync is an 
officially supported distribution protocol for RPKI repository data, and it is 
one of our highest priorities that the data published is atomic and consistent. 
We plan to release the new publication infrastructure in Q2/Q3 2021. Part of 
this work will mitigate these non-repeatable-reads for clients using rsync.

We will update you on our progress during RIPE 82, taking place online from 
17-21 May 2021.

Kind regards,

Nathalie Trenaman
RIPE NCC