Re: [PDB Tech] new rate limiting mechanism is too strict

2022-05-17 Thread Chris Caputo
On Tue, 17 May 2022, Arnold Nipper wrote:
> On 17.05.2022 18:54, Chris Caputo wrote:
> > Highlights for all client developers:
> 
> I would add: instead of querying PDB for each ASN one by one, use the
> asn__in=$list_of_ASN_separated_by_commata feature.

Ooh 200 IQ maneuver there. Genius!

Thanks,
Chris
___
Pdb-tech mailing list
Pdb-tech@lists.peeringdb.com
https://lists.peeringdb.com/cgi-bin/mailman/listinfo/pdb-tech


Re: [PDB Tech] new rate limiting mechanism is too strict

2022-05-17 Thread Chris Caputo
All,

I am behind the throttling rollout in the last 24 hours, and have worked 
with Theo to loosen things up for now. I've also reached out to pierky re 
changes requested for arouteserver and will endeavor to delay resumption 
of the same throttling until after arouteserver users have had reasonable 
time to upgrade.

Highlights for all client developers:

 - Implement support for PeeringDB API keys:

 https://docs.peeringdb.com/howto/api_keys/

   The idea being that we will throttle users using API keys less.

 - Add a delay in between queries that is randomly between 2 and 2.5 
   seconds, to reduce thundering herd. This delay will mean a client 
   queries PeeringDB at most 30 hits per minute, which will be unthrottled 
   if they are authenticated using API keys and not making identical 
   requests.

 - Highly preferred over separate queries: If you don't need non-public 
   contact info from PeeringDB, is that you implement peeringdb-py 
   (peeringdb-py: http://peeringdb.github.io/peeringdb-py/) client-side 
   caching. Doing so enables you to locally query the heck out of a local 
   sqlite (or whatever) database. The start time of a peeringdb-py run 
   should be randomized per the docs 
   (http://peeringdb.github.io/peeringdb-py/cli/). At the SeattleIX we use 
   peeringdb-py and here is what the once per hour update looks like for 
   all of PeeringDB:

[17/May/2022:15:40:09 +] "GET /api/org?since=1652794724=0 HTTP/1.1" 
200 392 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.423
[17/May/2022:15:40:10 +] "GET /api/fac?since=1652773361=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.409
[17/May/2022:15:40:10 +] "GET /api/net?since=1652796557=0 HTTP/1.1" 
200 1695 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.426
[17/May/2022:15:40:11 +] "GET /api/ix?since=1652397370=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.397
[17/May/2022:15:40:11 +] "GET /api/ixfac?since=1652763759=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.414
[17/May/2022:15:40:12 +] "GET /api/ixlan?since=1652781160=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.399
[17/May/2022:15:40:12 +] "GET /api/ixpfx?since=1652429334=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.408
[17/May/2022:15:40:13 +] "GET /api/netfac?since=1652790428=0 
HTTP/1.1" 200 318 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.553
[17/May/2022:15:40:14 +] "GET /api/netixlan?since=1652796556=0 
HTTP/1.1" 200 399 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.590
[17/May/2022:15:40:14 +] "GET /api/poc?since=1652785835=0 HTTP/1.1" 
200 24 "-" "PeeringDB/1.2.1.1 django_peeringdb/2.13.0" 0.640

   It is fast because, as I understand it, django serializes PeeringDB 
   changes, and the timestamp (since last update) results in only the 
   changes being delivered.

Finally: My apology to those disrupted by this. Please feel free to reach 
out to me with any questions or concerns.

Thanks,
Chris
___
Pdb-tech mailing list
Pdb-tech@lists.peeringdb.com
https://lists.peeringdb.com/cgi-bin/mailman/listinfo/pdb-tech


[PDB Tech] new rate limiting mechanism is too strict

2022-05-17 Thread Theo de Raadt
At YYCIX, we run arouteserver, which polls peeringdb.  On May 15 (I think)
we started seeing failures to download peeringdb records, which has resulted
in our routeserver configuration not being updated as usual.

I've become aware that the API has a new rate filtering mechanism, for 
non-APIKEY
accesses.

Well, arouteserver doesn't do APIKEY, and I don't see how the author of
arouteserver would have received any notice that this was suddenly
mandatory, nor all the arouteserver users.

I think peeringdb should have looked more carefully into the no-APIKEY
accesses to determine what downstream effects might occur from this change.

Testing further manually, I observe some really crazy behaviour.

Using a regular browser, I can keep reloading the same record over and over
without hitting any limit.

Using a command-line client (curl or OpenBSD ftp(1)), i hit some extremely
strict limits very quickly.  

On the 11th lookup, I receiving this, and it started counting down:

{"message": "Request was throttled. Expected available in 59 minutes. 
Authenticate for less restrictions. For more information: 
https://docs.peeringdb.com/howto/api_keys/;, "meta": {"error": "Too Many 
Requests"}}

All requests are blocked for an hour.

I paused, and tried a new lookup every 5 minutes.  Half an hour later it 
allowed me
to retrieve 1 record, but then continued the countdown:

{"message": "Request was throttled. Expected available in 17 minutes. 
Authenticate for less restrictions. For more information: 
https://docs.peeringdb.com/howto/api_keys/;, "meta": {"error": "Too Many 
Requests"}}

If the new limit is 10 records an hour, I mean you could really just shut
down the service entirely, there's really no difference.

I understand using APIKEY is the new way, but the old way has effectively been
disabled without notice.

But also, why does this work in the browser?  I can reload hundreds of
times.  Is the official browser User-agent being considered exempt from
the rule?  If so I don't understand what the protection plan is.. maybe
we should all change our commandline tools to declare they are Chrome?




___
Pdb-tech mailing list
Pdb-tech@lists.peeringdb.com
https://lists.peeringdb.com/cgi-bin/mailman/listinfo/pdb-tech