The wget is a nice solution.

But if you want to use curl, you can:

curl -b "human=1" -L -s https://beta.lmfdb.org/data/riemann-zeta-zeros/ |
head -20

Here's a gist with more detail:

https://gist.github.com/rwcitek/304aed3f2188f0dbf86dacc0085b0959#file-riemann-zeta-zeros-ipynb

Regards,
- Robert

On Sat, Nov 1, 2025 at 8:36 PM Robert Detjens <[email protected]> wrote:

> The redirect page seems to be a browser check that sets a `human=1` cookie,
> which then allows access to the underlying page. The robots.txt on that
> subdomain also blocks bot traffic on the whole site, so then that has to be
> disabled too.
>
> This wget command works here:
> wget --header 'Cookie: human=1' -e robots=off
> https://beta.lmfdb.org/data/riemann-zeta-zeros/zeros_14.dat
> or all:
> wget --header 'Cookie: human=1' -e robots=off --recursive --no-parent
> --reject 'index.*' https://beta.lmfdb.org/data/riemann-zeta-zeros/
>
>
> On Sat, Nov 1, 2025 at 6:48 PM American Citizen <[email protected]
> >
> wrote:
>
> > Hello:
> >
> > I can manually download the numeric data from the LMFDB website page
> > https://beta.lmfdb.org/data/riemann-zeta-zeros/ by clicking on one of
> > the zeros_nnnn.dat files and the browser asks if I want to save the file.
> >
> > However if I attempt to automate this procedure (and yes, the reason I
> > am attempting to do this, is the fact that 14,580 dat files exist on
> > this webpage, way beyond the normal human ability to click all 14,580
> > dat files (I took 2 hours to do 53 files due to the slow download speed)
> > but using curl fails.
> >
> > Here's the curl session logged with the -v option
> >
> > owner@localhost:~> curl -A "Mozilla/5.0 (X11; Linux x86_64; rv:144.0)
> > Gecko/20100101 Firefox/144.0" -L -O
> > "beta.lmfdb.org/data/riemann-zeta-zeros/zeros_14.dat" -v
> >    % Total    % Received % Xferd  Average Speed   Time    Time  Time
> > Current
> >                                   Dload  Upload   Total   Spent Left
> Speed
> >    0     0    0     0    0     0      0      0 --:--:-- --:--:--
> > --:--:--     0*   Trying 18.1.37.31:443...
> > * Connected to beta.lmfdb.org (18.1.37.31) port 443 (#0)
> > * ALPN: offers h2,http/1.1
> > } [5 bytes data]
> > * TLSv1.3 (OUT), TLS handshake, Client hello (1):
> > } [512 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Server hello (2):
> > { [122 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
> > { [25 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Certificate (11):
> > { [2683 bytes data]
> > * TLSv1.3 (IN), TLS handshake, CERT verify (15):
> > { [264 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Finished (20):
> > { [52 bytes data]
> > * TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
> > } [1 bytes data]
> > * TLSv1.3 (OUT), TLS handshake, Finished (20):
> > } [52 bytes data]
> > * SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
> > * ALPN: server accepted http/1.1
> > * Server certificate:
> > *  subject: CN=alpha.lmfdb.org
> > *  start date: Sep  1 16:25:06 2025 GMT
> > *  expire date: Nov 30 16:25:05 2025 GMT
> > *  subjectAltName: host "beta.lmfdb.org" matched cert's "beta.lmfdb.org"
> > *  issuer: C=US; O=Let's Encrypt; CN=R13
> > *  SSL certificate verify ok.
> > * using HTTP/1.1
> > } [5 bytes data]
> >  > GET /data/riemann-zeta-zeros/zeros_14.dat HTTP/1.1
> >  > Host: beta.lmfdb.org
> >  > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:144.0) Gecko/20100101
> > Firefox/144.0
> >  > Accept: */*
> >  >
> > { [5 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
> > { [57 bytes data]
> > * TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
> > { [57 bytes data]
> > * old SSL session ID is stale, removing
> > { [5 bytes data]
> > < HTTP/1.1 302 Found
> > < Date: Sun, 02 Nov 2025 01:32:38 GMT
> > < Server: Apache/2.4.52 (Ubuntu)
> > < Location:
> > beta.lmfdb.org/gate.html?gateorig=/data/riemann-zeta-zeros/zeros_14.dat
> > < Cache-Control: max-age=0
> > < Expires: Sun, 02 Nov 2025 01:32:38 GMT
> > < Content-Length: 344
> > < Content-Type: text/html; charset=iso-8859-1
> > <
> > * Ignoring the response-body
> > { [344 bytes data]
> > 100   344  100   344    0     0    807      0 --:--:-- --:--:--
> > --:--:--   809
> > * Connection #0 to host beta.lmfdb.org left intact
> > * Issue another request to this URL:
> > 'beta.lmfdb.org/gate.html?gateorig=/data/riemann-zeta-zeros/zeros_14.dat
> '
> > * Found bundle for host: 0x564bce5df2d0 [serially]
> > * Can not multiplex, even if we wanted to
> > * Re-using existing connection #0 with host beta.lmfdb.org
> > } [5 bytes data]
> >  > GET /gate.html?gateorig=/data/riemann-zeta-zeros/zeros_14.dat HTTP/1.1
> >  > Host: beta.lmfdb.org
> >  > User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:144.0) Gecko/20100101
> > Firefox/144.0
> >  > Accept: */*
> >  >
> > { [5 bytes data]
> > < HTTP/1.1 200 OK
> > < Date: Sun, 02 Nov 2025 01:32:38 GMT
> > < Server: Apache/2.4.52 (Ubuntu)
> > < Last-Modified: Thu, 14 Aug 2025 10:49:47 GMT
> > < ETag: "28a-63c51082bdba9"
> > < Accept-Ranges: bytes
> > < Content-Length: 650
> > < Cache-Control: no-cache, no-store, must-revalidate
> > < Expires: 0
> > < Vary: Accept-Encoding,User-Agent
> > < Pragma: no-cache
> > < Content-Type: text/html
> > <
> > { [650 bytes data]
> > 100   650  100   650    0     0   1224      0 --:--:-- --:--:--
> > --:--:--  1224
> >
> > NO!!! the zeros_14.dat file is NOT 650 bytes in size! It is at least 57K
> > bytes in size and a binary file too.
> >
> > Apparently I am being redirected, but then land on an html page when
> > using the curl command.
> >
> > LFMDB claims that anyone can download their data, they're not trying to
> > keep this a big secret.
> >
> > How can I fix the curl command and its options, so I can successfully
> > download a selected zeros_nnnn.dat file?
> >
> > Randall
> >
> >
>

Reply via email to