Hi All, I think there is a memory error in the libcurl connection code that typically happens when libcurl reads big chunks of data. This potentially affects all code that use url() with the libcurl download method, which is the default in most builds. In practice it tends to happen more with HTTP/2 and if the connection is wrapped into a gzcon(). macOS Catalina has a libcurl build with HTTP/2 error, so many users that upgraded macOS are starting to see this.
The workaround is to avoid using url(), if you can. If you need an HTTP stream, you can use curl::curl(), which is a drop-in replacement. To reproduce, the easiest is a libcurl build that has HTTP/2 support and a server with HTTP/2 as well, e.g. the cloud mirror: ------------------------------------------------ ~ # R --slave -e 'options(internet.info = 0); foo <- readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds")))' * Trying 13.33.54.118:443... * TCP_NODELAY set * Connected to cran.rstudio.com (13.33.54.118) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: none * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 * ALPN, server accepted to use h2 * Server certificate: * subject: CN=cran.rstudio.com * start date: Jul 24 00:00:00 2019 GMT * expire date: Aug 24 12:00:00 2020 GMT * subjectAltName: host "cran.rstudio.com" matched cert's "cran.rstudio.com" * issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon * SSL certificate verify ok. * Using HTTP2, server supports multi-use * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Using Stream ID: 1 (easy handle 0x56303c2910e0) > GET /src/contrib/Meta/archive.rds HTTP/2 Host: cran.rstudio.com User-Agent: R (3.4.4 x86_64-pc-linux-gnu x86_64 linux-gnu) Accept: */* * Connection state changed (MAX_CONCURRENT_STREAMS == 128)! < HTTP/2 200 < content-length: 2483432 < date: Wed, 22 Jan 2020 21:22:04 GMT < server: Apache/2.4.39 (Unix) < last-modified: Wed, 22 Jan 2020 17:10:22 GMT < etag: "25e4e8-59cbd998a0360" < accept-ranges: bytes < cache-control: max-age=1800 < expires: Wed, 22 Jan 2020 21:52:04 GMT < x-cache: Hit from cloudfront < via: 1.1 6cbe48f9f9ff0c768f29d83804f75d4c.cloudfront.net (CloudFront) < x-amz-cf-pop: MAN50-C1 < x-amz-cf-id: WwCQVQz9g8ZP6Az4m4n__h7aUW6vwlg0-AkiCv_DnVfGe10bzaFtfg== < age: 960 < * 85 data bytes written Error in readRDS(gzcon(url("https://cran.rstudio.com/src/contrib/Meta/archive.rds"))) : reference index out of range * stopped the pause stream! * Connection #0 to host cran.rstudio.com left intact Execution halted ------------------------------------------------ Sometimes you get a crash, sometimes a corrupt stream, etc. Sometimes is actually works. It seems that the fix is simply this: ------------------------------------ --- src/modules/internet/libcurl.c~ +++ src/modules/internet/libcurl.c @@ -762,6 +762,7 @@ void *newbuf = realloc(ctxt->buf, newbufsize); if (!newbuf) error("Failure in re-allocation in rcvData"); ctxt->buf = newbuf; ctxt->bufsize = newbufsize; + ctxt->current = ctxt->buf; } memcpy(ctxt->buf + ctxt->filled, ptr, add); ------------------------------------ Best, Gabor ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel