Hello guys! I’m using wget to make a mirror of https://releases.hashicorp.com but I don’t want to make a full mirror, rather I’d like to have a mirror of certain “subfolders” of this site (e.g. terraform, consul etc.). So I do this using the following command:
wget -N -r -l inf --no-parent https://releases.hashicorp.com/consul/ The problem is that at first I get the following result ****** $ wget -N -r -l inf --no-parent https://releases.hashicorp.com/consul/ --2022-05-16 16:28:18-- https://releases.hashicorp.com/consul/ Resolving releases.hashicorp.com (releases.hashicorp.com)... 151.101.193.183, 151.101.129.183, 151.101.65.183, ... Connecting to releases.hashicorp.com (releases.hashicorp.com)|151.101.193.183|:443... connected. HTTP request sent, awaiting response... HTTP/1.1 200 OK Connection: keep-alive Content-Type: text/html ETag: TvHhjlva/+c= X-Api-Version: 0.1.2 X-Request-Id: 8a74122b-c155-88ff-511e-8d0d93155b2e X-Amz-Cf-Pop: AMS50-C1 X-Amz-Cf-Id: Pdzhym0uq3XXjsZ_PxS8xvkntM0IsSCQtakE2EvgwC0v0tYMPJwCzQ== Age: 61398 Access-Control-Allow-Origin: * Strict-Transport-Security: max-age=31536000; includeSubDomains; preload X-XSS-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Frame-Options: sameorigin Accept-Ranges: bytes Date: Mon, 16 May 2022 16:28:18 GMT Vary: Origin, Accept-Encoding transfer-encoding: chunked Length: unspecified [text/html] Saving to: ‘releases.hashicorp.com/consul/index.html’ releases.hashicorp.com/consul/index.html [ <=> ] 19.51K --.-KB/s in 0s Last-modified header missing -- time-stamps turned off. 2022-05-16 16:28:18 (45.4 MB/s) - ‘releases.hashicorp.com/consul/index.html’ saved [19979] ****** We can see that whatever is there at https://releases.hashicorp.com/consul/ gets saved to local releases.hashicorp.com/consul/index.html which is fine, exactly what I want. But when in comes to the first href from the releases.hashicorp.com/consul/index.html I get the following: ****** --2022-05-16 16:30:21-- https://releases.hashicorp.com/consul/1.12.0 Reusing existing connection to releases.hashicorp.com:443. HTTP request sent, awaiting response... HTTP/1.1 200 OK Connection: keep-alive Content-Type: text/html X-Api-Version: 0.1.2 X-Request-Id: ca8c47f5-2e54-b09a-adde-6e8cf5e92d45 ETag: 8p+ndCqEoYc= X-Amz-Cf-Pop: AMS50-C1 X-Amz-Cf-Id: qA5XZEv2hZReEYoZD29GRsD_M6u76VLv6g-usgKJAzTCQm_SyWVFRA== Age: 27384 Access-Control-Allow-Origin: * Strict-Transport-Security: max-age=31536000; includeSubDomains; preload X-XSS-Protection: 1; mode=block X-Content-Type-Options: nosniff X-Frame-Options: sameorigin Accept-Ranges: bytes Date: Mon, 16 May 2022 16:30:21 GMT Vary: Origin, Accept-Encoding transfer-encoding: chunked Length: unspecified [text/html] releases.hashicorp.com/consul/1.12.0: Is a directory Cannot write to ‘releases.hashicorp.com/consul/1.12.0’ (Success). ****** We can see that it tries to save whatever is there at https://releases.hashicorp.com/consul/1.12.0 into releases.hashicorp.com/consul/1.12.0, not releases.hashicorp.com/consul/1.12.0/index.html as I would prefer. The mind blowing fact is that it used to work well for me even couple of weeks before with the same invocation. It would produce index.html not only at the root but at the leaves as well. Definitely something has changed on the server but how can I address the issue? As it works currently it leaves me no option to maintain my mirror properly because without these index.htmls I simply can’t offer my mirror to my users.