On Fri, Dec 27, 2019 at 10:49 PM Guilherme Janczak <janczak.guilhe...@gmail.com> wrote: > > On Thu, 26 Dec 2019 16:13:33 +0000 > "goleo ." <goleo...@gmail.com> wrote: > > > I was wondering how much space distfiles on "ftp" take, so because > > I couldn't see that in my web browser clearly, I downloaded the page > > https://ftp.openbsd.org/pub/OpenBSD/distfiles/ as distfiles.txt > > With wget, you can download the HTML of a web page, and also recurse > into links within it. > > $ wget -r -l 0 -A '*.html' --no-parent -O everything.html > https://ftp.openbsd.org/pub/OpenBSD/distfiles/ > > This command recurses into an infinite number of links without going up > in the hierarchy and into the parent directory, downloads only other > .html files (from which more links can be acquired), and appends > everything to an "everything.html" file. > > After a few minutes running and just ~1.7MiB of HTML downloaded, it > tried to recurse into a lot of non-existing directories, so I cut it > short there. The figure may not be perfect. > > $ grep -E '[0-9]$' everything.html | sed 's|.* \([0-9]*\)$|\1|' | awk > '{sum+=$1} END{print sum / 1024 / 1024}' > 65629 > > > The sum of all filesizes, which are listed in kebibytes, divided by > 1024^2, to turn it into gibibytes, returns 65629 gibibytes or about > 65 tebibytes. > This number seems a little absurd, I'm not sure if I made a mistake. > It does not seem completely implausible either however, the tree > does have files dating all the way back to 1990. > https://ftp.openbsd.org/pub/OpenBSD/distfiles/ja-fonts/
Filesizes are listed just in bytes, that means your calculation shows 65629 megabytes. Still nice, I didn't know it's so easy to fetch contents of subdirectories :)