> As it happens, I do plan an improvement which I think would address both > issues: to start using Puppeteer to load each link URL, check for 404s, take > a screenshot, extract the page title, and to record the URL after any > redirections. > > > If we extracted and saved the document title at the same time, it might be > interesting to explore using that as the displayed title in the bookmarks > aggregator, with the URL in a subtitle. This might be more intuitive to > readers as URLs alone don't convey that much information of interest to end > users.
Exactly, I should have added extracting the favicon too. The hold up is that at the moment the entire system runs from cold each time, fetching all the link wikis, and building the static site. To make the Puppeteer stuff practical, we’d have to be able to just run newly added sites through puppeteer, and retain previously recorded screenshots etc in the repo. The scraper is pretty dumb and simple right now. One thing we might want to look at is more ambitious existing libraries that we can use. Best wishes Jeremy > > Cheers, > Saq > > -- > You received this message because you are subscribed to the Google Groups > "TiddlyWiki" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tiddlywiki/7f9685a1-b6e1-43a8-8c48-b1b7ea914dcan%40googlegroups.com > > <https://groups.google.com/d/msgid/tiddlywiki/7f9685a1-b6e1-43a8-8c48-b1b7ea914dcan%40googlegroups.com?utm_medium=email&utm_source=footer>. -- You received this message because you are subscribed to the Google Groups "TiddlyWiki" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tiddlywiki/CCDFFCB2-FCB3-475A-BF9A-BE4F0B5EDE92%40gmail.com.

