On Sunday, 7 February 2016 at 21:59:00 UTC, Andrei Alexandrescu wrote:
Dpaste currently does not expire pastes by default. I was thinking it would be nice if it saved them in the Wayback Machine such that they are archived redundantly.

I'm not sure what's the way to do it - probably linking the newly-generated paste URLs from a page that the Wayback Machine already knows of.

I just saved this by hand: http://dpaste.dzfl.pl/2012caf872ec (when the WM does not see a link that is search for, it offers the option to archive it) obtaining https://web.archive.org/web/20160207215546/http://dpaste.dzfl.pl/2012caf872ec.


Thoughts?

You want it in Wayback? Sounds like you need some WARC [0]. Since anyone can upload to IA (using a nice S3-like API, even [1]), this should be pretty uncomplicated. If you can get a list of all the paste URLs, you can use wget [2] to build the WARC fairly trivially. [3] Then I'd suggest getting a dlang account and make an item [4] out of it. Just make sure it's set to mediatype:web and it should get ingested by Wayback.

After that? Generate a WARC when a paste is made and use the dlang S3 keys to add it to the previous item (or maybe just do it daily or weekly so as to not stress the derive queue too much). I'm pretty sure that's all that's needed.

-Wyatt

[0] http://fileformats.archiveteam.org/wiki/WARC
[1] https://archive.org/help/abouts3.txt
[2] -i, --input-file=FILE download URLs found in local or external FILE. [3] http://www.archiveteam.org/index.php?title=Wget#Creating_WARC_with_wget [4] https://blog.archive.org/2011/03/31/how-archive-org-items-are-structured/

Reply via email to