[mailto:CODE4LIB@listserv.nd.edu] On Behalf Of
Wilhelmina Randtke
Sent: Wednesday, January 15, 2014 10:29 AM
To: CODE4LIB@listserv.nd.edu
Subject: Re: [CODE4LIB] archiving web pages
Agreed, don't focus too much on preserving the presentation for an online
newspaper. The text and images
Here is another:
http://wax.lib.harvard.edu/collections/home.do
- Randy
--
Date:Tue, 14 Jan 2014 10:43:18 -0700
From:Robert Sanderson azarot...@gmail.com
Subject: Re: archiving web pages
Here are several to consider:
*
Agreed, don't focus too much on preserving the presentation for an online
newspaper. The text and images are important, but the layout isn't so
important.
-Wilhelmina Randtke
On Tue, Jan 14, 2014 at 10:59 AM, Kyle Banerjee kyle.baner...@gmail.comwrote:
IMO, there are many web archiving
If it's doable, I think preserving the whole enchilada is desirable. For
instance, at my last library, there was a regular assignment where students
needed the print version of old periodicals because they were tasked with
analysing the ads and layouts. Someone might be interested in web layouts
There's always the option of capturing a WARC of the newspaper as the
preservation master for dark storage, and generating PDFs for access via
your CMS. If you're in ContentDM already, then a PDF would be much easier
to use (both on the back and frontends).
The provenance metadata of WARC is too
On Wed, Jan 15, 2014 at 8:52 AM, Andrew Darby darby.li...@gmail.com wrote:
If it's doable, I think preserving the whole enchilada is desirable. For
instance, at my last library, there was a regular assignment where students
needed the print version of old periodicals because they were tasked
+1 to Alex's suggestion to use WARC for the preservation master and
generate PDFs for access.
While I agree with Kyle that it's ultimately the content that's
important and that hypothetical researcher needs are inexhaustible, I do
think there's an advantage to preserving web content in a
IMO, there are many web archiving situations where it is more appropriate
to just focus on the content rather than the manifestation of the content.
Just as you wouldn't expect a 1995 article from the NYT to be displayed as
the website was in 1995 or an article in an online database to actually
Hi Kathryn,
Right now the WARC format is considered the best preservation format for
websites/social media, in terms of digital archives. It is our best guess
right now. It will likely will be with us for a long time, because it has
been adopted by most of the major players.
The way I have seen
For what it's worth, the latest wayback code is:
https://github.com/iipc/openwayback
And being developed by the IIPC consortium, rather than just the Internet
Archive alone.
It has many additional features, contributed by other members.
It should be used in preference to the sourceforge
On 1/14/2014 11:48 AM, Kathryn Frederick (Library) wrote:
Hi,
I'm trying to develop a strategy for preserving issues our school's online
newspaper. Creating a WARC file of the content seems straightforward, but how
will that content fair long-term? Also, how is the WARC served to an end-user?
Rob is right on! I included the wrong link, thanks for catching that...
Cheers
Lisa
On Tue, Jan 14, 2014 at 11:04 AM, Robert Sanderson azarot...@gmail.comwrote:
For what it's worth, the latest wayback code is:
https://github.com/iipc/openwayback
And being developed by the IIPC
On Tue, Jan 14, 2014 at 12:08 PM, Francis Kayiwa fkay...@colgate.eduwrote:
If Skidmore has an IR I'd looking into adding them into your IR and render
from there (in addition to WARC'ing them)
Francis, I'm confused when you say in addition to WARC'ing them. Wouldn't
you be putting the WARC
Lisa,
Is your local web archive available online? I'd like to see a production
example of non-Internet Archive instance of Wayback/Open Wayback.
Thanks,
Nathan
On Tue, Jan 14, 2014 at 12:17 PM, L Snider lsni...@gmail.com wrote:
Rob is right on! I included the wrong link, thanks for catching
Hi Nathan,
Nope, unfortunately not...It was done as a test, and at that time we used
the IA only version.
Cheers
Lisa
On Tue, Jan 14, 2014 at 11:31 AM, Nathan Tallman ntall...@gmail.com wrote:
Lisa,
Is your local web archive available online? I'd like to see a production
example of
Here are several to consider:
*
http://www.webarchive.org.uk/wayback/archive/*/http://www.aboutmayfair.co.uk/
*
http://webarchive.loc.gov/lcwa0015/*/http://lawprofessors.typepad.com/adminlaw/
* http://www.padi.cat:8080/wayback/*/http://www.ajberga.cat/
* http://vefsafn.is/index.php?page=english
Hi-
We actually have implemented the original question above with some shell
scripts[1] for harvesting, and creating SIPs. The SIPs are then ingested
into our Islandora instance with the Web ARChive Solution Pack[2] as
AIPs. DIPs are also available via our local Wayback instance[3], and on
Kathryn,
When you write strategy do you mean a technology solution or a preservation
strategy, one component of which is the technology implementation of said
strategy? If it's a preservation strategy for your school's online (web)
content - so archival records - see what the University of
On 1/14/2014 12:26 PM, Nathan Tallman wrote:
On Tue, Jan 14, 2014 at 12:08 PM, Francis Kayiwa fkay...@colgate.eduwrote:
If Skidmore has an IR I'd looking into adding them into your IR and render
from there (in addition to WARC'ing them)
Francis, I'm confused when you say in addition to
Thanks for the thoughtful responses. We've been actively digitizing our print
paper (which ceased publication in 2011) and I was thinking of this as an
extension of that effort. Right now, I think capturing a monthly WARC file of
the site is definitely a good idea no matter what. But beyond
As an archivist, I don't see any problem using a PDF. Technically it should
be a PDF-A, but realistically it is usually a PDF.
I have done projects where I used PDFs for the archiving of full websites.
It can be quite handy, depending on needs of course. Sometimes it works
with the look and
21 matches
Mail list logo