Hi folks, I began working on an Islandora Solution Pack for web archives a while back, and the more I work on it and think about it I'm a little stuck on an foundational aspect, what is the object?
The way I had initially constructed it as a proof of concept was just ingesting and disseminating warc files. But, as I learn more and more about web archiving, there is more I'd like to do dissemination wise with associated datastreams (screenshots, pdfs) and full-text searching of warcs. So, here is my issue. Is an object a given crawl of a site? For example web crawl of http://yfile.news.yorku.ca on March 4, 2013? Or is an object a given website, the yfile example, and each crawl is a version of a datastream? To me it all seems like a matter of how a given collection is arranged and described, and both solutions are technically correct. But, is one way better than the other? If you'll indulge me, I'd love to hear your input. cheers! -- -nruest ------------------------------------------------------------------------------ Everyone hates slow websites. So do we. Make your web apps faster with AppDynamics Download AppDynamics Lite for free today: http://p.sf.net/sfu/appdyn_d2d_feb _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users