Re: [Reprozip-users] Web Archiving

2018-05-11 Thread Rasa Bočytė
Hi,

I was just wondering if you would be able to help me with testing out
Reprozip.

I have my case study website set up on my laptop. It does not require any
special server-side software, I can run it with a simple php inbuilt server
or Lampp. At the moment I am doing it only for testing purposes, so I could
use any server software that would work with Reprozip.

A couple of questions:
- apart from tracing the server, do I need to trace any other processes to
ensure that I can reproduce the experiment?
- how do I ensure that my package includes all the website data? I tried
running a php inbuilt server (reprozip trace php -S localhost:8000 -t
path/to/the/website/files) and the finished package only captured data from
the web pages that I visited while running the experiment.
- can I somehow include the browser needed to run the website into the
package?

Thank you for your help!

Regards,
Rasa

On 24 April 2018 at 15:37, Rasa Bočytė <rboc...@beeldengeluid.nl> wrote:

> Hi both,
>
> I presented ReproZip to other researchers in my institution and everyone
> seems quite excited to see if it would work for us! I still need to discuss
> it with a couple of other colleagues but I think we will try to test it.
>
> One of the things that I am trying to figure out is how to include
> client-side software, i.e. web browser, into the equation. Would you have
> to create a separate container for that? I deally, we would like to package
> everything, source files, server-side dependencies and client-side
> dependencies, into one place, but I don't know if that is feasible.
>
> Regards,
> Rasa
>
> On 18 April 2018 at 18:27, Vicky Steeves <vicky.stee...@nyu.edu> wrote:
>
>> Hi Rasa,
>>
>> Apologies, we were traveling and just got back to the office. We are very
>> glad to be of help!
>>
>> We let the users packing experiments to edit the yml file before the
>> final packing step, and for those secondary users who unpack, we let them
>> download and view the yml file. We certainly *could* automatically
>> extract categories of information for the user. It bears more thinking
>> about, especially since there are a few ways that unpacking users interface
>> with ReproUnzip.
>>
>> Best,
>> Vicky
>>
>> Vicky Steeves
>> Research Data Management and Reproducibility Librarian
>> Phone: 1-212-992-6269
>> ORCID: orcid.org/-0003-4298-168X/
>> vickysteeves.com | @VickySteeves <https://twitter.com/VickySteeves>
>> NYU Libraries Data Services | NYU Center for Data Science
>>
>> On Tue, Apr 10, 2018 at 4:46 AM, Rasa Bočytė <rboc...@beeldengeluid.nl>
>> wrote:
>>
>>> Hi Remi,
>>>
>>> In terms of migration, originally my institute planned to acquire files
>>> from the creators and then figure out what to do with them, most likely
>>> migrate individual files to updated versions when needed. Which I think is
>>> not a helpful approach since you need to start at the server and capture
>>> the environment and software that manipulates those files to create a
>>> website. Especially, if you want to be able to reproduce it.
>>>
>>> I am definitely leaning towards the idea that virtualisation of a web
>>> server would be the best approach for us. I will try to test out the
>>> examples that you have on your website and see if I can run some tests with
>>> my own case studies (of course, it depends if the creators will allow us to
>>> do it).
>>>
>>> I promise I won't bother you too much but my last question is about the
>>> metadata captured on the yml file. It is machine and human readable, but
>>> the question is what do you with it and how you present it once you have it
>>> so it becomes a valuable resource for those using the preserved object.
>>> Have you thought about automatically extracting some categories of
>>> information from that file in a user-friendly format or do you think it is
>>> enough as it is?
>>>
>>> Just wanted to say a massive thank you for your feedback. It has been
>>> incredibly helpful!
>>>
>>> Rasa
>>>
>>> On 6 April 2018 at 19:53, Rémi Rampin <remi.ram...@nyu.edu> wrote:
>>>
>>>> Rasa,
>>>>
>>>> 2018-04-04 08:03 EDT, Rasa Bočytė <rboc...@beeldengeluid.nl>:
>>>>
>>>>> In our case, we are getting all the source files directly from content
>>>>> creators and we are looking for a way to record and store all the
>>>>> technical, administrative and descriptive metadata, and visualise
>>>>> dependencies on

Re: [Reprozip-users] Web Archiving

2018-04-24 Thread Rasa Bočytė
Hi both,

I presented ReproZip to other researchers in my institution and everyone
seems quite excited to see if it would work for us! I still need to discuss
it with a couple of other colleagues but I think we will try to test it.

One of the things that I am trying to figure out is how to include
client-side software, i.e. web browser, into the equation. Would you have
to create a separate container for that? I deally, we would like to package
everything, source files, server-side dependencies and client-side
dependencies, into one place, but I don't know if that is feasible.

Regards,
Rasa

On 18 April 2018 at 18:27, Vicky Steeves <vicky.stee...@nyu.edu> wrote:

> Hi Rasa,
>
> Apologies, we were traveling and just got back to the office. We are very
> glad to be of help!
>
> We let the users packing experiments to edit the yml file before the final
> packing step, and for those secondary users who unpack, we let them
> download and view the yml file. We certainly *could* automatically
> extract categories of information for the user. It bears more thinking
> about, especially since there are a few ways that unpacking users interface
> with ReproUnzip.
>
> Best,
> Vicky
>
> Vicky Steeves
> Research Data Management and Reproducibility Librarian
> Phone: 1-212-992-6269
> ORCID: orcid.org/-0003-4298-168X/
> vickysteeves.com | @VickySteeves <https://twitter.com/VickySteeves>
> NYU Libraries Data Services | NYU Center for Data Science
>
> On Tue, Apr 10, 2018 at 4:46 AM, Rasa Bočytė <rboc...@beeldengeluid.nl>
> wrote:
>
>> Hi Remi,
>>
>> In terms of migration, originally my institute planned to acquire files
>> from the creators and then figure out what to do with them, most likely
>> migrate individual files to updated versions when needed. Which I think is
>> not a helpful approach since you need to start at the server and capture
>> the environment and software that manipulates those files to create a
>> website. Especially, if you want to be able to reproduce it.
>>
>> I am definitely leaning towards the idea that virtualisation of a web
>> server would be the best approach for us. I will try to test out the
>> examples that you have on your website and see if I can run some tests with
>> my own case studies (of course, it depends if the creators will allow us to
>> do it).
>>
>> I promise I won't bother you too much but my last question is about the
>> metadata captured on the yml file. It is machine and human readable, but
>> the question is what do you with it and how you present it once you have it
>> so it becomes a valuable resource for those using the preserved object.
>> Have you thought about automatically extracting some categories of
>> information from that file in a user-friendly format or do you think it is
>> enough as it is?
>>
>> Just wanted to say a massive thank you for your feedback. It has been
>> incredibly helpful!
>>
>> Rasa
>>
>> On 6 April 2018 at 19:53, Rémi Rampin <remi.ram...@nyu.edu> wrote:
>>
>>> Rasa,
>>>
>>> 2018-04-04 08:03 EDT, Rasa Bočytė <rboc...@beeldengeluid.nl>:
>>>
>>>> In our case, we are getting all the source files directly from content
>>>> creators and we are looking for a way to record and store all the
>>>> technical, administrative and descriptive metadata, and visualise
>>>> dependencies on software/hardware/file formats/ etc. (similar to what
>>>> Binder does).
>>>>
>>>
>>> I didn't think Binder did that (this binder?
>>> <https://github.com/jupyterhub/binderhub>). It is certainly a good
>>> resource for reproducing environments already described as a Docker image
>>> or Conda YaML, but I am not aware of ways to use it to track or visualize
>>> dependencies or any metadata.
>>>
>>> We have been mostly considering migration as it is a more scalable
>>>> approach and less technically demanding. Do you find that virtualisation is
>>>> a better strategy for website preservation? At least from the archival
>>>> community, we have heard some reservations about using Docker since it is
>>>> not considered a stable platform.
>>>>
>>>
>>> When you talk of migration, do you mean to new hardware? What would you
>>> be migrating to? Or do you mean upgrading underlying software/frameworks?
>>> The way I see it, virtualization (sometimes referred to as "preserving
>>> the mess") is definitely less technically demanding than migration. Could
>>> you share a bit more about what y

[Reprozip-users] Web Archiving

2018-03-30 Thread Rasa Bočytė
 Dear ReproZip Team,

let me introduce myself. I am a researcher at the Netherlands Institute for
Sound and Vision and I am currently investigating server-side web
preservation strategies as a way to preserve dynamic websites.

We have been conducting a review of best practices for server-side website
preservation and a couple of people mentioned that ReproZip might be
helpful for that. I was wondering if you think it could be used for this
purpose and whether you know of any examples of website servers or similar
database preservation.

What I am most interested in is to know whether ReproZip could be used to:
- document the original environment from which the files were acquired (web
server, hardware, software)
- record extra technical details and instructions that could be added
manually
- maintain dependencies between files and folders
- capture metadata

I would be very interested to hear about this and I would really appreciate
your help!

Kind regards,
---

*Rasa Bocyte*
Web Archiving Intern

*Netherlands Institute for Sound and Vision*
*Media

Parkboulevard

1
,
1217 WE  Hilversum | Postbus 1060, 1200 BB  Hilversum |
**beeldengeluid.nl
*
___
Reprozip-users mailing list
Reprozip-users@vgc.poly.edu
https://vgc.poly.edu/mailman/listinfo/reprozip-users