Re: [mirrorbrain] Import and export metalinks?

2018-03-12 Thread Peter Pöml
Hi Derek,

no, no, yes, and no. 

MirrorBrain expects the files to be present locally, at least as an empty file 
without content, with path- and filename matching some mirror. You could set up 
redirection to single mirrors/servers arbitrarily of course (including 
metalinks), combining them ander a single namespace. However, hashes can be 
created only for locally present files; and I don’t know if the setup you 
imagine would be worthwhile without hashes. You can use the included null-rsync 
tool to create file tree copies of mirrors with empty files to experiment (or 
to redirect from, without hashes though). On the other hand, the hashes could 
be entered into the database via some other means; that would be possible. 

Peter

> Am 13.03.2018 um 00:35 schrieb Derek Hofmann :
> 
> Hi, I'm thinking of setting up a MirrorBrain server that would be a
> permanent way to locate files as their original locations vanish from
> the web. Metalinks (RFC5854) seem ideal for this because they support
> multiple URLs.
> 
> I want to do this as cheaply as possible for multiple terabytes of
> data, so I don't want to store the mirrored files locally except maybe
> some that aren't yet available on the Internet or have few mirrors of
> their own.
> 
> Q1. Is there a way to import .metalink files into the database? This
> would be a quick way to populate the database with hashes even without
> storing the files locally. For other mirror admins who want to store
> the files locally, this would save a step of hashing all the files
> which could take hours depending on the number of terabytes.
> 
> Q2. Is there a way to export the database as one large .metalink file
> or maybe one large .metalink file per directory? It would be similar
> to "mb file ls" but in XML format. I could make this file available
> via http/ftp/rsync for someone who wants my metadata but doesn't use
> MirrorBrain. (Metalink files being formally described in an RFC would
> be more portable than database dumps.) Of course I would compress it
> first. Or my cron job would commit the file into a github repository.
> 
> Q3. Must the directory structure of a mirror match the other mirrors,
> or can the files be located anywhere and MirrorBrain or an external
> tool uses the filename/size/hash to resolve their actual locations?
> 
> Q4. Must the filenames match across mirrors, or can MirrorBrain or an
> external tool use other means (such as file size+hash) to match them
> up?
> 
>Thanks,
>Derek
> 
> 
> ___
> mirrorbrain mailing list
> Archive: http://mirrorbrain.org/archive/mirrorbrain/
> 
> Note: To remove yourself from this mailing list, send a mail with the content
>unsubscribe
> to the address mirrorbrain-requ...@mirrorbrain.org


___
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
unsubscribe
to the address mirrorbrain-requ...@mirrorbrain.org


[mirrorbrain] Import and export metalinks?

2018-03-12 Thread Derek Hofmann
Hi, I'm thinking of setting up a MirrorBrain server that would be a
permanent way to locate files as their original locations vanish from
the web. Metalinks (RFC5854) seem ideal for this because they support
multiple URLs.

I want to do this as cheaply as possible for multiple terabytes of
data, so I don't want to store the mirrored files locally except maybe
some that aren't yet available on the Internet or have few mirrors of
their own.

Q1. Is there a way to import .metalink files into the database? This
would be a quick way to populate the database with hashes even without
storing the files locally. For other mirror admins who want to store
the files locally, this would save a step of hashing all the files
which could take hours depending on the number of terabytes.

Q2. Is there a way to export the database as one large .metalink file
or maybe one large .metalink file per directory? It would be similar
to "mb file ls" but in XML format. I could make this file available
via http/ftp/rsync for someone who wants my metadata but doesn't use
MirrorBrain. (Metalink files being formally described in an RFC would
be more portable than database dumps.) Of course I would compress it
first. Or my cron job would commit the file into a github repository.

Q3. Must the directory structure of a mirror match the other mirrors,
or can the files be located anywhere and MirrorBrain or an external
tool uses the filename/size/hash to resolve their actual locations?

Q4. Must the filenames match across mirrors, or can MirrorBrain or an
external tool use other means (such as file size+hash) to match them
up?

Thanks,
Derek


___
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
unsubscribe
to the address mirrorbrain-requ...@mirrorbrain.org