Thanks, its starting to make a bit more sense now.
I notice your implementation supports multiple range requests, does
openwayback send multi-range requests?
On Tuesday, September 13, 2016 at 12:20:36 PM UTC+12, Alex Osborne wrote:
> Hi Ben,
> There's an example in RemoteCollection.xml.
> Note that you can configure the resourceStore independently of the
> resourceIndex. So if you want to use a local CDX resourceIndex with a
> remote resourceStore just put the appropriate stanzas from both example
> CDXCollection.xml and RemoteCollection.xml in the one WaybackCollection.
> Note also that the server for the resource store should support HTTP 1.1
> range requests. This is so that Wayback can retrieve just the record it's
> interested in and not the whole WARC file. Most regular web servers like
> Apache and nginx will do this out of the box but if you implement your own
> servlet it's something you'll need to take care of. A common scenario is a
> servlet proxying to multiple backend servers that have the actual files. In
> that case just make sure to also proxy the request and response headers and
> status code. If your servlet is to serve the files directly off disk or via
> say calls to a preservation system API you might need to take care of that
> range headers yourself.
> Here's the relevant RFC for range requests:
> My implementation, which currently looks up the path in a database and
> serves from disk is here:
> On Monday, September 12, 2016 at 9:15:56 AM UTC+10, Ben O'Brien wrote:
>> Hi Lauren,
>> Thanks for your relpy.
>> Not exactly, I want to handle that 'path-index' functionality separately
>> from OW.
>> I was hoping I could write a servlet to act as the remote resource store
>> to OW, which will look up the warc location on the fly. I see your point
>> about serving the warcs via a webserver and using the path-index file with
>> URLs. But it seemed nicer (in my head) if I could just serve the warc
>> location via an external service, removing the path-index flat file step
>> The context is that we are trying to use OW as a viewer from our
>> preservation system, which has a growing web archive. For a growing
>> collection the remote resource store seemed more of a fit than using a
>> path-index file.
>> On Friday, September 9, 2016 at 8:24:32 AM UTC+12, Lauren Ko wrote:
>>> Hi Ben,
>>> If you are using a FlatFileResourceFileLocationDB as described here
>>> , in your path-index.txt file you would put the URL to where the ARC/WARC
>>> files are being served instead of just a local path. Then you can serve the
>>> WARC files via whatever web server, such as Apache, from wherever you want.
>>> Is that what you are wanting to do?
>>> Lauren Ko
>>> UNT Libraries
>>> On Mon, Sep 5, 2016 at 7:22 PM, Ben O'Brien <obrien...@gmail.com> wrote:
>>>> Hello all,
>>>> I've found myself wanting to setup and test a remote resource store in
>>>> openwayback recently. Initially I was excited to see a link on the
>>>> Advanced-configuration wiki page 'Configuring a remote
>>>> ResourceStore'....only to find it was a placeholder :(
>>>> So in the interest of generating some content for that page - does
>>>> anybody have an example of configuring a remote ResourceStore?
>>>> You received this message because you are subscribed to the Google
>>>> Groups "openwayback-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to openwayback-d...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups
To unsubscribe from this group and stop receiving emails from it, send an email
For more options, visit https://groups.google.com/d/optout.