Hi Alex,

Thanks, its starting to make a bit more sense now.

I notice your implementation supports multiple range requests, does 
openwayback send multi-range requests?


Cheers,
Ben

On Tuesday, September 13, 2016 at 12:20:36 PM UTC+12, Alex Osborne wrote:
>
> Hi Ben,
>
> There's an example in RemoteCollection.xml.
>
>
> https://github.com/iipc/openwayback/blob/master/wayback-webapp/src/main/webapp/WEB-INF/RemoteCollection.xml#L33
>
> Note that you can configure the resourceStore independently of the 
> resourceIndex. So if you want to use a local CDX resourceIndex with a 
> remote resourceStore just put the appropriate stanzas from both example 
> CDXCollection.xml and RemoteCollection.xml in the one WaybackCollection.
>
> Note also that the server for the resource store should support HTTP 1.1 
> range requests. This is so that Wayback can retrieve just the record it's 
> interested in and not the whole WARC file. Most regular web servers like 
> Apache and nginx will do this out of the box but if you implement your own 
> servlet it's something you'll need to take care of. A common scenario is a 
> servlet proxying to multiple backend servers that have the actual files. In 
> that case just make sure to also proxy the request and response headers and 
> status code. If your servlet is to serve the files directly off disk or via 
> say calls to a preservation system API you might need to take care of that 
> range headers yourself.
>
> Here's the relevant RFC for range requests:
>
> https://tools.ietf.org/html/rfc7233
>
> My implementation, which currently looks up the path in a database and 
> serves from disk is here:
>
>
> https://github.com/nla/bamboo/blob/32d7f2e/ui/src/bamboo/crawl/WarcsController.java#L132
>
> Cheers,
>
> Alex
>
>
>
> On Monday, September 12, 2016 at 9:15:56 AM UTC+10, Ben O'Brien wrote:
>>
>> Hi Lauren,
>>
>> Thanks for your relpy.
>>
>> Not exactly, I want to handle that 'path-index' functionality separately 
>> from OW. 
>> I was hoping I could write a servlet to act as the remote resource store 
>> to OW, which will look up the warc location on the fly. I see your point 
>> about serving the warcs via a webserver and using the path-index file with 
>> URLs. But it seemed nicer (in my head) if I could just serve the warc 
>> location via an external service, removing the path-index flat file step 
>> altogether.
>>
>> The context is that we are trying to use OW as a viewer from our 
>> preservation system, which has a growing web archive. For a growing 
>> collection the remote resource store seemed more of a fit than using a 
>> path-index file.
>>
>>
>> Cheers,
>> Ben
>>
>>
>>
>> On Friday, September 9, 2016 at 8:24:32 AM UTC+12, Lauren Ko wrote:
>>>
>>> Hi Ben,
>>> If you are using a FlatFileResourceFileLocationDB as described here 
>>> https://github.com/iipc/openwayback/wiki/How-to-configure#telling-openwayback-where-to-find-your-arc-and-warc-files
>>>  
>>> , in your path-index.txt file you would put the URL to where the ARC/WARC 
>>> files are being served instead of just a local path. Then you can serve the 
>>> WARC files via whatever web server, such as Apache, from wherever you want. 
>>> Is that what you are wanting to do?
>>>
>>> Lauren Ko
>>> UNT Libraries
>>>
>>> On Mon, Sep 5, 2016 at 7:22 PM, Ben O'Brien <obrien...@gmail.com> wrote:
>>>
>>>> Hello all,
>>>>
>>>>
>>>> I've found myself wanting to setup and test a remote resource store in 
>>>> openwayback recently. Initially I was excited to see a link on the 
>>>> Advanced-configuration wiki page 'Configuring a remote 
>>>> ResourceStore'....only to find it was a placeholder :(
>>>>
>>>> So in the interest of generating some content for that page - does 
>>>> anybody have an example of configuring a remote ResourceStore?
>>>>
>>>>
>>>> Cheers,
>>>> Ben
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "openwayback-dev" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to openwayback-d...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"openwayback-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to openwayback-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to