On 6/21/13 11:27 AM, Dave Brondsema wrote: > I found an nginx module for custom authentication that we could play around > with > and see if it works: > http://mdounin.ru/hg/ngx_http_auth_request_module/file/a29d74804ff1/README > I'm > sure Apache has similar modules too.
Our ops team wasn't keen on recompiling nginx to be able to add a 3rd-party module. They suggested using non-http methods to provide the file. At SourceForge, we have ssh/scp/sftp access for projects, so that would be a good delivery mechanism for the backup zip. Other allura instances that use this might have to figure out what works well for them, but we could make it flexible & configurable: the zip file could get created in any directory (specified by a path pattern in the .ini file) and email notification upon completion could provide access instructions (text configurable via the .ini file) > If all we do is send an email, and don't show the status on the admin page > anywhere, a very long running backup could cause the admin to think it got > stuck > or died, and thus request another backup. I suppose we should have > anti-dogpiling logic to avoid that. > > On 6/21/13 10:43 AM, Cory Johns wrote: >> Have we even tested serving large files through the app stack? I strongly >> suspect they'd hit the long-request timeout. I know I've hit it before >> when testing uploading large-ish attachments. >> >> And on the subject of attachments, the API end-points already (or will with >> the next push) include attachment metadata, including the URL to download >> them from. I definitely think that's good enough for now, as the admin can >> parse the URLs out and download them, if needed. If that proves to be too >> onerous for doing project exports, then we can address it at that time. >> >> Going back to serving up the exports, is there any way we could serve them >> outside of the app stack but still with authentication? Such as a >> standalone, light-weight service that just serves files with authentication >> (could be useful for the screenshots and icons for private projects), or >> via authenticated SFTP? This is verging on an infrastructure question at >> this point, but I definitely agree that we should have some auth in front >> of it but it's not going to be easy. >> >> >> On Tue, Jun 18, 2013 at 10:26 AM, Dave Brondsema <[email protected]> wrote: >> >>> For us at SourceForge, we have a need to build a feature that lets project >>> admins download a backup/export of all their project data. Since this is a >>> pretty big feature, I wanted to propose here how we might do it and get >>> feedback >>> & ideas before we proceed. >>> >>> Add a bulk_export() method to Application which would be responsible for >>> generating json for all the artifacts in the tool. The format should >>> match the >>> API format for artifacts so that we're consistent. Thus any tool that >>> implements bulk_export() would typically loop through all the artifacts >>> for this >>> instance (matching app_config_id) and convert to json the same way the API >>> json >>> is generated (e.g. call the __json__ method or RestController method; some >>> refactoring might be needed). Multiple types of artifacts/objects could be >>> listed out in groups, e.g. Tracker app could have a list of tickets, list >>> of >>> saved search bins, list of milestones, and the tracker config data. >>> Discussion >>> threads would need to be included too, ideally inline with the artifact >>> they go >>> with. No permission checks would be done since this export would only be >>> available to admins (makes it faster & simpler). >>> >>> Provide a page on the Admin sidebar to generate a bulk export. Project >>> admins >>> could choose individual tool instances, or all tools in the project (that >>> support it). That form would kick off a background task which goes >>> through the >>> selected tools and runs their bulk_export() methods. Save each tool's >>> data as >>> mount_point.json and zip them all together. >>> >>> It'd be easiest to store & deliver the zip files similarly to the code >>> snapshots >>> (static files not served through allura), but that won't be secure. We'll >>> need >>> to either serve it through allura with authentication, or maybe name the >>> zip >>> file with a random name that can't be guessed (and then serve it directly >>> through apache or nginx). Other ideas? >>> >>> When the task is complete, notify the user. What way is best? Send an >>> email? >>> Probably would be good to show a listing of available completed extracts >>> on the >>> extract page, so if any older ones are still sitting around they can be >>> retrieved (would be up to server admins to have a cron to delete old files) >>> >>> We could make this something that can be triggered automatically via the >>> API and >>> check status through the API, but that seems like a good thing to add on >>> later. >>> >>> Should we include attachments? These would be important in some cases but >>> not >>> in others. It could also increase the export size immensely in some cases. >>> Maybe leave out for now, and add in later when needed, possibly as an >>> option. >>> >>> Further thoughts on implementation details: >>> >>> So that a giant json string doesn't have to be held in memory for each >>> tool, the >>> export task should open a file handle for mount_point.json and send call >>> bulk_export() with that open file handle and each App can append to their >>> file >>> incrementally. >>> >>> If mongo performance is slow, some refactoring may be needed to avoid lots >>> of >>> individual mongo calls and be more batch oriented. We can see how it goes. >>> >>> Could parallelize bulk_export() later, to do multiple tools at once. >>> >>> >>> Sound reasonable? Any suggestions or other ideas? >>> >>> >>> -- >>> Dave Brondsema : [email protected] >>> http://www.brondsema.net : personal >>> http://www.splike.com : programming >>> <>< >>> >> > > > -- Dave Brondsema : [email protected] http://www.brondsema.net : personal http://www.splike.com : programming <><
