I started a thread in infra about the best way to handle large generated documentation for our website. Examples of this - we have links on our website to the (current) javadocs.
The website docs/d/ directory is where our generated documentation is; this is (according to current requirements of Apache infra) checked into SVN, and any changes are automatically "published". A typical size of these for uimaj is ~ 70 MB, about 3500 files. We currently have uimaj-2.4.0, 2.3.1, and 2.3.0 (not in /d/ - that was built before we started using /d/) on our website. Any objections to my removing the 2.3.1/ 2.3 0 versions? They are preserved forever (in SVN, and in the archives of the distributions - they're inside the binary distribuitions of UIMA). Doing this will reduce the load in checking out and otherwise manipulating the site SVN. ============= Brett Porter responded with some suggestions on handling this in the future. One suggestion is to have infra set up a CMS version of our Anakia site (without having us convert to CMS). This would allow us to take advantage of a feature put into CMS that would allow us to delete certain folder trees (e.g. uimaj-2.4.0 javadocs) in our SVN after they've been successfully published, and have the website **not** delete that folder tree. (see http://www.apache.org/dev/cmsref.html#extpaths) Infra did something "special" for the incubator website, and we could ask them for a similar solution. ============= Finally, putting large generated stuff into SVN by generating it and then committing it uses up SVN space at a much faster rate. The bottom of the attached notes suggests how to avoid this. -Marshall P.S., Jira has been down for abot 45 minutes: see http://monitoring.apache.org -------- Original Message -------- Subject: Re: CMS & Javadoc : a few questions Date: Fri, 11 Jan 2013 10:39:22 +1100 From: Brett Porter <[email protected]> To: Marshall Schor <[email protected]> On 11/01/2013, at 4:29 AM, Marshall Schor <[email protected]> wrote: > > On 1/10/2013 12:04 AM, Brett Porter wrote: >> On 10/01/2013, at 3:22 PM, Marshall Schor <[email protected]> wrote: >> >>> On 1/9/2013 6:25 PM, Brett Porter wrote: >>>> On 10/01/2013, at 8:08 AM, Marshall Schor <[email protected]> wrote: >>>> >>>>> Is there a corresponding way to handle large generated documentation (e.g. >>>>> JavaDocs) for sites not on CMS, but rather using svnpubsub? >>>>> >>>>> If not, is there a recommended approach, short of converting our website >>>>> to CMS? >>>> You can check it straight into the production tree. >>> Hmmm ... I thought only CMS websites have the "production tree". Our >>> site is >>> not a CMS site - we've never converted it. But it is svnpubsub enabled. >> You mean this is what your site is served from? >> http://svn.apache.org/viewvc/uima/site/trunk/uima-website/docs/ >> >> If so, that's the tree I was referring to - but bear in mind that big >> checkins to the ASF tree need to be minimised or avoided. > > Yes, that's it. > > So, there's 2 parts to this problem I'd like help with. > > "bear in mind that big checkins ... need to be minimized or avoided". I > agree, > and your suggestions re: Javadocs might be the way to go. This would seem to > indicate a workflow for a new release / new corresponding Javadocs, something > like this: > > 1) do svn copy (cheap) of javadocs-version-<previous> to > javadocs-version-<next>. > 2) check out the javadocs-version-<next> into the spot where the build will do > the generation. > 3) run the build - it blindly regenerates the javadocs, but hopefully, many of > the files are identical > 4) check in the result. > > Does that sound right? Yep, that's what we did (omitting the timestamp so that it doesn't change thousands of files every time). - Brett -- Brett Porter [email protected] http://brettporter.wordpress.com/ http://au.linkedin.com/in/brettporter http://twitter.com/brettporter
