Awesome stuff. I'll look at your PR today. Only reason I suspected images as a problem is that the docs directory on the website (http://svn.apache.org/repos/asf/cordova/site/public/) is currently 773MB. Not great, but I guess not too terrible :P
On Mon, Jan 5, 2015 at 7:24 AM, Andrey Kurdumov <kant2...@googlemail.com> wrote: > I take measures and find that most of the time is spend inside of 'Adding > Title', 'Building TOC' and 'Merging files' step. > Each of that steps take 3-4 seconds. Other steps takes less then half of a > second. > > Average generation time for language is 24 seconds. Upper part of > distribution is non-English translations. > Most English docs takes 13-15 seconds to generate. > Other European languages 20-24 seconds. > Japanese, Korean and Chinese 25-31 > > The pull request for the docs generator with timing swtich is > https://github.com/apache/cordova-docs/pull/252 > > @Andrew From what I see that static content ~9M is not give us too much > problem if we not upload everything again after regeneration. > If uploading only new docs it will be ~100M overhead. I definitely will try > to reduce duplication, but right now it does not give me too much > pain so I will improve that specific place. Maybe I'm not aware about other > side-effects and processes where this duplication is increased? > > Thanks > Andrey. > > > 2014-12-31 11:48 GMT+06:00 Andrey Kurdumov <kant2...@googlemail.com>: > > > On my preference we have to live for now with autolinking, but have to > > invest > > in documentation validation process. If we change to HTML links we still > > have > > to create validation scripts, but they will do folowing: double check > that > > all links > > are still valid and no renaming occurs during development. Also > > autolinking seems > > more convenient when we don't have dedicated person responsible for > > documentation. > > > > About de-duping images and generation speed. I believe that most of the > > time > > during generation spent reading and writing to the disk for all files. > > Images is copied just one time. > > I will put timing in generation process, so we could measure that. > > > > Most of places where generation write to disk could not be avoided with > > current design, > > but there two place which could be avoided if used with some server-side > > technology: > > VersionMenu and SideBar on the website. These parts of site exactly the > > same for > > same language and version, so if we generate single file and use them > > across all files, > > that should bring benefit in generation speed. Apache + SSI seems good > > candidate for that. > > Again, let's measure and find what place takes most of the time. > > > > > > 2014-12-31 8:31 GMT+06:00 Andrew Grieve <agri...@chromium.org>: > > > >> Your links do indeed tell a sad tale :(. > >> > >> Doing a one-over to turn all auto-links into manual links seems like a > >> good > >> idea, at least for links that link to other pages. > >> > >> Going on a tangent here - but another way to improve the docs would be > to > >> de-dupe images across versions and languages. I mention it because I > >> suspect that it would vastly speed up generation and upload times. Feel > >> free to ignore this though, and work on what is most interesting to you > >> :). > >> > >> Thanks for your work here! Really great stuff :) > >> > >> On Wed, Dec 24, 2014 at 1:41 PM, Andrey Kurdumov < > kant2...@googlemail.com > >> > > >> wrote: > >> > >> > Once I finish moving documentation generation to JS I now have plans > >> how to > >> > move forward with improving translation. > >> > > >> > First thing to finish is to cleanup task which was left after > migration > >> > These which was mentioned by Andrew: > >> > - update the README.md to describe how to generate using the new > >> generator > >> > - delete the ruby files! > >> > - Change path for generation from public/test/ to public/ > >> > > >> > Next is the improve quality of translation. Right now autolinking in > >> > translated languages almost is broken. You could compare [1] and [2] > or > >> [3] > >> > to understand what I mean. That's when you change header of the page, > >> you > >> > should go across all pages where this term is used and change > reference > >> to > >> > that page everywhere. That's almost impossible to do without tooling > >> > support. So I decide to create tool(s) which will help keep > consistency. > >> > > >> > Here I will use Russian as a second language which I will translate. > >> > > >> > Tool 1: Translation comparator > >> > I compare how many links I have in English translation with links > which > >> I > >> > have in Russian traslation and generate report of differences. > >> > This tool could give overview of translation quality for language. > >> > > >> > Example usage: > >> > ./bin/translationreport ru edge > >> > > >> > Example output: > >> > Comparing translation for en and ru for version edge > >> > Comparing index.md .... OK > >> > Comparing guide/platforms/index.md .... Found differences > >> > > >> > Tool 2. Find autolinks > >> > This tool will report which part of Markdown file will be used in the > >> > autolinking feature. That's to know what text should you use during > >> > translation in the other parts. > >> > Example usage: > >> > ./bin/findautolinks en/edge/config_ref/images.md > >> > > >> > Example output: > >> > Found terms: > >> > Icons and Splash Screens > >> > Example configuration > >> > Supported platforms > >> > Splashscreen Plugin > >> > > >> > Tool 3. Find linked pages > >> > When given Markdown file name, then this tool will report from which > >> other > >> > Markdown files this file is referenced. > >> > Example usage: > >> > ./bin/findrefautolinks en/edge/guide/cli/index.md > >> > > >> > Example output: > >> > Searching for autolinks > >> > Found: > >> > guide/platforms/index.md > >> > config_ref/images.md > >> > ....... > >> > > >> > > >> > Alternative way is to move away from autolinking completely and > provide > >> > another tool > >> > > >> > Tool 4. > >> > Replace places in existing MD files where autolinking happens with > >> direct > >> > links > >> > Example usage > >> > ./bin/replaceautolinks en edge > >> > > >> > Example output: > >> > Performed replace > >> > > >> > That's just my ideas how translation could be improved. I just want to > >> > bring this to discussion, to have broader view on the topic before > start > >> > doing something. > >> > Also I have question about CrowdIn - I want to intensively check what > >> how > >> > my translation in Crowdin broke the autolinking. Is is possible have > >> > readonly API keys which allows me doing translation in CrowdIn, and > >> > generate documentation locally, to check that changes in wording, > don't > >> > break autolinking? > >> > > >> > [1] http://cordova.apache.org/docs/en/edge/index.html > >> > [2] http://cordova.apache.org/docs/ru/edge/index.html > >> > [3] http://cordova.apache.org/docs/es/edge/index.html > >> > > >> > Thanks, > >> > Andrey > >> > > >> > > > > >