+1 DocProject
About compilation we may use an explicit NAnt/MsBuild task.

P.S. don't sorry for the length of the mail.

2009/3/29 Stephen Bohlen <[email protected]>

> All:
>
> As I committed to in a prior thread (
> http://groups.google.com/group/nhibernate-development/browse_thread/thread/729819b625001217
>  )
> I have now completed a preliminary review of the two possible approaches to
> auto-generating API docs for NHibernate from the XML code comments as part
> of our build process.
>
> Recall that one suggestion (offered by Will) was to investigate James
> Gregory's new alpha build of 'Docu' ( http://docu.jagregory.com/ ), the
> light-weight code-comment-compiler he wrote for generating help content for
> the Fluent NHibernate project and offered as OSS to anyone wanting to use it
> for any other project.  I countered that we should also consider the more
> robust DocProject ( http://www.codeplex.com/DocProject ) app that shields
> the developer from having to interact with the incredible complexity that is
> the SandCastle + MSHelp compiler infrastructure from MS.  What follows are
> the results of my doing just that.
>
> Test platform:
> Dell D830 laptop, Intel Core 2 Duo, 32-bit WinXP/SP3, 4GB RAM
> Visual Studio Pro 2008 SP1
> NHibernate 1.2.1 GA release (binaries and XML comment files, no source
> needed)
>
> Test platform notes: I used the 1.2.1 GA release of NH just because its
> what I happened to grab off my hard drive at first; I have no reason to
> believe that the results of any of my tests would be materially affected by
> running them on any subsequent build/release/version of NH so I don't think
> this to be an impact on the tests or their results.
>
> ***Docu Testing and Observations***
>
> Docu works by simply firing off a command-line and passing it the path to
> your binary (nhibernate.dll in this case).  It then constructs pure HTML
> output that can be loaded/viewed in a browser without needing to be hosted
> on a webserver (though the content could of course be posted to a web server
> for others to view as desired).
>
> From the get-go, I had a number of issues (unhandled null-reference
> exceptions) thrown by the Docu EXE iteself when operating on the
> NHibernate.dll and its XML code comments.  I eventually grabbed the latest
> code from Docu's hosted location on GitHub and built it myself in VS.  The
> latest code had the same unhandled exceptions but at least with the code in
> hand I could troubleshoot the issue(s) myself :D
>
> My 'fixes' to Docu probably aren't worth committing back to that
> project since most of them basically checked for null instances of variables
> at critical points and return from the methods if nulls are passed to them
> (probably NOT the desired behavior, but certainly enough for me to get Docu
> to successfully produce output from the nhibernate.dll assembly without
> throwing exceptions).  I have no idea if these exceptions are due to
> any strange (unexpected?) syntax we are using in the code comments within
> the NHibernate codebase or are simply the result of Docu being in
> early-alpha and not properly handling otherwise legitimate code comment
> syntaxes, but none-the-less we need to be aware that as it exists RIGHT NOW
> TODAY, Docu and the NHibernate project's XML code comments are fundamentally
> incompatible with each other without there being changes to at least one or
> the other :(
>
> Once I tweaked the Docu souce code to successfully run against
> the NHibernate.dll without throwing null-reference-exceptions, I was able to
> produce the API refererence docs that I have posted on my server for
> download by anyone interested at the following URL:
> http://unhandled-exceptions.com/downloads/NHibernate_121_Docu_Test.zip.
> The good news is that this documentation is light-weight (pure HTML) and the
> ZIP file is barely 2MB in size for the entire help collection.  To view the
> results, unzip it somewhere and just click on the index.htm to load the
> 'site' in your browser of choice.
>
> The generation of these comments by Docu is *NOT* speedy; Docu took
> approximately 15+ minutes to generate the output, most of that was spent
> with my dual-core processor locked @ 50% utilization with near-zero disk
> activity, suggesting that Docu is processor-bound in its performance and
> expects and uses just a single core to do its work (suggesting that throwing
> more hardware at it isn't likley to help much unless/until Docu becomes
> multi-threaded).  Significant disk activity only occured briefly at the end
> of the 15 minutes when the final output was rendered to the files included
> in the aforementioned ZIP file, suggesting that the compilation isn't disk
> I/O-bound at all.
>
> This suggests that even though running Docu is as simple as passing a
> single-argument command-line to it, its largely infeasible to invoke it as
> part of *every* build sequence while someone was working on the NH codebase
> since the post-build documentation compilation step would take a
> prohibitively long time.  It might be reasonable to setup Docu to run
> remotely on some dedicated CI build-server (e.g., the codebetter teamcity
> installation, etc.) so that it happened post-checkin, but due to the long
> runtime for the doc-compilation process, its nearly certain that this
> process would always have to run out-of-band as a developer worked on the
> project and made their check-ins.
>
> Since all that's needed is a single command-line invocation, integrating
> Docu into the CI server's build process would be trivial and Docu's lack of
> dependency on any other infrastructure (e.g., SandCastle, help compilers,
> etc.) makes it trivial for anyone to run the thing themselves were they to
> check out the source to their own PC (although as mentioned, they would have
> to wait the 15+ minutes for the process to run to completion were they to
> invoke Docu against the project).
>
>
>
> ***DocProject Testing and Observations***
>
> DocProject is a significantly more complex and signficantly more
> feature-rich XML code-compilation solution than Docu.  This is both a
> positive (better, more useful API reference compilation) and a negative
> (significant complexity and dependencies on other tools, etc.).
>
> DocProject works by automating the MS SandCastle infrastructure and,
> optionally, the MS Help compiler v 1.x and/or v 2.x to produce its output.
> As such, these dependencies have to be present (and properly installed) in
> order for DocProject to funciton properly.  The good news is that once this
> is accomplished, DocProject is capable of producing compiled help as a
> single .CHM file, a .HxS visual-studio-integrated help file that can be
> installed right into the VS help subsystem and accessible via F1 from within
> Visual Studio, and a complete ASP.NET web site that can be deployed to a
> server for wider access to the content.  As DocProject installs and is
> controlled as a new 'project type' in Visual Studio, you select as part of
> the New-Project-Wizard in Visual Studio which of these output targets you
> are interested in compiling to.
>
> I had little trouble getting DocProject up and running on my system; after
> installation of the SandCastle infrastructure and the requisite MS help
> compilers, the DocProject installer capably interrogates the registry to
> discover the paths to these items and wires itself up to them just fine.
> Once installed, I need not interact with the underlying components at all
> and can control/confgure the behavior of the compiled help output solely
> from with Visual Studio by editing the DocProject settings for the custom VS
> project type.  This makes for a familiar UI (build property pages, etc.) for
> configuring the output of the system.
>
> Performance of the DocProject system in generating the help output was no
> better (or worse!) than that of Docu, taking about the same 15+ minutes to
> produce its output.  This suggests that performance/speed isn't a factor in
> determining which of these directions to pursue.  There doesn't seem to be a
> significant change in the compilation time based on what output targets you
> select (e.g., CHM, ASP.NET web site, etc.) so I am strongly guessing that
> the vast bulk of the 15+ minutes is spent in processing the comments rather
> than spitting them out to actual help artifacts.  Since this is about the
> same 15+ minutes that Docu took, I'm going to conclude that there is little
> that could be done to reduce this processing time significantly.
>
> The results of my running the nhibernate.dll and its related comments
> through the DocProject process are posted for download by anyone interested
> at the following URL:
> http://unhandled-exceptions.com/downloads/NHibernate_121_DocProject_Test.zip. 
>  Becuase the DocProject output is a complete
> ASP.NET web site including graphics, icons, etc. instead of just standard
> HTML and because this download also contains the complete CHM file, this
> download is over 90+ MB in size.  Since its an ASP.NET web site, to view
> this content you will need to unzip it somewhere and then point an IIS
> virtual directory to it in order to view/consume it.  Once you do this, the
> website also contains a link (in the upper right) which leads to the
> compiled 15 MB .chm file if you are interested in seeing that content as
> well (but it looks almost 100% identical to the ASP.NET content, so not
> much need for that).
>
> DocProject is (ultimately) invoked from MSBUILD, and so it would be
> possible to wire it up as well as a post-build event or a CI task that
> automatically happened out-of-band when code is checked into the repository
> (just as with Docu) but since its MSBUILD this task-integration would
> probably be more complex than the simpler command-line invocation that Docu
> provides.  Also, since DocProject is dependent on Sandcastle and the MS help
> compilers to do its work, these dependencies would need to be
> installed/configured on whatever CI platform invoked the API Reference
> compilation step of course.
>
>
>
>
> ***SUMMARY OF COMPARISON***
>
> Docu
> ---------
> Pros:
>
>    - simple to configure/invoke
>    - no external dependencies on other tools
>    - light-wt output (small output size 2MB+/-)
>    - final output can be viewed in browser w/out a web server (e.g., just
>    HTML files)
>    - web output can be posted to a non-IIS/ASP.NET web server for public
>    access
>
> Cons:
>
>    - early alpha tool
>    - presently throws exceptions and crashes when pointed at the NH
>    project :(
>    - no search capability in the output (beyond CTRL+F on page-by-page
>    basis); intended usage pattern seems to be BROWSE, not SEARCH
>    - no single-file output target (e.g., CHM)
>    - no integration of output with Visual Studio Help system
>    - takes 15+ minutes to run
>
>
> DocProject
> ----------------
> Pros:
>
>    - output looks/feels like rest of Microsoft (MSDN) help and offers
>    familiar navigation of content
>    - offers single-file output target (CHM)
>    - output is searchable in its entirety at once (vs. page-at-a-time)
>    - index automatically built and integrated into output
>    - Visual Studio integrated help can be an output target
>    - configuration is performed in a familiar environment (Visual Studio)
>
> Cons:
>
>    - external dependency on MS tools (sandcastle, help compilers, etc.)
>    - significantly larger website output (90+ MB)
>    - web content needs IIS/ASP.NET to host it for public access
>    - more complex process of integrating it into build scripts
>    - takes 15+ mnutes to run
>
>
> ***RECOMMENDATION***
>
> IMO the DocProject approach is the more robust of the two options, offering
> a more familiar presentation of content to the end-user and richer
> experience in interacting with the content (e.g., integrated seach, indexed
> keywords, etc.).  If we are going to bother to do this, I think it would be
> most valuable to do it in a way that the resulting content is the most
> approachable and the most usable by as many people as possible and IMO
> that's the output provided by the DocProject approach.  It offers the
> web-based content that should be posted to the internet as well as the CHM
> file for those wanting offline reference to the content.  For the
> adventuresome, there is even the VS-integrated content making the NHibernate
> API reference a full-fledged participant in the VS help system (supporting
> valuable learning scenarios such as placing your cursor on an NH
> class/method and being able to jump to help on it via a simple F1 keystroke
> from inside Visual Studio -- followed, sadly, by the interminable 10-minute
> wait for the VS help system to spool up and load, of course!).
>
> The biggest challenge to the DocProject approach IMO is the dependency on
> SandCastle, the MS help compilers, etc. and if the DocProject help generator
> VS project were added directly to the NHibernate trunk solution, then anyone
> interested in building NH would need to either unload the DocProject VS
> project from the solution or else get all of those dependencies installed
> just to build/compile NH and that's too high a burden to ask anyone to
> achieve if all they want is to check out the core NH project and
> build/compile it for themselves IMO.
>
> One of the important things to understand about *either* of these help
> compilation tools is that neither of them actually require access to *any*
> of the NH source code directly -- instead they simply require access to the
> compiled binaries and the XML code-comment files extracted from the source
> code by the C# compiler at build-time.  This actually means that I think the
> best way to accomplish the creation of a rich API reference for NH is to
> create a separate parallel solution (NH_API_Reference?) that is *not* part
> of the main NH solution but contains (relative) path-pointers to the
> location of the compiled NH binaries from the actual NH solution itself.
> This way, the 'API Reference Project' can be completely separate and
> distinct from the actual NH source trunk.
>
> This would support the following scenarios:
>
> 1) if you want to build just NH, you get that trunk and build it; the sln,
> nant scripts, etc. make no refernece to the DocProject stuff at all and
> nobody is affected (nobody needs SandCastle, MS Help compilers, etc. to
> build the NH trunk just as is the case today)
>
> 2) if you want to build the API ref docs, you check out BOTH the NH trunk
> and the API_REF trunk, build the NH trunk, and then build the API_REF trunk
> that points to the bin output folder from the NH source trunk to get the
> binaries and the XML it needs to process; this scenario would (of course)
> require you to have installed SandCastle, the MS Help compilers, etc. in
> order to perform the compilation of the API reference docs but only such
> people would be affected
>
> It seems to me that this would support the needs of everyone in a way that
> would have the least negative impact on the 'real' NH source trunk and yet
> still permit us to construct the most robust API reference content for any
> NH adopter.
>
> Sorry to all for the (ridiculous) length of this thing, but as this is
> hardly the kind of decision I think I should (could!) make on my own, I
> wanted to try to summarize as much of my findings as I could so that
> everyone can understand the factors that will play into our decision and
> help form the basis for any discussion anyone wants to have about how best
> to proceed.
>
> Thoughts (as always) welcome; I'm sure I'm overlooking several pros and
> cons for either solution so am hoping a discussion here about this will
> surface some of my oversights.
>
> --
> Steve Bohlen
> [email protected]
> http://blog.unhandled-exceptions.com
> http://twitter.com/sbohlen
>



-- 
Fabio Maulo

Reply via email to