On Mon, Dec 17, 2012 at 2:58 AM, Nick Wellnhofer <[email protected]> wrote:
> Hey, how did you find that comment on Github? It was mainly meant as note to
> myself ;)

I got a notification, which I'll forward to you offlist.  Somehow I must have
started following <https://github.com/apache/lucy>.  (Maybe when I forked that
repo?)

>> (; Extending Charmonizer to handle compilation might actually be the most
>> elegant approach -- we've already abstracted the compiler and the shell to
>> an extent -- but heads may explode if I propose writing build scripts in C
>> so I'll save that for later. ;)

> I've also been thinking in that direction when working on the C bindings. It
> sounds like a crazy idea, but that could be said about Charmonizer as a
> whole.  OTOH, Charmonizer does its job very well, and using C is the only
> approach that really works cross-platform.

I'm pretty happy with how Charmonizer has evolved.  "Use C to configure C"
has worked out well in practice.

> FWIW, here is the preliminary Makefile that I currently use in my work on
> the C bindings:
>
>     http://s.apache.org/fBI
>
> If you ignore the huge list of source files, it's actually quite simple.
> So it doesn't sound like too much a stretch to generate it using C.

Heh, generating Makefiles wasn't the approach I was thinking of. :)  I meant
writing actual build scripts in C.

Here's a snippet from one of our Perl build scripts:

    for my $c_file (@$c_files) {
        my $o_file   = $c_file;
        my $ccs_file = $c_file;
        $o_file   =~ s/\.c$/$Config{_o}/ or die "no match";
        $ccs_file =~ s/\.c$/.ccs/        or die "no match";
        push @objects, $o_file;
        next if $self->up_to_date( $c_file, $o_file );
        $self->add_to_cleanup($o_file);
        $self->add_to_cleanup($ccs_file);
        $cbuilder->compile(
            source               => $c_file,
            extra_compiler_flags => $cc_flags,
            include_dirs         => $self->include_dirs,
            object_file          => $o_file,
        );
    }

Here's that code ported to C:

    /* Set extra compiler flags and include dirs. */
    chaz_CC_add_extra_cflags(cflags);
    for (i = 0; i < num_include_dirs; i++) {
        chaz_CC_add_include_dir(include_dirs[i]);
    }

    /* Compile all C source files. */
    for (i = 0; i < num_c_files; i++) {
        const char *c_file   = c_files[i];
        const char *o_file   = chaz_Util_swap_ext(c_file, chaz_CC_obj_ext());
        objects->[i] = o_file;
        if (!chaz_Util_up_to_date(c_file, o_file)) {
            const char *ccs_file = chaz_Util_swap_ext(c_file, ".ccs");
            S_add_to_cleanup(o_file);
            S_add_to_cleanup(ccs_file);
            free(ccs_file);
            chaz_CC_compile_obj(c_file, o_file);
        }
    }

We'd need to write a few helper subroutines to make that code sample actually
work, but the hard part -- abstracting the task of compilation -- is already
done.  (To give us a little more flexibility when writing those helper subs,
the build script could be separate from the configure script and could
pound-include charmony.h.)

> * Path names can sometimes be problematic. Windows generally accepts the
> forward slash as directory separator, but there are always some cases where
> it doesn't work or needs a work-around.

That's my experience as well.

> * Windows is probably the first platform that will break because of overlong
>   command lines (8K limit).

At some point we'll probably need to start generating scripts like
ExtUtils::CBuilder does to address that problem.

> * External commands can be hard to emulate on Windows. Even something simple
> like "rm" can be much more complicated.

Indeed.  See the "clean" targets in the various Charmonizer Makefiles. :P

> I made an initial attempt to put the bulk of the Makefile in a
> platform-independent file which then is included by platform-dependent
> Makefiles. This approach seems to work, but some of the corner cases need
> rather inelegant solutions.

I seem to recall trying that approach for Charmonizer's Makefiles at some
point, though I don't remember whether it was an original idea or a mod on
some of Joe Schaefer's work.

The approach that seemed most appealing in the end was to model Makefiles
using OO; that's what's in devel/bin/gen_charmonizer_makefiles.pl now.

    Charmonizer::Build::Makefile          <-- base class
    Charmonizer::Build::Makefile::Posix
    Charmonizer::Build::Makefile::MSVC
    Charmonizer::Build::Makefile::MinGW

If we want to generate Makefiles using Charmonizer, I suggest we do something
similar faking up inheritance with structs.

    struct chaz_PosixMakefile {
        struct chaz_Makefile base;
        /* ... */
    };

I'd hope we could stay away from parsing a template file (a la "Makefile.in"),
because having to support parsing would complicate things a lot.  Instead, I'd
suggest embedding the entire content of the Makefile within the generator app
-- again like gen_charmonizer_makefiles.pl -- so that we get to piggyback on
the C compiler's parser.

The downside of the OO approach is that the layout of the content does not
really look like the final Makefile.

> So I'd like to look into a way to generate Makefiles programmatically using
> parameters provided by Charmonizer.

I'd argue against generating Makefiles.

Makefile syntax is obtuse on its own, but the real problem is that generating
Makefiles essentially means compiling down to shell code.  We do pretty well
compiling down to C with the Clownfish compiler, but C compilers exhibit a lot
less variability than shells and the external programs that they reference.
Our task is made harder by the fact that we're contemplating targeting at
least two shell environments -- POSIX-compliant sh and cmd.exe -- which means
rearranging arguments, dealing with different quoting and splitting rules,
invoking completely different commands, etc.

I agree with Martin Fowler's take on the superiority of Rake's design over
that of Make:

    http://martinfowler.com/articles/rake.html#DomainSpecificLanguageForBuilds

    All my three build languages share another characteristic - they are all
    examples of a Domain Specific Language (DSL). However they are different
    kinds of DSL. In the terminology I've used before:

        * make is an external DSL using a custom syntax
        * ant (and nant) is an external DSL using an XML based syntax
        * rake is an internal DSL using Ruby.

    The fact that rake is an internal DSL for a general purpose language is a
    very important difference between it and the other two. It essentially
    allows me to use the full power of ruby any time I need it, at the cost of
    having to do a few odd looking things to ensure the rake scripts are valid
    ruby. Since ruby is a unobtrusive language, there's not much in the way of
    syntactic oddities. Furthermore since ruby is a full blown language, I
    don't need to drop out of the DSL to do interesting things - which has
    been a regular frustration using make and ant. Indeed I've come to view
    that a build language is really ideally suited to an internal DSL because
    you do need that full language power just often enough to make it
    worthwhile - and you don't get many non-programmers writing build scripts.

Similar arguments hold for Module::Build over ExtUtils::MakeMaker:

    http://perldoc.perl.org/5.16.1/Module/Build.html#MOTIVATIONS

Unfortunately, Rake has not yet achieved 100% market penetration and Ruby is
too big to bundle with Lucy. :)  However, bundling the source code for a C
library which provides some functions to support common build tasks gives
us some of the same advantages: portability problems are scoped to
individual subroutines, and we get to rely on the uniformity of C syntax
rather than shell and its quirks.  As computers have gotten bigger and faster,
compiling a bundled build tool from source incurs proportionally less
overhead.  It works for Lemon; it can work for Charmonizer, too.

Nevertheless, I suspect that generating Makefiles is still a workable solution
for this portion of Lucy at least because as you note, what we're doing here
isn't all that complicated.  I suspect that with generated Makefiles the slope
gets steeper the more complex the task, and thus that the approach is
self-limiting (see ExtUtils::MakeMaker), but we probably won't hit the wall.

Marvin Humphrey

Reply via email to