On Fri, Aug 23, 2013 at 3:33 PM, Nick Wellnhofer <[email protected]> wrote:
> I just saw that you started to work on an immutable string class. It's a
> good idea to get this done before the first Clownfish release. Some
> implementation details have already been discussed on lucy-dev and I have an
> unpublished, local branch where I continued to flesh out the design of
> string iterators.

Before getting started, I had a look at the published branch
"string-iterator-wip1":

https://git-wip-us.apache.org/repos/asf?p=lucy.git;a=shortlog;h=refs/heads/string-iterator-wip1

It seems as though source code churn from changes like "_IMP" has made the
branch impractical to update, so the commits will have to be recreated one by
one.  Nevertheless, the concepts remain just as applicable since the main
code base has changed only superficially.

> I see that you made CharBuf inherit from String as a temporary measure. It
> seems that you have a plan on how to make a gradual transition to immutable
> strings which would be great. Can you share some details about the separate
> steps so I could help with some of the work?

What I had in mind was to transition classes which make minimal use of
CharBuf's features as early as possible -- e.g. those which store fixed field
names as CharBufs.

However, thinking things through a little more, I'm not sure that migrating
classes from CharBuf to String piecemeal is going to work out so conveniently.
We will probably run into problems because the automatic conversion code in
XSBind.c won't play nice with type mismatches between CharBuf and String.

In that case, the cfish-string-wip1 branch will become a proof-of-concept,
used to firm up the design of String, but not leading to a final outcome via a
straight line sequence of commits.

> From what I understand, the process should roughly look like this:

There's a task which isn't on your list which is to my mind perhaps the most
important: vet the emerging Clownfish String design against existing
implementations from several other popular programming languages.

I'm concerned about several substandard features of Clownfish, which wormed
their way into the codebase through expedience, accident, or failed
experiment, becoming part of the public API.  We're better off excising that
unhealthy tissue sooner rather than later, and comparing Clownfish's design
against other designs will hopefully allow us to diagnose any problems.

Removing Dump/Load, Serialize/Deserialize, and Make has been a good start.
There's more to do -- in the context of String, I'm particularly dissatisfied
with the type specifiers in the format used by newf() -- but thank goodness,
now that Clownfish runtime is separate from Lucy and comprises only a handful
of classes, performing a thorough review the API design doesn't feel
unreasonable.

Once upon a time, I imagined that we'd perform such a review after finishing
Python and Ruby bindings but before exposing a C API.  I'd hoped that the
community would accumulate collective wisdom through the process of
implementing bindings for multiple dynamic languages and that the experience
would inform our choices while polishing the design of the runtime.  But it
turns out that the C bindings were finished first, so here we are. :)

I'll kick things off with a review of String's Starts_With() and Ends_With().

>     * Move a couple of methods from CharBuf to String.
>     * Implement string iterators.
>     * Make CFC convert between Perl and Clownfish strings.
>     * Implement a "zombie" string class?
>     * Step-by-step conversion of CharBufs to Strings.
>     * Make CharBuf a separate class.

It would be great to collaborate on creating the best possible immutable
String class for Clownfish.  I suspected you might want to take part, which
was why I deliberately started off the cfish-string-wip1 branch with only the
most basic skeleton. :)

When we're done (enough) with String, if it turns out that piecemeal
integration is too awkward, there's an alternate path:

1.  Start a new branch.
2.  Duplicate CharBuf in a new class, CharBuffer.
3.  Rename CharBuf to String in one huge but superficial.
4.  Switch over sites which actually need mutability to CharBuffer.  There
    aren't that many.
5.  Replace CharBuf-masquerading-as-String with the actual immutable String
    class completed earlier.
6.  Rename CharBuffer to CharBuf.

Marvin Humphrey

Reply via email to