Isaac Dupree wrote:
"It encodes any string into a string that is acceptable as a C name."

this isn't true, I noticed.  It's under- and over- specified.

- input starting with a digit 0..9 still starts with a digit
- the empty string becomes the empty string

Neither are acceptable as C names (though they're fine as suffixes of C names). Currently this doesn't cause any GHC bug, as far as I know (at least assuming package names can't start with a digit). But it's a comment bug. It's not entirely obvious what to do:

- changing how existing things are z-encoded will probably break some things, even if they decode the same way without extensions (e.g. encoding digits with the z..U form).

- distributivity, I don't know if this is important
zEncodeString a ++ zEncodeString b == zEncodeString (a ++ b)

Since it's a comment bug, I suggest fixing the comment :-)

It could reasonably fail for the empty string, I think, unless you can find a non-empty string that will not clash with anything else.

But one cannot both have distributivity and encode the empty string as a C name (we could nominally restrict the range to non-empty strings for that purpose though), and distributivity also prevents treating only a *leading* digit with the z..U-treatment.

- an important property of the z-encoding that should be mentioned in the comments-specification is it produces a C-name *without underscores*. This is important because underscore is then used as a separator.

Right.  Worth adding to the comments.

I think it's a good idea to prefix anything z-encoded with some standard prefix such as "ghc_" or "zEncoded_" perhaps -- to make sure it doesn't conflict with any other name when linking.

Increasing the size of every symbol has a detrimental effect on the size of libraries and link-times, so it's to be avoided if possible.

Then it would be easy to change the specification to something completely true and sensible. But it looks like currently ghc's C-names are often
(package)_(module)_(something)
or even
(module)_(something)
e.g. I found
'base_ControlziConcurrent_zdfforkOSzuentryzua159_closure' in ghc-6.8.2/libraries/base/dist/build/Control/Concurrent_stub.c
and
'SystemziConsoleziReadline_d2k5' in ghc-6.8.2/libraries/readline/dist/build/System/Console/Readline_stub.c

Hmm, the latter looks like it should include the package name.

Cheers,
        Simon

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to