On Fri, 14 Nov 2008 14:19:54 -0500, Scott McKellar <[EMAIL PROTECTED]> wrote:

After taking a bit of a sabbatical to work on the election, I returned to
the Evergreen project a few days ago.  I was rummaging through utils.c
just to get back into the groove, and lit upon uescape().

Welcome back :)

<snip>


Both replacement candidates also eliminate the original full_escape
parameter, which is a boolean.  Internally we branch on it in order to
decide whether to encode control characters or leave them unchanged.

In all cases where we call uescape(), we pass a value of 1 for full_escape,
meaning that we encode control characters.  It's a useless parameter
because in practice it makes no difference.

I don't like boolean parameters because I always have trouble remembering
whether true means *this* and false means *that*, or the other way around.
If we ever need to provide the "false" behavior we should provide a
separate function for it, with a name that indicates how it differs.

I can certainly get on board with that.

<snip>


-------------

There are some issues to resolve concerning the treatment of errors.

If uescape() or full_uescape() encounters invalid Unicode, it returns
NULL.  The calling code can detect the error by checking for NULL, but
in fact it doesn't. Without noticing or reporting the error, it just adds nothing to the growing_buffer that it's building, even if some of the unescaped string was valid.

If that's a sensible way to respond to invalid Unicode, then I can make
buffer_append_uescape() behave the same way.

If we want to be able to detect invalid Unicode, then
buffer_append_uescape() will have to change.  Instead of returning a
pointer to the growing_buffer, it could return a status code.  In that
case it would have to insist on getting a non-NULL pointer to the
growing_buffer.  That may be a wiser policy anyway.


I like the approach of returning a status (and requiring a non-NULL growing_buffer pointer). In particular, with JSON parsing, it would be nice to know invalid data was encountered so it can be reported in a meaningful way.



-------------

My recommendation is to go with some form of buffer_append_uescape().
It incorporates all of the optimizations in full_uescape(), plus it
eliminates two mallocs and two frees, plus it doesn't have to do a
strlen() to guess at a buffer size.

You have my vote on buffer_append_uescape() with full_uescape() optimizations. That's clearly geared toward how we use the code.


We could also keep the original uescape() around, just in case we ever
need it again.  Otherwise we can reuse the same name for a function with
a different signature.

I think we should leave the original uescape() in, just to ease the process of coupling different versions Evergreen and OpenSRF.

-b


--
Bill Erickson
| VP, Software Development & Integration
| Equinox Software, Inc. / The Evergreen Experts
| phone: 877-OPEN-ILS (673-6457)
| email: [EMAIL PROTECTED]
| web: http://esilibrary.com

Reply via email to