Re: [PATCH] Minor fixes to rx.c

Nicholas Clark Thu, 10 Jan 2002 09:01:04 -0800

On Thu, Jan 10, 2002 at 03:06:38PM +0000, Alex Gough wrote:
> Also, I'm a bit concerned that our null termination games:
> 
>     s->bufstart = mem_sys_allocate(buflen+1);
>     ...
>     memset((char *)s->bufstart+s->bufused,0,1);
> 
> Are going to lead to an eternity of OBO errors.  Also if our encoding or
> output does not require termination, this is a waste of time.  Also
> I don't think all the string functions update the termination when
> they add characters beyond bufused but below buflen, which means that
> any output function that needs null termination will have to check
> for this itself anyway.  All in all, I think it would make more sense to
> keep the code that cares about termination away from the general string
> code as it's being a bit obscuring at present.


I would strongly recommend that perl6 mandates that buffers are not nul
terminated. Anything that needs a nul should arrange for one to be appended.
[eg by ensuring that the buffer is writable, extending it by one byte if
needs be, and writing that nul, or by copying out the contents.]

If there are no explicit nul bytes, and none needed to be added then as you
say

0: Avoids lots of off by one errors.
1: Saves time adding a nul byte. [1 call to memcpy to set a buffer, rather than
   (a call of given length && an explicit write of nul) or
   (add one to length && call memcpy) if you know source has a nul byte]
2: It's more robust. If your code knows that it needs a nul byte, you don't
   need to ponder whether you can trust everyone else (eg the string functions)
   not to shaft you by buggily forgetting nul bytes

and also

3: substrings can point into the buffers of other things.

The last one lets you mmap a file (or however VMS does it faster still) and
then make your <> scalars point into the mmap buffer, flagged copy on write.
And if your grep-as-perl doesn't actually modify the buffer there's no
copying. This assumes that the housekeeping of copy-on-write is less than
the time spent copying.

Nicholas Clark
-- 
ENOJOB http://www.ccl4.org/~nick/CV.html

Re: [PATCH] Minor fixes to rx.c

Reply via email to