On Thu, Aug 25, 2005 at 02:47:37PM +1000, Donald Bruce Stewart wrote:
> Hey,
>
> I've cabalised the FastPackedString module,
> darcs get http://www.cse.unsw.edu.au/~dons/code/fps
Great!
> While writing some unit tests for FastPackedString, I found two issues.
> The first problem is initPS. It seems to have an off-by-one error:
>
> Data.FastPackedString> initPS (packString "haskell")
> > "haskell"
> Data.FastPackedString> initPS (packString "has")
> > "has"
> Data.FastPackedString> initPS (packString "hs")
> > "hs"
> Data.FastPackedString> initPS (packString "h")
> ok ""
> Data.FastPackedString> initPS (packString "")
> ok "*** Exception: FastPackedString.initPS: init []
>
> The relevant line is:
> hunk ./FastPackedString.hs 409
> - | otherwise = substrPS ps 0 (len - 1)
> + | otherwise = substrPS ps 0 (len - 2)
Indeed, this does look like a bug. I'd say a more thorough solution might
be to eliminate the (confusing) substrPS function. It is only used twice
and one of those uses is buggy! And it's not exported from
FastPackedString. I think initPS and tailPS will be much more clearly
written as
initPS (PS p s l) | l <= 0 = error "FastPackedString.initPS: init []"
| l == 1 = nilPS
| otherwise = PS p s (l-1)
tailPS (PS p s l) | l <= 0 = error "FastPackedString.tailPS: tail []"
| l == 1 = nilPS
| otherwise = PS p (s+1) (l-1)
and this is possibly even more efficient than the substrPS implementation,
since it'll do fewer subtractions and additions, although ghc may be able
to optimize away the addition of zero.
> I see that initPS is used twice in Stringalike, so I'm unsure if this
> behaviour has been worked around somehow. Anyway, a patch is attached,
> and has been merged into fps.
I don't see how the uses of initPS can be correct. It only comes up in
either binary file patches or carriage-return removal, both of which are
somewhat rarely used features, so I suspect we've got real bugs and just
haven't noticed.
> ------------------------------------------------------------------------
>
> The second is that linesPS/unlinesPS aren't equivalent to lines/unlines. The
> PS versions behave as follows:
> Data.FastPackedString> let s = "a\nb\nc\n"
>
> Data.FastPackedString> linesPS . packString $ s
> > ["a","b","c",""]
> Data.FastPackedString> lines s
> ["a","b","c"]
>
> Data.FastPackedString> unlinesPS . linesPS . packString $ s
> "a\nb\nc\n"
> Data.FastPackedString> unlines . lines $ s
> "a\nb\nc\n"
>
> Since unlinesPS correctly inverts linesPS, this is probably ok to leave
> as is, though I've attached a patch for this just in case, and I'm not
> sure if darcs somehow depends on the current behaviour. This patch is in
> fps as well:
As you suspect, this is a feature, not a bug! :) (unlines . lines) is not
equivalent to id, but (unlinesPS . linesPS) *is*, which allows us to handle
files that aren't terminated with a newline.
For non-darcs usage, it might be nice to mimic the behavior of lines and
unlines, but in my opinion the linesPS behavior is much more useful,
although somewhat less intuitive.
Thanks for the bug reports!
--
David Roundy
http://www.darcs.net
_______________________________________________
darcs-devel mailing list
[email protected]
http://www.abridgegame.org/cgi-bin/mailman/listinfo/darcs-devel