I wrote:
>     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)
>       => ("foo" "bar" "baz")
>     (regexp-split " +" "  foo  bar  baz  " #:limit 2 #:trim 'both)
>       => ("foo" "bar")

Sorry, that last example is wrong of course, but both of these examples
raise an interesting question about how #:limit and #:trim should
interact.  To my mind, the top example above is correct.  I think the
last result should be "baz", not "baz  ".

I guess I'd prefer to think of #:trim as trimming *before* splitting,
instead of trimming empty elements *after* splitting, so:

     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)
       => ("foo" "bar" "baz")
     (regexp-split " +" "  foo  bar  baz  " #:limit 2 #:trim 'both)
       => ("foo" "bar  baz")

Note also that if you trim empty elements *after* splitting, then
there's a bad interaction with #:limit if you trim the left side.
Consider:

     (regexp-split " +" "  foo  bar  baz  " #:limit 3 #:trim 'both)

If we first split, taking into account the limit, we get:

     ("" "foo" "bar  baz  ")

and then we trim empty elements from both ends to get the final result:

       => ("foo" "bar  baz")

which seems wrong, given that I asked for #:limit 3.

Honestly, this question makes me wonder if the proposed 'regexp-split'
is too complicated.  If you want to trim whitespace, how about using
'string-trim-right' or 'string-trim-both' before splitting?  It seems
more likely to do what I would expect.

What do you think?

    Regards,
      Mark

Reply via email to