Yesterday, Marijn wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi, > > this just appeared on guile-devel, but it seems to have exposed a bug > in racket. > > On 29-12-11 10:32, Nala Ginrut wrote: > > [...]
This doesn't look like an issue that is related to guile, just that he chose python as the goal... The first other random example I tried was `split-string' in Emacs, which did the same thing as Racket. > Welcome to Racket v5.2.0.7. > > (regexp-split "([^0-9])" "123+456*/") > '("123" "456" "" "") > > should it be considered a bug in racket that it doesn't support > capturing groups in regexp-split? No. > Without the capturing group the results are identical: [...] Which is expected. > >>> import re re.split("[^0-9]", "123+456*/") > ['123', '456', '', ''] > > > (regexp-split "[^0-9]" "123+456*/") > '("123" "456" "" "") It was tricky to dig out what you wanted here... Python does something which is IMO very weird: >>> re.split("([^0-9])", "123+456*/") ['123', '+', '456', '*', '', '/', ''] It's even more confusing with multiple patterns: >>> re.split("([^0-9]([0-9]))", "123+456*/") ['123', '+4', '4', '56*/'] There's probably uses for that -- at least for the simple version with a single group around the whole regexp, but that's some hybrid of `regexp-split' and `regexp-match*': it returns something that interlevase them, which can be useful, but I'd rather see it with a different name. We've talked semi-recently about adding an option to `regexp-match*' so it can return the lists of matches for each pattern, perhaps add another option for returning the unmatched sequences between them, and give the whole thing a new name? (Something that indicates it being the multitool version of all of these.) -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _________________________ Racket Developers list: http://lists.racket-lang.org/dev