> Accepting input from a human is fraught with dangers and edge cases. > Here's a non-regex solution
Thanks all for playing! And as usual I forgot a critical detail: I'm writing a matcher for a Morelia /viridis/ Scenario step, so the matcher must be a single regexp. http://c2.com/cgi/wiki?MoreliaViridis I'm avoiding the current situation, where Morelia pulls out (.*), and the step handler "manually" splits that up with: flags = re.split(r', (?:and )?', flags) That means I already had a brute-force version. A regexp version is always better because, especially in Morelia, it validates input. (.*) is less specific than (\w+). So if the step says: Alice has crypto keys apple, barley, and flax Then the step handler could say (if this worked): def step_user_has_crypto_keys_(self, user, *keys): r'(\w+) has crypto keys (?:(\w+), )+and (\w+)' # assert that user with those keys here That does not work because "a capturing group only remembers the last match". This would appear to be an irritating 'feature' in Regexp. The total match is 'apple, barley, and flax', but the stored groups behave as if each () were a slot, so (\w+)+ would not store "more than one group". Unless there's a Regexp workaround to mean "arbitrary number of slots for each ()", then I /might/ go with this: got = re.findall(r'(?:(\w+), )?(?:(\w+), )?(?:(\w+), )?(?:(\w+), )? (?:(\w+), and )?(\w+)$', 'whatever a, bbb, and c') print got # [('a', '', '', '', 'bbb', 'c')] The trick is to simply paste in a high number of (?:(\w+), )? segments, assuming that nobody should plug in too many. Behavior Driven Development scenarios should be readable and not run-on. (Morelia has a table feature for when you actually need lots of arguments.) Next question: Does re.search() return a match object that I can get ('a', '', '', '', 'bbb', 'c') out of? The calls to groups() and such always return this crazy ('a', 2, 'bbb', 'c') thing that would disturb my user-programmers. -- Phlip -- http://mail.python.org/mailman/listinfo/python-list