Hi all, I have a regex that matches dates in various formats. I've tested the regex in a reliable testbed, and it seems to match what I want (dates in formats like "1 Jan 2010" and "January 1, 2010" and also "January 2008"). It's just that using re.findall with it is giving me weird output. I'm using Python 2.6.5 here, and I've put in line breaks for clarity's sake:
>>> import re >>> date_regex = re.compile(r"([0-3]?[0-9])?((\s*)|(\t*))((Jan\.?u?a?r?y?)|(Feb\.?r?u?a?r?y?)|(Mar\.?c?h?)|(Apr\.?i?l?)|(May)|(Jun[e.]?)|(Jul[y.]?)|(Aug\.?u?s?t?)|(Sep[t.]?\.?e?m?b?e?r?)|(Oct\.?o?b?e?r?)|(Nov\.?e?m?b?e?r?)|(Dec\.?e?m?b?e?r?))((\s*)|(\t*))(2?0?[0-3]?[0-9]\,?)?((\s*)|(\t*))(2?0?[01][0-9])") >>> test_output = re.findall(date_regex, 'January 1, 2008') >>> print test_output [('', '', '', '', 'January', 'January', '', '', '', '', '', '', '', '', '', '', '', ' ', ' ', '', '20', '', '', '', '08')] >>> test_output = re.findall(date_regex, 'January 1, 2008') >>> print test_output [('', '', '', '', 'January', 'January', '', '', '', '', '', '', '', '', '', '', '', ' ', ' ', '', '1,', ' ', ' ', '', '2008')] >>> test_output = re.findall(date_regex, "The date was January 1, 2008. But it was not January 2, 2008.") >>> print test_output [('', ' ', ' ', '', 'January', 'January', '', '', '', '', '', '', '', '', '', '', '', ' ', ' ', '', '1,', ' ', ' ', '', '2008'), ('', ' ', ' ', '', 'January', 'January', '', '', '', '', '', '', '', '', '', '', '', ' ', ' ', '', '2,', ' ', ' ', '', '2008')] A friend says: " I think that the problem is that every time that you have a parenthesis you get an output. Maybe there is a way to suppress this." My friend's explanation speaks to the empties, but maybe not to the two Januaries. Either way, what I want is for re.finall, or some other re method that perhaps I haven't properly explored, to return the matches and just the matches. I've read the documentation, googled various permutations etc, and I can't figure it out. Any help much appreciated. Thanks, Mike
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor