Dennis Lee Bieber wrote: > On Sat, 20 Jan 2007 13:49:52 -0700, Steven Bethard > <[EMAIL PROTECTED]> declaimed the following in comp.lang.python: > >> Within a larger pyparsing grammar, I have something that looks like:: >> >> wsj/00/wsj_0003.mrg >> >> When parsing this, I'd like to keep around both the full string, and the >> AAA_NNNN substring of it, so I'd like something like:: >> >> >>> foo.parseString('wsj/00/wsj_0003.mrg') >> (['wsj/00/wsj_0003.mrg', 'wsj_0003'], {}) >> > If working file name/paths, why not use the functions in os.path?
Two reasons. First, as I mentioned, this is within a larger pyparsing grammar so it's not as easy to switch back and forth between the two. Second, I do want to do some data validation (e.g. the name of the file needs to be in a particular format) so I either need to post-process the os.path approach or just do it in pyparsing. >> But that then allows whitespace between the pieces of the path, which >> there shouldn't be:: >> > If you didn't have whitespace coming in, there shouldn't be any > going out. If you do, you likely have malformed data and probably should > detect it earlier... Well that's the intention of using pyparsing here. With a proper grammar, pyparsing can detect the malformed data for me and throw an error. STeVe -- http://mail.python.org/mailman/listinfo/python-list