> I have a file that is a long list of records (roughly) in the format > > [EMAIL PROTECTED] > > So, for example: > > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > [EMAIL PROTECTED] > .... > > What I would like to do is run a regular expression against this and > wind up with:
I'd recommend scratching out the requirement to use regular expressions. *grin* I'm actually not certain they're appropriate for this problem; it seems more like knowing about data structures like lists and dictionaries will be more crucial here. > Actually, should I be able to do something like that? If I execute it > in my debugger, my string gets really funky... like the re is losing > track of what the groups are... and I end up with a single really long > string rather than what I expect.. I do not see an obvious regular expression that does what you want. I'm not saying that no such regex exists (I'd have to think about it a bit), but that simpler approaches will probably work out better. Would you might if we simplify the problem a bit? Rather than working directly on files, what if you were working on tuples where the id and the data portion was already split up for you? That is, would life be simpler for you if you had a list like: [('id1', 'data1'), ('id1', 'data2'), ('id1', 'data3'), ('id1', 'data4'), ('id2', 'data1'), ...] and given input like this, you were to try to compute something like a dictionary from ids to a list of the data? { 'id1' : ['data1', 'data2', 'data3', 'data4'], 'id2' : ['data1'], ...} Would this be something you'd know how to do? Best of wishes to you! _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor