Re: [Tutor] Help with re.sub()

Danny Yoo Thu, 16 Mar 2006 21:05:38 -0800

> I have a file that is a long list of records (roughly) in the format
>
> [EMAIL PROTECTED]
>
> So, for example:
>
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> [EMAIL PROTECTED]
> ....
>
> What I would like to do is run a regular expression against this and
> wind up with:


I'd recommend scratching out the requirement to use regular expressions.
*grin*

I'm actually not certain they're appropriate for this problem; it seems
more like knowing about data structures like lists and dictionaries will
be more crucial here.


> Actually, should I be able to do something like that?  If I execute it
> in my debugger, my string gets really funky... like the re is losing
> track of what the groups are... and I end up with a single really long
> string rather than what I expect..

I do not see an obvious regular expression that does what you want.
I'm not saying that no such regex exists (I'd have to think about it a
bit), but that simpler approaches will probably work out better.



Would you might if we simplify the problem a bit?  Rather than working
directly on files, what if you were working on tuples where the id and the
data portion was already split up for you?

That is, would life be simpler for you if you had a list like:

[('id1', 'data1'),
 ('id1', 'data2'),
 ('id1', 'data3'),
 ('id1', 'data4'),
 ('id2', 'data1'),
 ...]

and given input like this, you were to try to compute something like a
dictionary from ids to a list of the data?

{ 'id1' : ['data1', 'data2', 'data3', 'data4'],
  'id2' : ['data1'],
  ...}

Would this be something you'd know how to do?


Best of wishes to you!

_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Help with re.sub()

Reply via email to