Hi Shai, The file is indeed large, ~6MB. I added print lines before/after each line and found that the only line that consumes more than 1 second was: " match = re.search(pattern, txt, re.S) ", it consumed ~5 minutes!
Best Regards, Yitzhak -----Original Message----- From: Shai Berger [mailto:[email protected]] Sent: Tuesday, May 25, 2010 12:47 PM To: Yitzhak Wiener; [email protected] Subject: Re: [Python-il] [python-il]location in file Hi Yitzhak, First of all, please keep the discussion public. On Tuesday 25 May 2010 12:04:47 you wrote: > > Thanks a lot. It works, but it takes a vvvvvveeeeeeeeerrrrrryyyyyyyy > llllllllllllllllooooooooooonnnnnnnnnnnnnnggggggggggggggggg time to > complete, something like 10 minutes. Why? Is there a way to make it > significantly faster? > That is quite odd; regular expression search of this kind is usually quite fast, even when the files are large (how large is your file?). Can you try the different parts separately to find out where the problem is? (I'm leaving the old messages in as they were not sent to the list) Shai. > -----Original Message----- > From: Shai Berger [mailto:[email protected]] > Sent: Monday, May 24, 2010 11:14 PM > To: Yitzhak Wiener > Subject: Re: [Python-il] [python-il]location in file > > On Monday 24 May 2010, you wrote: > > Shai, thanks. > > > > What type is 'words'? I wanted to print it but " name 'words' is not > > defined "! > > Serves me right for posting untested code... > > words is a list of strings, but it is only assigned if the search is > successful; and that only happens when using re.S instead of re.M (you > had > re.S in your original, line-separating code, where it did no good; it > only > matters when you search multiline texts. I had confused re.M for re.S). > > Sorry, > Shai. > > ______________________________________________________________________ > DSP Group, Inc. automatically scans all emails and attachments using > MessageLabs Email Security System. > _____________________________________________________________________ > > ______________________________________________________________________ > DSP Group, Inc. automatically scans all emails and attachments using > MessageLabs Email Security System. > _____________________________________________________________________ > ______________________________________________________________________ DSP Group, Inc. automatically scans all emails and attachments using MessageLabs Email Security System. _____________________________________________________________________ ______________________________________________________________________ DSP Group, Inc. automatically scans all emails and attachments using MessageLabs Email Security System. _____________________________________________________________________ _______________________________________________ Python-il mailing list [email protected] http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
