Hi Shai,

The file is indeed large, ~6MB.
I added print lines before/after each line and found that the only line
that consumes more than 1 second was: " match = re.search(pattern, txt,
re.S) ", it consumed ~5 minutes!

 
Best Regards,
Yitzhak

-----Original Message-----
From: Shai Berger [mailto:[email protected]] 
Sent: Tuesday, May 25, 2010 12:47 PM
To: Yitzhak Wiener; [email protected]
Subject: Re: [Python-il] [python-il]location in file

Hi Yitzhak,

First of all, please keep the discussion public.


On Tuesday 25 May 2010 12:04:47 you wrote:
> 
> Thanks a lot. It works, but it takes a vvvvvveeeeeeeeerrrrrryyyyyyyy
> llllllllllllllllooooooooooonnnnnnnnnnnnnnggggggggggggggggg time to
> complete, something like 10 minutes. Why? Is there a way to make it
> significantly faster?
> 

That is quite odd; regular expression search of this kind is usually
quite 
fast, even when the files are large (how large is your file?).

Can you try the different parts separately to find out where the problem
is?

(I'm leaving the old messages in as they were not sent to the list)

Shai.

> -----Original Message-----
> From: Shai Berger [mailto:[email protected]]
> Sent: Monday, May 24, 2010 11:14 PM
> To: Yitzhak Wiener
> Subject: Re: [Python-il] [python-il]location in file
> 
> On Monday 24 May 2010, you wrote:
> > Shai, thanks.
> >
> > What type is 'words'? I wanted to print it but " name 'words' is not
> > defined "!
> 
> Serves me right for posting untested code...
> 
> words is a list of strings, but it is only assigned if the search is
> successful; and that only happens when using re.S instead of re.M (you
> had
> re.S in your original, line-separating code, where it did no good; it
> only
> matters when you search multiline texts. I had confused re.M for
re.S).
> 
> Sorry,
>       Shai.
> 
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
> MessageLabs Email Security System.
> _____________________________________________________________________
> 
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
>  MessageLabs Email Security System.
>  _____________________________________________________________________
> 

______________________________________________________________________
DSP Group, Inc. automatically scans all emails and attachments using
MessageLabs Email Security System.
_____________________________________________________________________

______________________________________________________________________
DSP Group, Inc. automatically scans all emails and attachments using 
MessageLabs Email Security System.
_____________________________________________________________________
_______________________________________________
Python-il mailing list
[email protected]
http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il

לענות