Re: [Tutor] Example for read and readlines() (Asad)
Asad, Like many projects, there may be many ways to do things BUT some rules do apply. You can only read an open file ONCE unless you seek back to the beginning or reopen it. string = f3.read() string1 = f3.readlines() The first line reads the entire file into a single buffer. The second program line won't work as intended. The first consumed the entire file. Much of the rest is not organized well enough for me to understand what you want to do. I find it important for people to try some simple things like examining the values step by step. Had you typed print (string) print (string1) on a small sample file, you might have fixed that before continuing. Then each step along the way you could examine and verify it made sense up to that point. Try writing the outline of the logic of your program first in English or your native language as an algorithm. Then see what tools are needed. Look at a sample of the log you are evaluating and see what it takes to locate the lines you want and then to break out the parts you want to keep for further use. What I see looks like this: If you find one instance of the string "ERR1" Then You want to find ALL (nonoverlapping) regions consisting of an upper-case letter followed by two lower-case letters and a space and either a space or digits 1 to 3 and digits 0-9 and a space and ... Fairly complex pattern. But you are searching the contents of the ENTIRE file for this and since you seem to have wanted to replace all newlines by spaces and your pattern includes spaces, this would match something that wrapped around from line to line. Is this what you wanted? You then switch gears to using the readlines version and I decided to get back to my regularly scheduled life. As noted, that probably is an empty string or worse. Good luck. -Original Message- From: Tutor On Behalf Of Asad Sent: Sunday, November 11, 2018 8:54 PM To: tutor@python.org Subject: Re: [Tutor] Example for read and readlines() (Asad) Hi All , Thanks for the reply . I am building a framework for the two error conditions, therefore I need to read and readlines because in one only regex is required and in other regex+ n-1 line is required to process : #Here we are opening the file and substituting space " " for each \n encountered f3 = open (r"D:\QI\log.log", 'r') string = f3.read() string1 = f3.readlines() regex = re.compile ( "\n" ) st = regex.sub ( " ", string ) if re.search('ERR1',st): y=re.findall("[A-Z][a-z][a-z] [ 123][0-9] [012][0-9]:[0-5][0-9]:[0-5][0-9] [0-9][0-9][0-9][0-9]",st) print y patchnumber = re.compile(r'(\d+)\/(\d+)')==> doesnot work it only works if I use #string = f3.read() for j in range(len(string1)): if re.search ( r'ERR2', string1[j] ): print "Error line \n", string1[j - 1] mo = patchnumber.search (string1[j-1]) a = mo.group() print a print os.getcwd() break Please advice how to proceed. Thanks, ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Example for read and readlines() (Asad)
On 12Nov2018 07:24, Asad wrote: Thanks for the reply . I am building a framework for the two error conditions, therefore I need to read and readlines because in one only regex is required and in other regex+ n-1 line is required to process : #Here we are opening the file and substituting space " " for each \n encountered f3 = open (r"D:\QI\log.log", 'r') string = f3.read() string1 = f3.readlines() My first remark is that both these lines read _and_ _consume_ the file content. So "string" gets the entire file content, and "string1" gets an empty array of lines, because the file is already at the end, where there is no more data. It is also better to use this idiom to read and then close a file: with open(r"D:\QI\log.log", 'r') as f3: string = f3.read() This reliably closes f3 once the "with" suite completes, even if there's some kind of exception. You need 2 copies of the file data. You can do this 2 ways. The first way is to read the file twice: with open(r"D:\QI\log.log", 'r') as f3: string = f3.read() with open(r"D:\QI\log.log", 'r') as f3: string1 = f3.readlines() The efficient way is to read the file once, then make string from string1, or string1 from string. For example: with open(r"D:\QI\log.log", 'r') as f3: string1 = f3.readlines() string = ''.join(string1) regex = re.compile ( "\n" ) st = regex.sub ( " ", string ) Using a regular expression to replace a fixed string such as "\n" is overkill. Consider: st = string.replace("\n", " ") Python strings have a bunch of handy methods for common simple things. Have a read of the docs for further detail. if re.search('ERR1',st): y=re.findall("[A-Z][a-z][a-z] [ 123][0-9] [012][0-9]:[0-5][0-9]:[0-5][0-9] [0-9][0-9][0-9][0-9]",st) print y On the other hand, a regexp is a good tool for something like the above. patchnumber = re.compile(r'(\d+)\/(\d+)')==> doesnot work it only works if I use #string = f3.read() This may be because "string" is a single string (the whole file text as one string). "string1" is a _list_ of individual strings, one for each line. Personally, i would call this "strings" or "lines" or some other plural word; your code will be easier to read, and easier to debug. Conversely, a misleading name makes debugging harder because you expect the variable to contain what its name suggests, and if it doesn't this will impede you in finding problems, because you will be thinking the whrong thing about what your program is doing. for j in range(len(string1)): if re.search ( r'ERR2', string1[j] ): print "Error line \n", string1[j - 1] mo = patchnumber.search (string1[j-1]) a = mo.group() print a print os.getcwd() break Please advice how to proceed. mo.group() returns the whole match. The above seems to look for the string 'ERR2' in a line, and look for a patch number in the previous line. Is that what is it supposed to do? If the above isn't working, it would help to see the failing output and a description of what good output is meant to look like. Finally, please consider turning off "digest mode" in your list subscription. It will make things easier for everyone. Cheers, Cameron Simpson ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Example for read and readlines() (Asad)
Hi All , Thanks for the reply . I am building a framework for the two error conditions, therefore I need to read and readlines because in one only regex is required and in other regex+ n-1 line is required to process : #Here we are opening the file and substituting space " " for each \n encountered f3 = open (r"D:\QI\log.log", 'r') string = f3.read() string1 = f3.readlines() regex = re.compile ( "\n" ) st = regex.sub ( " ", string ) if re.search('ERR1',st): y=re.findall("[A-Z][a-z][a-z] [ 123][0-9] [012][0-9]:[0-5][0-9]:[0-5][0-9] [0-9][0-9][0-9][0-9]",st) print y patchnumber = re.compile(r'(\d+)\/(\d+)')==> doesnot work it only works if I use #string = f3.read() for j in range(len(string1)): if re.search ( r'ERR2', string1[j] ): print "Error line \n", string1[j - 1] mo = patchnumber.search (string1[j-1]) a = mo.group() print a print os.getcwd() break Please advice how to proceed. Thanks, On Sun, Nov 11, 2018 at 10:30 PM wrote: > Send Tutor mailing list submissions to > tutor@python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/tutor > or, via email, send a message with subject or body 'help' to > tutor-requ...@python.org > > You can reach the person managing the list at > tutor-ow...@python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Tutor digest..." > Today's Topics: > >1. Re: Require Python assistance (Alan Gauld) >2. Re: Example for read and readlines() (Alan Gauld) >3. Re: Example for read and readlines() (Alan Gauld) >4. Re: Example for read and readlines() (Asad) >5. Re: Example for read and readlines() (Alan Gauld) > > > > -- Forwarded message -- > From: Alan Gauld > To: tutor@python.org > Cc: > Bcc: > Date: Sun, 11 Nov 2018 09:53:23 + > Subject: Re: [Tutor] Require Python assistance > On 10/11/2018 18:10, Avi Gross wrote: > > WARNING to any that care: > > > > As the following letter is a repeat request without any hint they read > the earlier comments here, I did a little searching and see very much the > same request on another forum asking how to do this in MATLAB: > > The OP has also repeated posted the same message to this list > (which I rejected as moderator). > > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > > > > > -- Forwarded message -- > From: Alan Gauld > To: tutor@python.org > Cc: > Bcc: > Date: Sun, 11 Nov 2018 10:00:33 + > Subject: Re: [Tutor] Example for read and readlines() > On 11/11/2018 06:49, Asad wrote: > > Hi All , > > > > If I am loading a logfile what should I use from the option > 1,2,3 > > > > f3 = open ( r"/a/b/c/d/test/test_2814__2018_10_05_12_12_45/logA.log", > 'r' ) > > > > 1) should only iterate over f3 > > This is best for processing line by line which is the most > common way to handle files. It saves memory and allows you > to exit early, without reading the entire file if you are > only looking for say a single entry. > > for line in file: >if terminal_Condition: break ># process line here > > > 2) st = f3.read() > > The best solution if you want to process individual characters > or small character groups. Also best if you want to process > the entire file at once, for example using a regular expression > which might span lines. > > > 3) st1 = f3.readlines() > > Mainly historical and superseded by iterating over the file. > But sometimes useful if you need to do multiple passes over > the lines since it only reads the file once. Very heavy > memory footprint for big files. > > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > > > > > -- Forwarded message -- > From: Alan Gauld > To: tutor@python.org > Cc: > Bcc: > Date: Sun, 11 Nov 2018 10:02:40 + > Subject: Re: [Tutor] Example for read and readlines() > On 11/11/2018 09:40, Steven D'Aprano wrote: > > >> f3 = open ( r"/a/b/c/d/test/test_2814__2018_10_05_12_12_45/logA.log", > 'r' ) > > > > Don't use raw strings r"..." for pathname
Re: [Tutor] Example for read and readlines()
On 11/11/2018 10:04, Asad wrote: > 1) and I want to extract the start time , error number and end > time from this logfile so in this case what should I use I guess option 1 : > > with open(filename, 'r') as f: > for line in f: > process(line) Yes, that woyuld be the best choice in that scenario. > 2) Another case is a text formatted logfile and I want to print (n-4) > lines n is the line where error condition was encountered . In that case you could use readlines() if it is a small file. Or you could save the last 5 lines and print those each time you find an error line. You should probably write a function to save the line since it needs to move the previous lines up one. buffer = ['','','','',''] def saveLine(line, buff): buff[0] = buff[1] buff[1] = buff[2] buff[2] = buff[3] buff[3] = buff[4] buff[4] = line for line in file: saveLine(line,buffer) if error_condition: printBuffer() readlines is simpler but stores the entire file in memory. The buffer saves memory but requires some extra processing to save/print. There are some modules for handling cyclic stores etc but in a simple case like this they are probably overkill. > 3) Do we need to ensure that each line in the logfile ends with \n . > > \n is not visible so can we verify in someway to proof EOL \n is placed > in the file . textfile lines are defined by the existence of the \n so both readlines() and a loop over the file will both read multiple lines if a \n is missing. You could use read() and a regex to check for some text marker and insert the newlines. This would be best if the whole file had them missing. If it just an occasional line then you can iterate over the file as usual and check each line for a missing \n and insert (or split) as needed. You might want to write the modified lines back to a new file. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Example for read and readlines()
On 11/11/2018 09:40, Steven D'Aprano wrote: >> f3 = open ( r"/a/b/c/d/test/test_2814__2018_10_05_12_12_45/logA.log", 'r' ) > > Don't use raw strings r"..." for pathnames. Umm, Why not? -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Example for read and readlines()
On 11/11/2018 06:49, Asad wrote: > Hi All , > > If I am loading a logfile what should I use from the option 1,2,3 > > f3 = open ( r"/a/b/c/d/test/test_2814__2018_10_05_12_12_45/logA.log", 'r' ) > > 1) should only iterate over f3 This is best for processing line by line which is the most common way to handle files. It saves memory and allows you to exit early, without reading the entire file if you are only looking for say a single entry. for line in file: if terminal_Condition: break # process line here > 2) st = f3.read() The best solution if you want to process individual characters or small character groups. Also best if you want to process the entire file at once, for example using a regular expression which might span lines. > 3) st1 = f3.readlines() Mainly historical and superseded by iterating over the file. But sometimes useful if you need to do multiple passes over the lines since it only reads the file once. Very heavy memory footprint for big files. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Example for read and readlines()
On Sun, Nov 11, 2018 at 12:19:36PM +0530, Asad wrote: > Hi All , > > If I am loading a logfile what should I use from the option 1,2,3 Depends what you want to do. I assume that the log file is formatted into lines of text, so you probably want to iterate over each line. with open(filename, 'r') as f: for line in f: process(line) is the best idiom to use for line-by-line iteration. It only reads each line as needed, not all at once, so it can handle huge files even if the file is bigger than the memory you have. > f3 = open ( r"/a/b/c/d/test/test_2814__2018_10_05_12_12_45/logA.log", 'r' ) Don't use raw strings r"..." for pathnames. > 1) should only iterate over f3 > > 2) st = f3.read() Use this if you want to iterate over the file character by character, after reading the entire file into memory at once. > 3) st1 = f3.readlines() Use this if you want to read all the lines into memory at once. -- Steve ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor