Hi Alan, I made a mistake and incorrectly assumed that differences between 54 lines of output and 27 lines of output is the result of removing duplicate email addresses, i.e., [email protected] [email protected], [email protected], [email protected]
Apparently, this is not the case and I was wrong :( The solution to the problem is in the desired line output: [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] There were 27 lines in the file with From as the first word Not in the output of a subset. Latest output: set(['[email protected]', '[email protected]', ' [email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', ' [email protected]', '[email protected]', ' [email protected]']) ← Mismatch There were 54 lines in the file with From as the first word Latest revised code: fname = raw_input("Enter file name: ") if len(fname) < 1 : fname = "mbox-short.txt" fh = open(fname) count = 0 addresses = set() for line in fh: if line.startswith('From'): line2 = line.strip() line3 = line2.split() line4 = line3[1] addresses.add(line4) count = count + 1 print addresses print "There were", count, "lines in the file with From as the first word" Regards, Hal On Sat, Aug 1, 2015 at 5:44 PM, Alan Gauld <[email protected]> wrote: > On 02/08/15 00:07, Ltc Hotspot wrote: > >> Question1: The output result is an address or line? >> > > Its your assignment,. you tell me. > But from your previous mails I'm assuming you want addresses? > > Question2: Why are there 54 lines as compared to 27 line in the desired >> output? >> > > Because the set removes duplicates? So presumably there were 27 > duplicates? (Which is a suspicious coincidence!) > > fname = raw_input("Enter file name: ") >> if len(fname) < 1 : fname = "mbox-short.txt" >> fh = open(fname) >> count = 0 >> addresses = set() >> for line in fh: >> if line.startswith('From'): >> line2 = line.strip() >> line3 = line2.split() >> line4 = line3[1] >> addresses.add(line4) >> count = count + 1 >> print addresses >> print "There were", count, "lines in the file with From as the first word" >> > > That looks right in that it does what I think you want it to do. > > The output result: >> set(['[email protected]', '[email protected]', ' >> [email protected]', '[email protected]', '[email protected]', ' >> [email protected]', >> '[email protected]', '[email protected]',' >> [email protected]', '[email protected]', ' >> [email protected]']) ← Mismatch >> > > That is the set of unique addresses, correct? > > There were 54 lines in the file with From as the first word >> > > And that seems to be the number of lines in the original file > starting with From. Can you check manually if that is correct? > > The desired output result: >> [email protected] >> [email protected] >> [email protected] >> [email protected] >> [email protected] >> [email protected] >> > ... > > Now I'm confused again. This has duplicates but you said you > did not want duplicates? Which is it? > > ... > >> [email protected] >> [email protected] >> There were 27 lines in the file with From as the first word >> > > And this is reporting the number of lines in the output > rather than the file (I think). Which do you want? > > Its easy enough to change the code to govre the output > you demonstrate, but that's not what you originally asked > for. So just make up your mind exactly what it is you want > out and we can make it work for you. > > -- > Alan G > Author of the Learn to Program web site > http://www.alan-g.me.uk/ > http://www.amazon.com/author/alan_gauld > Follow my photo-blog on Flickr at: > http://www.flickr.com/photos/alangauldphotos > > > _______________________________________________ > Tutor maillist - [email protected] > To unsubscribe or change subscription options: > https://mail.python.org/mailman/listinfo/tutor > _______________________________________________ Tutor maillist - [email protected] To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
