Pete O'Connell wrote: > Hi I am trying to parse a text file and create a list of all the lines > that don't include: "vn", "vt" or are empty. I want to make this as > fast as possible because I will be parsing many files each containing > thousands of lines. I though I would give list comprehensions a try. > The last 3 lines of the code below have three list comprehensions that > I would like to combine into 1 but I am not sure how to do that. > Any tips would be greatly appreciated > > pete > > #start############################################################ > fileName = '/usr/home/poconnell/Desktop/objCube.obj' > theFileOpened = open(fileName,'r') > theTextAsList = theFileOpened.readlines()
If you have a file with 1,000,000 lines you have now a list of 1,000,000 strings of which perhaps 1,000 match your criteria. You are squandering memory. Rule of thumb: never use readlines(), iterate over the file directly. > theTextAsListStripped = [] > for aLine in theTextAsList: > > theTextAsListStripped.append(aLine.strip("\n")) > > theTextAsListNoVn = [x for x in theTextAsListStripped if "vn" not in x] > theTextAsListNoVnOrVt = [x for x in theTextAsListNoVn if "vt" not in x] > theTextAsListNoVnOrVtOrEmptyLine = [x for x in theTextAsListNoVn if x != > ""] I think that should be theTextAsListNoVnOrVtOrEmptyLine = [x for x in theTextAsListNoVnOrVt if x != ""] You can combine the three if clauses or add them all to one list-comp: with open(filename) as lines: wanted = [line.strip("\n") for line in lines if "vn" not in line and "vt" not in line and line != "\n"] You can even have multiple if clauses in one list-comp (but that is rarely used): with open(filename) as lines: wanted = [line.strip("\n") for line if "vn" not in line if "vt" not in x if line != "\n"] While your problem is simple enough to combine all filters into one list- comp some problems are not. You can then prevent the intermediate lists from materializing by using generator expressions. The result minimizes memory consumption, too, and should be (almost) as fast. For example: with open(filename) as lines: # use gen-exps to remove empty and whitespace-only lines stripped = (line.strip() for line in lines) nonempty = (line for line in stripped if line) wanted = [line for line in nonempty if "vt" not in line and "vn" not in line] _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor