Re: Memory error due to the huge/huge input file size
On Nov 10, 4:47 pm, [EMAIL PROTECTED] wrote: Hello Everyone, I need to read a .csv file which has a size of 2.26 GB . And I wrote a Python script , where I need to read this file. And my Computer has 2 GB RAM Please see the code as follows: This program has been developed to retrieve all the promoter sequences for the specified list of genes in the given cluster So, this program will act as a substitute to the whole EZRetrieve system Input arguments: 1) Cluster.txt or DowRatClust161718bwithDummy.txt 2) TransProCrossReferenceAndSequences.csv - This is the file that has all the promoter sequences 3) -2000 4) 500 import time import csv import sys import linecache import re from sets import Set import gc print time.localtime() fileInputHandler = open(sys.argv[1],r) line = fileInputHandler.readline() refSeqIDsinTransPro = [] promoterSequencesinTransPro = [] reader2 = csv.reader(open(sys.argv[2],rb)) reader2_list = [] reader2_list.extend(reader2) for data2 in reader2_list: refSeqIDsinTransPro.append(data2[3]) for data2 in reader2_list: promoterSequencesinTransPro.append(data2[4]) while line: l = line.rstrip('\n') for j in range(1,len(refSeqIDsinTransPro)): found = re.search(l,refSeqIDsinTransPro[j]) if found: promoterSequencesinTransPro[j] print l line = fileInputHandler.readline() fileInputHandler.close() The error that I got is given as follows: Traceback (most recent call last): File RefSeqsToPromoterSequences.py, line 31, in module reader2_list.extend(reader2) MemoryError I understand that the issue is Memory error and it is caused because of the line reader2_list.extend(reader2). Is there any other alternative method in reading the .csv file line by line? sincerely, Suprabhath Thanks a Lot James Mills. It worked -- http://mail.python.org/mailman/listinfo/python-list
Memory error due to the huge/huge input file size
Hello Everyone, I need to read a .csv file which has a size of 2.26 GB . And I wrote a Python script , where I need to read this file. And my Computer has 2 GB RAM Please see the code as follows: This program has been developed to retrieve all the promoter sequences for the specified list of genes in the given cluster So, this program will act as a substitute to the whole EZRetrieve system Input arguments: 1) Cluster.txt or DowRatClust161718bwithDummy.txt 2) TransProCrossReferenceAndSequences.csv - This is the file that has all the promoter sequences 3) -2000 4) 500 import time import csv import sys import linecache import re from sets import Set import gc print time.localtime() fileInputHandler = open(sys.argv[1],r) line = fileInputHandler.readline() refSeqIDsinTransPro = [] promoterSequencesinTransPro = [] reader2 = csv.reader(open(sys.argv[2],rb)) reader2_list = [] reader2_list.extend(reader2) for data2 in reader2_list: refSeqIDsinTransPro.append(data2[3]) for data2 in reader2_list: promoterSequencesinTransPro.append(data2[4]) while line: l = line.rstrip('\n') for j in range(1,len(refSeqIDsinTransPro)): found = re.search(l,refSeqIDsinTransPro[j]) if found: promoterSequencesinTransPro[j] print l line = fileInputHandler.readline() fileInputHandler.close() The error that I got is given as follows: Traceback (most recent call last): File RefSeqsToPromoterSequences.py, line 31, in module reader2_list.extend(reader2) MemoryError I understand that the issue is Memory error and it is caused because of the line reader2_list.extend(reader2). Is there any other alternative method in reading the .csv file line by line? sincerely, Suprabhath -- http://mail.python.org/mailman/listinfo/python-list