It is supposed to only read on demand. Are you sure it isn't?? As long as you don't keep references to the individual sequences they should be destroyed by the garbage collector. If there is a real memory leak something must be keeping references to them but this is not the intended behaivour. This would be a serious bug. A while back there was a problem with change listeners not getting disposed of. I thought this was resolved but possibly it was not.
Would need an example to track this down. - Mark "Richard HOLLAND" <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 07/04/2005 01:33 PM To: <biojava-l@biojava.org> cc: Gem Yang <[EMAIL PROTECTED]>, (bcc: Mark Schreiber/GP/Novartis) Subject: RE: [Biojava-l] memory leak while reading nr.fasta This is one big problem, and I've come across it before. SeqIOTools.fileToBiojava reads the whole file in at once and stores everything in memory as Sequence objects in a virtual sequence database. For a file the size of nr, this is simply impossible on most machines, and causes out-of-memory exceptions. What is required for files this size is a SeqIOTools parser that reads sequence objects _on demand_ as requested by the iterator, rather than reading the whole lot at once. This way it can drop sequence objects once they have been passed over by the iterator, freeing up memory for subsequent ones (assuming the client app keeps no references to them either). How this fits in with BioJava's "everything is a sequence database" philosophy or not I don't know, as essentially it breaks it by defining a file to be a sequential-access sequence database, rather than a random-access one. Can someone clarify if a lazy-loading parser/database implementation already exists for situations like this, or does one need to be written? cheers, Richard Richard Holland Bioinformatics Specialist GIS extension 8199 --------------------------------------------- This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately. Please do not copy or use it for any purpose, or disclose its content to any other person. Thank you. --------------------------------------------- > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Gem Yang > Sent: Friday, July 01, 2005 2:30 AM > To: biojava-l@biojava.org > Subject: [Biojava-l] memory leak while reading nr.fasta > > > Hi, > > I am new to Biojava. > I have the following program, which is copied from ReadFaster2 in the > cookbook. > > public static void main(String[] args) { > try { > // args[0] is nr.fasta > BufferedReader br = new BufferedReader(new > FileReader(args[0])); > > String format = "FASTA"; > String alphabet = "PROTEIN"; > > SequenceIterator iter = > quenceIterator)SeqIOTools.fileToBiojava(format,alphabet, br); > > int count =0; > long start = System.currentTimeMillis(); > while(iter.hasNext()) > { > Sequence s = iter.nextSequence(); > String name = s.getName(); > > //System.out.println(name); > s.getAnnotation(); > //System.out.println(s.seqString()); > count ++; > System.out.println(count); > > } > long end = System.currentTimeMillis(); > System.out.println("number of sequence " + count); > System.out.println("time used" + (end-start)/1000 + > "seconds"); > System.out.println((end-start)/1000/60 + "minutes"); > } > catch (FileNotFoundException ex) { > //can't find file specified by args[0] > ex.printStackTrace(); > }catch (BioException ex) { > //error parsing requested format > ex.printStackTrace(); > } > } > > When running this code, I got out of memory error in about > half an hour and > 1.5GB memory allocated. My workstation is a Windows XP with > 2 GB of memory. > My biojava version is 1.3. My JRE is one came with Websphere > application > developer. > > Thanks. > Gem > _______________________________________________ > Biojava-l mailing list - Biojava-l@biojava.org > http://biojava.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l