Hi Mark! The blast etc. is parallelized. The contigs are split into groups of 1000 and I also modified my program in the way that it works now with all those separate files. But nevertheless I also have a program that works on the concatenated blast output. The parser with my customized handler is always looking for the results of a certain contig and then compares these results to something else and also does some other stuff in-between to calculate some statistics and then creates a new parser again to get the results for the next contig. So a System.exit() is not an option, since it would stop my whole program (in which I am using the parser). I also don't wanna start working with threads here. I was just hoping that there would be a way to tell the handler that, when a certain condition is met, it should give the parser a signal to stop parsing (and maybe even to reset itself to the first line). But I guess there's no way to do it in the customized handler...
Thanks, Marcel [email protected] wrote: > > Hi - > > There are many ways to stop the parsing but it really depends on how you > have set the program up. Notably there is no way for the Blast parsing > system of BioJava to shut itself down but control probably shouldn't > happen at that level. > > A crude but effective procedure is to write out the results when you > find the hit of interest and then simply call System.exit() > > Another approach would be to spawn Tasks to parse each record and then > have them signal to the main thread when they are complete to shut them > down. If you are using Java 1.5 or earlier then you would need to do > this with Threads. If you have a later version you can use the > concurrent packages which are much nicer to deal with. > > One thing I don't understand is why you don't blast each contig > separately, in that case the results would only contain your hit of > interest. That means 90K separate blasts but there are versions of > blast that run on clusters and the database (3 million genes) is not > huge so it should be an embarrassingly parallel problem? > > - Mark > > [email protected] wrote on 03/10/2009 03:00:36 AM: > >> Hi Mark! >> >> Mark Schreiber wrote: >> > You could just customize BlastEcho to pass on the events of interest, >> > ignore those that are not interesting. >> That's what I am doing right now. But I don't know, how to tell my >> customized BlastEcho to stop, when a certain condition is met during a >> paricular event call. What's the command for stopping there? >> >> > It could also exit if a certain >> > event occurs. >> How? >> >> > Remember it cost almost nothing to read the file so you >> > save time by only sending interesting events for parsing. >> Hmm, I am not sure, if it's really almost nothing, when I've about 90,000 >> contigs that were blasted against a database with about maybe 3,000,000 >> genes. The blast output that I am parsing is about 13Gig big and every >> cycle I am looking for the results of one particular contig of these >> 90,000 contigs. So I definitely experienced that the time sums up a lot, >> when it's running in each of these 90,000 cycles over the whole file, >> although the contig I am looking for was already at the beginning > ofthe file. >> >> >> Cheers, >> Marcel _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
