Regarding the format guessing function. It was deprecated cause it cannot be gaurenteed to work. However, deprecation might be a bit extreme, especially if many people use it. I would propose that we undeprecate it and just document a warning saying it may not work. Any objections?
- Mark Kalle Näslund <[EMAIL PROTECTED]> Sent by: [EMAIL PROTECTED] 11/29/2005 09:34 PM To: Ola Spjuth <[EMAIL PROTECTED]> cc: biojava-l@biojava.org, (bcc: Mark Schreiber/GP/Novartis) Subject: Re: [Biojava-l] Multiple questions Ola Spjuth wrote: >Hi, > >I am investigating the usefulness of BioJava as a backend for sequence >management in Bioclipse (www.bioclipse.net). As a total newbie to >Biojava, I have read the tutorial, BIA examples, glanced at the API, >read my first FASTA-sequence and have come up with a few questions: > >1) Is it possible to search the Biojava-l archives without having to >manually browse by month? > >2) Is there a wrapper for SequenceIO.fileToBiojava(..) that >automatically detects file formats or is it necessary to distinguish >sequence formats externally, i.e. with different file-extensions? If so, >does anyone know of a complete list of file-extensions that could be >mapped to a format? > > There is a deprecated piece of code available, that quite many people actualy use in their code still. Even though it might not be the greatest thing to try to auto guess file format, its the desireable thing to do in many cases. If i just look at people in my lab, they want to open the file, they dont want to keep track of what file format that particular sequence was in, and so on. So, even if file format guessing is bad, people are going to write it, and imho its better to have one centralised good, known to work file guesser, then several different implementations that differ in each persons own application. So, my suggestion is to start with using the deprecated version thats in biojava, if it gets removed you can easily just copy that small part of the code into your own application, or as an external little jarfile. >3) How robust are the I/O-classes for different formats? The >test-library provided is rather short in my opinion and my first test >broke since there was a space in the wrong position... > >4) What are the capabilities for multiple sequence alignment in Biojava? >Is it limited to parse results into Biojava objects (as in BIA) or does >it contain any stable MSA-implementations? Due to BioJavas size it is >not easy to get an overview of the current capabilities and the standard >of different parts. > > There is some support for multiple alignments in biojava. The Alignment interface and implementations happily handle multiple alignments. And you can choose how to interpret it, either as SymbolList over a crossproduct alphabet, or as individual sequences accessable by some label. There is a basic framework for handling multiple alignment formats in the biojava org.biojava.bio.seq.io package. It currently only implements two formats, FASTA and MSF. Most programs seem to be able to generate multiple alignment output into either FASTA or MSF format so you should be able to get the results into biojava. >5) As a novice, has anyone implemented BLAST or CLUSTALW in Java? Any >public web-services running for this? > > > I have been told by greater deities that implementing BLAST in java is hard, because the blast algorithm makes heavy use of low level data structures, pointers ? and similar things that are very hard to implement and controll in java. So the resulting implementation would most likely run pretty darn slow, and not do what you want. Depending on what you want to do with BLAST, the biojava SSAHA implementation might be something you can use instead ( it works pretty ok on quite conserved sequences, but its not realy suited for more divergent sequences ) When it comes to webservices i just know of a few things, i have not used any of these to an large extent, so i cant comment on how well they work for large sequences, big jobs and so on. http://www.ebi.ac.uk/Tools/webservices/services.html http://xml.ddbj.nig.ac.jp/wsdl/index.jsp Sadly they all use their own data encoding and service invocation setup, so its pretty darn annoying to use. >6) Is there some example-code on how to use DAS (as a client)? > >7) How can I submit an RFE? > >Sorry for so many questions in one post; I have a lot of catching up to >do and was hoping for some guidance. Some answers have probably already >been answered in earlier posts but I have not been able to search the >archives. > >Cheers, > > .../Ola > > > > >_______________________________________________ >Biojava-l mailing list - Biojava-l@biojava.org >http://biojava.org/mailman/listinfo/biojava-l > > _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l _______________________________________________ Biojava-l mailing list - Biojava-l@biojava.org http://biojava.org/mailman/listinfo/biojava-l