Hi Everyone, I went throug the URLs sent by Dr Chapman. Interesting work that you are doing here.:)...
I was wondering if there is anyone who could consider on these. I would like to also be a part of the research work being carried out using Biojava( especially in sequence alignment, miRNA signature Analysis (especially for cancers)...) 1) A set of tools for converting flat data (e.g. sequence strings, taxononmy strings) into BioJava-like objects (e.g. SymbolLists, NCBITaxon). These BioJava-like objects could then be used for more advanced applications. A set of tools for manipulating the BioJava-like objects. 2) Module?: biojava-ws-blast Module?: biojava-ws-biolit Proposed Module: biojava-j2ee Lead: Mark Schreiber - This would probably take the form of SessionBeans and WebServices that can be deployed to Glassfish/ JBoss etc to provide biological services for people who want to make client server or SOA apps. 3) I also liked what Mr. Gang Wu is working on(I read the discussions). I was wondering if I could do something of that sort... May I request the leads to tell me how I could chip in... Regards, Jitesh Dundas On 4/16/10, Mark Chapman <[email protected]> wrote: > A great place to start finding ideas is the wiki. > Both http://biojava.org/wiki/BioJava:Modules > and http://biojava.org/wiki/BioJava3_Proposal > list the next steps planned/desired for BioJava. > > What research area did you have in mind? > > Have fun, > Mark > > > On 4/16/2010 8:57 AM, jitesh dundas wrote: >> Dear Sir, >> >> I am very interested in contributing to this project. >> >> I am looking for a good problem,more on the research side. I can also >> help in coding (I also work as a software >> engineer-j2ee/eclipse/jboss/tomcat .. >> >> Anything that I could work on... >> >> Regards, >> Jitesh Dundas >> >> On 4/8/10, Andreas Dräger<[email protected]> wrote: >>> Hi all, >>> >>> This e-mail is just for your information about somebody new, who'd like >>> to contribute to our project. >>> >>> Cheers >>> Andreas >>> >>> >>> Subject: >>> Re: Fwd: Proposing a project on "Biojava alignment lead" >>> From: >>> Andreas Dräger<[email protected]> >>> Date: >>> Wed, 07 Apr 2010 09:27:13 +0200 >>> To: >>> Cai Shaojiang<[email protected]> >>> >>> Hi Cai Shaojiang, >>> >>> Thank you for you e-mail! I don't know what happened to the e-mail list. >>> Sometimes it takes a while due to the spam filters, I guess. >>> >>> > I am a PhD student from National University of Singapore. My major >>> research area is local alignment algorithms and data structures for SNP >>> identification. And I have used Java and Eclipse for years for software >>> development. I am very interested in your GSoC programme. I find that >>> there is a module called "biojava-alignment lead" whose mentor is you. I >>> want to propose a new project on this module. I have several questions >>> about this module. >>> >>> Yes, that's me. So great to get your support. >>> >>> > 1. It seems that pairwise alignment is to find similarity between >>> two >>> short sequences. Existing pairwise alignment is based on dynamic >>> programming, is it Smith-Waterman algorithm? >>> >>> So, currently, BioJava contains three different alignment approaches. >>> There are two deterministic algorithms, i.e., Smith-Waterman for local >>> alignment and Needleman-Wunsch for global alignment. Third, there is the >>> possibility to apply Hidden Markov Models for alignment. An example of >>> the latter approach should be in the cookbook. >>> >>> > 2. What is the exact task of "refactoring of underlying data >>> structures"? >>> >>> Yes, this is something, I did last week already but it could still be >>> improved. The problem was that the alignment algorithms actually >>> produced a kind of string that looks similar to the output of BLAST. >>> This string contained the score, the computation time, the length of the >>> alignment etc. The problem was that people wanted to perform >>> higher-level computation on the score value or evaluate some other >>> information. Now, the alignment will produce a data structure that >>> contains all the information and can, in addition to that, also produce >>> such a BLAST-like output. There is, however, still the following >>> problem: The data structure requires both sequences in the pair-wise >>> alignment to have an identical length. In case of local alignment this >>> is especially stupid (actually), because gaps are inserted to fill the >>> sequences. And then the data structure tries to keep the old sequence >>> coordinates, leading to the effect that the numbers "query start", >>> "query end", "subject start", and "subject end" are required to shift >>> the sequences against each other when displaying the output. So, you >>> cannot easily print the sequences below of each other, you first have to >>> shift them. Please check out the latest version of this package via >>> anonymeous svn and have a look ;-) >>> >>> > 3. My existing research area is aiming to deal with aligning short >>> read (10s~100s bp) against extremely long sequences (e.g., human >>> genome). Af far as I know, there is not existing such alignment tools >>> implemented in Java. Would you consider this direction? >>> >>> See, this would be very nice to include. But this requires that we no >>> longer fill the short sequence with many, many gap symbols (just a waist >>> of memory), but improve the data structure. There is already an >>> UnequalLenghtAlignment (just a data structure, no algorithm) and I think >>> we could use this as a starting point. Then your algorithm should only >>> produce such a data structure and this would be fine. >>> >>> > 4. It seems that the existing tools is just lacking of some >>> refactoring and representation interfaces. Any more underlying tasks? >>> >>> Hm. Yes: With the release of BioJava 3 data structures have changed >>> again. So maybe there's also some adaptation to the new structure >>> required. >>> >>> > I am keeping an eye on GSoC from last month, but sorry to find out >>> that I sent the initial email to the mailing list before I subscribe >>> it... >>> >>> Ok. Sounds good. Thanks for your interest. So I suggest: Download the >>> latest trunk, have a look, play around and if you can improve something >>> we'll put it into the trunk and write your name into the authors' tag. >>> >>> Cheers >>> Andreas >>> >>> -- >>> Dipl.-Bioinform. Andreas Dräger >>> Eberhard Karls University Tübingen >>> Center for Bioinformatics (ZBIT) >>> Sand 1 >>> 72076 Tübingen >>> Germany >>> >>> Phone: +49-7071-29-70436 >>> Fax: +49-7071-29-5091 >>> _______________________________________________ >>> Biojava-l mailing list - [email protected] >>> http://lists.open-bio.org/mailman/listinfo/biojava-l >>> >> >> _______________________________________________ >> Biojava-l mailing list - [email protected] >> http://lists.open-bio.org/mailman/listinfo/biojava-l > _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
