Hi Nick, I agree with Andreas (thank you for coming in!), just a few additions below:
> - 1 year and a half experience with Java; it became my first choice in coding; currently I do all my tasks and homework in Java, also developing a bot for aichallenge [1] in Java as a university project. And a little personal project I'm working at, a memory test game, also written in Java. > - 5 years of C/C++ > - web: HTML, PHP, CSS, MySQL - made a module for my school's website Great, sound Java knowledge is something that would help you a lot on this project. > > Some thoughts and questions about the project > > - I took a look at your sources and saw you already have parsers for a lot of files like: FASTA, FASTQ, PDB, mmcif etc. What are the priorities for the new parsers, which is needed most ? You are right there are many parsers in BioJava, too many actually, we only need one parser for one file format. However, currently this is not the case, there are 2 or 3 FASTA parsers for example. They are all subtly different, so the task would be to unify these parsers so one parser could be used for in all the cases. > - Should we choose only one parser to work on for this project, or the expectations are to implement more than one ? It depends on the parser and on your own abilities. However, if you can only make one FASTA parser in 3 months, than your application is unlikely to be competitive. > Questions about the "Coding exercise" > > - About the "ambiguous characters", lets say we have ambiguous DNA. For these two sequences: "ACTATATCGG" and "ATGKMCGW" we should have in one FASTA output file the sequence "ACTATATCGG" and in another one "ATGKMCGW" ? Correct > > - What do you mean by large, “be capable of reading large files”, because afterwards under “Submission” it says “the test data file named data.fasta up to 10Kb in > size” ? Should I understand that 10Kb is the limit for a “large file” ? For this exercise assume that the large file is the one that does not fit into the computers RAM. With Java programme you can substitute computer RAM with the amount of memory available for JVM. So let's say that your parser should be able to work with 512Mb file with the JVM settings -Xmx256M. And yes, you do not have to email this file to me. I hope that helps. Good luck with your application. Regards, Peter _______________________________________________ Biojava-l mailing list - [email protected] http://lists.open-bio.org/mailman/listinfo/biojava-l
