Hadoop Users, I just wanted to announce my Hadoop application 'CloudBurst' is available open source at: http://cloudburst-bio.sourceforge.net
In a nutshell, it is an application for mapping millions of short DNA sequences to a reference genome to, for example, map out differences in one individual's genome compared to the reference genome. As you might imagine, this is a very data intense problem, but Hadoop enables the application to scale up linearly to large clusters. A full description of the program is available in the journal Bioinformatics: http://bioinformatics.oxfordjournals.org/cgi/content/abstract/btp236 I also wanted to take this opportunity to thank everyone on this mailing list. The discussions posted were essential for navigating the ins and outs of hadoop during the development of CloudBurst. Thanks everyone! Michael Schatz http://www.cbcb.umd.edu/~mschatz
