Throwing my opinion into the ring on this I've got to agree with Russ here. I would think that SCF is a more sensible format for this kind of procedure but there is the added bonus that the SCF parser does not encode delta-delta values which the SCF specification is completely dependant on.

SCF does have the advantage that nothing "really" assumes anything about them so you can fiddle about with the chromatogram and so long as the things you create in the output Chromatogram are normalised with respect to the cuts then everything should be hunky dory.

If you're doing this for space concerns can I suggest passing the SCF files through a compression filter. You get the best results with a BZIP2 compression algorithm (the format was developed for bzip compression) but GZIP works really well and is the choice of compression format here at the Sanger Centre.

Hope that helps,

Andy Yates
~~~~~~~~~~~~~~~
Senior Computer Biologist,
Cancer Genome Project.

Wellcome Trust Sanger Institute,
Hinxton, Cambridge

Russ Kepler wrote:
On Wednesday 01 February 2006 11:41 pm, Heather Kent wrote:
I would like to write a small application that would concatenate abi or scf
chromatograms and write out a new chromatogram file..
 has anyone done something similar to this or seen any code that would be
helpful for me, i am new at programming
and have been looking through the Biojava API

I'm familiar with the ABI trace code and what you want to do would not be difficult, but the result may not work the way that you want it to. A basecaller will likely be fooled in the transition between the traces and miscall or call no peaks for some time unless you match the local frequencies of each trace around the transition, and tagging the start of one run to the end of the other is a pretty good way to not do that.

If you're not going to run things through a basecaller all you really need to do it is to catenate the trace and basecalls arrays and sequences. These are all exposed in gets(). If the data is coming from a newish AB instrument you may want to add code to handle the Q values from the KB caller and catenate those arrays as well.

Writing the new file would be a new capability, but the existing reader should show you the way to do it.
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l
_______________________________________________
Biojava-l mailing list  -  Biojava-l@biojava.org
http://biojava.org/mailman/listinfo/biojava-l

Reply via email to