[EMAIL PROTECTED] wrote: > Hi, > Anybody can give a hint how to use biojava extract a specific region(say: -800 > to +200 relative to transcription startsite) of a gene's genomic sequence from NCBI? > > I wrote java program to do this myself, but I am not if my parsing scheme and > retrieving scheme are efficient and accurate. > > Thanks! >
Morning, If you have a genbank file with this region (both the tss and -800 - +200 relative to that) then you can use SeqIOTools.readGenbank to read the file, the filter() method on sequence in combination with an instance of FeatureFilter (by location, by type or whatever you need to pull out that tss), and then new SubSequence(seq, tssLoc.getMin() - 800, tssLoc.getMax() + 200) to cut out that bit of sequence. You may need to check the strandedness of the tss and flip the subsequence accordingly. Matthew _______________________________________________ Biojava-l mailing list - [EMAIL PROTECTED] http://biojava.org/mailman/listinfo/biojava-l
