<[email protected]> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0
---------------------------------------- > Date: Sat=2C 3 Oct 2009 12:54:51 +0100 > From: [email protected] > To: [email protected] > Subject: Re: [BiO BB] gff to sequence > > You can do this easily in Perl... Here is some 'pseudo code' to > (roughly) do it... > > > ## Get a hash of sequences=2C keys =3D IDs=2C values =3D sequence strings= =3B > my %sequences=3B > ... > > # open the GFF file ... > > while(my $gff =3D ){ > my @gffcols =3D split(/\t/=2C $gff)=3B > > print substr($sequence{$gffcols[0]}=2C $gffcols[3]=2C $gffcols[4] - > $gffcols[3])=2C "\n"=3B > ... > } > > > Or something roughly similar to the above =3B-) > > Dan. > > > 2009/10/3 Kie Kyon Huang : >> Hi=2C >> >> Is there a way to quickly extract out the coordinates from a gff file >> and the corresponding sequence from a fasta file? >> I guess it depends what you mean by quick- quick to write you could use awk but then it depends what additional things you want to do with results.=20 I ended up writing a C++ fasta utility program since PERL can slow down som= etimes but I ended up grabbing a couple of regex libraries to let me=20 grep names etc.=20 =0A= _________________________________________________________________=0A= Hotmail: Free=2C trusted and rich email service.=0A= http://clk.atdmt.com/GBL/go/171222984/direct/01/= _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
