<55d5e770d835674587c89ee6239c5dae1d4fff4...@exmbx06.ad.oak.ox.ac.uk> Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0
> This seems of such general use that it begs a small utility which will > take a (possibly indexed) fasta file=2C a gff and output the sequences yo= u > want. What would people want from such a programme? > Is GTF (http://mblab.wustl.edu/GTF2.html) more useful or GFF? > Would different elements from the same group (gene/transcript) be joined > together in order? I wrote a small system like this based on ASCII hit files- this means most of your temp files can be processed with standard tools and usually they don't limit the speed although with cygwin going through windoze this can add up.=20 > Would one want filtering on the "features" column so one could retrieve a= ll > splice sites or codon exons? > What would be the output? Another fasta file? How would each "group" of > Sequences (e.g. transcript) be labelled? By a user supplied regular expre= ssion? > > >> I guess it depends what you mean by quick- quick to write you could use = awk >> but then it depends what additional things you want to do with results.= =3D20 >> I ended up writing a C++ fasta utility program since PERL can slow down = som=3D >> etimes but I ended up grabbing a couple of regex libraries to let me=3D2= 0 >> grep names etc.=3D20 > I hoped you used boost:regex which will be in the next c++ standard If you had to read my posts on their mail list youwould change your attitud= e=20 and wish I never heard of it:) Actually=2C as pointed out there=2C it isn't clear how fast it is compared to greta ( for all my complaints on msft that works well but maddock is at boost in any case). Finally I wrote my own limited compiler but there seem to be boost expression compilers that may be useful too. For editing fasta files=2C I doubt you care this much about regex speed however.=20 Note: hotmail is now unusable for TEXT=2C I am moving to [email protected]= m or also use [email protected]. Thanks. Mike Marchywka 586 Saint James Walk Marietta GA 30067-7165 415-264-8477 (w)<- use this 404-788-1216 (C)<- leave message 989-348-4796 (P)<- emergency only [email protected] Note: If I am asking for free stuff=2C I normally use for hobby/non-profit information but may use in investment forums=2C public and private. Please indicate any concerns if applicable. =0A= _________________________________________________________________=0A= Hotmail: Powerful Free email with security by Microsoft.=0A= http://clk.atdmt.com/GBL/go/171222986/direct/01/= _______________________________________________ BBB mailing list [email protected] http://www.bioinformatics.org/mailman/listinfo/bbb
