It depends what you mean by "quickly". It's quite difficult to 
efficiently insert bytes into a file anywhere except at the end... and 
of course that kind of efficiency is nice to have when dealing with 
multi-gigabyte files. I doubt any (non-commercial?) proteomics tool goes 
to the length of doing protein accession id manipulation optimally:
http://www.codeproject.com/KB/files/enhancedfs.aspx
http://stackoverflow.com/questions/724998/efficient-in-line-search-and-replace-for-large-file

DecoyFASTA's -no_reverse mode writes a new file with the original 
sequences and the ids adjusted; that's probably the best you're going to 
get. Make sure you've got 2 gigs free. :)

-Matt


Brian Pratt wrote:
> Have a look at the decoyfasta tool that ships with TPP.
>
> On Fri, Sep 18, 2009 at 10:21 AM, rhodea <[email protected] 
> <mailto:[email protected]>> wrote:
>
>
>     Dear friends,
>
>     I have a large protein database (2G) in fasta format. I want to use it
>     as decoy database during protein identification and append it to a
>     target database. So I need to modify the ID name in this fasta file.
>     How should I do it quickly in batch? Is there any software that can
>     satisfy this aim?
>
>     Sincerely,
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/spctools-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to