Actually, it was a little bit more complicated; for the protocol: Combine fasta databases
cat db_1.fasta db_2.fasta > uni50plantsplus.fasta Fix encoding errors dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta dos2unix: Binary symbol 0x00 found at line 13484276 dos2unix: Skipping binary file uni50plantsplusdos.fasta Don’t try to open a large FASTA database in a standard editor! Instead use sed, the *stream editor*: sed -n '13484266,13484286p' uni50plantsplusdos.fastaHGRQGVLQALRQGQSNPDFAYAAVVDIQGKAANESSAPGIIIPNQPIPNDPLAWLGERKI DSTSDGRNFLEFHAPVFDEGDIRGYVRLGYFQPEAGLQYEDLPFFAMFTLPVFLLTPLFY FLLRREIRPLRQMNENLEDLIEGGVEKRVELHPSGELGDFIQSFNKLIDSAQNRIQTLES EQSGMLTSGKLLSYRHARIESILKALPDAILVIDEGGCVNYANDKTAGLLGKTQESIIGK KPQEWCKDPGLITYLSGYGASGGQVGYISDSIRIAPQHDPEKLLEVKAYPLFSPKDGSHL LGNMVVIRDCTEEQLANQNRGEFIAQVSHELKTPLNVLAMYSEALLGEDGNSESFRIEGL NIIHDEVDRLSTLINNMLAISRFELGGIQMNRQRVRIGELLEDAYNNITQSGRDRDLEYE IDLPREMNALNVDKELLRIAVNNLLTNAIKYNKSNGTVTLTAQEFDDAIEISVSDTGVGI SPDDQQKIFDKFYRADDDKVREQTGHGLGSSLVQQIVHFHHGKLSVESERKKGSTFTIRL EKDMATRLQAGAV >UniRef50_A0A349N9R6 Uncharacterized prssiosira oceanica TaxID=159749 RepID=K0T835_THAOC MMNFFPARPARSRDRPAEVELYRTIASRATDRPTDRPTESLNHVAAVSTEDIIANRAAPP PGPSPPPGGGAVRIRDIQFPAMSAACKTAAESLVNNSNNSLTVSWDMNEEKWQYGLLAAM EGEVRCEPLWMPLRFPELRLQDRLIEFSITAIISLVASREARAPHSKGVHHTTRRGMKLN EGWEGGRTEVYPAYWCSAMATSPVATRVSPDAAGKENDAIAKSTHRSGNAVQQPHSLAIN LLRRAKSGQLSMFSYSNSEQSPAIDQECFDIALRLILDEDNELHIGSDDSLPSYCKNALP EARGSITRVQFSGYQFTSFPEVTFGRTYFNLAHLDIRQNSSLTCIDSIISQLPQLTSLNL TNCPNLSEFDLASAYSYIHLLIRWFAFAGTVAPLGRNRKGRRNLRLQNLWIRGCNLSTMR SEEWGRVFDNLAESTGPLEMLTLSGNRLACLHENVVKCKSLVHLFIEDNGQMTTSSPLVL PENLGDLSQLTSLSLCGNNLRRLPRTIGRLDDQCGLHLQRNSDLAHPPPRYLQSIQTIRD FYHEERMKLVRGMILFVPHFNRARIRANGRLYEPGGSGYFECKTRFEEVASERGPINSFV Looks normal, track down the problem: sed -n '13484266,13484286p' uni50plantsplusdos.fasta > test.fasta dos2unix -n test.fasta testux.fasta dos2unix: Binary symbol 0x00 found at line 11 dos2unix: Skipping binary file test.fasta Opening test.fasta in standard editor notpad++: line 11: >UniRef50_A0A349N9R6 Uncharacterized prNULNULNUL…ssiosira oceanica TaxID=159749 RepID=K0T835_THAOC Fix that with sed: First try with the test.fasta, containing the problem: sed -i 's/\UniRef50_A0A349N9R6 Uncharacterized.*/UniRef50_A0A349N9R6 Uncharacterized/' test.fastados2unix -n test.fasta testux.fasta dos2unix: converting file test.fasta to file testux.fasta in Unix format... In the normal editor: >UniRef50_A0A349N9R6 Uncharacterized Looks good! Now, fix the real fasta file: sed -i 's/\UniRef50_A0A349N9R6 Uncharacterized.*/UniRef50_A0A349N9R6 Uncharacterized/' uni50plantsplusdos.fasta Try again: dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta The same error, but in a different line… We could repeat the above procedure n times, but better let’s try something more aggressive to remove all null characters: sed -i 's/\x0//g' uni50plantsplusdos.fasta Try again: dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta dos2unix: converting file uni50plantsplusdos.fasta to file uni50plantsplus.fasta in Unix format... No error message. Conclusion sed -i 's/\x0//g' uni50plantsplusdos.fasta Am Mittwoch, 21. November 2018 09:10:34 UTC+1 schrieb Robert: > > Thanks David, you are right, this was a WinLin problem! > > Am Dienstag, 20. November 2018 17:04:35 UTC+1 schrieb David Shteynberg: >> >> Could this be windows line endings? Have you tried to run dos2unix >> command on your fasta file? >> >> Thanks, >> -David >> >> On Tue, Nov 20, 2018 at 7:41 AM Robert Winkler <[email protected]> >> wrote: >> >>> This is what I try to find out at the moment. head and tail look fine. >>> The DB consists of 3 individual DBs. 2 of them look fine. The UniRef50 >>> could be the problem but I cannot check it manually. Information about the >>> "strange" output would help. >>> >>> >>> *Robert Winkler* >>> >>> On Nov 20 2018, at 4:31 pm, Eric Deutsch <[email protected]> >>> wrote: >>> >>> >>> Hi Robert, I think this warning comes when the FASTA database has >>> unexpected characters in it. Is there anything unusual about the FASTA >>> database you’re using? Unusual spaces or something? >>> >>> >>> >>> Regards, >>> >>> Eric >>> >>> >>> >>> >>> >>> *From:* [email protected] >>> <https://link.getmailspring.com/link/[email protected]/2?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> >>> <[email protected] >>> <https://link.getmailspring.com/link/[email protected]/3?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>> >>> >>> *On Behalf Of *Robert >>> *Sent:* Tuesday, November 20, 2018 5:50 AM >>> *To:* spctools-discuss <[email protected] >>> <https://link.getmailspring.com/link/[email protected]/4?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> > >>> *Subject:* [spctools-discuss] TPP 5.2 ProteinProphet WARNING: Trying to >>> compute mass of non-residue: >>> >>> >>> Hi, I am testing the docker version of TPP 5.2 on Windows 10, >>> works fine so far, but in the ProteinProphet step appears a warning (I >>> stopped the script after several minutes of the warning message running). >>> WARNING: Trying to compute mass of non-residue: >>> WARNING: Trying to compute mass of non-residue: >>> WARNING: Trying to compute mass of non-residue: >>> WARNING: Trying to compute mass of non-residue: >>> WARNING: Trying to compute mass of non-residue: >>> Any idea how to track down the error (I suppose a strange symbol??)? >>> Best, Robert >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "spctools-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] >>> <https://link.getmailspring.com/link/[email protected]/5?redirect=mailto%3Aspctools-discuss%2Bunsubscribe%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> To post to this group, send email to [email protected] >>> <https://link.getmailspring.com/link/[email protected]/6?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> Visit this group at https://groups.google.com/group/spctools-discuss >>> <https://link.getmailspring.com/link/[email protected]/7?redirect=https%3A%2F%2Fgroups.google.com%2Fgroup%2Fspctools-discuss&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> For more options, visit https://groups.google.com/d/optout >>> <https://link.getmailspring.com/link/[email protected]/8?redirect=https%3A%2F%2Fgroups.google.com%2Fd%2Foptout&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "spctools-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] >>> <https://link.getmailspring.com/link/[email protected]/9?redirect=mailto%3Aspctools-discuss%2Bunsubscribe%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> To post to this group, send email to [email protected] >>> <https://link.getmailspring.com/link/[email protected]/10?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> Visit this group at https://groups.google.com/group/spctools-discuss >>> <https://link.getmailspring.com/link/[email protected]/11?redirect=https%3A%2F%2Fgroups.google.com%2Fgroup%2Fspctools-discuss&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> For more options, visit https://groups.google.com/d/optout >>> <https://link.getmailspring.com/link/[email protected]/12?redirect=https%3A%2F%2Fgroups.google.com%2Fd%2Foptout&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t> >>> . >>> >>> [image: Open Tracking] >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "spctools-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> Visit this group at https://groups.google.com/group/spctools-discuss. >>> For more options, visit https://groups.google.com/d/optout. >>> >> -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/spctools-discuss. For more options, visit https://groups.google.com/d/optout.
