Actually, it was a little bit more complicated; for the protocol:

Combine fasta databases

cat db_1.fasta db_2.fasta > uni50plantsplus.fasta
Fix encoding errors

dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta
dos2unix: Binary symbol 0x00 found at line 13484276
dos2unix: Skipping binary file uni50plantsplusdos.fasta

Don’t try to open a large FASTA database in a standard editor!

Instead use sed, the *stream editor*:

sed -n '13484266,13484286p' 
uni50plantsplusdos.fastaHGRQGVLQALRQGQSNPDFAYAAVVDIQGKAANESSAPGIIIPNQPIPNDPLAWLGERKI
 
DSTSDGRNFLEFHAPVFDEGDIRGYVRLGYFQPEAGLQYEDLPFFAMFTLPVFLLTPLFY 
FLLRREIRPLRQMNENLEDLIEGGVEKRVELHPSGELGDFIQSFNKLIDSAQNRIQTLES 
EQSGMLTSGKLLSYRHARIESILKALPDAILVIDEGGCVNYANDKTAGLLGKTQESIIGK 
KPQEWCKDPGLITYLSGYGASGGQVGYISDSIRIAPQHDPEKLLEVKAYPLFSPKDGSHL 
LGNMVVIRDCTEEQLANQNRGEFIAQVSHELKTPLNVLAMYSEALLGEDGNSESFRIEGL 
NIIHDEVDRLSTLINNMLAISRFELGGIQMNRQRVRIGELLEDAYNNITQSGRDRDLEYE 
IDLPREMNALNVDKELLRIAVNNLLTNAIKYNKSNGTVTLTAQEFDDAIEISVSDTGVGI 
SPDDQQKIFDKFYRADDDKVREQTGHGLGSSLVQQIVHFHHGKLSVESERKKGSTFTIRL EKDMATRLQAGAV
>UniRef50_A0A349N9R6 Uncharacterized prssiosira oceanica TaxID=159749 
RepID=K0T835_THAOC 
MMNFFPARPARSRDRPAEVELYRTIASRATDRPTDRPTESLNHVAAVSTEDIIANRAAPP 
PGPSPPPGGGAVRIRDIQFPAMSAACKTAAESLVNNSNNSLTVSWDMNEEKWQYGLLAAM 
EGEVRCEPLWMPLRFPELRLQDRLIEFSITAIISLVASREARAPHSKGVHHTTRRGMKLN 
EGWEGGRTEVYPAYWCSAMATSPVATRVSPDAAGKENDAIAKSTHRSGNAVQQPHSLAIN 
LLRRAKSGQLSMFSYSNSEQSPAIDQECFDIALRLILDEDNELHIGSDDSLPSYCKNALP 
EARGSITRVQFSGYQFTSFPEVTFGRTYFNLAHLDIRQNSSLTCIDSIISQLPQLTSLNL 
TNCPNLSEFDLASAYSYIHLLIRWFAFAGTVAPLGRNRKGRRNLRLQNLWIRGCNLSTMR 
SEEWGRVFDNLAESTGPLEMLTLSGNRLACLHENVVKCKSLVHLFIEDNGQMTTSSPLVL 
PENLGDLSQLTSLSLCGNNLRRLPRTIGRLDDQCGLHLQRNSDLAHPPPRYLQSIQTIRD 
FYHEERMKLVRGMILFVPHFNRARIRANGRLYEPGGSGYFECKTRFEEVASERGPINSFV

Looks normal, track down the problem: 

sed -n '13484266,13484286p' uni50plantsplusdos.fasta > test.fasta
dos2unix -n test.fasta testux.fasta
dos2unix: Binary symbol 0x00 found at line 11
dos2unix: Skipping binary file test.fasta

Opening test.fasta in standard editor notpad++:
line 11: >UniRef50_A0A349N9R6 Uncharacterized prNULNULNUL…ssiosira oceanica 
TaxID=159749 RepID=K0T835_THAOC

Fix that with sed:
First try with the test.fasta, containing the problem:
sed -i 's/\UniRef50_A0A349N9R6 Uncharacterized.*/UniRef50_A0A349N9R6 
Uncharacterized/' test.fastados2unix -n test.fasta testux.fasta
dos2unix: converting file test.fasta to file testux.fasta in Unix format...


In the normal editor:
>UniRef50_A0A349N9R6 Uncharacterized
Looks good! Now, fix the real fasta file:
sed -i 's/\UniRef50_A0A349N9R6 Uncharacterized.*/UniRef50_A0A349N9R6 
Uncharacterized/' uni50plantsplusdos.fasta

Try again:
dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta

The same error, but in a different line…

We could repeat the above procedure n times, but better let’s try something 
more aggressive to remove all null characters:
sed -i 's/\x0//g' uni50plantsplusdos.fasta

Try again:
dos2unix -n uni50plantsplusdos.fasta uni50plantsplus.fasta
dos2unix: converting file uni50plantsplusdos.fasta to file 
uni50plantsplus.fasta in Unix format...
No error message.
Conclusion

sed -i 's/\x0//g' uni50plantsplusdos.fasta



Am Mittwoch, 21. November 2018 09:10:34 UTC+1 schrieb Robert:
>
> Thanks David, you are right, this was a WinLin problem!
>
> Am Dienstag, 20. November 2018 17:04:35 UTC+1 schrieb David Shteynberg:
>>
>> Could this be windows line endings?  Have you tried to run dos2unix 
>> command on your fasta file?
>>
>> Thanks,
>> -David
>>
>> On Tue, Nov 20, 2018 at 7:41 AM Robert Winkler <[email protected]> 
>> wrote:
>>
>>> This is what I try to find out at the moment. head and tail look fine. 
>>> The DB consists of 3 individual DBs. 2 of them look fine. The UniRef50 
>>> could be the problem but I cannot check it manually. Information about the 
>>> "strange" output would help.
>>>
>>>
>>> *Robert Winkler*
>>>
>>> On Nov 20 2018, at 4:31 pm, Eric Deutsch <[email protected]> 
>>> wrote:
>>>
>>>
>>> Hi Robert, I think this warning comes when the FASTA database has 
>>> unexpected characters in it. Is there anything unusual about the FASTA 
>>> database you’re using? Unusual spaces or something?
>>>
>>>  
>>>
>>> Regards,
>>>
>>> Eric
>>>
>>>  
>>>
>>>  
>>>
>>> *From:* [email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/2?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>>  
>>> <[email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/3?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>>
>>>  
>>> *On Behalf Of *Robert
>>> *Sent:* Tuesday, November 20, 2018 5:50 AM
>>> *To:* spctools-discuss <[email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/4?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> >
>>> *Subject:* [spctools-discuss] TPP 5.2 ProteinProphet WARNING: Trying to 
>>> compute mass of non-residue:
>>>  
>>>
>>> Hi, I am testing the docker version of TPP 5.2 on Windows 10,
>>> works fine so far, but in the ProteinProphet step appears a warning (I 
>>> stopped the script after several minutes of the warning message running).
>>> WARNING: Trying to compute mass of non-residue:
>>> WARNING: Trying to compute mass of non-residue:
>>> WARNING: Trying to compute mass of non-residue:
>>> WARNING: Trying to compute mass of non-residue:
>>> WARNING: Trying to compute mass of non-residue:
>>> Any idea how to track down the error (I suppose a strange symbol??)?
>>> Best, Robert
>>> --
>>> You received this message because you are subscribed to the Google 
>>> Groups "spctools-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/5?redirect=mailto%3Aspctools-discuss%2Bunsubscribe%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> To post to this group, send email to [email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/6?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> Visit this group at https://groups.google.com/group/spctools-discuss 
>>> <https://link.getmailspring.com/link/[email protected]/7?redirect=https%3A%2F%2Fgroups.google.com%2Fgroup%2Fspctools-discuss&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> For more options, visit https://groups.google.com/d/optout 
>>> <https://link.getmailspring.com/link/[email protected]/8?redirect=https%3A%2F%2Fgroups.google.com%2Fd%2Foptout&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>>
>>> --
>>> You received this message because you are subscribed to the Google 
>>> Groups "spctools-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/9?redirect=mailto%3Aspctools-discuss%2Bunsubscribe%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> To post to this group, send email to [email protected] 
>>> <https://link.getmailspring.com/link/[email protected]/10?redirect=mailto%3Aspctools-discuss%40googlegroups.com&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> Visit this group at https://groups.google.com/group/spctools-discuss 
>>> <https://link.getmailspring.com/link/[email protected]/11?redirect=https%3A%2F%2Fgroups.google.com%2Fgroup%2Fspctools-discuss&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>> For more options, visit https://groups.google.com/d/optout 
>>> <https://link.getmailspring.com/link/[email protected]/12?redirect=https%3A%2F%2Fgroups.google.com%2Fd%2Foptout&recipient=c3BjdG9vbHMtZGlzY3Vzc0Bnb29nbGVncm91cHMuY29t>
>>> .
>>>
>>> [image: Open Tracking] 
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "spctools-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/spctools-discuss.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/spctools-discuss.
For more options, visit https://groups.google.com/d/optout.

Reply via email to