Okay - When I ran the working set of spectra with the database that failed, 
it seems to have failed; when I ran the set of spectra that failed with a 
database that worked, it ran to completion. I think we can probably narrow 
the problem down to something in the database. 

On Friday, October 23, 2020 at 1:56:18 AM UTC-4 Emily Kawaler wrote:

> While those tests are still running, I pulled out all 185 of the proteins 
> that are in the 10OV pepXMLs but not in 01-09OV, figuring that maybe one of 
> those is causing the error. I've uploaded that to the same folder 
> everything else is in (it's called 10OV_uniq.fasta) - I don't see anything 
> that jumps out immediately. (There are no individual characters unique to 
> either the headers or the sequences in 10OV, so I don't think there's an 
> individual character messing things up.)
>
> On Thursday, October 22, 2020 at 3:49:18 PM UTC-4 David Shteynberg wrote:
>
>> I just re extracted that file and I don't see the issue anymore.  Perhaps 
>> this was a decompression issue.
>>
>> Thanks for checking.
>>
>> -David
>>
>> On Thu, Oct 22, 2020 at 12:19 PM Emily Kawaler <[email protected]> wrote:
>>
>>> Hello,
>>> Thanks so much for taking a look! I think the selenocysteines ("U") are 
>>> likely not the problem, since I've got those in all of my databases, 
>>> including the ones that run correctly. I'm looking at 
>>> 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML and I don't see 
>>> anything odd in line 171821 ("</modification_info>"), so I think our line 
>>> numberings might not match up - what does your problematic line contain?
>>>
>>
>>> When I try to run it on my end, it always sticks somewhere in the 
>>> 10CPTAC_OV files. Right now I'm running a working set of spectra with a 
>>> database that didn't work and vice versa, so hopefully that'll help me pin 
>>> down whether it's a problem with my spectra or my database - will let you 
>>> know how that turns out!
>>>
>>> Emily
>>>
>>> On Thursday, October 22, 2020 at 3:09:29 PM UTC-4 David Shteynberg wrote:
>>>
>>>> Hi Emily,
>>>>
>>>> I analyzed the search results that you sent and I am seeing some 
>>>> strange things in at least one of the files you gave me.  This may be 
>>>> causing some of the problems you saw.
>>>> In file 03CPTAC_OVprospective_W_PNNL_20161212_B1S3_f13.pepXML on line 
>>>> 171821 there are some strange characters (possibly binary) that are 
>>>> tripping up the TPP.  I think these might be caused by a bug in an 
>>>> analysis 
>>>> tool upstream of the TPP.  Not sure if there are other mistakes of this 
>>>> sort.  Also I found some 'U' amino acids in the database which the TPP 
>>>> complains about having a mass of 0.
>>>>
>>>> I hope this helps you somewhat.  Let me know what you find on your end.
>>>>
>>>> Cheers,
>>>> -David
>>>>
>>>> On Tue, Oct 20, 2020 at 1:42 PM Emily Kawaler <[email protected]> 
>>>> wrote:
>>>>
>>>>> Sure! The spectra are from the CPTAC2 ovarian propective dataset, 
>>>>> though I removed all scans that matched to a standard reference database 
>>>>> (I 
>>>>> don't think the scan removal is the issue, since I'm also having this 
>>>>> problem on a different dataset without removing any scans; I also checked 
>>>>> with xmllint and it looks like the mzML pepXML files are valid). I've 
>>>>> been 
>>>>> running it with the philosopher pipeline, so the pepXML files were 
>>>>> generated with MSFragger as part of that pipeline. The database is a 
>>>>> customized variant database with contaminants and decoys added by 
>>>>> philosopher's database tool. Are there any other specifics you'd like? I 
>>>>> can upload my full philosopher.yml file if that would be helpful.
>>>>>
>>>>> On Tuesday, October 20, 2020 at 1:30:44 AM UTC-4 David Shteynberg 
>>>>> wrote:
>>>>>
>>>>>> Hi Emily,
>>>>>>
>>>>>> I got the data and now I am trying to understand how you are running 
>>>>>> the analysis.  Can you please describe those steps?
>>>>>>
>>>>>> Thank you,
>>>>>> -David
>>>>>>
>>>>>> On Sat, Oct 17, 2020 at 12:54 PM Emily Kawaler <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>> I've uploaded the pepXML files, the parameters I used, and the 
>>>>>>> database here. 
>>>>>>> <https://drive.google.com/drive/folders/1gJoi9fqsmIYg_0tl_2Ur-n04MJyuotyc?usp=sharing>
>>>>>>> Please let me know if I should be uploading anything else! Thank you!
>>>>>>>
>>>>>>> On Saturday, October 17, 2020 at 12:04:21 AM UTC-4 Emily Kawaler 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you! I'm working on getting it transferred to Drive, so it 
>>>>>>>> might take a little while, but I'll be in touch!
>>>>>>>>
>>>>>>>> On Tuesday, October 13, 2020 at 3:08:44 PM UTC-4 David Shteynberg 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hello Emily,
>>>>>>>>>
>>>>>>>>> If you are able to share the dataset including the pepXML file and 
>>>>>>>>> the database I can try to replicate the issue here and try to 
>>>>>>>>> troubleshoot 
>>>>>>>>> the sticking point.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -David
>>>>>>>>>
>>>>>>>>> On Tue, Oct 13, 2020 at 11:15 AM Emily Kawaler <[email protected]> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hello, and thank you for your response! It doesn't look like the 
>>>>>>>>>> process is using too much memory (I've allocated 300 GB and it's 
>>>>>>>>>> maxing out 
>>>>>>>>>> around 10), and I've kicked up the minprob parameter - it's still 
>>>>>>>>>> getting 
>>>>>>>>>> stuck, unfortunately. 
>>>>>>>>>> Emily
>>>>>>>>>>
>>>>>>>>>> On Friday, October 9, 2020 at 2:24:37 PM UTC-4 Luis wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello Emily,
>>>>>>>>>>>
>>>>>>>>>>> This is not a problem that we have seen much of.  Do you know 
>>>>>>>>>>> which version of ProteinProphet / TPP you are using?
>>>>>>>>>>>
>>>>>>>>>>> One potential issue is the large number of proteins (and 
>>>>>>>>>>> peptides) that it is trying to process -- can you either monitor 
>>>>>>>>>>> the memory 
>>>>>>>>>>> usage of the machine when you run this dataset, and/or try on one 
>>>>>>>>>>> with more 
>>>>>>>>>>> memory?
>>>>>>>>>>>
>>>>>>>>>>> Hope this helps,
>>>>>>>>>>> --Luis
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Oct 6, 2020 at 6:32 PM Emily Kawaler <[email protected]> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hello! I've been running ProteinProphet as part of the 
>>>>>>>>>>>> Philosopher pipeline for a while now with no problems. However, 
>>>>>>>>>>>> one of my 
>>>>>>>>>>>> datasets seems to be getting stuck in the middle of this function. 
>>>>>>>>>>>> It 
>>>>>>>>>>>> doesn't throw an error or anything - just stops advancing (the 
>>>>>>>>>>>> last 
>>>>>>>>>>>> line of the output is "Computing degenerate peptides for 69919 
>>>>>>>>>>>> proteins: 0%...10%...20%...30%...40%...50%"). Has anyone run into 
>>>>>>>>>>>> this 
>>>>>>>>>>>> problem before?
>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>> Google Groups "spctools-discuss" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from 
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com
>>>>>>>>>>>>  
>>>>>>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/be33a8fb-a6ec-41b6-a988-981161f194fcn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "spctools-discuss" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/6d28e150-40f0-4747-a8a3-02630b12379dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "spctools-discuss" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com
>>>>>>>  
>>>>>>> <https://groups.google.com/d/msgid/spctools-discuss/de634f4a-0057-4fc1-b135-e639c0eb77een%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "spctools-discuss" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com
>>>>>  
>>>>> <https://groups.google.com/d/msgid/spctools-discuss/9c0b1f62-81a7-417b-9e8f-14900f87e134n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "spctools-discuss" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/spctools-discuss/8a49c6ac-a508-4f34-9369-53d0d6b503afn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"spctools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/spctools-discuss/91ee8045-1e02-4dab-8861-2e247769673fn%40googlegroups.com.

Reply via email to