I'm also working on the similar issue to analyze combined spectral files from multiple runs. I think maybe your user have the same thought as what we had, that is , combined mgf files will result in different statistics in term of peptide identification.
One solution we could think about is to pick out those good spectra for a combined mgf file with greatly reduced size, after analysis of each individual mgf file. The assumption, also practical observation, is that most of the acquired spectral data could not be matched to good peptide identification even if you include them in the mgf file. cheers, Tiannan Guo On Sat, Sep 4, 2010 at 4:02 AM, Matthew Chambers <[email protected]>wrote: > Forwarded to spctools-discuss since Simon isn't on spctools-dev. > > > -------- Original Message -------- > Subject: RE: [spctools-dev] Re: Request for help in running X!tandem > with large input mgf file of size over 3.5GB ..... > Date: Thu, 2 Sep 2010 12:40:22 +1000 > From: Simon Michnowicz <[email protected]> > To: Matt Chambers <[email protected]> > > > > Matt > > Actually, my earlier question was prompted by another line of thought! > > The data was generated by a Thermo LTQ - linear ion trap. This was run > > in ESI mode with an upfront 90min reversed phase gradient for each of > > the 26 fractions. > > Our user is adamant that he needs to merge the files into on big file, > > but would not comment on the reasons why. > > After increasing the size of our swap file, and putting try{}catch() in > > the main to catch memory errors, X!Tandem just sits there not printing > > anything for several days. (This lead our user to think the program had > > crashed when he tried it on his PC). > > I looked at the stack and think it is processing the enormous amounts of > > data, but am not familiar enough with the code to comment further... > > Does anybody know how the job-running time of X!Tandem scales with the > > size of the input file? > > Regards > > Simon Michnowicz > > Duty Programmer > > Australian Proteomics Computation Facility > > Ludwig Institute For Cancer Research > > Royal Melbourne Hospital, > > Victoria > > Tel: (+61 3) 9341 3155 > > Fax: (+61 3) 9341 3104 > > #0 mscore_tandem::dot (this=0x1, _v=0x2aaaabb429e0) at > mscore_tandem.cpp:330 > > #1 0x0000000000436b5f in mscore::score (this=0xf4eeefc0, _i=<value > optimized out>) at mscore.cpp:1711 > > #2 0x0000000000419d17 in mprocess::create_score (this=0xeaa2c6b0, _s=..., > _v=349, _w=364, _m=0, _p=true) at mprocess.cpp:439 > > #3 0x0000000000411008 in mprocess::score_single (this=0xeaa2c6b0, _s=...) > at mprocess.cpp:2884 > > #4 0x0000000000411c7c in mprocess::score (this=0xeaa2c6b0, _s=...) at > mprocess.cpp:2696 > > #5 0x0000000000411f80 in mprocess::score_each_sequence (this=0xeaa2c6b0) > at mprocess.cpp:3050 > > #6 0x0000000000418b8e in mprocess::process (this=0xeaa2c6b0) at > mprocess.cpp:1616 > > #7 0x000000000045c439 in ProcessThread (_p=0x1) at tandem.cpp:635 > > #8 0x0000003cbc606307 in start_thread () from /lib64/libpthread.so.0 > > #9 0x0000003cbbed1ded in clone () from /lib64/libc.so.6 > > *From:* Matt Chambers [mailto:[email protected]] > *Sent:* Thursday, 2 September 2010 11:01 AM > *To:* Jagan Kommineni > *Cc:* Robert Moritz ISB; [email protected]; Simon Michnowicz > *Subject:* Re: [spctools-dev] Re: Request for help in running X!tandem with > large input mgf file of size over 3.5GB ..... > > Ah, that makes sense given that Simon asked me about merging mgfs the other > day. :) > > Out of curiousity, why merge the runs in this way? I assume each input mgf > is a separate acquisition. > > On Sep 1, 2010 7:10 PM, "Jagan Kommineni" > <[email protected]<mailto: > [email protected]>> wrote: > > Dear Matt, > > > > It seems that the data is centoid (no profile data) and the 3.6GB file is > generated from 26 individual files of sizes range from 29MB to 178MB and > here are split file sizes. > > The total number of spectra count is 651,543. > > > > --------------------------- > > 29M LMcQuade-1.mgf > > 90M LMcQuade-2.mgf > > 143M LMcQuade-3.mgf > > 163M LMcQuade-4.mgf > > 167M LMcQuade-5.mgf > > 156M LMcQuade-6.mgf > > 178M LMcQuade-7.mgf > > 168M LMcQuade-8.mgf > > 130M LMcQuade-9.mgf > > 100M LMcQuade-10.mgf > > 93M LMcQuade-11.mgf > > 121M LMcQuade-12.mgf > > 152M LMcQuade-13.mgf > > 152M LMcQuade-14.mgf > > 171M LMcQuade-15.mgf > > 169M LMcQuade-16.mgf > > 176M LMcQuade-17.mgf > > 164M LMcQuade-18.mgf > > 165M LMcQuade-19.mgf > > 97M LMcQuade-20.mgf > > 77M LMcQuade-21.mgf > > 152M LMcQuade-22.mgf > > 160M LMcQuade-23.mgf > > 161M LMcQuade-24.mgf > > 144M LMcQuade-25.mgf > > 48M LMcQuade-26.mgf > > ------------------------- > > > > > > > > with regards, > > > > Dr. Jagan Kommineni > > Ludwig Institute for Cancer Research > > Parkville > > -- > You received this message because you are subscribed to the Google Groups > "spctools-discuss" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<spctools-discuss%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/spctools-discuss?hl=en. > > -- You received this message because you are subscribed to the Google Groups "spctools-discuss" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/spctools-discuss?hl=en.
