Hi Andreas, I've added tests for 2 packages - metastudent and libgo-perl, which is used by metastudent and which produced error (patch in libgo-perl makes metastudent run work). BTW, I noticed that autopkgtest-pkg-perl skips modules syntax check, when specific file is not present, and debian/control contains "Suggests:" line. I'll check librg-utils-perl in case it is useful there.
Now predictprotein run fails when reprof is called. That's why I added tests for binary package `reprof` (in addition to autopkgtest-pkg-perl tests). I added to debian/tests/installation-test, which calls reprof. For now it fails with following message: "Constructor failed at /usr/share/perl5/RG/Reprof.pm line 225." I looked at that file, it seems that for now problem in .model and .features files, accompanying reprof (which are installed to usr/share/reprof folder). That's why this test fails now and this reprof update is not ready for upload. 2016-07-13 22:21 GMT+03:00 Andreas Tille <[email protected]>: > Hi Tanya, > > On Wed, Jul 13, 2016 at 07:24:12PM +0300, merlettaia wrote: > > > > I found a problem in which this package is involved also. > > Last weekend I started to work on predictprotein. The hardest problem was > > to make it work. > > https://wiki.debian.org/DebianMed/PredictProtein - at some point I found > > this instruction, spent some time downloading database, and when I > > downloaded and installed it, then run predictprotein, I've got multilple > > error messages (output_with_errors.txt). It turned out that when one of > the > > perl scripts in librg-utils-perl calls blastpgp on that database, > > blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d > > /data/src/rostlab-data/data/big/big_80 -i query.fasta -o > > query.blastPsiOutTmp -C query.chk -Q query.blastPsiMat > > > > - blastpgp ends up with "Killed" message, and produces incorrect output > > file (query.blastPsiOutTmp is incomplete). Script in librg-utils-perl is > > correct, call in predictprotein is correct. Blastpgp fails with error. > > > > I thought that incorrect database format could be the reason for it. > > Because version of ncbi-blast+ (blastpgp belongs to this package) package > > uses latest version of that database, and database from RostLab's website > > probably isn't latest. > > I downloaded from NCBI FTP (ftp://ftp.ncbi.nlm.nih.gov/blast/db/) one of > > the databases, and tried to run predictprotein with that data. It worked! > > But now I've got error while metastudent run (output in some_output.txt) > - > > I'm working to fix it now. > > Thanks for your very thorough investigation. I have put Laszlo in CC - > may be he has some contact information or can help himself even if he > is not active in Debian Med any more. > > > And there are two things I don't understand: > > > > Is there any package which contains copy of current version of blastp > > database? Or small part of it. It seems that autopkgtest testsuite should > > use smaller portion of blastp database. > > As far as I know there is no such package. IMHO it might be a good idea > to ship something like a stripped down database since it could be used > as test data input for several other packages. What do other think? > > > For now it seems unclear how to test predictprotein with autopkgtest, > since > > for correct run it requires also local copy of (possibly) huge database > > (~30GB in copy from RostLab's website), probably ncbi-blast+/ncbi-tools6 > > should download and install it? > > For manual user tests this might be OK, but autopkgtest should be > offline. > > > Predictprotein has special parameters for > > different databases, and path to blast installation can be provided by > > hand, that makes possible to call it with smaller database in testsuite > > run. > > Sounds convincing. > > > But that will work only if blastpgp from ncbi-blast+ works correctly > > with the same version of database. That means that better way to > > install+test database usage from ncbi-blast+ tests, and use default > > database installed with ncbi-blast+ (if it will be installed). > > > > Could you also check that database from here: > > https://wiki.debian.org/DebianMed/PredictProtein - really doesn't work? > I > > have unstable internet connection and not sure if that file was not > > corrupted. > > Any volunteer for this? My internet is currently also not the best. > > Kind regards > > Andreas. > > > > cache merging is off at /usr/bin/predictprotein line 230. > > work_dir=/data/src/temp at /usr/bin/predictprotein line 336. > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/ > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17 > PROFROOT=/usr/share/profphd/prof/ > BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa > BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk > all norsp at /usr/bin/predictprotein line 383. > > make: Entering directory '/data/src/temp' > > metastudent -i query.fasta -o query.metastudent --silent --debug > > mkdir -p /tmp/metastudentulQjHj/methodC;cd > /usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl > /tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast > /tmp/metastudentulQjHj/methodC/output.MFO.txt 0 > /tmp/metastudentulQjHj/methodC > > !!!Error!!! mkdir -p /tmp/metastudentulQjHj/methodC;cd > /usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl > /tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast > /tmp/metastudentulQjHj/methodC/output.MFO.txt 0 > /tmp/metastudentulQjHj/methodC > > 65280 > > Can't use a hash as a reference at /usr/share/perl5/GO/IO/Dotty.pm line > 104. > > Compilation failed in require at ./treehandler.pl line 10. > > BEGIN failed--compilation aborted at ./treehandler.pl line 10. > > ./treehandler.pl -mfo transitiveClosure2014.txt -bpo > transitiveClosure2014.txt -cco transitiveClosure2014.txt -method 3 -pred > /tmp/metastudentulQjHj/methodC/blast.out -scoring 0 failed: 255 at > ./CafaWrapper3.pl line 16. > > Error occurred: IOError > > Traceback (most recent call last): > > File "/usr/bin/metastudent", line 721, in <module> > > runIt(tempfile, inputFastaFilePath, outputFilePath, outputBlast, > blastKickstartDatabasePaths, ontologies, blastOnly, keepTemp, allPreds, > debug, noNames, withImages) > > File "/usr/bin/metastudent", line 187, in runIt > > predLinesDict["C"] = runMethodC(blastKickstartDatabasePath, > fastaFilePathLocal, tmpDirPath, configMap["GROUP_C_SCORING_%s" % (ontology) > ], ontology, configMap, debug) > > File "/usr/lib/python2.7/dist-packages/metastudentPkg/runMethods.py", > line 206, in runMethodC > > with open(outputFilePath) as f: > > IOError: [Errno 2] No such file or directory: > '/tmp/metastudentulQjHj/methodC/output.MFO.txt' > > /usr/share/predictprotein/MakefilePP.mk:403: recipe for target > 'query.metastudent.BPO.txt' failed > > make: *** [query.metastudent.BPO.txt] Error 1 > > make: Leaving directory '/data/src/temp' > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/ > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17 > PROFROOT=/usr/share/profphd/prof/ > BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa > BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk > all norsp failed: 512 at /usr/bin/predictprotein line 392. > > > cache merging is off at /usr/bin/predictprotein line 230. > > work_dir=/data/src/temp at /usr/bin/predictprotein line 336. > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/ > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17 > PROFROOT=/usr/share/profphd/prof/ > BIGBLASTDB=/data/src/rostlab-data/data/big/big > BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80 > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk > all norsp at /usr/bin/predictprotein line 383. > > make: Entering directory '/data/src/temp' > > make: Warning: File 'query.in' has modification time 3.2 s in the future > > /usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta > formatOut=fasta fileOut=query.fasta exeConvertSeq=convert_seq > > /usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta > formatOut=gcg fileOut=query.seqGCG exeConvertSeq=convert_seq > > ncbi-seg query.fasta -x > query.segNorm > > /usr/share/librg-utils-perl//copf.pl query.segNorm formatOut=gcg > fileOut=query.segNormGCG > > # blast call may throw warnings on STDERR - silence it when we are not > in debug mode; blastpgp and blastall create a normally 0-sized 'error.log' > - remove it > > trap "rm -f error.log" EXIT; \ > > if ! ( blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d > /data/src/rostlab-data/data/big/big_80 -i query.fasta -o > query.blastPsiOutTmp -C query.chk -Q query.blastPsiMat ); then \ > > EXIT=$?; cat error.log >&2; exit $EXIT; \ > > fi > > Killed > > cat: error.log: No such file or directory > > # blast call may throw warnings on STDERR - silence it when we are not > in debug mode > > trap "rm -f error.log" EXIT; \ > > if ! ( blastpgp -F F -a 1 -b 1000 -e 1 -d > /data/src/rostlab-data/data/big/big -i query.fasta -o query.blastPsiAli.nz > -R query.chk ); then \ > > EXIT=$?; cat error.log >&2; exit $EXIT; \ > > fi > > [blastpgp] WARNING: -t larger than 1 not supported when restarting from > a checkpoint; setting -t to 1 > > > > [blastpgp] WARNING: posReadCheckpoint: Attempting to recover data from > previous checkpoint > > > > [blastpgp] WARNING: posReadPosFreqsStandard: Could not open checkpoint > file > > > > [blastpgp] WARNING: posReadCheckpoint: Data recovery failed > > > > [blastpgp] FATAL ERROR: blast: Error recovering from checkpoint > > cat: error.log: No such file or directory > > gzip -c -6 < 'query.blastPsiAli.nz' > 'query.blastPsiAli.gz' > > # lkajan: we have to switch off filtering (default for blastpgp) or > sequences like ASDSADADASDASDASDSADASA fail with > > # 'WARNING: query: Could not calculate ungapped Karlin-Altschul > parameters due to an invalid query sequence or its translation. Please > verify the query sequence(s) and/or filtering options' > > # Does switching off filtering hurt us? Loctree uses the results of this > for extracting keywords from swissprot, so I am not worried. > > # This blast call also often writes 'Selenocysteine (U) at position 59 > replaced by X' - we are not really interested. Silence this in non-debug > mode. > > trap "rm -f error.log" EXIT; \ > > if ! ( blastall -F F -a 1 -p blastp -d > /data/src/rostlab-data/data/swissprot/uniprot_sprot -b 1000 -e 100 -m 8 -i > query.fasta -o query.blastpSwissM8 ); then \ > > EXIT=$?; cat error.log >&2; exit $EXIT; \ > > fi > > /usr/share/librg-utils-perl//blastpgp_to_saf.pl > fileInBlast=query.blastPsiOutTmp fileInQuery=query.fasta > fileOutRdb=query.blastPsi80Rdb fileOutSaf=query.safBlastPsi80 red=100 > maxAli=3000 tile=0 > > opened query.fasta at /usr/share/librg-utils-perl//blastpgp_to_saf.pl > line 126. > > blastfile: query.blastPsiOutTmp at /usr/share/librg-utils-perl// > blastpgp_to_saf.pl line 127. > > nohits: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 128. > > iter: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 129. > > blast+: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 130. > > Died at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 76. > > *** ERROR blastpgp_to_saf.pl : *** ERROR blastp_to_saf: blast file > format not recognized > > /usr/share/predictprotein/MakefilePP.mk:465: recipe for target > 'query.safBlastPsi80' failed > > make: *** [query.safBlastPsi80] Error 255 > > rm query.blastPsi80Rdb query.blastPsiAli.nz > > make: Leaving directory '/data/src/temp' > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/ > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17 > PROFROOT=/usr/share/profphd/prof/ > BIGBLASTDB=/data/src/rostlab-data/data/big/big > BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80 > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk > all norsp failed: 512 at /usr/bin/predictprotein line 392. > > > -- > http://fam-tille.de > > -- Best wishes, Tanya.

