Hi Andreas,
It would be fine to drop this line then.
I found a problem in which this package is involved also.
Last weekend I started to work on predictprotein. The hardest problem was
to make it work.
https://wiki.debian.org/DebianMed/PredictProtein - at some point I found
this instruction, spent some time downloading database, and when I
downloaded and installed it, then run predictprotein, I've got multilple
error messages (output_with_errors.txt). It turned out that when one of the
perl scripts in librg-utils-perl calls blastpgp on that database,
blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d
/data/src/rostlab-data/data/big/big_80 -i query.fasta -o
query.blastPsiOutTmp -C query.chk -Q query.blastPsiMat
- blastpgp ends up with "Killed" message, and produces incorrect output
file (query.blastPsiOutTmp is incomplete). Script in librg-utils-perl is
correct, call in predictprotein is correct. Blastpgp fails with error.
I thought that incorrect database format could be the reason for it.
Because version of ncbi-blast+ (blastpgp belongs to this package) package
uses latest version of that database, and database from RostLab's website
probably isn't latest.
I downloaded from NCBI FTP (ftp://ftp.ncbi.nlm.nih.gov/blast/db/) one of
the databases, and tried to run predictprotein with that data. It worked!
But now I've got error while metastudent run (output in some_output.txt) -
I'm working to fix it now.
And there are two things I don't understand:
Is there any package which contains copy of current version of blastp
database? Or small part of it. It seems that autopkgtest testsuite should
use smaller portion of blastp database.
For now it seems unclear how to test predictprotein with autopkgtest, since
for correct run it requires also local copy of (possibly) huge database
(~30GB in copy from RostLab's website), probably ncbi-blast+/ncbi-tools6
should download and install it? Predictprotein has special parameters for
different databases, and path to blast installation can be provided by
hand, that makes possible to call it with smaller database in testsuite
run. But that will work only if blastpgp from ncbi-blast+ works correctly
with the same version of database. That means that better way to
install+test database usage from ncbi-blast+ tests, and use default
database installed with ncbi-blast+ (if it will be installed).
Could you also check that database from here:
https://wiki.debian.org/DebianMed/PredictProtein - really doesn't work? I
have unstable internet connection and not sure if that file was not
corrupted.
2016-07-13 11:08 GMT+03:00 Andreas Tille <[email protected]>:
> Hi Tanya,
>
> when I did some look over older commits I wonder what
> autopkgtest-pkg-perl in librg-utils-perl is actually doing.
> The build log does not show anything but
>
> Nothing to be done for 'check'.
>
> lines. While I think that your change to fix error "Can't use
> 'defined(@array)' is well worth an upload I wonder whether the test is
> doing nothing and the line should rather be dropped from debian/control.
>
> What do you think?
>
> Kind regards
>
> Andreas.
>
> --
> http://fam-tille.de
>
>
--
Best wishes,
Tanya.
cache merging is off at /usr/bin/predictprotein line 230.
work_dir=/data/src/temp at /usr/bin/predictprotein line 336.
make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query -j 1
BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
PROFROOT=/usr/share/profphd/prof/
BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa
BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa
PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk all
norsp at /usr/bin/predictprotein line 383.
make: Entering directory '/data/src/temp'
metastudent -i query.fasta -o query.metastudent --silent --debug
mkdir -p /tmp/metastudentulQjHj/methodC;cd
/usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl
/tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast
/tmp/metastudentulQjHj/methodC/output.MFO.txt 0 /tmp/metastudentulQjHj/methodC
!!!Error!!! mkdir -p /tmp/metastudentulQjHj/methodC;cd
/usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl
/tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast
/tmp/metastudentulQjHj/methodC/output.MFO.txt 0 /tmp/metastudentulQjHj/methodC
65280
Can't use a hash as a reference at /usr/share/perl5/GO/IO/Dotty.pm line 104.
Compilation failed in require at ./treehandler.pl line 10.
BEGIN failed--compilation aborted at ./treehandler.pl line 10.
./treehandler.pl -mfo transitiveClosure2014.txt -bpo transitiveClosure2014.txt
-cco transitiveClosure2014.txt -method 3 -pred
/tmp/metastudentulQjHj/methodC/blast.out -scoring 0 failed: 255 at
./CafaWrapper3.pl line 16.
Error occurred: IOError
Traceback (most recent call last):
File "/usr/bin/metastudent", line 721, in <module>
runIt(tempfile, inputFastaFilePath, outputFilePath, outputBlast,
blastKickstartDatabasePaths, ontologies, blastOnly, keepTemp, allPreds, debug,
noNames, withImages)
File "/usr/bin/metastudent", line 187, in runIt
predLinesDict["C"] = runMethodC(blastKickstartDatabasePath,
fastaFilePathLocal, tmpDirPath, configMap["GROUP_C_SCORING_%s" % (ontology) ],
ontology, configMap, debug)
File "/usr/lib/python2.7/dist-packages/metastudentPkg/runMethods.py", line
206, in runMethodC
with open(outputFilePath) as f:
IOError: [Errno 2] No such file or directory:
'/tmp/metastudentulQjHj/methodC/output.MFO.txt'
/usr/share/predictprotein/MakefilePP.mk:403: recipe for target
'query.metastudent.BPO.txt' failed
make: *** [query.metastudent.BPO.txt] Error 1
make: Leaving directory '/data/src/temp'
make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query -j 1
BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
PROFROOT=/usr/share/profphd/prof/
BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa
BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa
PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk all
norsp failed: 512 at /usr/bin/predictprotein line 392.
cache merging is off at /usr/bin/predictprotein line 230.
work_dir=/data/src/temp at /usr/bin/predictprotein line 336.
make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query -j 1
BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
PROFROOT=/usr/share/profphd/prof/
BIGBLASTDB=/data/src/rostlab-data/data/big/big
BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80
PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk all
norsp at /usr/bin/predictprotein line 383.
make: Entering directory '/data/src/temp'
make: Warning: File 'query.in' has modification time 3.2 s in the future
/usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta formatOut=fasta
fileOut=query.fasta exeConvertSeq=convert_seq
/usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta formatOut=gcg
fileOut=query.seqGCG exeConvertSeq=convert_seq
ncbi-seg query.fasta -x > query.segNorm
/usr/share/librg-utils-perl//copf.pl query.segNorm formatOut=gcg
fileOut=query.segNormGCG
# blast call may throw warnings on STDERR - silence it when we are not in debug
mode; blastpgp and blastall create a normally 0-sized 'error.log' - remove it
trap "rm -f error.log" EXIT; \
if ! ( blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d
/data/src/rostlab-data/data/big/big_80 -i query.fasta -o query.blastPsiOutTmp
-C query.chk -Q query.blastPsiMat ); then \
EXIT=$?; cat error.log >&2; exit $EXIT; \
fi
Killed
cat: error.log: No such file or directory
# blast call may throw warnings on STDERR - silence it when we are not in debug
mode
trap "rm -f error.log" EXIT; \
if ! ( blastpgp -F F -a 1 -b 1000 -e 1 -d /data/src/rostlab-data/data/big/big
-i query.fasta -o query.blastPsiAli.nz -R query.chk ); then \
EXIT=$?; cat error.log >&2; exit $EXIT; \
fi
[blastpgp] WARNING: -t larger than 1 not supported when restarting from a
checkpoint; setting -t to 1
[blastpgp] WARNING: posReadCheckpoint: Attempting to recover data from previous
checkpoint
[blastpgp] WARNING: posReadPosFreqsStandard: Could not open checkpoint file
[blastpgp] WARNING: posReadCheckpoint: Data recovery failed
[blastpgp] FATAL ERROR: blast: Error recovering from checkpoint
cat: error.log: No such file or directory
gzip -c -6 < 'query.blastPsiAli.nz' > 'query.blastPsiAli.gz'
# lkajan: we have to switch off filtering (default for blastpgp) or sequences
like ASDSADADASDASDASDSADASA fail with
# 'WARNING: query: Could not calculate ungapped Karlin-Altschul parameters due
to an invalid query sequence or its translation. Please verify the query
sequence(s) and/or filtering options'
# Does switching off filtering hurt us? Loctree uses the results of this for
extracting keywords from swissprot, so I am not worried.
# This blast call also often writes 'Selenocysteine (U) at position 59 replaced
by X' - we are not really interested. Silence this in non-debug mode.
trap "rm -f error.log" EXIT; \
if ! ( blastall -F F -a 1 -p blastp -d
/data/src/rostlab-data/data/swissprot/uniprot_sprot -b 1000 -e 100 -m 8 -i
query.fasta -o query.blastpSwissM8 ); then \
EXIT=$?; cat error.log >&2; exit $EXIT; \
fi
/usr/share/librg-utils-perl//blastpgp_to_saf.pl
fileInBlast=query.blastPsiOutTmp fileInQuery=query.fasta
fileOutRdb=query.blastPsi80Rdb fileOutSaf=query.safBlastPsi80 red=100
maxAli=3000 tile=0
opened query.fasta at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 126.
blastfile: query.blastPsiOutTmp at
/usr/share/librg-utils-perl//blastpgp_to_saf.pl line 127.
nohits: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 128.
iter: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 129.
blast+: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 130.
Died at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 76.
*** ERROR blastpgp_to_saf.pl : *** ERROR blastp_to_saf: blast file format not
recognized
/usr/share/predictprotein/MakefilePP.mk:465: recipe for target
'query.safBlastPsi80' failed
make: *** [query.safBlastPsi80] Error 255
rm query.blastPsi80Rdb query.blastPsiAli.nz
make: Leaving directory '/data/src/temp'
make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query -j 1
BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
PROFROOT=/usr/share/profphd/prof/
BIGBLASTDB=/data/src/rostlab-data/data/big/big
BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80
PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk all
norsp failed: 512 at /usr/bin/predictprotein line 392.