Control: found -1 38.90+dfsg-1
Control: tag -1 confirmed
Hi all,
Andreas Tille, on 2021-09-30:
> Am Thu, Sep 30, 2021 at 01:22:23PM -0400 schrieb Robert:
> > The bbmap package does not ship the needed resource files which causes some
> > of
> > the included tools not to work, e.g. bbduk when trying to process some fastq
> > data, crashes with output like [1].
>
> Thanks a lot for the report. Its extremely helpful since several of our
> maintainers are not using this software and we really need to rely on
> user input.
Thank you Robert! Your report is very useful indeed!
[…]
> > $ bbduk.sh in1=fwd.fastq in2=rev.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5
> > tpe tbo threads=48 out=out.fastq
> > java -ea -Xmx76702m -Xms76702m -cp /usr/share/java/bbmap.jar jgi.BBDuk
> > in1=fwd.fastq in2=rev.fastq ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo
> > threads=48 out=out.fastq
> > Executing jgi.BBDuk [in1=fwd.fastq, in2=rev.fastq, ktrim=r, k=21, mink=8,
> > hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
> > Version 38.90
> >
> > Set threads to 48
> > maskMiddle was disabled because useShortKmers=true
> > Warning! Cannot find primes.txt.gz
> > /tmp/bbduk_test/file:/usr/share/java/bbmap.jar!/primes.txt.gz
> > at jgi.BBDuk.main(BBDuk.java:78)
>
> If we could turn this into a test I could upload including test.
Andreas, I pulled some data files from python-biopython-doc,
and I think I managed to reproduce the problem on my end:
$ bbduk.sh \
in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq \
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq \
ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 \
out=out.fastq
java -ea -Xmx7195m -Xms7195m -cp /usr/share/java/bbmap.jar jgi.BBDuk
in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq
ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
Executing jgi.BBDuk
[in1=/usr/share/doc/python-biopython-doc/Tests/Quality/example.fastq,
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/solexa_example.fastq,
ktrim=r, k=21, mink=8, hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
Version 38.93
Set threads to 48
maskMiddle was disabled because useShortKmers=true
Warning! Cannot find primes.txt.gz
/home/emollier/tmp/bbduk_test/file:/usr/share/java/bbmap.jar!/primes.txt.gz
java.lang.Exception
at dna.Data.findPath(Data.java:1247)
at dna.Data.findPath(Data.java:1194)
at shared.Primes.fetchPrimes(Primes.java:167)
at shared.Primes.(Primes.java:177)
at kmer.ScheduleMaker.(ScheduleMaker.java:155)
at jgi.BBDuk.(BBDuk.java:964)
at jgi.BBDuk.main(BBDuk.java:78)
Exception in thread "main" java.lang.ExceptionInInitializerError
at kmer.ScheduleMaker.(ScheduleMaker.java:155)
at jgi.BBDuk.(BBDuk.java:964)
at jgi.BBDuk.main(BBDuk.java:78)
Caused by: java.lang.NullPointerException
at fileIO.ByteFile.(ByteFile.java:43)
at fileIO.ByteFile1.(ByteFile1.java:98)
at fileIO.ByteFile1.(ByteFile1.java:94)
at shared.Primes.fetchPrimes(Primes.java:169)
at shared.Primes.(Primes.java:177)
... 3 more
I tested the patch from Robert and applied by Andreas, and it
seems I could get much further in the processing. For the
autopkgtest, note that I had to pick an appropriate dataset with
same dimensions in both files, otherwise the processing fails,
because of intrinsic data inconsistencies I presume:
$ bbduk.sh \
in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq \
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq \
ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 \
out=out.fastq
java -ea -Xmx7140m -Xms7140m -cp /usr/share/java/bbmap.jar jgi.BBDuk
in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq
ktrim=r k=21 mink=8 hdist=2 ftm=5 tpe tbo threads=48 out=out.fastq
Executing jgi.BBDuk
[in1=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_sanger.fastq,
in2=/usr/share/doc/python-biopython-doc/Tests/Quality/wrapping_as_solexa.fastq,
ktrim=r, k=21, mink=8, hdist=2, ftm=5, tpe, tbo, threads=48, out=out.fastq]
Version 38.93
Set threads to 48
maskMiddle was disabled because useShortKmers=true
0.018 seconds.
Initial:
Memory: max=7486m, total=7486m, free=7467m, used=19m
** WARNING! A KMER OPERATION