Hi Étienne,

I'm just including the Uploaders of ncbi-blast+ into this conversation
to make sure the information reaches the experts.  We should probably
reassign the bug but I'll leave this to those who know better.

Thanks a lot for your analysis

      Andreas.

On Tue, Jun 09, 2020 at 02:25:58PM +0200, Étienne Mollier wrote:
> Hi all,
> 
> Andreas Tille, on 2020-06-08 16:01:33 +0200:
> > any voluntee to follow this hint of upstream?
> 
> Having a look a this issue, here is what I can tell so far.
> 
> > > Perhaps makeblastdb itself failed (and our wrapper didn't notice)? Those
> > > are the first files looked for after calling makeblastdb, to see if it
> > > could make a BLAST database.  Are there any GenBank/NC_005816.fna.n* or
> > > GenBank/NC_005816.faa.p* files present?
> > > 
> > > If it helps, the commands our script was trying to run were:
> > > 
> > > $ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna \
> > > -parse_seqids -hash_index -max_file_sz 20MB  -taxid 10
> > > 
> > > and:
> > > 
> > > $ makeblastdb -dbtype prot -in GenBank/NC_005816.faa \
> > > -parse_seqids -hash_index -max_file_sz 20MB -taxid 10
> 
> On my i686 machine, both of these commands end up in error,
> failing to allocate memory:
> 
>       $ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna -parse_seqids 
> -hash_index -max_file_sz 20MB  -taxid 10
>       
>       
>       Building a new DB, current time: 06/09/2020 08:28:08
>       New DB name:   /tmp/python-biopthon/Tests/GenBank/NC_005816.fna
>       New DB title:  GenBank/NC_005816.fna
>       Sequence type: Nucleotide
>       Deleted existing Nucleotide BLAST database named 
> /tmp/python-biopthon/Tests/GenBank/NC_005816.fna
>       Keep MBits: T
>       Maximum file size: 20000000B
>       Adding sequences from FASTA; added 1 sequences in 0.284663 seconds.
>       
>       No volumes were created.
>       
>       BLAST Database creation error: mdb_env_open: Cannot allocate memory
> 
> Looking up the strace to see what happens exactly from a kernel
> point of view, the program attempts to map 3647256576 bytes of
> memory in which the stub of database will be built:
> 
>       lstat64("/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", 
> 0xbfc2e5ac) = -1 ENOENT (No such file or directory)
>       openat(AT_FDCWD, 
> "/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", O_RDWR|O_CREAT, 0664) 
> = 4
>       fstatfs(4, {f_type=XFS_SB_MAGIC, f_bsize=4096, f_blocks=73645943, 
> f_bfree=64080178, f_bavail=64080178, f_files=147363840, f_ffree=147171712, 
> f_fsid={val=[65027, 0]}, f_namelen=255, f_frsize=4096, 
> f_flags=ST_VALID|ST_NOATIME}) = 0
>       pread64(4, "", 92, 0)                   = 0
>       pwrite64(4, 
> "\0\0\0\0\0\0\10\0\0\0\0\0\336\300\357\276\1\0\0\0\0\0\0\0\0\270d\331\0\20\0\0"...,
>  8192, 0) = 8192
>       mmap2(NULL, 3647256576, PROT_READ, MAP_SHARED, 4, 0) = -1 ENOMEM 
> (Cannot allocate memory)
>                   ~~~~~~~~~~
> 
> To rule out a few issues that could have caused more or less
> artificial memory starvation situations, I tried to bring the
> following changes to my configuration:
> 
>   - append an additional 4 GiB of swap through a file;
> 
>   - move to a PAE aware kernel since my original configuration
>     had no use for virtual memory extension past the 3 GiB limit
>     anyway:
>       $ uname -sr
>       Linux 4.19.0-9-686-pae
>       $ grep PAE /boot/config-`uname -r`
>       CONFIG_X86_PAE=y
> 
>   - check RLIMIT_DATA to make sure they were not blocking:
>       $ prlimit   # filtered
>       AS         address space limit unlimited unlimited bytes
>       DATA       max data size       unlimited unlimited bytes
> 
>   - increase the vm.max_map_count by two orders of magnitude
>     compared to the default (65536), just in case:
>       $ cat /proc/sys/vm/max_map_count
>       1000000
> 
>   - enable memory overcommit and allow unreasonable levels of
>     commit ratios:
>       $ grep . /proc/sys/vm/overcommit_*
>       /proc/sys/vm/overcommit_kbytes:0
>       /proc/sys/vm/overcommit_memory:1
>       /proc/sys/vm/overcommit_ratio:200
>     but that shouldn't be important given the fact that in such
>     mmap configuration, the memory does not need to be
>     committed anyway, that was just to rule out that point too.
> 
> For comparison, on 64 bits systems, the size of the mmap is of
> precisely 300 GB, and the command works very well whatever the
> actual size of physical memory is available on the host.
> 
> My current impression is that makeblastdb is unable to work
> properly on most 32 bits machines, because the amount of memory
> needing to be addressed by the process looks like it might
> exceed too easily 32 bits architectural limits.
> 
> Have a nice day,
> -- 
> Étienne Mollier <etienne.moll...@mailoo.org>
> Fingerprint:  5ab1 4edf 63bb ccff 8b54  2fa9 59da 56fe fff3 882d
> Help find cures against the Covid-19 !  Give CPU cycles:
>   * Rosetta@home: https://boinc.bakerlab.org/rosetta/
>   * Folding@home: https://foldingathome.org/



-- 
http://fam-tille.de

Reply via email to