Hi Étienne, I'm just including the Uploaders of ncbi-blast+ into this conversation to make sure the information reaches the experts. We should probably reassign the bug but I'll leave this to those who know better.
Thanks a lot for your analysis Andreas. On Tue, Jun 09, 2020 at 02:25:58PM +0200, Étienne Mollier wrote: > Hi all, > > Andreas Tille, on 2020-06-08 16:01:33 +0200: > > any voluntee to follow this hint of upstream? > > Having a look a this issue, here is what I can tell so far. > > > > Perhaps makeblastdb itself failed (and our wrapper didn't notice)? Those > > > are the first files looked for after calling makeblastdb, to see if it > > > could make a BLAST database. Are there any GenBank/NC_005816.fna.n* or > > > GenBank/NC_005816.faa.p* files present? > > > > > > If it helps, the commands our script was trying to run were: > > > > > > $ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna \ > > > -parse_seqids -hash_index -max_file_sz 20MB -taxid 10 > > > > > > and: > > > > > > $ makeblastdb -dbtype prot -in GenBank/NC_005816.faa \ > > > -parse_seqids -hash_index -max_file_sz 20MB -taxid 10 > > On my i686 machine, both of these commands end up in error, > failing to allocate memory: > > $ makeblastdb -dbtype nucl -in GenBank/NC_005816.fna -parse_seqids > -hash_index -max_file_sz 20MB -taxid 10 > > > Building a new DB, current time: 06/09/2020 08:28:08 > New DB name: /tmp/python-biopthon/Tests/GenBank/NC_005816.fna > New DB title: GenBank/NC_005816.fna > Sequence type: Nucleotide > Deleted existing Nucleotide BLAST database named > /tmp/python-biopthon/Tests/GenBank/NC_005816.fna > Keep MBits: T > Maximum file size: 20000000B > Adding sequences from FASTA; added 1 sequences in 0.284663 seconds. > > No volumes were created. > > BLAST Database creation error: mdb_env_open: Cannot allocate memory > > Looking up the strace to see what happens exactly from a kernel > point of view, the program attempts to map 3647256576 bytes of > memory in which the stub of database will be built: > > lstat64("/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", > 0xbfc2e5ac) = -1 ENOENT (No such file or directory) > openat(AT_FDCWD, > "/tmp/python-biopthon/Tests/GenBank/NC_005816.fna.ndb", O_RDWR|O_CREAT, 0664) > = 4 > fstatfs(4, {f_type=XFS_SB_MAGIC, f_bsize=4096, f_blocks=73645943, > f_bfree=64080178, f_bavail=64080178, f_files=147363840, f_ffree=147171712, > f_fsid={val=[65027, 0]}, f_namelen=255, f_frsize=4096, > f_flags=ST_VALID|ST_NOATIME}) = 0 > pread64(4, "", 92, 0) = 0 > pwrite64(4, > "\0\0\0\0\0\0\10\0\0\0\0\0\336\300\357\276\1\0\0\0\0\0\0\0\0\270d\331\0\20\0\0"..., > 8192, 0) = 8192 > mmap2(NULL, 3647256576, PROT_READ, MAP_SHARED, 4, 0) = -1 ENOMEM > (Cannot allocate memory) > ~~~~~~~~~~ > > To rule out a few issues that could have caused more or less > artificial memory starvation situations, I tried to bring the > following changes to my configuration: > > - append an additional 4 GiB of swap through a file; > > - move to a PAE aware kernel since my original configuration > had no use for virtual memory extension past the 3 GiB limit > anyway: > $ uname -sr > Linux 4.19.0-9-686-pae > $ grep PAE /boot/config-`uname -r` > CONFIG_X86_PAE=y > > - check RLIMIT_DATA to make sure they were not blocking: > $ prlimit # filtered > AS address space limit unlimited unlimited bytes > DATA max data size unlimited unlimited bytes > > - increase the vm.max_map_count by two orders of magnitude > compared to the default (65536), just in case: > $ cat /proc/sys/vm/max_map_count > 1000000 > > - enable memory overcommit and allow unreasonable levels of > commit ratios: > $ grep . /proc/sys/vm/overcommit_* > /proc/sys/vm/overcommit_kbytes:0 > /proc/sys/vm/overcommit_memory:1 > /proc/sys/vm/overcommit_ratio:200 > but that shouldn't be important given the fact that in such > mmap configuration, the memory does not need to be > committed anyway, that was just to rule out that point too. > > For comparison, on 64 bits systems, the size of the mmap is of > precisely 300 GB, and the command works very well whatever the > actual size of physical memory is available on the host. > > My current impression is that makeblastdb is unable to work > properly on most 32 bits machines, because the amount of memory > needing to be addressed by the process looks like it might > exceed too easily 32 bits architectural limits. > > Have a nice day, > -- > Étienne Mollier <etienne.moll...@mailoo.org> > Fingerprint: 5ab1 4edf 63bb ccff 8b54 2fa9 59da 56fe fff3 882d > Help find cures against the Covid-19 ! Give CPU cycles: > * Rosetta@home: https://boinc.bakerlab.org/rosetta/ > * Folding@home: https://foldingathome.org/ -- http://fam-tille.de