Re: cmsfs-fuse: mmap failed: Cannot allocate memory

David Boyes Thu, 15 Dec 2011 13:06:03 -0800

> -----Original Message-----
> From: Linux on 390 Port [mailto:[email protected]] On Behalf Of
> Rob van der Heij
> As Jan points out the FST is fragmented.


Agreed. However, each piece contains pointers to the next piece you need, and 
you need that information anyway, so following the breadcrumbs is not an 
operational loss as it happens in two scenarios: at first access, and after a 
update to the FST in R/W mode.

> The purpose of mmap() is that you
> map all blocks in virtual mem 

The purpose of mmap() is to map *A* specified object (disk, shared memory 
block, etc) of a specified size starting at a specified offset from the start 
of the object to *A* memory segment of equal size in the process' address 
space. It does not have to map ALL blocks of a disk to access some of the data 
on it. 

POSIX (IEEE Std 1003.1) definition of mmap():

The mmap() function shall establish a mapping between a process' address space 
and a file, shared memory object, or typed memory object. The format of the 
call is as follows:

    pa=mmap(addr, len, prot, flags, fildes, off);

The mmap() function shall establish a mapping between the address space of the 
process at an address pa for len bytes to the memory object represented by the 
file descriptor fildes at offset off for len bytes. The value of pa is an 
implementation-defined function of the parameter addr and the values of flags, 
further described below. A successful mmap() call shall return pa as its 
result. The address range starting at pa and continuing for len bytes shall be 
legitimate for the possible (not necessarily current) address space of the 
process. The range of bytes starting at off and continuing for len bytes shall 
be legitimate for the possible (not necessarily current) offsets in the file, 
shared memory object, or typed memory object represented by fildes.

> and then simply access the blocks in memory.
> Linux does the I/O under the covers.

I follow the concept, and see the advantages of using mmap to do the I/O under 
the covers. At this point, we're optimizing to minimize the amount of data we 
need, and thus the impact on other stuff that uses memory in the same virtual 
machine (and WSS of same).

> Since your blocks can be anywhere on
> disk, you map the entire thing. 

Here's where we diverge. 

There are two issues here: 

1) accessing the minidisk and representing its contents to Linux at a point in 
time
2) accessing the content of the minidisk

Mmap()ing the whole disk is a convenient solution to both problems, HOWEVER:

To access a minidisk and represent it to Linux, you do NOT need every block on 
the disk to be represented in a structure, you need the label data and the FST 
data (which, btw you need to read first ANYWAY to mmap the whole disk as you 
need to know the logical number of blocks to set up the mmap!). 

To use the files on the minidisk, you need the blocks CONTAINED in the file, 
not the entire disk. You get that from the FST and you mmap() those blocks.

Quote (again from IEEE Std 1003.1): Use of mmap() may reduce the amount of 
memory available to other memory allocation functions. 

This is what triggered the discussion. In no case do you ever need the entire 
set of blocks on the disk at the same time, unless they are contained in a 
single large file, which our use case (big disk with lots of small to 
medium-size files) makes unlikely, if not explicitly impossible by definition 
of the problem. 

>To map just record 3-5 is no gain if you need
> to point still at the rest of the blocks.

See above. It is a gain at access time (you don't need ALL the blocks, you need 
the ones to identify the volume, create a view of the volume contents, and 
where the interesting content is, or at least starts).  For R/O you need to 
build an in memory copy exactly once (on first access to the minidisk, then you 
can use it forever until the next access). For R/W you need a live copy in 
memory of the entire FST, which you need to build and maintain, regardless of 
activity or access method. The in-memory copy does not have to be discontiguous 
-- in fact, you *want* it to be contiguous so you can use simple indexing of a 
structure pattern over the FST entries for performance. 

You don't need the other data AT ALL until you actually access a file in some 
way, and then you need only the blocks that comprise the file you want. 

> DIAG250 is a block driver, just like Linux can do. Extra work is to allocate
> memory to hold blocks while you work on them, make sure to flush the
> updated pages, etc.

I suggested looking at DIAG 250 for ideas on how to approach the problem. I 
explicitly said that I do not want a duplicate of DIAG 250. 

Yes, it's going to be a little more work on the housekeeping tasks if you want 
R/W access, but you'd have to do substantially the same housekeeping with the 
full-disk map  approach. It's mostly buffer management issues, though; the 
actual update to the data on disk can still be done with memcpy and you still 
get the benefit of mmap goodness; you just have to think about it a bit more. 

> The approach to try map and take the long route as alternative is nice best of
> both, but double code. 

Rough consensus and running code.  That's probably the compromise setting -- I 
could live with it.  I could live with messing with the ulimit if I was handing 
it one ginormous file -- that's a real outlier. It shouldn't be the default 
case. 

Bottom line:  I think we are in violent agreement that we now know what we 
don't want. 8-) Otherwise, we're just doing a bit of whiteboard discussion on 
other ways to do it. 

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390
----------------------------------------------------------------------
For more information on Linux on System z, visit
http://wiki.linuxvm.org/

Re: cmsfs-fuse: mmap failed: Cannot allocate memory

Reply via email to