Re: [Boston.pm] transposing rows and columns in a CSV file

Ben Tilly Mon, 15 Nov 2004 12:29:04 -0800

On Sat, 13 Nov 2004 17:43:37 -0500, Uri Guttman <[EMAIL PROTECTED]> wrote:
> >>>>> "BT" == Ben Tilly <[EMAIL PROTECTED]> writes:
> 
>   BT> How was I confusing issues?  What I meant is that calling mmap does
>   BT> not use significant amounts of RAM.  (The OS needs some to track
>   BT> that the mapping exists, but that should be it.)  Once you actually use
>   BT> the data that you mmapped in, file contents will be swapped in, and
>   BT> RAM will be taken, but not until then.
> 
> mmap uses virtual ram which means physical ram and swap space. so mmap can
> suck up as much physical ram as you have if you allocate it.


We are both right, and we are both wrong.  The reality is that
any such behaviour on the part of mmap is OS and
implementation dependent.  And an intelligent OS will make
much of it configurable.  

Let me refer to my local Linux manpage.  For this purpose I'd
specify MAP_SHARED.  In that case edits made to memory are
made to the file, and there is no need to reserve RAM or swap.
(Not that Linux would reserve RAM anyways, Linux allows
overcommitting.)  Something close to your described behaviour
may happen if you use MAP_PRIVATE and don't specify
 MAP_NORESERVE.  If you specify MAP_NORESERVE you won't use
swap.  (You might get a SIGSEGV if you make a write when RAM
is not available.)

Which reinforces the point that specific details about the
side-effects of mmap (or any system call) are implementation
dependent and should never be assumed.

>   BT> As for a 3 GB limit, now that you mention it, I heard something
>   BT> about that.  But I didn't pay attention since I don't need it right now.
>   BT> I've also heard about Intel's large addressing extensions (keep 2GB
>   BT> in normal address space, page around the top 2 GB, you get 64 GB
>   BT> of addressible memory).  I'm curious about how (or if) the two can
>   BT> cooperate.
> 
> eww, that sounds like manual management of extended memory. like the
> days of overlays or even pdp-11 extended memory (which i used to access
> 22 bit address (yes, 22 bit PHYSICAL space) from a 16 bit cpu.). not
> something you want to deal with unless you have to.

You got it.  Worse yet, from the way that I see it, in the next 5 years our
industry must decide whether the bad old days are going to return, and
I don't know which way it will jump.

The problem is that consumer computers will soon use over 4 GB of
RAM.  The obvious clean solution is to switch to 64 bit CPUs. But as
soon as you go to 64-bit pointers, a lot of programs and data
structures grow (worst case they double) so there is a big jump in
memory needs.  (Rather than being a bit over your needs, you are a
lot over.)  And when data bloats, moving that data around the
computer slows down as well.  Programmer pain is invisible.  That
size and performance hit is not.

A couple of years ago Intel quietly added an addressing extension to
allow for up to 64 GB of RAM, and then pursued a non-consumer
64-bit strategy (which flopped).  The way that I read that is that Intel
expects the consumer industry to do as it did for a decade after the
16 to 32 bit conversion should have happened - stick with smaller
pointers and swallow the addressing difficulty.

AMD's strategy was more public.  They came out with a 64-bit CPU
that solved a bunch of problems with the x86 architecture, meaning
that if you switch to 64-bit mode there is a good chance that your
code will speed up.  Their CPU has been successful enough that
Intel has been compelled to issue a copycat CPU.

But which way the industry will jump is still unclear to me.  To me
the first good test is what happens when high-end games start
having memory requirements that are too big for straightforward
32-bit access.  (A limit which conventially is placed at 2 GB, but can
be stretched to somewhere between 2 GB and 4 GB.)  Will they
manually manage some big chunks of data, or will they require an
AMD Athalon-compatible computer?

>                                                                    as for the 
> original
> problem, i keep saying that mmap will give little help if the input
> matrix won't fit into physical ram. once you start swapping (manually or
> virtually) all bets are off on speed of any transpostion algorithm. you
> have to start counting disk accesses and once you do, who care how it
> was done (mmap or whatever)?

mmap with MAP_SHARED may reduce your RAM requirements, and
improves your access speed.  I agree that the difference is pretty
marginal.

As for your "counting disk accesses", I've already pointed out that
disk accesses are not created equal.  If you really care about the
performance of the application, you need to benchmark, not make
simplistic estimates.  Because unless you know a lot of detail about
how the disk drive works, you can't easily predict what the actual
performance will be.

[...]
>   BT> However the over-committed allocation comment confuses me.
>   BT> Why would a single mmap result in over committing memory?
> 
> you can allocate all the virtual space allowed when you don't need
> it. then mmap gains you little. mmap is best used as a window into a
> large file. note that page faults when using mmap will block a process
> until that page is swapped in. so some systems use small processes to
> touch shared (mmaped) virtual ram so that block and when they wake up
> the main process can use the real ram.

See my comments above about the detailed behaviour of mmap
under Linux.  Also note that I was assuming that for this application
you'd use MAP_SHARED, in which case no memory needs to be
allocated to the process at all.

As for page faults, blocking on a page-fault is no different than
blocking on a read.  And nothing stops you from doing this with the
regular filesystem.  But if you're going to go through this much work
to do it, you're going to take every obvious optimization first, and
therefore you'll want to mmap.

Personally I think it would be great to have a system call to say,
"Page this in, because I'll want to be there soon."  The
multi-process solution is the next best thing.

>   BT> I was thinking that you'd use something like Sys::Mmap's mmap
>   BT> call directly so that there is a Perl variable that Perl thinks is a
>   BT> regular variable but which at a C level has its data at an mmapped
>   BT> location.  Fragile, I know (because Perl doesn't know that it cannot
>   BT> reallocate the variable), but as long as you are careful to not cause
>   BT> it to be reallocated or copied, there should be no limitations on
>   BT> what you can do.
> 
> but those are bad limitations. i wouldn't ever design in such a beast. i
> would use other methods of IPC than mmap in perl.

As I said, it is fragile.  But after you see the hoops that something
like File::Stream has to go through, I can see that there might be
cases where I would use it as a strategy of last resort.  (This
problem is not one where I'd use it though.)

>   BT> In Perl I'd expect it to be possible but fragile.  If Parrot could make
>   BT> it possible and not fragile, that would be great.
> 
> parrot will have mmap in bytecode which will be nice and fast and
> shared. i dunno about mmaped data as that is always an issue and with
> garbage collection it can be tricky.

I see, so you compile Perl to bytecode, and then load it through mmap.
Sure, it might be faster to compile Perl than to read the file. But if the
bytecode has been used recently...

Cheers,
Ben
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Re: [Boston.pm] transposing rows and columns in a CSV file

Reply via email to