Re: [Boston.pm] transposing rows and columns in a CSV file

2004-11-13 Thread Ben Tilly
On Fri, 12 Nov 2004 23:04:46 -0500, Aaron Sherman [EMAIL PROTECTED] wrote:
 On Fri, 2004-11-12 at 13:22 -0800, Ben Tilly wrote:
[...]
  Um, mmap does not (well should not - Windows may vary) use any
  RAM
 
 You are confusing two issues. using RAM is not the same as allocating
 process address space. Allocating process address space is, of course,
 required for mmap (same way you allocate address space when you load a
 shared library, which is also mmap-based under Unix and Unix-like
 systems). All systems have to limit address space at some point. Linux
 does this at 3GB up to 2.6.x where it becomes more configurable and can
 be as large as 3.5, I think.

How was I confusing issues?  What I meant is that calling mmap does
not use significant amounts of RAM.  (The OS needs some to track
that the mapping exists, but that should be it.)  Once you actually use
the data that you mmapped in, file contents will be swapped in, and
RAM will be taken, but not until then.

As for a 3 GB limit, now that you mention it, I heard something
about that.  But I didn't pay attention since I don't need it right now.
I've also heard about Intel's large addressing extensions (keep 2GB
in normal address space, page around the top 2 GB, you get 64 GB
of addressible memory).  I'm curious about how (or if) the two can
cooperate.

 To be clear, though, if you had 10MB of RAM, you could still mmap a 3GB
 file, assuming you allowed for over-committed allocation in the kernel
 (assuming Linux... filthy habit, I know).

Exactly what I was referring to.

However the over-committed allocation comment confuses me.
Why would a single mmap result in over committing memory?

  mmap should not cause any more or less disk accesses than
  reading from the file in the same pattern should have.  It just lets
  you do things like use Perl's RE engine directly on the file
  contents.
 
 Actually, no it doesn't as far as I know (unless the copy-on-write code
 got MUCH better recently).

Where does a write happen?  I was thinking in terms of using the
RE engine (with pos) as a tokenizer.

I was thinking that you'd use something like Sys::Mmap's mmap
call directly so that there is a Perl variable that Perl thinks is a
regular variable but which at a C level has its data at an mmapped
location.  Fragile, I know (because Perl doesn't know that it cannot
reallocate the variable), but as long as you are careful to not cause
it to be reallocated or copied, there should be no limitations on
what you can do.

 Like I said, you probably won't get the win out of mmap in Perl that you
 would expect. In Parrot you would, but that's another story.

In Perl I'd expect it to be possible but fragile.  If Parrot could make
it possible and not fragile, that would be great.

Cheers,
Ben
___
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] transposing rows and columns in a CSV file

2004-11-13 Thread Uri Guttman
 BT == Ben Tilly [EMAIL PROTECTED] writes:

  BT How was I confusing issues?  What I meant is that calling mmap does
  BT not use significant amounts of RAM.  (The OS needs some to track
  BT that the mapping exists, but that should be it.)  Once you actually use
  BT the data that you mmapped in, file contents will be swapped in, and
  BT RAM will be taken, but not until then.

mmap uses virtual ram which means physical ram and swap space. so mmap can
suck up as much physical ram as you have if you allocate it.

  BT As for a 3 GB limit, now that you mention it, I heard something
  BT about that.  But I didn't pay attention since I don't need it right now.
  BT I've also heard about Intel's large addressing extensions (keep 2GB
  BT in normal address space, page around the top 2 GB, you get 64 GB
  BT of addressible memory).  I'm curious about how (or if) the two can
  BT cooperate.

eww, that sounds like manual management of extended memory. like the
days of overlays or even pdp-11 extended memory (which i used to access
22 bit address (yes, 22 bit PHYSICAL space) from a 16 bit cpu.). not
something you want to deal with unless you have to. as for the original
problem, i keep saying that mmap will give little help if the input
matrix won't fit into physical ram. once you start swapping (manually or
virtually) all bets are off on speed of any transpostion algorithm. you
have to start counting disk accesses and once you do, who care how it
was done (mmap or whatever)?

   To be clear, though, if you had 10MB of RAM, you could still mmap a 3GB
   file, assuming you allowed for over-committed allocation in the kernel
   (assuming Linux... filthy habit, I know).

  BT Exactly what I was referring to.

  BT However the over-committed allocation comment confuses me.
  BT Why would a single mmap result in over committing memory?


you can allocate all the virtual space allowed when you don't need
it. then mmap gains you little. mmap is best used as a window into a
large file. note that page faults when using mmap will block a process
until that page is swapped in. so some systems use small processes to
touch shared (mmaped) virtual ram so that block and when they wake up
the main process can use the real ram.

  BT I was thinking that you'd use something like Sys::Mmap's mmap
  BT call directly so that there is a Perl variable that Perl thinks is a
  BT regular variable but which at a C level has its data at an mmapped
  BT location.  Fragile, I know (because Perl doesn't know that it cannot
  BT reallocate the variable), but as long as you are careful to not cause
  BT it to be reallocated or copied, there should be no limitations on
  BT what you can do.

but those are bad limitations. i wouldn't ever design in such a beast. i
would use other methods of IPC than mmap in perl.

  BT In Perl I'd expect it to be possible but fragile.  If Parrot could make
  BT it possible and not fragile, that would be great.

parrot will have mmap in bytecode which will be nice and fast and
shared. i dunno about mmaped data as that is always an issue and with
garbage collection it can be tricky.

uri

-- 
Uri Guttman  --  [EMAIL PROTECTED]   http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs    http://jobs.perl.org
___
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm


Re: [Boston.pm] emergency social meeting

2004-11-13 Thread William Ricker
Dear Brian,

 I'm staying in Milford, but I can come into town.

If staying in Milford, I think I might want to escape once in a while too. :-)  
(I've still got some frinds at that client but I'm not sure if any are in the 
Stonehenge classes.)

What sort of ambience, food, or brew would you prefer?

Best fish between Gloucester and Plymouth is still The No-Name on Fish Pier in 
Boston.  They now have a liquor license.

We've got a great selection of brewpubs in the city and suburbs. I like the 
ones on city subway for serious drinking, so I don't have to DUI, but that 
doesn't help you. 

If there are nice pubs in Milford I haven't heard - did Merlyn give you a 
Defensive Gastronomic Briefing ? Milford's Turtle Tavern sounds vaguely 
interesting but WWW has little info.

For the deep burbs ... On i495, a good brew can be had at hostlers serving 
Sherwood Forest's Archer Ale. There are quite a few listed at  
http://www.sherwoodbrewers.com/local.php#restaurants 

One with local history is in Sudbury on Rt 20, Longfellow?s Wayside Inn, 
http://www.wayside.org/ , http://www.originalinns.com/wayside_index.html . The 
country?s oldest operating inn, which inspired Longfellow?s Tales, has Archer 
as the house draft. 
Food there was a little pricey when I was younger, but sounds reasonable now.

Complete Traditional Dinners
   starting at only $16.95
   Available for lunch or dinner 
   Monday through Friday
   For Reservations call (978) 443-1776

Similar historical ambience and prices (or higher) can be found downtown at 
Durgin Park and The Union Oyster House.

Cheers,

Bill

---
William Ricker [EMAIL PROTECTED]

___
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm