On 2008.04.28 09:27:27 -0700, David Roundy <[EMAIL PROTECTED]> scribbled 1.7K 
characters:
> On Mon, Apr 28, 2008 at 8:39 AM, Gwern Branwen <[EMAIL PROTECTED]> wrote:
> >  > I'm not sure what you mean by co_slurpy being strinct.  It looks to me 
> > like
> >  > it's got adequate unsafeInterleaveIO to make it lazy.
> >  > --
> >  > David Roundy
> >
> >  Well, it does have plenty of unsafeInterleaveIO, that is true, but the 
> > issue here is readFilePS: readFilePS is completely strict, it reads the 
> > entire file into memory (per docs and implementation). So, actually running 
> > readFilePS may get delayed to the last second, but once readFilePS gets 
> > inspected, it'll immediately do its best to suck in all 9 gigs or whatever.
> >
> >  This is why replacing readFilePS in co_slurp_helper with mmapFilePS is 
> > such a time saver - it is lazy and pretends to read in all 9 gigs 
> > immediately, but since with -s, we ultimately only read the first 4096 
> > characters, only a little bit will ever actually get page-faulted into 
> > memory.
> >
> >  (The problem with mmapFilePS is that as lispy mentions, on my 64-bit 
> > system, mmapFilePS can no longer handle >3 gig files while readFilePS 
> > scaled up to at least 9gigs, albeit slowly.)
>
> The other problem is that mmapFilePS will cause darcs to fail entirely
> on large repositories (with more than 1k files) due to sucking up all
> the system's file handles.  I think this is a more common use case in
> darcs than 9g files.  Of course, we could refuse to mmap small files
> (we already do this for very small files), and that could alleviate
> the problem considerably.

(Just a side note; with Lispy's type sig changes, I can now handle >3 gig files 
just fine, albeit more slowly than with readFilePS.)

Hm. I'm not sure about that. Perhaps you mean it'll fail on 32-bit systems? It 
works for me:

[EMAIL PROTECTED]:2849~/foo>echo "make sure we're using lispy's mmap version" 
&& duh bigtempfile                                      [ 6:04PM]
make sure we're using lispy's mmap version
3.9G bigtempfile
3.9G total
[EMAIL PROTECTED]:2850~/foo>cd ~/bin/ghc && darcs query manifest | wc [ 6:05PM]
aclocal.m4    compat/       configure.ac  distrib/           ghc.spec.in 
install-sh      LICENSE   quickcheck/  validate
ANNOUNCE      compiler/     _darcs/       docs/              gmp/ 
InstallShield/  Makefile  README       WindowsInstaller/
bindisttest/  config.guess  darcs-all     driver/            HACKING libffi/    
     mk/       rts/
boot          config.sub    darcs.prof    extra-gcc-opts.in  includes/ 
libraries/      push-all  utils/
   1191    1234   33726
[EMAIL PROTECTED]:2851~/bin/ghc>echo "ok, so there's 1200 files here. Let's see 
whether whatsnew -s fails due to filehandles" && darcs whatsnew -s
ok, so there's 1200 files here. Let's see whether whatsnew -s fails due to 
filehandles
No changes!
[EMAIL PROTECTED]:2847~/bin/ghc>echo "maybe the problem was masked by the lack 
of changes?" && rm HACKING ANNOUNCE LICENSE README [ 6:07PM]
maybe the problem was masked by the lack of changes?
[EMAIL PROTECTED]:2848~/bin/ghc>whatsnew -s [ 6:07PM]
R ./ANNOUNCE
R ./HACKING
R ./LICENSE
R ./README

> Another problem is that using mmap on files in the working directory
> can lead to segfaults, since the user is allowed to edit files in the
> working directory while darcs runs--or at least I don't want to
> segfault if the user does this.
>
> David

Hm, that does sound bad. Is there no way to handle this (set read-only, catch 
exceptions, etc)? I'll admit I've never tried to edit files while using Darcs, 
but that's just me.

--
gwern
Kilo remailers BOSS Medco mass CIDA Fetish bullion USCODE spies

Attachment: pgpcKlpKzVQPA.pgp
Description: PGP signature

_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to