On Thursday 09 September 2010 01:28:04, Daniel Fischer wrote: > Maybe the following observation helps: > > ghc-6.13.20100831 reads lazy ByteStrings in chunks of 8192 bytes. > > If I understand correctly, that means (since defaultChunkSize = 32760) > - bytestring allocates a 32K buffer to be filled and asks ghc for 32760 > bytes in that buffer > - ghc asks the OS for 8192 bytes (and usually gets them) > - upon receiving fewer bytes than requested, bytestring copies them to a > new smaller buffer > - since the number of bytes received is a multiple of ghc's allocation > block size (which I believe is 4K), there's no space for the bookkeeping > overhead, hence the new buffer takes up 12K instead of 8, resulting in > 44K allocation for 8K bytes > > That factor of 5.5 corresponds pretty well with the allocation figures > above,
That seems to be correct, but probably not the whole story. I've played with defaultChunkSize, setting it to (64K - overhead), ghc still reads in 8192 byte chunks, the allocation figures are nearly double those for (32K - overhead). Setting it to (8K - overhead), ghc reads in 8184 byte chunks and the allocation figures go down to approximately 1.4 times those of 6.12.3. Can a factor of 1.4 be explained by the smaller chunk size or is something else going on? > and the extra copying explains the approximate doubling of I/O time. Apparently not. With the small chunk size which should avoid copying, the I/O didn't get faster. > > Trying to find out why ghc asks the OS for only 8192 bytes instead of > 32760 hasn't brought enlightenment yet. No progress on that front. _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users