On Wed, Jul 15, 2009 at 03:27:05PM -0700, Brock Pytlik wrote: > Brock Pytlik wrote: >> [email protected] wrote: >>> On Wed, Jul 15, 2009 at 03:08:41PM -0700, Brock Pytlik wrote: >>> >>>> Webrev: >>>> http://cr.opensolaris.org/~bpytlik/ips-10050-v1/ >>>> >>>> Bug: >>>> reading manifests during search should pick an appropriate buffer size >>>> http://defect.opensolaris.org/bz/show_bug.cgi?id=10050 >>>> >>> >>> Stupid question: What was the rationale for the choice of 512 bytes? >>> >>> Here's the relevant pydoc for open(): >>> >>> open = class file(object) >>> | file(name[, mode[, buffering]]) -> file object >>> | | Open a file. The mode can be 'r', 'w' or 'a' for reading >>> (default), >>> | writing or appending. The file will be created if it doesn't exist >>> | when opened for writing or appending; it will be truncated when >>> | opened for writing. Add a 'b' to the mode for binary files. >>> | Add a '+' to the mode to allow simultaneous reading and writing. >>> | If the buffering argument is given, 0 means unbuffered, 1 means line >>> | buffered, and larger numbers specify the buffer size. >>> >>> I may have misunderstood, but if you're just reading a line then you >>> probably want set buffering to line buffering (1). Also, you don't seem >>> to be supplying any mode arguments to open(), "rb" is usually customary >>> for reading files. Do you perform any other operations other than >>> reading a line out of the file here? >>> >>> -j >>> >> I missed that line buffering was an option to file. Let me try that >> and I'll get back to you with the results in a moment. I'll add the >> "rb" as well. > So, I'm not sure what "line buffering" is supposed to do, but here' what > it does in practice. It sets the buffering size to 1016 in (nearly?) all > cases, which means that the I/O becomes about 5-6M/s instead of 3. I > appreciate you catching that option, but it seems like, despite what the > docs might suggest, line buffering isn't what we want. To some extent, > that makes sense right? In order to know how big to make the buffer for > each line, I'd have to know how long each line is, which suggests I've > already looked at the line. But if I've already looked at the line, why > not just return the line in the first place? ;) So I'm not sure what > problem line buffering is supposed to solve, but it appears that it's > not ours :)
Ok, thanks for looking at that option. You'll include the file mode with the final fix, correct? I'm still curious how you chose 512, though. I've come across ancient code in Solaris that picked constants that seemed like a good idea in 1984, but were horribly outdated in > 2000. Are we ever likely to need to read more than 512 bytes at a time in this situation? -j _______________________________________________ pkg-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/pkg-discuss
