On Fri, 11 Mar 2005 06:59:33 +0100, Heiko Wundram <[EMAIL PROTECTED]> wrote: > On Tuesday 08 March 2005 15:55, Simon Brunning wrote: > > Ah, but that's the clever bit; it *doesn't* store the whole list - > > only the selected lines. > > But that means that it'll only read several lines from the file, never do a > shuffle of the whole file content...
Err, thing is, it *does* pick a random selection from the whole file, without holding the whole file in memory. (It does hold all the selected items in memory - I don't see any way to avoid that.) Why not try it and see? > When you'd want to shuffle the file > content, you'd have to set lines=1 and throw away repeating lines in > subsequent runs, or you'd have to set lines higher, and deal with the > resulting lines too in some way (throw away repeating ones... :-). Eliminating duplicates is left as an exercise for the reader. ;-) > Doesn't > matter how, you'd have to store which lines you've already read > (selected_lines). And in any case you'd need a line cache of 10^9 entries for > this amount of data... Nope, you don't. -- Cheers, Simon B, [EMAIL PROTECTED], http://www.brunningonline.net/simon/blog/ -- http://mail.python.org/mailman/listinfo/python-list