On 7/26/07, Steven Parkes <[EMAIL PROTECTED]> wrote: > First I create a single large file that has one doc per line > from > Wikipedia content, using this alg > > Anybody disagree that the 1-line-per-doc format is better (at least for > Wikipedia)? If so, I'll get rid of the intermediate one-file-per-doc > step.
+1 opening a lot of files is an overhead, and something we aren't trying to test. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]