On Wednesday, 17 February 2021 at 04:10:24 UTC, tsbockman wrote:
On files small enough to fit in RAM, it is similar in speed to the other solutions posted, but less memory hungry. Memory consumption in this case is around (sourceFile.length + 32 * lineCount * 3 / 2) bytes. Run time is similar to other posted solutions: about 3 seconds per GiB on my desktop.

Oops, I think the memory consumption should be (sourceFile.length + 32 * (lineCount + largestBucket.lineCount / 2)) bytes. (In the limit where everything ends up in one bucket, it's the same, but that shouldn't normally happen unless the entire file has only one unique line in it.)

Reply via email to