On Friday, 7 October 2022 at 06:34:50 UTC, Siarhei Siamashka wrote:
Also are we allowed to artificially construct needle and haystack to blow up this test rather than only benchmarking it on typical real data?

Such as generating the input data via running:

    python -c "print(('a' * 49 + 'b') * 20000)" > test.lst

And then using "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" (the character 'a' replicated 50 times) as the needle to search for. Much longer needles work even better. In Linux the command line size is limited by 128K, so there's a huge room for improvement.

Reply via email to