Narfi Stefansson wrote:
On 3/20/06, Stepan Kasal <[EMAIL PROTECTED]> wrote:
On Sun, Mar 19, 2006 at 11:28:48PM -0500, Narfi Stefansson wrote:
I was expecting it to take order n time to match against n patterns, where n
is the number of lines in the pattern file.
What I am experiencing: The execution time of grep -f behaves like n^a, with
a somewhere in the range [2.5, 3.0].
yes, that is the expected behaviour: the goal of grep is to scan files which
are much bigger than the total size of all the given patterns.
So it is OK that the time required to build the ``compiled structure'' from
the pattern list is huge. [...]
This explaines the implementation, but it doesn't answer the question
of whether this is desirable or not. Look, the net effect is that
grep -f is nearly unusable with 1000 lines in the pattern file,
Obviously this is not desirable. I think that O(N^2) behaviour is to be
frowned upon, and anything worse is embarassing, even in a part of the program
like this that is not really intended to take large numbers of patterns. There
is no reason (I'm fairly confident) why it should need to be this bad, and so
we ought to make it better.
However, it is not likely to be fixed soon unless you can fix it or get
somebody else to do so.
Please file a bug report about this.
- Julian