James G. Sack (jim) wrote:

Carl Lowenstein wrote:

If you get really serious about this sort of stuff, you ought to look in at

_The Art of Computer Programming_ Vol 3 Sorting and Searching, Donald
E Knuth, 1997.
Warning -- Sorting occupies 381 pages and Searching 192 pages.

Side comment:  I went to college at the same time and place as Don
Knuth, knew some of the people and places that he discusses in various
reminiscences, but never actually met him.

These questions are neat because they always make one go think a little
deeper than the boilerplate recipe level.

And along those lines, RS ..
.. you do realize, don't you that your test pattern doesn't find your
desired matches.

That is, it gives false positives, on (say)

ada ada ada ada

Indeed. The search document contains 8 lines that contain 4 (or more) of just one of the words. And 17 lines contain only 3 of that word. Only 6 lines contain 4 (or more) of this word only (not part of another).

But the piped greps ensure that I'm not distracted by such false positives.

That doesn't seem to be what you wanted, eh? It's because of this that I
started yakking about permutations, and then also about chain-greps
giving false positives.

No. But the chain-greps work quite well to zero in on what I want. If 4 greps in the chain had not been enough, I could have added a 5th, or nth. But I tend to avoid looking for "\<a\>" (6k), "\<the\>" (23k), and "\<and\>" (21k).
"\<a\>" and "\<the\>" (4k)
"\<a\>" and "\<and\>" (4k)
"\<the\>" and "\<and\>" (16k)
"\<a\>" and "\<the\>" and "\<and\>" (3k)

Regards,
..jim

Thanks jim.



--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to