James G. Sack (jim) wrote:

James G. Sack (jim) wrote:
Ralph Shumaker wrote:
3. If a quick & dirty solution might be acceptable, just run a chain of
4 searches.
..
Wait a minute. It occurs to me that you may know something I don't. Are you saying that there is a way to do a four part compound search in
vim?
Nah, sorry. I was just alluding to the same thing suggested by jhriv and
kc, namely something like

grep ada sample.txt | grep bub | grep crc | grep dod

I had a post-thought:

The chain-approach (multiple passes, as in the grep line above) has a
potential problem if your search strings contain common substrings that
match when abutted.

for instance, if I change your object strings to
 ada bub crc dad

then the grep approach will give a false positive on
adad bub crc

Of course you may have intended to call that a hit (if so your problem
specification could have been improved), in which case the list of
permutations search approach will fail to match.

Well, most of the actual object strings are bigger than 3 letters, and no 2 of the 4 are likely to be found as part of the same word. If I remember correctly, the ones I was actually searching for were: (1) the first 3 letters of a name (capitalized), because I couldn't remember which of two similar names it was that I wanted, (2) a common word of 4 letters, unlikely to be part of another in this document, (3) a 5 letter name of a place (capitalized), unlikely to be part of another in this document, and
(4) a very common 5 letter word.

As it turned out, (1) was found in 349 lines, (2) in 132, (3) in 179, and (4) in 550.
(1) and (2) were in only 1 line,
(1) and (3) in 3,
(1) and (4) in 9,
(2) and (3) in 1,
(2) and (4) in 9, and
(3) and (4) in 4.

I could have found the line for which I searched merely by combining a search of (2) with either (1) or (3). But I did not know this before I tried. If the 4 element compound search had yielded nothing, then I would have tried some combinations of fewer elements. (4) was fairly popular with each of (1) and (3). But (1), (3), and (4) are all 3 found in only one line, along with (2).

Fun!

Indeed!

Regards,
..jim

Thanks jim.

(Now, I just have to find out how to use regex in grep.)



--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to