Re: find'ing files containing certain words (all of them) ...

2013-09-23 Thread Chris Bannister
On Sat, Sep 21, 2013 at 12:00:54PM -0400, Albretch Mueller wrote:
  I have come to believe this is one of those problems that is not to
 be optimally solved with a script, but a programming language

What's the difference? OK, you give a script to the cast and a program
to the audience, but other than that it gets difficult to discern them.

-- 
If you're not careful, the newspapers will have you hating the people
who are being oppressed, and loving the people who are doing the 
oppressing. --- Malcolm X


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130923073229.GH23041@tal



Re: find'ing files containing certain words (all of them) ...

2013-09-23 Thread Marko Randjelovic
On Sat, 21 Sep 2013 12:00:54 -0400
Albretch Mueller lbrt...@gmail.com wrote:

  I have come to believe this is one of those problems that is not to
 be optimally solved with a script, but a programming language
 
  lbrtchx
 
 

Probably AWK could be a good compromise :)

words.awk:
BEGIN {
split(p,ws,|);
n=1;
while (n in ws)
ff[n++]=0;
} 

{
for (i=1;in;i++) 
if ((ff[i]==0)  (match($0,ws[i])))
ff[i]=1;
}

END {
f=0;
i=1;
while ((f==0)(in)) {
if (ff[i++]==0)
f=1;
}
exit(f);
}

command line:
$ awk -v p=w1|...|wn -f words.awk file.txt

Regards :)

-- 
http://mr.flossdaily.org


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130923134723.7b828...@eunet.rs



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread Rob Owens
On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote:
  the short bash script bellow you can use to find text files
 containing one word, but my attempts at trying to make it find more
 than one word within the same file haven't been successful
 
I think you are looking for the 'grep' command.

grep word path/* will find all files in path which contain word

grep word path/* | grep word2 will do the same, but then narrow down the
search to files that also contain word2

man grep for options.  But you might be interested in '-r' to
recursively search a path.

-Rob


signature.asc
Description: Digital signature


Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread ken

On 09/21/2013 07:56 AM Rob Owens wrote:

On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote:

  the short bash script bellow you can use to find text files
containing one word, but my attempts at trying to make it find more
than one word within the same file haven't been successful


I think you are looking for the 'grep' command.

grep word path/* will find all files in path which contain word

grep word path/* | grep word2 will do the same, but then narrow down the
search to files that also contain word2

man grep for options.  But you might be interested in '-r' to
recursively search a path.

-Rob


Rob,

This is incorrect, though it's a common misconception.  It will only 
work if both of the words sought are on the same line within in the 
sought files.  This isn't what the OP is asking for.


Instead:

$ echo two words  grep-AND-test1
$ echo two  grep-AND-test2
$ echo words  grep-AND-test2
$ grep -l words $(grep -l two *)
grep-AND-test1
grep-AND-test2



--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/523da33a.3040...@mousecar.com



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread Albretch Mueller
 I have come to believe this is one of those problems that is not to
be optimally solved with a script, but a programming language

 lbrtchx


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAFakBwhDcRc_iEAqx=2oybga17_tm_3cj_q4n9g6r_rxlti...@mail.gmail.com



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread Alexander Kapshuk

On 09/21/2013 04:46 PM, ken wrote:

On 09/21/2013 07:56 AM Rob Owens wrote:

On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote:

  the short bash script bellow you can use to find text files
containing one word, but my attempts at trying to make it find more
than one word within the same file haven't been successful


I think you are looking for the 'grep' command.

grep word path/* will find all files in path which contain word

grep word path/* | grep word2 will do the same, but then narrow 
down the

search to files that also contain word2

man grep for options.  But you might be interested in '-r' to
recursively search a path.

-Rob


Rob,

This is incorrect, though it's a common misconception.  It will only 
work if both of the words sought are on the same line within in the 
sought files.  This isn't what the OP is asking for.


Instead:

$ echo two words  grep-AND-test1
$ echo two  grep-AND-test2
$ echo words  grep-AND-test2
$ grep -l words $(grep -l two *)
grep-AND-test1
grep-AND-test2



I could be wrong, but doesn't egrep, which supports extended regular 
expressions, fit the bill?


=; echo two words  grep-AND-test1
=; echo two  grep-AND-test2
=; echo words  grep-AND-test2

=; egrep 'two|words' grep-AND-test*
grep-AND-test1:two words
grep-AND-test2:two
grep-AND-test2:words


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/523de194.2040...@gmail.com



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread recoverym4n
 Hi.

On Sat, 21 Sep 2013 21:12:36 +0300
Alexander Kapshuk alexander.kaps...@gmail.com wrote:

 I could be wrong, but doesn't egrep, which supports extended regular 
 expressions, fit the bill?
 
 =; echo two words  grep-AND-test1
 =; echo two  grep-AND-test2
 =; echo words  grep-AND-test2
 
 =; egrep 'two|words' grep-AND-test*
 grep-AND-test1:two words
 grep-AND-test2:two
 grep-AND-test2:words

You're wrong indeed, as '|' means 'or', not 'and'.

$ echo two  grep-AND-test3
$ egrep 'two|words' grep-AND-test3
two

Please note that 'two.*words' won't be the solution (as 'words' can be
placed before 'two'), and even '(two.*words|words.*two)' won't be the
solution either ('two' and 'words' can be in a different rows of file).

Reco


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/20130921230559.007126c4783b01f08f766...@gmail.com



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread Linux-Fan
... OP wrote:
  You can find all files containing either import or
 BufferedReader, but not both words in the same file. Also, how can
 you use such a the same of a similar script to search for sequences of
 characters containing spaces and other especial characters? Say,
 something like:
 
 _WRD=import javax.swing|new BufferedReader

You could write a simple script doing the matching, aka /tmp/match.sh
containing these lines:

#!/bin/sh
cat $1 | tr '\n' ' ' | grep -qE 'import javax\.swing.*new
BufferedReader'  echo $1

and a suitable find invocation

$ find DIRECTORY -name '*.java' -exec /tmp/match.sh {} \;

Unfortunately, this will invoke the script for all .java files below
DIRECTORY which is why I would not use it on large amounts of data.

To search for special characters, e.g. the sequences of spaces
mentioned, you can use extended regular expressions.

HTH
Linux-Fan.

-- 
http://masysma.ohost.de/



signature.asc
Description: OpenPGP digital signature


Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread Rob Owens
On Sat, Sep 21, 2013 at 09:46:34AM -0400, ken wrote:
 On 09/21/2013 07:56 AM Rob Owens wrote:
 On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote:
   the short bash script bellow you can use to find text files
 containing one word, but my attempts at trying to make it find more
 than one word within the same file haven't been successful
 
 I think you are looking for the 'grep' command.
 
 grep word path/* will find all files in path which contain word
 
 grep word path/* | grep word2 will do the same, but then narrow down the
 search to files that also contain word2
 
 man grep for options.  But you might be interested in '-r' to
 recursively search a path.
 
 -Rob
 
 Rob,
 
 This is incorrect, though it's a common misconception.  It will only
 work if both of the words sought are on the same line within in the
 sought files.  This isn't what the OP is asking for.
 
Ah, I see I misread his problem...  Sorry for the false information.

-Rob

 Instead:
 
 $ echo two words  grep-AND-test1
 $ echo two  grep-AND-test2
 $ echo words  grep-AND-test2
 $ grep -l words $(grep -l two *)
 grep-AND-test1
 grep-AND-test2
 
 
 
 -- 
 To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a
 subject of unsubscribe. Trouble? Contact
 listmas...@lists.debian.org
 Archive: http://lists.debian.org/523da33a.3040...@mousecar.com
 


signature.asc
Description: Digital signature


Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread David
On 21 September 2013 19:22, Albretch Mueller lbrt...@gmail.com wrote:

 the short bash script bellow you can use to find text files
 containing one word, but my attempts at trying to make it find more
 than one word within the same file haven't been successful

Your question is not at all specific to Debian, so really it is offtopic
here. Ok maybe you are using Debian, but the question is not *about*
the Debian distribution you happen to be using. Your question is
about bash scripts and grep, which run in many other places than
Debian. The big benefit for you in understanding this point is that
you will reach a more suitable audience and be more likely to get
the help you want if you find a forum about bash scripting, or grep,
and ask there.

The script you provided has an ugly style that I find hard to read, so I
spent 5 seconds looking at it and then decided it was not going to be
fun, and stopped. I mention this not to criticise you, but to help you
understand our conversation.

Also I find your question unclear. It took too much effort for me to
figure out exactly what you are asking, so I gave up and had to guess.
I guessed like this:

 You can find all files containing either import or
 BufferedReader, but not both words in the same file.

I gather from this sentence that you want a script that can search text
files to find only files that contain all words in a set of words, and that
those words can occur in any order in the file.

 I want to do just one search per file

I think 'grep' cannot can do this in only one invocation, because I
think it has no way to specify all words without giving them an order.
So I wrote the below bash script that might help you. It prints only
filenames that contain all words in wordlist.

It generates 3 example files tbm.txt, tb.txt, t.txt and searches for the
only one that contains all 3 words: three blind mice.

#!/bin/bash

# require bash version 4
if [[ ${BASH_VERSION:0:1} != 4 ]] ; then
printf This script requires Bash version 4\n
exit
fi

# require nullglob set
shopt -s nullglob

# create some demo files
echo three blind mice tbm.txt
echo three blind tb.txt
echo three t.txt

# files to search
files=( *.txt )

# words to search for
wordlist=( three blind mice )

# search files for each word
for word in ${wordlist[@]} ; do
if [ -n ${files[*]} ] ; then
# keep only files that contain current word
mapfile -t files  (grep -l ${word} ${files[@]})
fi
done

# print remaining files
printf -- %s\n ${files[@]}


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAMPXz=ry1cw16ugyagkk6qpttl1170mhtqc4+3nobeajtpi...@mail.gmail.com



Re: find'ing files containing certain words (all of them) ...

2013-09-21 Thread David
On 22 September 2013 12:55, David bouncingc...@gmail.com wrote:

 if [ -n ${files[*]} ] ; then

oops, that line above (#7 from the end) works ok but it will run
faster if changed to this more modern bash syntax:

if [[ -n ${files[*]} ]] ; then

(I was writing makefiles yesterday, I got stuck in the habit of using
the older syntax :)


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/CAMPXz=qo2kk7xfaaz9jlz3ttq4w18xb_dg0qa_+oq4dm08s...@mail.gmail.com