Re: find'ing files containing certain words (all of them) ...
On Sat, Sep 21, 2013 at 12:00:54PM -0400, Albretch Mueller wrote: I have come to believe this is one of those problems that is not to be optimally solved with a script, but a programming language What's the difference? OK, you give a script to the cast and a program to the audience, but other than that it gets difficult to discern them. -- If you're not careful, the newspapers will have you hating the people who are being oppressed, and loving the people who are doing the oppressing. --- Malcolm X -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130923073229.GH23041@tal
Re: find'ing files containing certain words (all of them) ...
On Sat, 21 Sep 2013 12:00:54 -0400 Albretch Mueller lbrt...@gmail.com wrote: I have come to believe this is one of those problems that is not to be optimally solved with a script, but a programming language lbrtchx Probably AWK could be a good compromise :) words.awk: BEGIN { split(p,ws,|); n=1; while (n in ws) ff[n++]=0; } { for (i=1;in;i++) if ((ff[i]==0) (match($0,ws[i]))) ff[i]=1; } END { f=0; i=1; while ((f==0)(in)) { if (ff[i++]==0) f=1; } exit(f); } command line: $ awk -v p=w1|...|wn -f words.awk file.txt Regards :) -- http://mr.flossdaily.org -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130923134723.7b828...@eunet.rs
Re: find'ing files containing certain words (all of them) ...
On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote: the short bash script bellow you can use to find text files containing one word, but my attempts at trying to make it find more than one word within the same file haven't been successful I think you are looking for the 'grep' command. grep word path/* will find all files in path which contain word grep word path/* | grep word2 will do the same, but then narrow down the search to files that also contain word2 man grep for options. But you might be interested in '-r' to recursively search a path. -Rob signature.asc Description: Digital signature
Re: find'ing files containing certain words (all of them) ...
On 09/21/2013 07:56 AM Rob Owens wrote: On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote: the short bash script bellow you can use to find text files containing one word, but my attempts at trying to make it find more than one word within the same file haven't been successful I think you are looking for the 'grep' command. grep word path/* will find all files in path which contain word grep word path/* | grep word2 will do the same, but then narrow down the search to files that also contain word2 man grep for options. But you might be interested in '-r' to recursively search a path. -Rob Rob, This is incorrect, though it's a common misconception. It will only work if both of the words sought are on the same line within in the sought files. This isn't what the OP is asking for. Instead: $ echo two words grep-AND-test1 $ echo two grep-AND-test2 $ echo words grep-AND-test2 $ grep -l words $(grep -l two *) grep-AND-test1 grep-AND-test2 -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/523da33a.3040...@mousecar.com
Re: find'ing files containing certain words (all of them) ...
I have come to believe this is one of those problems that is not to be optimally solved with a script, but a programming language lbrtchx -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAFakBwhDcRc_iEAqx=2oybga17_tm_3cj_q4n9g6r_rxlti...@mail.gmail.com
Re: find'ing files containing certain words (all of them) ...
On 09/21/2013 04:46 PM, ken wrote: On 09/21/2013 07:56 AM Rob Owens wrote: On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote: the short bash script bellow you can use to find text files containing one word, but my attempts at trying to make it find more than one word within the same file haven't been successful I think you are looking for the 'grep' command. grep word path/* will find all files in path which contain word grep word path/* | grep word2 will do the same, but then narrow down the search to files that also contain word2 man grep for options. But you might be interested in '-r' to recursively search a path. -Rob Rob, This is incorrect, though it's a common misconception. It will only work if both of the words sought are on the same line within in the sought files. This isn't what the OP is asking for. Instead: $ echo two words grep-AND-test1 $ echo two grep-AND-test2 $ echo words grep-AND-test2 $ grep -l words $(grep -l two *) grep-AND-test1 grep-AND-test2 I could be wrong, but doesn't egrep, which supports extended regular expressions, fit the bill? =; echo two words grep-AND-test1 =; echo two grep-AND-test2 =; echo words grep-AND-test2 =; egrep 'two|words' grep-AND-test* grep-AND-test1:two words grep-AND-test2:two grep-AND-test2:words -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/523de194.2040...@gmail.com
Re: find'ing files containing certain words (all of them) ...
Hi. On Sat, 21 Sep 2013 21:12:36 +0300 Alexander Kapshuk alexander.kaps...@gmail.com wrote: I could be wrong, but doesn't egrep, which supports extended regular expressions, fit the bill? =; echo two words grep-AND-test1 =; echo two grep-AND-test2 =; echo words grep-AND-test2 =; egrep 'two|words' grep-AND-test* grep-AND-test1:two words grep-AND-test2:two grep-AND-test2:words You're wrong indeed, as '|' means 'or', not 'and'. $ echo two grep-AND-test3 $ egrep 'two|words' grep-AND-test3 two Please note that 'two.*words' won't be the solution (as 'words' can be placed before 'two'), and even '(two.*words|words.*two)' won't be the solution either ('two' and 'words' can be in a different rows of file). Reco -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20130921230559.007126c4783b01f08f766...@gmail.com
Re: find'ing files containing certain words (all of them) ...
... OP wrote: You can find all files containing either import or BufferedReader, but not both words in the same file. Also, how can you use such a the same of a similar script to search for sequences of characters containing spaces and other especial characters? Say, something like: _WRD=import javax.swing|new BufferedReader You could write a simple script doing the matching, aka /tmp/match.sh containing these lines: #!/bin/sh cat $1 | tr '\n' ' ' | grep -qE 'import javax\.swing.*new BufferedReader' echo $1 and a suitable find invocation $ find DIRECTORY -name '*.java' -exec /tmp/match.sh {} \; Unfortunately, this will invoke the script for all .java files below DIRECTORY which is why I would not use it on large amounts of data. To search for special characters, e.g. the sequences of spaces mentioned, you can use extended regular expressions. HTH Linux-Fan. -- http://masysma.ohost.de/ signature.asc Description: OpenPGP digital signature
Re: find'ing files containing certain words (all of them) ...
On Sat, Sep 21, 2013 at 09:46:34AM -0400, ken wrote: On 09/21/2013 07:56 AM Rob Owens wrote: On Sat, Sep 21, 2013 at 05:22:09AM -0400, Albretch Mueller wrote: the short bash script bellow you can use to find text files containing one word, but my attempts at trying to make it find more than one word within the same file haven't been successful I think you are looking for the 'grep' command. grep word path/* will find all files in path which contain word grep word path/* | grep word2 will do the same, but then narrow down the search to files that also contain word2 man grep for options. But you might be interested in '-r' to recursively search a path. -Rob Rob, This is incorrect, though it's a common misconception. It will only work if both of the words sought are on the same line within in the sought files. This isn't what the OP is asking for. Ah, I see I misread his problem... Sorry for the false information. -Rob Instead: $ echo two words grep-AND-test1 $ echo two grep-AND-test2 $ echo words grep-AND-test2 $ grep -l words $(grep -l two *) grep-AND-test1 grep-AND-test2 -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/523da33a.3040...@mousecar.com signature.asc Description: Digital signature
Re: find'ing files containing certain words (all of them) ...
On 21 September 2013 19:22, Albretch Mueller lbrt...@gmail.com wrote: the short bash script bellow you can use to find text files containing one word, but my attempts at trying to make it find more than one word within the same file haven't been successful Your question is not at all specific to Debian, so really it is offtopic here. Ok maybe you are using Debian, but the question is not *about* the Debian distribution you happen to be using. Your question is about bash scripts and grep, which run in many other places than Debian. The big benefit for you in understanding this point is that you will reach a more suitable audience and be more likely to get the help you want if you find a forum about bash scripting, or grep, and ask there. The script you provided has an ugly style that I find hard to read, so I spent 5 seconds looking at it and then decided it was not going to be fun, and stopped. I mention this not to criticise you, but to help you understand our conversation. Also I find your question unclear. It took too much effort for me to figure out exactly what you are asking, so I gave up and had to guess. I guessed like this: You can find all files containing either import or BufferedReader, but not both words in the same file. I gather from this sentence that you want a script that can search text files to find only files that contain all words in a set of words, and that those words can occur in any order in the file. I want to do just one search per file I think 'grep' cannot can do this in only one invocation, because I think it has no way to specify all words without giving them an order. So I wrote the below bash script that might help you. It prints only filenames that contain all words in wordlist. It generates 3 example files tbm.txt, tb.txt, t.txt and searches for the only one that contains all 3 words: three blind mice. #!/bin/bash # require bash version 4 if [[ ${BASH_VERSION:0:1} != 4 ]] ; then printf This script requires Bash version 4\n exit fi # require nullglob set shopt -s nullglob # create some demo files echo three blind mice tbm.txt echo three blind tb.txt echo three t.txt # files to search files=( *.txt ) # words to search for wordlist=( three blind mice ) # search files for each word for word in ${wordlist[@]} ; do if [ -n ${files[*]} ] ; then # keep only files that contain current word mapfile -t files (grep -l ${word} ${files[@]}) fi done # print remaining files printf -- %s\n ${files[@]} -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAMPXz=ry1cw16ugyagkk6qpttl1170mhtqc4+3nobeajtpi...@mail.gmail.com
Re: find'ing files containing certain words (all of them) ...
On 22 September 2013 12:55, David bouncingc...@gmail.com wrote: if [ -n ${files[*]} ] ; then oops, that line above (#7 from the end) works ok but it will run faster if changed to this more modern bash syntax: if [[ -n ${files[*]} ]] ; then (I was writing makefiles yesterday, I got stuck in the habit of using the older syntax :) -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/CAMPXz=qo2kk7xfaaz9jlz3ttq4w18xb_dg0qa_+oq4dm08s...@mail.gmail.com