Re: Grep on dictionary words

2009-11-30 Thread Andrew Sackville-West
On Sun, Nov 29, 2009 at 11:14:58AM +0200, Dotan Cohen wrote: 2009/11/29 Andrew Sackville-West and...@farwestbilliards.com: On Sun, Nov 29, 2009 at 01:22:15AM +0200, Dotan Cohen wrote: will get the ones that start with capital alphas. if you want initial caps *only* then: grep

Re: Grep on dictionary words

2009-11-30 Thread Mike Castle
On Sat, Nov 28, 2009 at 7:13 AM, Dotan Cohen dotanco...@gmail.com wrote: I have a long binary file (about 12 MB) that I need to extract the text from via strings. Naturally, there are a lot of junk lines such as these: pDuf #k0H}g) GoV5 rLeY1 TMlq,* Is there a way to grep the output of

Re: Grep on dictionary words

2009-11-29 Thread Tzafrir Cohen
On Sun, Nov 29, 2009 at 01:22:15AM +0200, Dotan Cohen wrote: will get the ones that start with capital alphas. if you want initial caps *only* then: grep ^[A-Z][a-z]*$ would match those. Thanks. I meant that caps could only be at the beginning of a word, not in the middle.

Re: Grep on dictionary words

2009-11-29 Thread Dotan Cohen
2009/11/29 Andrew Sackville-West and...@farwestbilliards.com: On Sun, Nov 29, 2009 at 01:22:15AM +0200, Dotan Cohen wrote: will get the ones that start with capital alphas. if you want initial caps *only* then: grep ^[A-Z][a-z]*$ would match those. Thanks. I meant that caps could

Re: Grep on dictionary words

2009-11-29 Thread Emanoil Kotsev
Dotan Cohen wrote: This means that only words that start with a caps are valid. I need can start with a caps, but caps can be nowhere else. I got that like this: grep ^[A-Za-z][a-z]*$ However I think that there is a better way. This is a good exercise. I am bettering my regex skills as

Grep on dictionary words

2009-11-28 Thread Dotan Cohen
I have a long binary file (about 12 MB) that I need to extract the text from via strings. Naturally, there are a lot of junk lines such as these: pDuf #k0H}g) GoV5 rLeY1 TMlq,* Is there a way to grep the output of strings in order to only show lines that contain words found in the aspell

Re: Grep on dictionary words

2009-11-28 Thread Boyd Stephen Smith Jr.
In 880dece00911280713n6193b8das6970e8a071fc2...@mail.gmail.com, Dotan Cohen wrote: Is there a way to grep the output of strings in order to only show lines that contain words found in the aspell dictionary? Thanks in advance. I once wrote a small program against the aspell API to do something

Re: Grep on dictionary words

2009-11-28 Thread Andrew Sackville-West
On Sat, Nov 28, 2009 at 11:32:59AM -0600, Boyd Stephen Smith Jr. wrote: In 880dece00911280713n6193b8das6970e8a071fc2...@mail.gmail.com, Dotan Cohen wrote: Is there a way to grep the output of strings in order to only show lines that contain words found in the aspell dictionary? Thanks in

Re: Grep on dictionary words

2009-11-28 Thread Dotan Cohen
ISTM that because the output of strings is not discrete list of potential words, but is instead a long list of concatenated characters, this problem is really rather daunting. The output should probably be first broken up into something resembling words by perhaps breaking on non-alphabetic

Re: Grep on dictionary words

2009-11-28 Thread Florian Kriener
On Saturday 28 November 2009 16:13:55 Dotan Cohen wrote: I have a long binary file (about 12 MB) that I need to extract the text from via strings. Naturally, there are a lot of junk lines such as these: pDuf #k0H}g) GoV5 rLeY1 TMlq,* Is there a way to grep the output of strings in

Re: Grep on dictionary words

2009-11-28 Thread Andrew Sackville-West
On Sun, Nov 29, 2009 at 12:00:33AM +0200, Dotan Cohen wrote: ISTM that because the output of strings is not discrete list of potential words, but is instead a long list of concatenated characters, this problem is really rather daunting. The output should probably be first broken up into

Re: Grep on dictionary words

2009-11-28 Thread Dotan Cohen
will get the ones that start with capital alphas. if you want initial caps *only* then: grep ^[A-Z][a-z]*$ would match those. Thanks. I meant that caps could only be at the beginning of a word, not in the middle. Expanding your example, I figured that would be: grep ^[A-Z]?[a-z]*$ // note

Re: Grep on dictionary words

2009-11-28 Thread Andrew Sackville-West
On Sun, Nov 29, 2009 at 01:22:15AM +0200, Dotan Cohen wrote: will get the ones that start with capital alphas. if you want initial caps *only* then: grep ^[A-Z][a-z]*$ would match those. Thanks. I meant that caps could only be at the beginning of a word, not in the middle.

Re: Grep on dictionary words

2009-11-28 Thread John Hasler
Dotan writes: Is there a way to grep the output of strings in order to only show lines that contain words found in the aspell dictionary? Try this: #!/bin/bash strings $1 | while read line do if [ ` echo $line | sed -e 's/[^a-zA-Z ]//g' | wc -m` -lt 6 ] then continue fi echo $line | sed -e