Patrick Hall wrote:
> 
> --- "John W. Krahn" <[EMAIL PROTECTED]> wrote:
> > Patrick Hall wrote:
> > > I'd like to push all sequences of capitalized words onto an array.
> > >
> > > So, given this paragraph (which I just snagged off AP)
> > >
> > > Indian Prime Minister Atal Bihari Vajpayee said
> > > Wednesday that India would consider jointly monitoring
> > > the disputed Kashmir border with its longtime rival
> > > Pakistan.
> > >
> > > would return:
> > >
> > > Indian Prime Minister Atal Bihari Vajpayee
> > > Wednesday
> > > India
> > > Kashmir
> > > Pakistan
> >
> > perl -00ne'tr/\n /
>
> /s;push@x,grep/\S/,split/\b[a-z]\S+\s/}{print"$_\n"for@x'
>
> > yourfile.txt
>
> A couple things -
> Seems to return
>
> Indian
> Prime
> Minister
> Atal
> Bihari
> Vajpayee
>
> Wednesday
>
> India
>
> Kashmir
>
> Pakistan.
>
> Instead of the sequences
>
> Prime Minister Atal Bihari Vajpayee
> Wednesday
> India
> Kashmir
> Pakistan.

Sorry, that's because my e-mail software wrapped the line, it should be all on one 
line:

perl -00ne'tr/\n / /s;push@x,grep/\S/,split/\b[a-z]\S+\s/}{print"$_\n"for@x' 
yourfile.txt

You could also use s/\s+/ /g instead of tr/\n / /s.


> Also, what does the 00 option do? It's hard to look up
> such things in the docs. (Or where is it in the docs?)

perldoc perlrun

It's the first option listed.  It is used to access the file in paragraph mode.

perldoc perlvar

For how to set the $/ variable to paragraph mode.


John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to