Re: [9fans] sed question (OT)

dave . l Fri, 30 Oct 2009 08:36:51 -0700

You can do it, definitely.

Caveat: I'm in bed with a virus and the brain's on impulse power
so these are untested and may be highly suboptimal.


Is the input guaranteed to have 2 words on each line?
What are your definitions of words and blanks?

I know from your snippet that there's no leading blanks and no emptylines.


Assuming there are 2 words on every line, something like:
h
s/[A-Za-z0-9_-]+(.).*/\1/
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G
s/(.)\n([A-Za-z0-9_-]+).(.*)/\2\1\3/

ought to roughly work after your fragment.

If >= 2 words per line isn't assumed:
h
t urnofflag
: urnofflag
s/[A-Za-z0-9_-]+[^ A-Za-z0-9_-]*(.).*/\1/
t for2
b cosnot2wds
: for2
y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
G
s/(.)\n([A-Za-z0-9_-]+[^ A-Za-z0-9_-]*).(.*)/\2\1\3/
b
: cosnot2wds
g

Bizarrely, within it's limitations (\n, \0, size limits), sed is, insome sense, complete,since you can store any number of things in the spaces (using /(.*\n)/ etc.) and branch conditionally.

Another insane possibility, since there are only 26 variations, is todo:

        s/^a/A/
        s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)a/\1A/
        s/^b/B/
        s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)b/\1B/

You can of course, use sed to create the above script like so:
        echo abcdefghijklmnopqrstuvwxyz | sed ...

Filling in the ellipses is left as an exercise for the already addledreader.


BTW: if you're shovelling a lot of this kind of muck,

it may, paradoxically, be easier to do it on the command line and useyour shell's variables for the repeated bits of regexps, commands etc.The only caveats are that this technique will curdle your brain evenmore than sed already doesand it may, oddly, be the exception to the rule that rc is moreelegant than sh, due to caret vs. double-quotes.

Apologies for grandstanding, but I used to do this sort of stuff for aliving.I wrote a piece of training courseware for sed once which had farworse excesses than the above as examples.

RFC-822 header-reassembly anyone?

I also used to get my intellectual rocks off on stuff like this untilI finally grew up (in my late 40s).


Dave.

SEE ALSO
        teco, assembler, qed.


On 29 Oct 2009, at 15:41, Steve Simon wrote:

Sorry, not really the place for such questions but...

I always struggle with sed, awk is easy but sed makes my head hurt.
I am trying to capitalise the first tow words on each line (I coulduse awkas well but I have to use sed so it seems churlish to start anotherprocess).
capitalising the first word on the line is easy enough:

                        h
                        s/^(.).*/\1/
                        y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/
                        x
                        s/^.(.*)/\1/
                        x
                        G
                        s/\n//

Though there maye be a much easier/more elegant way to do this,
but for the 2nd word it gets much harder.
What I really want is sam's ability to select a letter and operateon it
rather than everything being line based as sed seems to be.
any neat solutions? (extra points awarded for use of the branchoperator :-)
-Steve

Re: [9fans] sed question (OT)

Reply via email to