Re: [9fans] sed question (OT)
On Oct 30, 12:58 pm, noah.ev...@gmail.com (Noah Evans) wrote: This kind of problem is character processing, which I would argue is C's domain. You can massage awk and sed to do the job for you, but at least for me it's conceptually simpler to just bang out the following C program: #include u.h #include libc.h #include bio.h #define isupper(r)    (L'A' = (r) (r) = L'Z') #define islower(r)    (L'a' = (r) (r) = L'z') #define isalpha(r)    (isupper(r) || islower(r)) #define isspace(r)    ((r) == L' ' || (r) == L'\t' \             || (0x0A = (r) (r) = 0x0D)) #define toupper(r)    ((r)-'a'+'A') void usage(char *me) {     fprint(2, %s: usage\n, me); } void main(int argc, char **argv) {     Biobuf in, out;     int c, waswhite, nwords;     ARGBEGIN{     default:         usage(argv[0]);     }ARGEND;     Binit(in, 0, OREAD);     Binit(out, 1, OWRITE);     waswhite = 0;     nwords = 0;     while((c = Bgetc(in)) != Beof){         if(isalpha(c))         if(waswhite)         if(nwords 2){             if(islower(c))                 c = toupper(c);             nwords++;         }         if(isspace(c))             waswhite = 1;         else             waswhite = 0;         if(c == '\n')             nwords = 0;         Bputc(out, c);     }     exits(0); } Noah Simple, and wrong. You need to initialize waswhite to 1, not 0.
Re: [9fans] sed question (OT)
The script has a small bug one might say: it capitalizes the first two words on a line that are _not_ already capitalized. If one of the first two words is capitalized then the third will get capitalized. --On Thursday, October 29, 2009 15:41 + Steve Simon st...@quintile.net wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT)
Listing of file 'sedscr:' s/^/ /; s/$/aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ/; s/ \([a-z]\)\(.*\1\)\(.\)/ \3\2\3/; s/ \([a-z]\)\(.*\1\)\(.\)/ \3\2\3/; s/.\{52\}$//; s/ //; $ echo This is a test | sed -f sedscr This Is a test $ echo someone forgot to capitalize | sed -f sedscr Someone Forgot to capitalize This works with '/usr/bin/sed' from a FreeBSD 6.2-RELEASE installation. Above sed script stolen from: http://dervish.wsisiz.edu.pl/~bse26236/batutil/help/sed/CAPITALI.HTM With a minor change: first three words to first two words. --On Thursday, October 29, 2009 15:41 + Steve Simon st...@quintile.net wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT)
You can do it, definitely. Caveat: I'm in bed with a virus and the brain's on impulse power so these are untested and may be highly suboptimal. Is the input guaranteed to have 2 words on each line? What are your definitions of words and blanks? I know from your snippet that there's no leading blanks and no empty lines. Assuming there are 2 words on every line, something like: h s/[A-Za-z0-9_-]+(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/(.)\n([A-Za-z0-9_-]+).(.*)/\2\1\3/ ought to roughly work after your fragment. If = 2 words per line isn't assumed: h t urnofflag : urnofflag s/[A-Za-z0-9_-]+[^ A-Za-z0-9_-]*(.).*/\1/ t for2 b cosnot2wds : for2 y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/(.)\n([A-Za-z0-9_-]+[^ A-Za-z0-9_-]*).(.*)/\2\1\3/ b : cosnot2wds g Bizarrely, within it's limitations (\n, \0, size limits), sed is, in some sense, complete, since you can store any number of things in the spaces (using /(.* \n)/ etc.) and branch conditionally. Another insane possibility, since there are only 26 variations, is to do: s/^a/A/ s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)a/\1A/ s/^b/B/ s/^([A-Z][A-Za-z0-9]+[^ A-Za-z0-9_-]*)b/\1B/ You can of course, use sed to create the above script like so: echo abcdefghijklmnopqrstuvwxyz | sed ... Filling in the ellipses is left as an exercise for the already addled reader. BTW: if you're shovelling a lot of this kind of muck, it may, paradoxically, be easier to do it on the command line and use your shell's variables for the repeated bits of regexps, commands etc. The only caveats are that this technique will curdle your brain even more than sed already does and it may, oddly, be the exception to the rule that rc is more elegant than sh, due to caret vs. double-quotes. Apologies for grandstanding, but I used to do this sort of stuff for a living. I wrote a piece of training courseware for sed once which had far worse excesses than the above as examples. RFC-822 header-reassembly anyone? I also used to get my intellectual rocks off on stuff like this until I finally grew up (in my late 40s). Dave. SEE ALSO teco, assembler, qed. On 29 Oct 2009, at 15:41, Steve Simon wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT)
On Fri Oct 30 11:31:24 EDT 2009, dav...@mac.com wrote: You can do it, definitely. well played! - erik
Re: [9fans] sed question (OT)
Eris Discordia wrote: The script has a small bug one might say: it capitalizes the first two words on a line that are _not_ already capitalized. If one of the first two words is capitalized then the third will get capitalized. Call me a Dinosaur, but - so long as it is ASCII or EBCDIC it is relatively trivial to implement that in hardware AND NOT have the issue of altering any but the first two words AND NOT have issues where there is only one word or a numeral or punctuation or hidden/control character rather than alpha. Hint: Among other simple stuff, needs XOR capability. 'Dinosaur' 'coz the last time I did one of the key portions of it was converting a Data Printer CT-1064 chaintrain from HP-3000 MKIII use to work with an S-100 Z-80. That capitalized *every* alpha character, but took just two 74-series IC's to replace a pair of lookup-table PROMS. One would need to add logic to detect space or newline, set/unset a few latches - not a lot more. Could have built it in less time than this thread has been running... ;-) Bill --On Thursday, October 29, 2009 15:41 + Steve Simon st...@quintile.net wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT) (OT) (OT)
Call me a Dinosaur, but - so long as it is ASCII or EBCDIC it is relatively trivial to implement that in hardware AND NOT have the issue of altering any but the first two words AND NOT have issues where there is only one word or a numeral or punctuation or hidden/control character rather than alpha. You should have added an extra (OT) to the subject line. I'm adding a few more just to be fair. Could have built it in less time than this thread has been running... then what have you been doing all this time? Bill Tim Newsham http://www.thenewsh.com/~newsham/
Re: [9fans] sed question (OT) (OT) (OT) (OT) (OT)(OT)(OT)(OT)(OT)(OT)(OT)(OT)(OT)(OT)
Tim Newsham wrote: Call me a Dinosaur, but - so long as it is ASCII or EBCDIC it is relatively trivial to implement that in hardware AND NOT have the issue of altering any but the first two words AND NOT have issues where there is only one word or a numeral or punctuation or hidden/control character rather than alpha. You should have added an extra (OT) to the subject line. I'm adding a few more just to be fair. Could have built it in less time than this thread has been running... then what have you been doing all this time? Bill Tim Newsham http://www.thenewsh.com/~newsham/ Honestly? Trying to determine what a valid USE for capitalizing exactly the first 'n' words on a line might be. Especially as it calls for ONE or TWO but never THREE or more. Document 'sideheads', maybe?? - but those may not be limited to 2 words. The need is as puzzling as some of the solutions.. Bill
Re: [9fans] sed question (OT)
This kind of problem is character processing, which I would argue is C's domain. You can massage awk and sed to do the job for you, but at least for me it's conceptually simpler to just bang out the following C program: #include u.h #include libc.h #include bio.h #define isupper(r) (L'A' = (r) (r) = L'Z') #define islower(r) (L'a' = (r) (r) = L'z') #define isalpha(r) (isupper(r) || islower(r)) #define isspace(r) ((r) == L' ' || (r) == L'\t' \ || (0x0A = (r) (r) = 0x0D)) #define toupper(r) ((r)-'a'+'A') void usage(char *me) { fprint(2, %s: usage\n, me); } void main(int argc, char **argv) { Biobuf in, out; int c, waswhite, nwords; ARGBEGIN{ default: usage(argv[0]); }ARGEND; Binit(in, 0, OREAD); Binit(out, 1, OWRITE); waswhite = 0; nwords = 0; while((c = Bgetc(in)) != Beof){ if(isalpha(c)) if(waswhite) if(nwords 2){ if(islower(c)) c = toupper(c); nwords++; } if(isspace(c)) waswhite = 1; else waswhite = 0; if(c == '\n') nwords = 0; Bputc(out, c); } exits(0); } Noah On Thu, Oct 29, 2009 at 4:41 PM, Steve Simon st...@quintile.net wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT)
To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ L. On Thu, Oct 29, 2009 at 3:41 PM, Steve Simon st...@quintile.net wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve
Re: [9fans] sed question (OT)
Steve Simon wrote: Sorry, not really the place for such questions but... I always struggle with sed, awk is easy but sed makes my head hurt. I am trying to capitalise the first tow words on each line (I could use awk as well but I have to use sed so it seems churlish to start another process). capitalising the first word on the line is easy enough: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ x s/^.(.*)/\1/ x G s/\n// Though there maye be a much easier/more elegant way to do this, but for the 2nd word it gets much harder. What I really want is sam's ability to select a letter and operate on it rather than everything being line based as sed seems to be. any neat solutions? (extra points awarded for use of the branch operator :-) -Steve I'd be sore tempted to move the needful files into an environment where I could use multiple passes of 'rpl' (or 'back in the day' BRIEF). BFBI .. far less capable tools, perhaps - BUT by the time you've figured out how to even *tell* awk or sed what to do, I'm working on some other task... 'If at first you don't succeed - cheat' YMMV, Bill
Re: [9fans] sed question (OT)
To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ ; echo abc def | sed 's/^.\u/' sed: s command garbled: s/^.\u/ - erik
Re: [9fans] sed question (OT)
On Thu, Oct 29, 2009 at 2:08 PM, erik quanstrom quans...@quanstro.net wrote: To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ ; echo abc def | sed 's/^.\u/' sed: s command garbled: s/^.\u/ i guess you missed the second slash
Re: [9fans] sed question (OT)
On Thu Oct 29 12:31:23 EDT 2009, iru.mu...@gmail.com wrote: On Thu, Oct 29, 2009 at 2:08 PM, erik quanstrom quans...@quanstro.net wrote: To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ ; echo abc def | sed 's/^.\u/' sed: s command garbled: s/^.\u/ i guess you missed the second slash now it is less helpful: ; echo abc def | sed 's/^./\u/' uabc def - erik
Re: [9fans] sed question (OT)
On Thu, Oct 29, 2009 at 2:06 PM, Lorenzo Bolla lbo...@gmail.com wrote: To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ L. % echo rwrong | sed 's/^./\u/' urwrong
Re: [9fans] sed question (OT)
I forgot the 9. This works for GNU sed version 4.2.1 L. On Thu, Oct 29, 2009 at 4:33 PM, Iruata Souza iru.mu...@gmail.com wrote: On Thu, Oct 29, 2009 at 2:06 PM, Lorenzo Bolla lbo...@gmail.com wrote: To capitalize the first letter of each line wouldn't this be enough? s/^./\u/ L. % echo rwrong | sed 's/^./\u/' urwrong
Re: [9fans] sed question (OT)
Sorry, not really the place for such questions but... Try stackoverflow.com. They delight in problems such as these. I am trying to capitalise the first tow words on each line I store the original line with h, and then pull it back out repeatedly with G to mangle it. I got far enough to translate first second ... to First s with this: h s/^(.).*/\1/ y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ G s/^.([^ ]+ ).*/\1/ s/^.([^ ]+)$/\1/ G s/^.[^ ]+ (.).*/\1/ #y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ #3y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/ s/\n//g There's a couple problems. (1) It doesn't handle the case with only one word on a line, because it's hard to tell, later on, that I pulled out the single word once already. (2) I'd like to put in one of the commented-out y commands, but (2a) the first uppercases the entire pattern space, and (2b) the second refers to line 3 of the entire file, not line 3 of the pattern space. -Steve Jason Catena