Peter Alcibiades wrote:
How do you do the following?

I have a series of lines which go like this

|  [record separator, new record starts]
AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed
CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
| [record separator]
AAA adipisicing elit, sed   [new record starts]

| is the record separator.

In the above, its CCC that is repeated, but it could be any prefix. Also CCC is next to its repetition. This will always be the case.

I want to go through the file. When I find a single prefix (like AAA) this should be written to the output file. when the next line starts with the same prefix (as in the CCC cases, I want to put both occurences on the same line. So the desired output would be

AAA consectetur adipisicing elit, sed
BBB lorem ipsum
CCC consectetur adipisicing elit, sed CCC laboris nisi ut aliquip ex ea
DDD ut aliquip ex ea commodo
EOR
AAA adipisicing elit, sed

How do I detect a repetition of that sort and do this?
Here's a simple script that does what I think you want ....

on mouseUp
put "x" into tSeparator # should be TAB, but for testing that's inconvenient
  put field "F1" into lInput
  put "" into lastPrefix
  put "" into lastSeriesOfLines
  repeat for each line L in lInput
    put char 1 to 3 of L into tPrefix
    replace tSeparator & tPrefix with tPrefix in char 4 to -1 of L
    if tPrefix = lastPrefix then
      put L after lastSeriesOfLines
    else
      put lastSeriesOfLines & CR after lOutput
      # NB this assumes record separator comes on its own on the line
      put tPrefix into lastPrefix
      put L into lastSeriesOfLines
    end if
  end repeat
  put lastSeriesOfLines & CR after lOutput
  put lOutput into field "F2"
end mouseUp


A similar question, if the line is

CCC  adipisicing elit, sed TAB CCC  adipisicing elit, sed

How do you detect the multiple occurence (I can do this with regex) and then write out in place of thie above expression (this I don't see how to do) the following:

CCC  adipisicing elit, sed CCC  adipisicing elit, sed

See the "replace" line of the script above. Note - I was using input fields, so the TAB was a nuisance, hence the introduction of tSeparator .... You may need to adjust for whether spaces are significant, etc.

-- Alex.
_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to