Thanks -- the lines are in English 400 years old, hence the eccentric spelling.
The sorting you suggest is just what produced the first list, the raw file: it is mixed up with lines without repetitions, and those repetitions that occur are not ordered: so that in the whole file you might find two repetitions here and three there, with a six between them, and odd single lines getting in the way. Over 33,000 lines, this becomes impossible to manage by hand. 'his', 'her' etc are not related or linked: they are different terms and do not count as repetitions. In the first group (repeating two terms) there are three separate groups of terminations: 'to abate', 'gan abate' and 'by might'. In the next group (repeating three terms) I accidentally left out the second line in each case -- again there are supposed to be three different pairs. And in the six repeating terms group, I missed out the first line of the group. Sorry for the missing lines, and thanks for your comments, Julian On Sun, Sep 30, 2012 at 7:20 PM, Tony Mechelynck <[email protected]> wrote: > On 30/09/12 18:14, jbl wrote: >> >> Hi: The first difficulty with the problem I describe below is that I >> don't know what the key terms would be to search Google accurately. I >> have searched for a long time already. So if anyone could even tell me >> what it is I am looking for I'd be very grateful. >> >> The problem is this: I have a large file of poetry in alphabetical >> order sorted on the last term in each line, I post an except in >> sample1 below. I want to sort it so that lines that share, say, the >> last two terms (on the right) with the last two terms of any other >> line are in one group, those lines that share the last three terms in >> another and so on up to seven places -- as in sample2 below. >> >> The first difficulty I have is getting the search terms into an :ex >> command -- I need to find for each line whether there are any others >> that match it to seven terminal places, then to six and so on. I could >> do the simple locating with something cumbersome like this: >> >> map ö $BB2yW: p0ig/ A$/m0 map ä $ByW: p0ig/ A$/m0 >> >> and so on up to seven places. But it must be possible to generalize >> that somehow. What would be the general form of an expression for >> finding the last 'x' words of a line in the same position somewhere >> else in the file? >> >> Apart from the crudeness of the operation, the trouble would be >> exporting (redirecting?) the results automatically and keeping the >> exported results in order (as in sample2 below). And also how to >> iterate it usefully through the whole file. >> >> If I started at the top of the raw file, iterating something like >> these commands, checking each line and exporting the results to a >> single file, the resulting file would be identical to the original >> file. I need, I think, to be able to eliminate those lines which do >> not share any terminations with any other lines. I think starting >> (somehow) with seven places then six and down to two, would leave me >> the non-sharing lines by themselves in the original file(?). >> >> But I'm not even sure what the strategic logic should be: exactly what >> tasks should I be trying to get the program to perform? The process >> needs to be automated because the file is 33,000 lines long. As I say, >> if someone could tell me what key terms, what types of operations, I >> should be looking for on Google, it would help a great deal. >> >> Many thanks for any help, JBL >> Vim 7.x Debian/Win7 >> >> Here are the samples, one before (from the raw file) and one after (as >> I'd like the whole thing organized). >> >> Raw Lines >> 6.4.30.7 All these our ioyes and all our blisse abate >> 2.12.15.9 And after them did driue with all her power and might >> 3.9.14.4 And both full liefe his boasting to abate >> 6.6.27.9 And layd at him amaine with all his will and might >> 6.1.38.2 At once did heaue with all their powre and might >> 6.1.12.7 But through misfortune which did me abase >> 5.11.57.9 Did set vpon those troupes with all his powre and might >> 6.2.26.5 For deare affection and vnfayned zeale >> 3.2.13.6 For hardy thing it is to weene by might >> 4.9.6.9 He her vnwares attacht and captiue held by might >> 6.1.32.9 He spide come pricking on with al his powre and might >> 6.6.31.9 He stayd his second strooke and did his hand abase >> 3.8.51.6 Mote not mislike you also to abate >> 3.8.28.7 Ne ought your burning fury mote abate >> 1.7.35.1 No magicke arts hereof had any might >> 5.8.46.8 She at her ran with all her force and might >> 1.10.2.8 She cast to bring him where he chearen might >> 3.7.35.3 That at the last his fiercenesse gan abate >> 4.8.17.8 That her inburning wrath she gan abate >> 1.10.47.7 That hill they scale with all their powre and might >> 4.6.3.4 The armes he bore his speare he gan abase >> 5.9.39.4 To all assayes; his name was called Zele >> 2.9.7.4 To serue that Queene with all my powre and might >> 2.1.26.7 When suddenly that warriour gan abace >> 6.12.23.9 Where he him found despoyling all with maine and might >> 1.5.1.8 With greatest honour he atchieuen might >> 4.8.1.7 With sufferaunce soft which rigour can abate >> 5.5.30.1 With that she turn'd her head as halfe abashed >> >> Sorted lines >> ---Lines not repeating final term (=Unique lines): >> FQ 2.1.26.7 When suddenly that warriour gan abace >> FQ 5.5.30.1 With that she turn'd her head as halfe abashed >> FQ 6.2.26.5 For deare affection and vnfayned zeale >> FQ 5.9.39.4 To all assayes; his name was called Zele >> >> ---Lines repeating final term only: >> FQ 6.1.12.7 But through misfortune which did me abase >> FQ 6.6.31.9 He stayd his second strooke and did his hand abase >> FQ 4.6.3.4 The armes he bore his speare he gan abase >> FQ 3.8.28.7 Ne ought your burning fury mote abate >> FQ 4.8.1.7 With sufferaunce soft which rigour can abate >> FQ 6.4.30.7 All these our ioyes and all our blisse abate >> FQ 1.7.35.1 No magicke arts hereof had any might >> FQ 1.10.2.8 She cast to bring him where he chearen might >> FQ 1.5.1.8 With greatest honour he atchieuen might >> FQ 6.6.27.9 And layd at him amaine with all his will and might > > abase == abate == might? I guess I'm too stupid. > >> >> ---Lines repeating final two terms: >> FQ 3.9.14.4 And both full liefe his boasting to abate >> FQ 3.8.51.6 Mote not mislike you also to abate >> FQ 4.8.17.8 That her inburning wrath she gan abate >> FQ 3.7.35.3 That at the last his fiercenesse gan abate >> FQ 3.2.13.6 For hardy thing it is to weene by might >> FQ 4.9.6.9 He her vnwares attacht and captiue held by might >> >> ---Lines repeating final three terms: >> FQ 5.8.46.8 She at her ran with all her force and might >> FQ 6.12.23.9 Where he him found despoyling all with maine and might >> FQ 2.9.7.4 To serue that Queene with all my powre and might > > force == maine == powre (sic) ? You will have to explain me that > > >> ........ >> >> ---Lines repeating final six terms: >> FQ 2.12.15.9 And after them did driue with all her power and might >> FQ 5.11.57.9 Did set vpon those troupes with all his powre and might >> FQ 6.1.32.9 He spide come pricking on with all his powre and might >> FQ 1.10.47.7 That hill they scale with all their powre and might >> FQ 6.1.38.2 At once did heaue with all their powre and might > > I suppose "powre" is four times a typo. > her == his == their? Or are there three different sets of lines, one of them > a singleton? > >> > > This sounds like a "decorate - sort - undecorate" problem: > 1. Put each line into "sortable" order (in this case, reverse the order of > the terms, so that the last term comes at the start of the line, then one > space, then the last but one, then one space, etc.); > 2. Sort > 3. Put the lines back like they used to be (i.e., reverse the order of the > terms again). > > Note that no "dumb" computer will be able to find out that "his", "her" and > "their" are to be sorted together, unless you somehow program it into the > logic of your steps 1 and 3. > > > Best regards, > Tony. > -- > "I'd love to go out with you, but I'm taking punk totem pole carving." > > -- > You received this message from the "vim_use" maillist. > Do not top-post! Type your reply below the text you are replying to. > For more information, visit http://www.vim.org/maillist.php -- J.B. Lethbridge (Gen. Ed. The Manchester Spenser) English Seminar Tuebingen University WIlhelmstrasse 50 Tuebingen 72074 Germany -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
