Re: Jane Austen's peculiarity
I am wondering whether this is a relevant point or not,but, here goes anyway: I am looking for occurrences of BE + Past Participle in on-passive, intransitive constructions, with examples such as: *I am arrived*, *He is become*, *She is returned* in publicly available versions of English literature 'standards' [no, not going to get distracted by what constitutes a canon here]. These constructions were displaced over the period 1750-1850 by *I have arrived*, *He has arrived* and *She has returned* respectively. This is for thinking up reasons why this grammatical change may have taken place (or a syntactic rather than grammatical change), and whether English writers (and non-English writers such as Walter Scott, who wrote in English contemporaneously) were inherently conservative. There is a possibility that Dr Snezha Tsoneva-Mathewson (my wife) could construct a model inwith the framework of Cognitive Grammar to explain how this change may have taken place. This should be relatively easy using LiveCode, and it is, except for one thing: searching through an *html* text loaded into a textField for the relevant constructions is *far, far slower* than doing the same thing by opening the documents in Firefox and doing a 'find' operation. Of course one cannot put the results into a 'sexy' colour-coded textField when one uses Firefox. Now, possibly I am being a bit foolish expecting LiveCode to crunch its way through textFields (even if, as some helpful people on this list suggested, /they are loaded into variables/) looking for strings faster than Firefox (a dedicated html-thing) does. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Hi Richmond, in LC 7 you can use sentence as a text unit. Since sentence seems your context it is tempting to use that here is an adaption of the is/was handler returning the sentence number and the sentence Easy to read, easy to maintain and reasonably fast. Just paste the code into a copy of your button is / are --- on mouseUp lock screen put the milliseconds into t put empty into fld COOKED2 put empty into fld STARTT put empty into fld STOPT put started : the long time into fld STARTT put fld TEKST into TEKST put fld WERBS into WERBS repeat for each line aVerb in Werbs put is aVerb into tSearch1 put are aVerb into tSearch2 put 1 into tCounter repeat for each sentence aSentence in TEKST if aSentence contains tSearch1 then put tCounter space aSentence cr after tCollect end if if aSentence contains tSearch2 then put tCounter space aSentence cr after tCollect2 end if add 1 to tCounter end repeat end repeat put tCollect cr tCollect2 into field COOKED2 put finished : the long time into fld STOPT -- put the milliseconds - t unlock screen end mouseUp Kind regards Bernd -- View this message in context: http://runtime-revolution.278305.n4.nabble.com/Jane-Austen-s-peculiarity-tp4694658p4694849.html Sent from the Revolution - User mailing list archive at Nabble.com. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 10/08/15 23:51, hh wrote: Richmond, this was your last post to this thread before mine. My current version is here: https://www.dropbox.com/sh/ja47l87gg87sn0q/AAAIj99kEQVOb8ev3jz8C5ORa?dl=0 File : TA.zip play with it, rip it to pieces, improve it: go on, I dare you :) Richmond. So I downloaded this stack and wrote a script that implemented three ideas, two by other LCoders, one by me. Because you graciously ignored these ideas, I was simply curious about their effects on speed and selectivity (by using trueWords). I didn't play with your stack, I didn't rip it into pieces, but somehow improved it a bit in the sense of using effectively some available features of LC 7. It was no dare, I had fun. And you had obviously fun too, what a great speech! Who dares wins, you --- and me. I am most gratified to find that someone actually read and enjoyed one of my rants. Hermann p.s. Shouldn't the opening of your speech read I was _achieved_? ;-) Well it SHOULD (perhaps) read I have achieved, but at the point I wrote that I had not put the colourisation scripts into the relevant buttons, so the action had not been completed :) It could not read I was achieved in the way Jane Austen was using that sort of structure because 'achieve' is a TRANSITIVE verb. Richmond wrote: I am achieving what I initially set out to achieve, and with far less code than yours, so have no intention of changing anything. I, also, am a lucky sort of chap insofar as I don't really mind that much if my stack takes 3 days to work its way through a corpus . . . I can go and do some teaching, read a book, cook some food, go for a bike ride, talk to my wife, play with my cats, and so on. That has ALWAYS been my approach to programming for one simple reason: working every holiday for very many years indeed on a farm on an island I had to sort out broken bailers, tractors and so on. Now proper spares had to come, on a ferry, at a vast transportation overhead, from the mainland of Scotland. We could not afford that, so we fossicked (lovely verb) for whatever would do the job in the 'graveyard' of broken tractors, cars, stuff we had picked up from the local dump, and so on. Every single time we got our accursed bailer to bail the straw and the hay, we got the cotter pins we needed to connect the tractor to the plough, harrow, muck-spreader or whatever; never very elegant, but they worked. In fact my younger son was on that farm just 8 days ago and was shown some of my repair work by the farmer's son (the farmer is long dead); still functional after 25 years. I have, just, worked out a way to colourise the items I want, and while, churning through some socking great corpus that would take days, I only need it to colourise the sentences the previous routine has extracted, so that won't take that long. You, if it really seems such a good idea (and is it?) are more than welcome to download my stack https://www.dropbox.com/sh/ja47l87gg87sn0q/AAAIj99kEQVOb8ev3jz8C5ORa?dl=0 File: TA.zip and mess around with the script to your heart's content. AND, while we are talking about time-consuming exercises: having put 4 hours of work into the thing, that seems, already, a bit more than the thing deserves as I am not interested in winning the Tour de France, simply extracting some data from a million word corpus with absolutely no deadline at all unless I choose to impose one. The results MAY get rolled into a paper my wife and I are THINKING of writing for an academic conference . . . . Almost ALL the stacks I have thrown out into the public domain in the last 6 months have come back to me with comments about how my code is clunky, inefficient, and so forth; and I would not doubt for a minute that that is probably true. HOWEVER, as far as I am concerned there is one enormous advantage about my code above thine, or anybody else's; while thy code and the code of many others is probably more efficient, more clever and gets things done more quickly, I don't understand the finer points of it, while I understand how my code works 100% because it was written by me, follows my logic, and does what I require it to do. It is always entertaining and instructive to see how people react to my code, and I often learn a lot from their reactions (not least about human psychology), including new coding tricks - but there always come a point where the burden of having to plough through other people's code (reflecting the way their minds work) feels like too much in comparison from anything I might learn from it. --- I also suspect that very many people share my interest in getting the job done rather than producing posh code. RunRev claim, on their website, that one can learn to code quickly. With Livecode one can learn how to code RELATIVELY quickly, up to a certain point; and many people who are not programmers qua programmers should be attracted by that because they have probably got other things to do
Re: Jane Austen's peculiarity
Of course I couldn't resist a tinker. I too am into text manipulation/searching and wondered how I would go about this. I looked at the repeat loops and realised they would run much faster if they were inverted as I am sure the list of verbs would be less than the lines of text being searched. I also wanted to use a repeat for each construct as this is usually orders of magnitude faster. But this meant I needed the line count and adding a counter seemed counter productive. So I settled on using the lineoffset. Here was my go... on mouseUp put empty into fld COOKED put empty into fld STARTT put empty into fld STOPT put empty into lCooked1 put started : the long time into fld STARTT put the milliseconds into st put fld TEKST into TEKST put fld WERBS into WERBS put 0 into acounter put the number of lines of TEKST into numlines repeat for each line KWERBS in WERBS put wasKWERBS into FRAZE put were KWERBS into FRAZE2 put 0 into loffesta put 0 into loffestb put 1 into lcounta put 1 into lcountb repeat while lcounta 0 put lineoffset(FRAZE,TEKST,loffesta) into lcounta if lcounta = 0 then exit repeat end if put lcounta + loffesta into thelinea put thelinea :line thelinea of TEKST cr after lCooked1 put lcounta into loffesta end repeat repeat while lcountb 0 put lineoffset(FRAZE2,TEKST,loffestb) into lcountb if lcountb = 0 then exit repeat end if put lcountb + loffestb into thelineb put thelineb : line thelineb of TEKST cr after lCooked1 put lcountb into loffestb end repeat end repeat put the number of lines of lCooked1 found put lcooked1 into fld Cooked put finished : the long time into fld STOPT put the milliseconds into nd put nd - st into fld TIMET end mouseUp I haven't tried returning to the original repeat order to see if this was faster but running the above on Richmond's sample stack for the WAS/WERE case delivered a result of three lines.. 2663 : officers, who in comparison with the stranger, were become stupid, 731 : was returned in due form. Miss Bennet's pleasing manners grew on the 4116 : were returned, and to lament over his absence from the Netherfield ball. in 89 msec on my Mac running LC7.1Dp1 I was then going to examine colourising the found chunks when I realised that the supplied text had line breaks within each paragraph. This means none of the proposed solutions (including Richmond's own) will find the desired phrase if it falls across one of these line breaks. For my solution using lineoffset this is a dead end WHILE these line breaks within a paragraph remain. For the other solutions a simple expedient is to increase the number of FRAZEs to four... put wasKWERBS into FRAZE put was crKWERBS into FRAZE2 put were KWERBS into FRAZE3 put were cr KWERBS into FRAZE4 This addition makes the extra FRAZES two lines and thus non valid arguments for a lineoffset function. or so I thought. However given the unpredictability of the formatting of the text this was a much too simplistic solution. This solution breaks down where paragraphs are indented using spaces! So, to keep the formatting as read in is problematic without knowing the formatting used. But if the focus is the actual text, then perhaps the fancy formatting is not important. Processing the text BEFORE searching so as to remove embedded line breaks and space padding allows my original code to work fine. inserting the following before the REPEATS does the trick (at least with the example text replace return with ^* in TEKST put \s+ into lmultispace put replacetext (TEKST,lmultispace, ) into TEKST replace ^*^* with return in TEKST replace ^* with in TEKST replace return with return return in TEKST The only downside being the time to execute went from 89 msec to 616 msec. you mileage may vary. NOTE: My method does not identify multiple instances of the FRAZE within a single line, however once it is found in a line it would be simple to see if it occurred again. Thanks for the diversion Richmond. James ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 11/08/15 17:31, James Hale wrote: snip This means none of the proposed solutions (including Richmond's own) will find the desired phrase if it falls across one of these line breaks. snip Wow! Very valuable point: thanks. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Hi all, Richmond, you could give this a try in your fine prepared stack: The following uses = an array [one of the proposals above] = trueWords [one of the proposals, needs LC 7] = multichar-itemDelimiters [one of the proposals above, needs LC 7] It outputs for each of your 6 opening words were ,was ,is ,are ,has ,have . the frequency counts of words 1 and lists the item numbers of these occurences, for each of the 6 words as itemdelimiter (actually word space). For example in fld COOKED were (by script created) we get: were by3 122 375 413 what means there are 3 occurrences of were by and these are at trueword 1 of items 122, 375 and 413 if were is the itemdelimiter. [Use of trueWord collects for example by and by? and by, and by! in one categorie by.] *** This takes 1 sec, in sum for all 6 opening words from above! *** *** So this is TMHO a true demo of the power of some LC 7 features *** A click on a line of one of the 6 output fields colourizes (yellow backColour) exactly the occurrences in fld TEKST and cycles finding these by hitting the enterKey. What to do? [1} Make a new button with the following script part 1. [2] Add the last part of the script to your card script part 2. Have fun, it takes 5 minutes to test all this with your stack ... Hermann ## part 1 for button on mouseUp put the millisecs into strt put started : the long time into fld STARTT put empty into fld STOPT lock screen; lock messages -- speeds up set cursor to watch put 1 into KTEKST; put 1 into KCOOK put fld WERBS into WERBS; delete last line of WERBS put fld TEKST into TEKST delete char 1 to offset(PRIDE AND PREJUDICE,TEKST)-1 of TEKST -- watch the space after each item, no space before each item put were ,was ,is ,are ,has ,have into openings -- start be lazy if there is no fld STOPT2 then clone fld STOPT set name of last fld to STOPT2 set left of fld STOPT2 to the left of fld STOPT set top of fld STOPT2 to the 40+the top of fld STOPT end if repeat with j=1 to 6 put (COOKED word 1 of item j of openings) into F if there is no field F then clone fld Cooked set name of last fld to F set rect of fld F to (0,0,275,150) set topleft of fld F to \ (item j of 95,95,380,380,670,670, item j of 590,740,590,740,590,740) set tabstops of fld F to 128 end if end repeat -- end be lazy repeat for each item W in openings put (COOKED word 1 of W) into F put empty into RM; put empty into RM1 set itemdelimiter to W; put TEKST into TEKST2 delete item 1 of TEKST2; put 1 into X repeat for each item I in TEKST2 put W trueword 1 of I into Y -- important is trueword, compare to word add 1 to word 1 of RM[Y] add 1 to X; put space X after RM[Y] end repeat -- write these 'keys' at top repeat for each line L in WERBS put RM[W L] into wL if wL is empty then put 0 into wL put cr W L tab wL after RM1 end repeat combine RM by cr and tab put W : diff cases tab (the number of lines of RM) \ cr RM1 crcr RM into fld F set textstyle of line 1 of fld F to bold set textstyle of line 3 to 2+(the num of lines of WERBS) of fld F to italic set hilitedLines of fld F to 1 set itemdelimiter to comma end repeat put finished : the long time into fld STOPT put (the short name of me): (the millisecs - strt) ms into fld STOPT2 unlock screen; unlock messages end mouseUp ## part 2 for card script local toFind on mouseUp if cooked is in the short name of the target then set cursor to watch; lock screen; lock messages put length(fld TEKST) into L set textcolor of char 1 to L of fld TEKST to 0,0,0 set backColor of char 1 to L of fld TEKST to 255,255,255 put the value of the clickline into cL colorWords cL unlock screen; unlock messages end if end mouseUp on colorWords x set itemdel to tab put item 2 of x into wrds put 1 + word 1 of wrds into N set itemdel to ((trueword 1 of x) space) repeat with j=2 to N set backcolor of trueword 1 of item (word j of wrds) of fld TEKST to 255,255,0 end repeat put find whole quote (trueword 1 to 2 of x) quote \ in fld quote TEKST quote into toFind select before trueword 1 of item (word N of wrds) of fld TEKST -- the last hit set itemdel to comma do toFind end colorWords on enterinField do toFind end enterinField -- end of scripts ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond, this was your last post to this thread before mine. My current version is here: https://www.dropbox.com/sh/ja47l87gg87sn0q/AAAIj99kEQVOb8ev3jz8C5ORa?dl=0 File : TA.zip play with it, rip it to pieces, improve it: go on, I dare you :) Richmond. So I downloaded this stack and wrote a script that implemented three ideas, two by other LCoders, one by me. Because you graciously ignored these ideas, I was simply curious about their effects on speed and selectivity (by using trueWords). I didn't play with your stack, I didn't rip it into pieces, but somehow improved it a bit in the sense of using effectively some available features of LC 7. It was no dare, I had fun. And you had obviously fun too, what a great speech! Who dares wins, you --- and me. Hermann p.s. Shouldn't the opening of your speech read I was _achieved_? ;-) Richmond wrote: I am achieving what I initially set out to achieve, and with far less code than yours, so have no intention of changing anything. I, also, am a lucky sort of chap insofar as I don't really mind that much if my stack takes 3 days to work its way through a corpus . . . I can go and do some teaching, read a book, cook some food, go for a bike ride, talk to my wife, play with my cats, and so on. That has ALWAYS been my approach to programming for one simple reason: working every holiday for very many years indeed on a farm on an island I had to sort out broken bailers, tractors and so on. Now proper spares had to come, on a ferry, at a vast transportation overhead, from the mainland of Scotland. We could not afford that, so we fossicked (lovely verb) for whatever would do the job in the 'graveyard' of broken tractors, cars, stuff we had picked up from the local dump, and so on. Every single time we got our accursed bailer to bail the straw and the hay, we got the cotter pins we needed to connect the tractor to the plough, harrow, muck-spreader or whatever; never very elegant, but they worked. In fact my younger son was on that farm just 8 days ago and was shown some of my repair work by the farmer's son (the farmer is long dead); still functional after 25 years. I have, just, worked out a way to colourise the items I want, and while, churning through some socking great corpus that would take days, I only need it to colourise the sentences the previous routine has extracted, so that won't take that long. You, if it really seems such a good idea (and is it?) are more than welcome to download my stack https://www.dropbox.com/sh/ja47l87gg87sn0q/AAAIj99kEQVOb8ev3jz8C5ORa?dl=0 File: TA.zip and mess around with the script to your heart's content. AND, while we are talking about time-consuming exercises: having put 4 hours of work into the thing, that seems, already, a bit more than the thing deserves as I am not interested in winning the Tour de France, simply extracting some data from a million word corpus with absolutely no deadline at all unless I choose to impose one. The results MAY get rolled into a paper my wife and I are THINKING of writing for an academic conference . . . . Almost ALL the stacks I have thrown out into the public domain in the last 6 months have come back to me with comments about how my code is clunky, inefficient, and so forth; and I would not doubt for a minute that that is probably true. HOWEVER, as far as I am concerned there is one enormous advantage about my code above thine, or anybody else's; while thy code and the code of many others is probably more efficient, more clever and gets things done more quickly, I don't understand the finer points of it, while I understand how my code works 100% because it was written by me, follows my logic, and does what I require it to do. It is always entertaining and instructive to see how people react to my code, and I often learn a lot from their reactions (not least about human psychology), including new coding tricks - but there always come a point where the burden of having to plough through other people's code (reflecting the way their minds work) feels like too much in comparison from anything I might learn from it. --- I also suspect that very many people share my interest in getting the job done rather than producing posh code. RunRev claim, on their website, that one can learn to code quickly. With Livecode one can learn how to code RELATIVELY quickly, up to a certain point; and many people who are not programmers qua programmers should be attracted by that because they have probably got other things to do other than JUST program. I am, at least to a certain extent, one of those people, as computer programming is not the hinge on which my life rotates (and this became extremely clear just recently when I spent 3 weeks driving round Europe without access to any programming facilities at all), and that is why I may come
Re: Jane Austen's peculiarity
On 10/08/15 22:19, hh wrote: Hi all, Richmond, you could give this a try in your fine prepared stack: The following uses = an array [one of the proposals above] = trueWords [one of the proposals, needs LC 7] = multichar-itemDelimiters [one of the proposals above, needs LC 7] It outputs for each of your 6 opening words were ,was ,is ,are ,has ,have . the frequency counts of words 1 and lists the item numbers of these occurences, for each of the 6 words as itemdelimiter (actually word space). For example in fld COOKED were (by script created) we get: were by3 122 375 413 what means there are 3 occurrences of were by and these are at trueword 1 of items 122, 375 and 413 if were is the itemdelimiter. [Use of trueWord collects for example by and by? and by, and by! in one categorie by.] *** This takes 1 sec, in sum for all 6 opening words from above! *** *** So this is TMHO a true demo of the power of some LC 7 features *** A click on a line of one of the 6 output fields colourizes (yellow backColour) exactly the occurrences in fld TEKST and cycles finding these by hitting the enterKey. What to do? [1} Make a new button with the following script part 1. [2] Add the last part of the script to your card script part 2. Have fun, it takes 5 minutes to test all this with your stack ... Hermann ## part 1 for button on mouseUp put the millisecs into strt put started : the long time into fld STARTT put empty into fld STOPT lock screen; lock messages -- speeds up set cursor to watch put 1 into KTEKST; put 1 into KCOOK put fld WERBS into WERBS; delete last line of WERBS put fld TEKST into TEKST delete char 1 to offset(PRIDE AND PREJUDICE,TEKST)-1 of TEKST -- watch the space after each item, no space before each item put were ,was ,is ,are ,has ,have into openings -- start be lazy if there is no fld STOPT2 then clone fld STOPT set name of last fld to STOPT2 set left of fld STOPT2 to the left of fld STOPT set top of fld STOPT2 to the 40+the top of fld STOPT end if repeat with j=1 to 6 put (COOKED word 1 of item j of openings) into F if there is no field F then clone fld Cooked set name of last fld to F set rect of fld F to (0,0,275,150) set topleft of fld F to \ (item j of 95,95,380,380,670,670, item j of 590,740,590,740,590,740) set tabstops of fld F to 128 end if end repeat -- end be lazy repeat for each item W in openings put (COOKED word 1 of W) into F put empty into RM; put empty into RM1 set itemdelimiter to W; put TEKST into TEKST2 delete item 1 of TEKST2; put 1 into X repeat for each item I in TEKST2 put W trueword 1 of I into Y -- important is trueword, compare to word add 1 to word 1 of RM[Y] add 1 to X; put space X after RM[Y] end repeat -- write these 'keys' at top repeat for each line L in WERBS put RM[W L] into wL if wL is empty then put 0 into wL put cr W L tab wL after RM1 end repeat combine RM by cr and tab put W : diff cases tab (the number of lines of RM) \ cr RM1 crcr RM into fld F set textstyle of line 1 of fld F to bold set textstyle of line 3 to 2+(the num of lines of WERBS) of fld F to italic set hilitedLines of fld F to 1 set itemdelimiter to comma end repeat put finished : the long time into fld STOPT put (the short name of me): (the millisecs - strt) ms into fld STOPT2 unlock screen; unlock messages end mouseUp ## part 2 for card script local toFind on mouseUp if cooked is in the short name of the target then set cursor to watch; lock screen; lock messages put length(fld TEKST) into L set textcolor of char 1 to L of fld TEKST to 0,0,0 set backColor of char 1 to L of fld TEKST to 255,255,255 put the value of the clickline into cL colorWords cL unlock screen; unlock messages end if end mouseUp on colorWords x set itemdel to tab put item 2 of x into wrds put 1 + word 1 of wrds into N set itemdel to ((trueword 1 of x) space) repeat with j=2 to N set backcolor of trueword 1 of item (word j of wrds) of fld TEKST to 255,255,0 end repeat put find whole quote (trueword 1 to 2 of x) quote \ in fld quote TEKST quote into toFind select before trueword 1 of item (word N of wrds) of fld TEKST -- the last hit set itemdel to comma do toFind end colorWords on enterinField do toFind end enterinField -- end of scripts ___ I am achieving what I initially set out to achieve, and with far less code than yours, so have no intention of changing anything. I, also, am a lucky sort of chap insofar as I don't really mind that much if my stack takes 3 days to work its way through a corpus . . . I can go and do some teaching, read a book, cook some food, go for a bike ride, talk to my wife, play with my cats, and so on. That
Re: Jane Austen's peculiarity
My current version is here: https://www.dropbox.com/sh/ja47l87gg87sn0q/AAAIj99kEQVOb8ev3jz8C5ORa?dl=0 File : TA.zip play with it, rip it to pieces, improve it: go on, I dare you :) Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 23:03, Richard Gaskin wrote: Richmond wrote: Just by loading the textFields into variables the whole script runs considerably faster If you did the same with the output it'd get even faster. Hmm: on mouseUp put empty into fld COOKED put empty into fld STARTT put empty into fld STOPT put started : the long time into fld STARTT put 1 into KTEKST put 1 into KCOOK put fld TEKST into TEKST put fld WERBS into WERBS repeat until line KTEKST of TEKST contains finalSolution666 put line KTEKST of TEKST into LTEKST put 1 into KWERBS repeat until line KWERBS of WERBS contains finalSolution666 put was line KWERBS of WERBS into FRAZE put were line KWERBS of WERBS into FRAZE2 if LTEKST contains FRAZE then put KTEKST : LTEKST into line KCOOK of COOKED ---!!! add 1 to KCOOK end if if LTEKST contains FRAZE2 then put KTEKST : LTEKST into line KCOOK of COOKED ---!!! add 1 to KCOOK end if add 1 to KWERBS end repeat add 1 to KTEKST end repeat put COOKED into fld COOKED ---!!! put finished : the long time into fld STOPT end mouseUp modifications marked thus: ---!!! crashed Livecode. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Call my code clunky, clumsy and slow, and that won't really fuss me: BUT what does fuss me is why this produces NO results when it analyses a text imported from an RTF file, BUT does work when the text is either manually edited or imported from a text file. on mouseUp put empty into fld COOKED put 1 into KTEKST put 1 into KCOOK repeat until line KTEKST of fld TEKST contains finalSolution666 put line KTEKST of fld TEKST into fld LYNE put 1 into KWERBS repeat until line KWERBS of fld WERBS contains finalSolution666 put was line KWERBS of fld WERBS into fld FRAZE if fld LYNE contains fld FRAZE then put fld LYNE into line KCOOK of fld COOKED add 1 to KCOOK end if add 1 to KWERBS end repeat add 1 to KTEKST end repeat end mouseUp This is a big Pain-in-the-bum as texts imported from Text files do NOT generally come into LC textFields with lineBreaks. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
To come back to Richmond's opening post, one could think about using the following, avoiding complex offset constructions. First collect word 1 of each item of a string (not too large, size adapted to your machine), where the itemdelimiter is were or any other word (conditional) that filters a targeted phrasing in or out. Strings as itemdelimiters are possible in LC 7 (one may also use split and combine with such delimiters) and this is pretty fast. This could narrow the lists and cases you have to investigate further. Hermann Sun Aug 9 01:44:36 CEST 2015 by Alex Tweedly. I think I'd agree that a conditional clause should be equired (could it be any of 'if', 'unless', 'whether', ...)? Otherwise, you'd be finding false positives like: I gave two shillings to my brother and last night they _were returned_ to me. -- Alex. Sat Aug 8 18:42:51 CEST 2015 by Richmond. Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 16:47, Peter M. Brigham wrote: On Aug 9, 2015, at 8:46 AM, Richmond wrote: BUT what does fuss me is why this produces NO results when it analyses a text imported from an RTF file, BUT does work when the text is either manually edited or imported from a text file. Maybe if you do this? set the RTFtext of the templatefield to RTFtextFromFile put the text of the templatefield into fileText reset the templatefield and then operate on the variable fileText. That is, use the engine to convert the RTFtext to plain text. -- Peter Hmm, that's a thought. However, I'm doing just fine with HTML text instead :) Now, if you happen to know of a list of English intransitive verbs . . . . Richmond Nuttier than a fruitcake Mathewson. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 19:21, Richard Gaskin wrote: Richmond wrote: Now, if you happen to know of a list of English intransitive verbs . . . . https://en.wiktionary.org/wiki/Category:English_intransitive_verbs Yes: I looked there: but I am a lazy toad, and the thought of typing all those words into a field makes me want to go and work as a pole dancer! So, all you Use-List users who, for some funny reason (!!!), cannot stomach the idea of Richmond as a pole-dancer in an erotic bar near you, you know what you have to do: get typing :) There's also WordNet, but while it does include word sense I don't recall if it gets as specific as to the type of verb. Besides, it's even more cumbersome to parse than sraping those Wikipedia pages, so hopefully those will be helpful. scraping is, indeed, the word. I was, perhaps rather naively, hoping that some hard-working person somewhere had already assembled a nice, tidy, text file of intransitive verb forms . . . Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 9, 2015, at 8:46 AM, Richmond wrote: BUT what does fuss me is why this produces NO results when it analyses a text imported from an RTF file, BUT does work when the text is either manually edited or imported from a text file. Maybe if you do this? set the RTFtext of the templatefield to RTFtextFromFile put the text of the templatefield into fileText reset the templatefield and then operate on the variable fileText. That is, use the engine to convert the RTFtext to plain text. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond wrote: Now, if you happen to know of a list of English intransitive verbs . . . . https://en.wiktionary.org/wiki/Category:English_intransitive_verbs There's also WordNet, but while it does include word sense I don't recall if it gets as specific as to the type of verb. Besides, it's even more cumbersome to parse than sraping those Wikipedia pages, so hopefully those will be helpful. -- Richard Gaskin Fourth World Systems Software Design and Development for the Desktop, Mobile, and the Web ambassa...@fourthworld.comhttp://www.FourthWorld.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 16:52, Richmond wrote: On 09/08/15 16:47, Peter M. Brigham wrote: On Aug 9, 2015, at 8:46 AM, Richmond wrote: BUT what does fuss me is why this produces NO results when it analyses a text imported from an RTF file, BUT does work when the text is either manually edited or imported from a text file. Maybe if you do this? set the RTFtext of the templatefield to RTFtextFromFile put the text of the templatefield into fileText reset the templatefield and then operate on the variable fileText. That is, use the engine to convert the RTFtext to plain text. -- Peter Hmm, that's a thought. However, I'm doing just fine with HTML text instead :) Now, if you happen to know of a list of English intransitive verbs . . . . Richmond Nuttier than a fruitcake Mathewson. Well . . . the stack works, but as I have not loaded the whole text to be analysed into a variable, but am doing it line by line, the whole thing is taking far, far too long . . . After 3 hours the stack has just passed line 56,362 of the Gutenberg Library's Complete Jane Austen ( and that is only looking for was / were plus 3 verb forms ) . . . much too slow. I wonder why I have always had to learn things the hard way? Certainly, if I am going to analyse the million words of the Freiburg English Dialect Corpus, I am going to have to get things moving along somewhat. Wikipedia lists 5048 intransitive verbs; IFF I can get them into a listField (!!!), things will get very slow indeed if the list is not loaded into a variable. My production machine has 6 GB RAM, so, with any luck, with both the corpus and the verb list in variables (i.e. directly in RAM), I won't have to go away and plant potatoes for a week while it does the crunching. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
snip However, I'm doing just fine with HTML text instead :) snip Well . . . the stack works, but as I have not loaded the whole text to be analysed into a variable, but am doing it line by line, the whole thing is taking far, far too long . . . What this does at least prove (P. Brigham et al) is that while my code may work slowly, my logic is not faulty. That at least means I can have a warm fuzzy about my logic as the stack grinds along at a glacial rate. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Just by loading the textFields into variables the whole script runs considerably faster: on mouseUp put empty into fld COOKED put the long time into fld STARTT put 1 into KTEKST put 1 into KCOOK put fld TEKST into TEKST -- put fld WERBS into WERBS -- repeat until line KTEKST of TEKST contains finalSolution666 put line KTEKST of TEKST into LTEKST put 1 into KWERBS repeat until line KWERBS of WERBS contains finalSolution666 put was line KWERBS of WERBS into FRAZE put were line KWERBS of WERBS into FRAZE2 if LTEKST contains FRAZE then put KTEKST : LTEKST into line KCOOK of fld COOKED add 1 to KCOOK end if if LTEKST contains FRAZE2 then put KTEKST : LTEKST into line KCOOK of fld COOKED add 1 to KCOOK end if add 1 to KWERBS end repeat add 1 to KTEKST end repeat put the long time into fld STOPT end mouseUp -- indicates the important changes. The new script (with loading textFields into variables) with one text took 115 seconds. The same text with the old text (no loading) took 10 minutes and 7 seconds. 5.27 times faster . . . Wow! I am surprised at such a speed difference! Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 23:03, Richard Gaskin wrote: Richmond wrote: Just by loading the textFields into variables the whole script runs considerably faster If you did the same with the output it'd get even faster. Aha. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond wrote: Just by loading the textFields into variables the whole script runs considerably faster If you did the same with the output it'd get even faster. -- Richard Gaskin Fourth World Systems Software Design and Development for the Desktop, Mobile, and the Web ambassa...@fourthworld.comhttp://www.FourthWorld.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond, Just so you know what is going on. Each time a change in made to a field, a lot of management code is executed to properly render the field in case it is needed to be visible. So when a field is modified within a loop, that field management code is executed over and over. When the data is moved into a variable then manipulated the field management code is only executed when the results are put back into the field. In almost all cases it is much faster to copy a field into a variable, manipulate the data, then put it back in the field when you want to make it visible. Regards, Mike On 8/9/15 4:22 PM, Richmond wrote: On 09/08/15 23:03, Richard Gaskin wrote: Richmond wrote: Just by loading the textFields into variables the whole script runs considerably faster If you did the same with the output it'd get even faster. Aha. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Hermann. You are back. So glad... Craig -Original Message- From: hh h...@livecode.org To: use-livecode use-livecode@lists.runrev.com Sent: Sun, Aug 9, 2015 2:15 pm Subject: Re: Jane Austen's peculiarity To come back to Richmond's opening post, one could think about using the following, avoiding complex offset constructions. First collect word 1 of each item of a string (not too large, size adapted to your machine), where the itemdelimiter is were or any other word (conditional) that filters a targeted phrasing in or out. Strings as itemdelimiters are possible in LC 7 (one may also use split and combine with such delimiters) and this is pretty fast. This could narrow the lists and cases you have to investigate further. Hermann Sun Aug 9 01:44:36 CEST 2015 by Alex Tweedly. I think I'd agree that a conditional clause should be equired (could it be any of 'if', 'unless', 'whether', ...)? Otherwise, you'd be finding false positives like: I gave two shillings to my brother and last night they _were returned_ to me. -- Alex. Sat Aug 8 18:42:51 CEST 2015 by Richmond. Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 10/08/15 02:54, Michael Doub wrote: Richmond, Just so you know what is going on. Each time a change in made to a field, a lot of management code is executed to properly render the field in case it is needed to be visible. So when a field is modified within a loop, that field management code is executed over and over. When the data is moved into a variable then manipulated the field management code is only executed when the results are put back into the field. In almost all cases it is much faster to copy a field into a variable, manipulate the data, then put it back in the field when you want to make it visible. Regards, Mike Thanks: you read my mind - I was going to get up this morning and ask 'why'? I suffer from a serious problem: I learnt all about this sort of stuff donkey's years ago (about 30) and since the fact that I have a computer that sits on my desk that at 'only' 9 years old wipes the floor with anything available 30 years ago (VAX mainframe) I had really overlooked that sort of thing, to the extent of completely forgetting about it. This is, also, the first time I have used a programming language/suite/IDE to process large amounts of data since my BA project in PASCAL (30 years ago) - which, using a much shorter text than Jane Austen's complete works (an English translation of Leibniz's /Monadology/) crashed the University of Durham mainframe; and that didn't render any visual stuff on screen whatsoever. I shall now become obsessive about loading everything into variables :) Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 12:42 PM, Richmond wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? I'll leave it to those who speak Regex to suggest a wildcard solution. Here's another one (not tested) that will catch past participles ending in ed. Not sure how this will scale with large texts: function findWere pText -- returns a comma-delim list of all the word offsets matching were *ed put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList end repeat return item 1 to -1 of outList end if function wordOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the wordOffsets of str in pContainer -- if matchWhole = true then only whole words are located --else will find word matches everywhere str is part of a word in pContainer --note that in LC words will include adjacent puncutation, -- so using matchWhole = true may exclude too many words -- duplicates are stripped out --eg wordOffsets(co,the common coconut) = 2,3 not 2,3,3 -- note: to get the last wordOffset of a string in a container (often useful) --use item -1 of wordOffsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = 0 then return 0 repeat for each item i in offList put the number of words of (char 1 to i of pContainer) into wdNbr if matchWhole then if word wdNbr of pContainer str then next repeat end if put 1 into A[wdNbr] -- using an array avoids duplicates end repeat put the keys of A into wordList sort lines of wordList ascending numeric replace cr with comma in wordList return wordList end wordOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if not found -- note: offsets(xx,xx) returns 1,3,5 not 1,2,3,4,5 -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use item -1 of offsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets P.S. I love Jane Austen. One of my favorite books of all time is Pride and Prejudice. It's so beautifully constructed. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 20:48, Peter M. Brigham wrote: On Aug 8, 2015, at 12:42 PM, Richmond wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? I'll leave it to those who speak Regex to suggest a wildcard solution. Here's another one (not tested) that will catch past participles ending in ed. Looks good; however, I am really looking for ALL preterites; such as 'become', so your 'ed' trap won't catch that. I am wondering about using a listField of all the preterites that I am looking for. Not sure how this will scale with large texts: function findWere pText -- returns a comma-delim list of all the word offsets matching were *ed put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList end repeat return item 1 to -1 of outList end if function wordOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the wordOffsets of str in pContainer -- if matchWhole = true then only whole words are located --else will find word matches everywhere str is part of a word in pContainer --note that in LC words will include adjacent puncutation, -- so using matchWhole = true may exclude too many words -- duplicates are stripped out --eg wordOffsets(co,the common coconut) = 2,3 not 2,3,3 -- note: to get the last wordOffset of a string in a container (often useful) --use item -1 of wordOffsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = 0 then return 0 repeat for each item i in offList put the number of words of (char 1 to i of pContainer) into wdNbr if matchWhole then if word wdNbr of pContainer str then next repeat end if put 1 into A[wdNbr] -- using an array avoids duplicates end repeat put the keys of A into wordList sort lines of wordList ascending numeric replace cr with comma in wordList return wordList end wordOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if not found -- note: offsets(xx,xx) returns 1,3,5 not 1,2,3,4,5 -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use item -1 of offsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets P.S. I love Jane Austen. One of my favorite books of all time is Pride and Prejudice. It's so beautifully constructed. Glad to hear that another programmer doesn't spend all their time in front of a computer screen! -- Peter Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 1:56 PM, Richmond wrote: On 08/08/15 20:48, Peter M. Brigham wrote: On Aug 8, 2015, at 12:42 PM, Richmond wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? I'll leave it to those who speak Regex to suggest a wildcard solution. Here's another one (not tested) that will catch past participles ending in ed. Looks good; however, I am really looking for ALL preterites; such as 'become', so your 'ed' trap won't catch that. I am wondering about using a listField of all the preterites that I am looking for. if you do that then just make the repeat loop as follows: repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList else if testWord is among the words of fld preteritesList then put w comma after outList end repeat This will be faster if you put the preteritesList field into a variable before the repeat loop, since it's significantly faster for the engine to access the contents of a variable compared with the contents of a field. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig Not sure how this will scale with large texts: function findWere pText -- returns a comma-delim list of all the word offsets matching were *ed put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList end repeat return item 1 to -1 of outList end if function wordOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the wordOffsets of str in pContainer -- if matchWhole = true then only whole words are located --else will find word matches everywhere str is part of a word in pContainer --note that in LC words will include adjacent puncutation, -- so using matchWhole = true may exclude too many words -- duplicates are stripped out --eg wordOffsets(co,the common coconut) = 2,3 not 2,3,3 -- note: to get the last wordOffset of a string in a container (often useful) --use item -1 of wordOffsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = 0 then return 0 repeat for each item i in offList put the number of words of (char 1 to i of pContainer) into wdNbr if matchWhole then if word wdNbr of pContainer str then next repeat end if put 1 into A[wdNbr] -- using an array avoids duplicates end repeat put the keys of A into wordList sort lines of wordList ascending numeric replace cr with comma in wordList return wordList end wordOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if not found -- note: offsets(xx,xx) returns 1,3,5 not 1,2,3,4,5 -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use item -1 of offsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets P.S. I love Jane Austen. One of my favorite books of all time is Pride and Prejudice. It's so beautifully constructed. Glad to hear that another programmer doesn't spend all their time in front of a computer screen! ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 21:18, Peter M. Brigham wrote: On Aug 8, 2015, at 1:56 PM, Richmond wrote: On 08/08/15 20:48, Peter M. Brigham wrote: On Aug 8, 2015, at 12:42 PM, Richmond wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? I'll leave it to those who speak Regex to suggest a wildcard solution. Here's another one (not tested) that will catch past participles ending in ed. Looks good; however, I am really looking for ALL preterites; such as 'become', so your 'ed' trap won't catch that. I am wondering about using a listField of all the preterites that I am looking for. if you do that then just make the repeat loop as follows: repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList else if testWord is among the words of fld preteritesList then put w comma after outList end repeat This will be faster if you put the preteritesList field into a variable before the repeat loop, since it's significantly faster for the engine to access the contents of a variable compared with the contents of a field. Thanks for that one I've just made a fool of myself using a listField of the verb forms and the thing is glacially slow. As soon as the stack has run its course I will implement your suggestion. Richmond. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig Not sure how this will scale with large texts: function findWere pText -- returns a comma-delim list of all the word offsets matching were *ed put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed then put w comma after outList end repeat return item 1 to -1 of outList end if function wordOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the wordOffsets of str in pContainer -- if matchWhole = true then only whole words are located --else will find word matches everywhere str is part of a word in pContainer --note that in LC words will include adjacent puncutation, -- so using matchWhole = true may exclude too many words -- duplicates are stripped out --eg wordOffsets(co,the common coconut) = 2,3 not 2,3,3 -- note: to get the last wordOffset of a string in a container (often useful) --use item -1 of wordOffsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = 0 then return 0 repeat for each item i in offList put the number of words of (char 1 to i of pContainer) into wdNbr if matchWhole then if word wdNbr of pContainer str then next repeat end if put 1 into A[wdNbr] -- using an array avoids duplicates end repeat put the keys of A into wordList sort lines of wordList ascending numeric replace cr with comma in wordList return wordList end wordOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if not found -- note: offsets(xx,xx) returns 1,3,5 not 1,2,3,4,5 -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use item -1 of offsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets P.S. I love Jane Austen. One of my favorite books of all time is Pride and Prejudice. It's so beautifully constructed. Glad to hear that another programmer doesn't spend all their time in front of a computer screen! ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
I seem to be going wrong: I have a fld WERBS containing: found returned become and my test to be analysed in a fld TEKST: My Dad ate cheese. My Mum and Dad were returned home when it began to rain. He had a house in Spain. They were become hairdressers. They were found. finalSolution666 But this: on mouseUp put 1 into textLine put fld WERBS into $WERBS put fld TEKST into $TEKST put 1 into cookedLine repeat until line textLine of $TEKST contains finalSolution666 put 1 into verbLine repeat until line verbLine of $WERBS is empty put line verbLine of $WERBS into WERB put were WERB into FRAZE if line textLine $TEKST contains FRAZE then put line textLine $TEKST into line cookedLine of fld COOKED add 1 to cookedLine end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp put only They were found in line 1 of fld COOKED something wrong with my counters Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond, The key here is the “if” - which creates a conditional clause - which requires the past plural of the verb (in this case “were”). This is similar to the “wenn clause in German (Deutsch) and the “ut” clause in Latin. If I were able, I’d thank you in person for mentioning this. Paul Looney On Aug 8, 2015, at 9:42 AM, Richmond richmondmathew...@gmail.com wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 22:56, Paul Looney wrote: Richmond, The key here is the “if” - which creates a conditional clause - which requires the past plural of the verb (in this case “were”). This is similar to the “wenn clause in German (Deutsch) and the “ut” clause in Latin. If I were able, I’d thank you in person for mentioning this. Paul Looney I'm not sure anent that: He had been visiting a friend in the neighbouring county, and that friend having recently had his grounds laid out by an improver, Mr. Rushworth _was returned_ with his head full of the subject, and very eager to be improving his own place in the same way; and though not saying much to the purpose, could talk of nothing else. Jane Austen, Mansfield Park Richmond. On Aug 8, 2015, at 9:42 AM, Richmond richmondmathew...@gmail.com wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
In your last example: Mr. Rushworth _was returned_” “was returned” (singular, past tense, passive) is correct (although a simple “returned” would have been more powerful). There is no conditional, no “if”; as in your first example: to inquire if Mr. Wickham_were returned_, Haven’t had this much fun with the language in a long time… On Aug 8, 2015, at 1:07 PM, Richmond richmondmathew...@gmail.com wrote: On 08/08/15 22:56, Paul Looney wrote: Richmond, The key here is the “if” - which creates a conditional clause - which requires the past plural of the verb (in this case “were”). This is similar to the “wenn clause in German (Deutsch) and the “ut” clause in Latin. If I were able, I’d thank you in person for mentioning this. Paul Looney I'm not sure anent that: He had been visiting a friend in the neighbouring county, and that friend having recently had his grounds laid out by an improver, Mr. Rushworth _was returned_ with his head full of the subject, and very eager to be improving his own place in the same way; and though not saying much to the purpose, could talk of nothing else. Jane Austen, Mansfield Park Richmond. On Aug 8, 2015, at 9:42 AM, Richmond richmondmathew...@gmail.com wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com mailto:use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com mailto:use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 3:41 PM, Richmond wrote: I seem to be going wrong: I have a fld WERBS containing: found returned become and my test to be analysed in a fld TEKST: My Dad ate cheese. My Mum and Dad were returned home when it began to rain. He had a house in Spain. They were become hairdressers. They were found. finalSolution666 But this: on mouseUp put 1 into textLine put fld WERBS into $WERBS put fld TEKST into $TEKST put 1 into cookedLine repeat until line textLine of $TEKST contains finalSolution666 put 1 into verbLine repeat until line verbLine of $WERBS is empty put line verbLine of $WERBS into WERB put were WERB into FRAZE if line textLine $TEKST contains FRAZE then put line textLine $TEKST into line cookedLine of fld COOKED Missing an of in the two lines above: put line textLine *of* $TEKST into line cookedLine of fld COOKED etc Don't know if that's the problem. add 1 to cookedLine end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp put only They were found in line 1 of fld COOKED Your script logic seems unnecessarily complex. Since it looks as if only the last occurrence is ending up in the output field, instead of using a counter to keep track of the next line in the field, you could just put cr line textLine of $TEKST after fld COOKED But once again, loading a line into a field repeatedly will be much slower than putting it into a variable in the repeat loop and then putting the variable into the field just once when the repeat is done. Getting or putting something from or into a field is much slower than doing the same in a variable, so just do it once. Also, I can see no reason to be loading your data into system variables, which is what $WERBS etc is defining. The only reason to put something into a variable beginning with $ is if you want some other system process besides LC to be able to access the data. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 23:33, Peter M. Brigham wrote: On Aug 8, 2015, at 3:41 PM, Richmond wrote: I seem to be going wrong: I have a fld WERBS containing: found returned become and my test to be analysed in a fld TEKST: My Dad ate cheese. My Mum and Dad were returned home when it began to rain. He had a house in Spain. They were become hairdressers. They were found. finalSolution666 But this: on mouseUp put 1 into textLine put fld WERBS into $WERBS put fld TEKST into $TEKST put 1 into cookedLine repeat until line textLine of $TEKST contains finalSolution666 put 1 into verbLine repeat until line verbLine of $WERBS is empty put line verbLine of $WERBS into WERB put were WERB into FRAZE if line textLine $TEKST contains FRAZE then put line textLine $TEKST into line cookedLine of fld COOKED Missing an of in the two lines above: put line textLine *of* $TEKST into line cookedLine of fld COOKED etc Don't know if that's the problem. add 1 to cookedLine end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp put only They were found in line 1 of fld COOKED Your script logic seems unnecessarily complex. Since it looks as if only the last occurrence is ending up in the output field, instead of using a counter to keep track of the next line in the field, you could just put cr line textLine of $TEKST after fld COOKED But once again, loading a line into a field repeatedly will be much slower than putting it into a variable in the repeat loop and then putting the variable into the field just once when the repeat is done. Getting or putting something from or into a field is much slower than doing the same in a variable, so just do it once. Also, I can see no reason to be loading your data into system variables, which is what $WERBS etc is defining. The only reason to put something into a variable beginning with $ is if you want some other system process besides LC to be able to access the data. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig Um . . . $ is a mistake brought on by a dream I had about FORTRAN last night: in FORTRAN IV '$ was used for string variables. Senior moment! Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 23:23, Paul Looney wrote: In your last example: Mr. Rushworth _was returned_” “was returned” (singular, past tense, passive) I'm not sure if that is a passive, or an older form of the past perfect (= had returned) ??? is correct (although a simple “returned” would have been more powerful). There is no conditional, no “if”; as in your first example: to inquire if Mr. Wickham_were returned_, Haven’t had this much fun with the language in a long time… On Aug 8, 2015, at 1:07 PM, Richmond richmondmathew...@gmail.com wrote: On 08/08/15 22:56, Paul Looney wrote: Richmond, The key here is the “if” - which creates a conditional clause - which requires the past plural of the verb (in this case “were”). This is similar to the “wenn clause in German (Deutsch) and the “ut” clause in Latin. If I were able, I’d thank you in person for mentioning this. Paul Looney I'm not sure anent that: He had been visiting a friend in the neighbouring county, and that friend having recently had his grounds laid out by an improver, Mr. Rushworth _was returned_ with his head full of the subject, and very eager to be improving his own place in the same way; and though not saying much to the purpose, could talk of nothing else. Jane Austen, Mansfield Park Richmond. On Aug 8, 2015, at 9:42 AM, Richmond richmondmathew...@gmail.com wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com mailto:use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com mailto:use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 08/08/15 23:33, Peter M. Brigham wrote: snip Missing an of in the two lines above: put line textLine *of* $TEKST into line cookedLine of fld COOKED etc Don't know if that's the problem. snip Your script logic seems unnecessarily complex. Since it looks as if only the last occurrence is ending up in the output field, instead of using a counter to keep track of the next line in the field, you could just put cr line textLine of $TEKST after fld COOKED But once again, loading a line into a field repeatedly will be much slower than putting it into a variable in the repeat loop and then putting the variable into the field just once when the repeat is done. Getting or putting something from or into a field is much slower than doing the same in a variable, so just do it once. Also, I can see no reason to be loading your data into system variables, which is what $WERBS etc is defining. The only reason to put something into a variable beginning with $ is if you want some other system process besides LC to be able to access the data. -- Peter Well, as per your suggestion I did this: on mouseUp put 1 into textLine put fld WERBS into WERBS put fld TEKST into TEKST repeat until line textLine of TEKST contains finalSolution666 put textLine into fld KOUNT put 1 into verbLine repeat until line verbLine of WERBS is empty put line textLine of TEKST into fld LYNE put line verbLine of WERBS into WERB put were WERB into FRAZE put FRAZE into fld FRAZE if line textLine of TEKST contains FRAZE then if fld COOKED is empty then put line textLine of TEKST after fld COOKED -- this is here so that line 1 of fld COOKED does not end up empty else put cr line textLine of TEKST after fld COOKED end if end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp but still get only the last value. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: English usage [OT] (used to be Re: Jane Austen's peculiarity)
On Aug 8, 2015, at 4:44 PM, Richmond wrote: On 08/08/15 23:23, Paul Looney wrote: In your last example: Mr. Rushworth _was returned_” “was returned” (singular, past tense, passive) I'm not sure if that is a passive, or an older form of the past perfect (= had returned) ??? I believe that with verbs relating to motion and location, the old past tense was not has returned but is returned. He has come home is modern English, it used to be He is come home. The old usage survives in modern French: Il est venu, rather than Il a venu. I agree that it's not passive, just an old auxiliary verb. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: English usage [OT] (used to be Re: Jane Austen's peculiarity)
On 08/08/15 23:59, Peter M. Brigham wrote: On Aug 8, 2015, at 4:44 PM, Richmond wrote: On 08/08/15 23:23, Paul Looney wrote: In your last example: Mr. Rushworth _was returned_” “was returned” (singular, past tense, passive) I'm not sure if that is a passive, or an older form of the past perfect (= had returned) ??? I believe that with verbs relating to motion and location, the old past tense was not has returned but is returned. He has come home is modern English, it used to be He is come home. The old usage survives in modern French: Il est venu, rather than Il a venu. I agree that it's not passive, just an old auxiliary verb. -- Peter That is extremely useful: I shall pass that onto my wife (she's the Linguist-qua-Linguist: I just have an MA from SIUC in the subject - a dabbler), as she will be very pleased with that. Thank you. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 4:51 PM, Richmond wrote: On 08/08/15 23:33, Peter M. Brigham wrote: snip Missing an of in the two lines above: put line textLine *of* $TEKST into line cookedLine of fld COOKED etc Don't know if that's the problem. snip Your script logic seems unnecessarily complex. Since it looks as if only the last occurrence is ending up in the output field, instead of using a counter to keep track of the next line in the field, you could just put cr line textLine of $TEKST after fld COOKED But once again, loading a line into a field repeatedly will be much slower than putting it into a variable in the repeat loop and then putting the variable into the field just once when the repeat is done. Getting or putting something from or into a field is much slower than doing the same in a variable, so just do it once. Also, I can see no reason to be loading your data into system variables, which is what $WERBS etc is defining. The only reason to put something into a variable beginning with $ is if you want some other system process besides LC to be able to access the data. -- Peter Well, as per your suggestion I did this: on mouseUp put 1 into textLine put fld WERBS into WERBS put fld TEKST into TEKST repeat until line textLine of TEKST contains finalSolution666 put textLine into fld KOUNT put 1 into verbLine repeat until line verbLine of WERBS is empty put line textLine of TEKST into fld LYNE put line verbLine of WERBS into WERB put were WERB into FRAZE put FRAZE into fld FRAZE if line textLine of TEKST contains FRAZE then if fld COOKED is empty then put line textLine of TEKST after fld COOKED -- this is here so that line 1 of fld COOKED does not end up empty else put cr line textLine of TEKST after fld COOKED end if end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp but still get only the last value. Well, your logic still makes my head hurt, too many counters. Here's what I'd do, using a variant of my original function since it appears that you want to list the lines the relevant phrases occur in, not just the isolated phrases. function findWere pText -- returns a comma-delim list of all the line offsets matching were *ed --or were a word in your preterite list. put fld WERBS into pretList put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed or testWord is among the words of pretList then put (the number of lines of word 1 to w of pText) comma after outList end if end repeat return char 1 to -2 of outList end if then: on mouseup put fld TEKXT into tText put findWere(tText) into linesList repeat for each item i in linesList put line i of tText cr after relevantLines end repeat put char 1 to -2 of relevantLines into fld COOKED end mouseup Untested, but you get the idea. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On 09/08/15 00:26, Peter M. Brigham wrote: On Aug 8, 2015, at 4:51 PM, Richmond wrote: On 08/08/15 23:33, Peter M. Brigham wrote: snip Missing an of in the two lines above: put line textLine *of* $TEKST into line cookedLine of fld COOKED etc Don't know if that's the problem. snip Your script logic seems unnecessarily complex. Since it looks as if only the last occurrence is ending up in the output field, instead of using a counter to keep track of the next line in the field, you could just put cr line textLine of $TEKST after fld COOKED But once again, loading a line into a field repeatedly will be much slower than putting it into a variable in the repeat loop and then putting the variable into the field just once when the repeat is done. Getting or putting something from or into a field is much slower than doing the same in a variable, so just do it once. Also, I can see no reason to be loading your data into system variables, which is what $WERBS etc is defining. The only reason to put something into a variable beginning with $ is if you want some other system process besides LC to be able to access the data. -- Peter Well, as per your suggestion I did this: on mouseUp put 1 into textLine put fld WERBS into WERBS put fld TEKST into TEKST repeat until line textLine of TEKST contains finalSolution666 put textLine into fld KOUNT put 1 into verbLine repeat until line verbLine of WERBS is empty put line textLine of TEKST into fld LYNE put line verbLine of WERBS into WERB put were WERB into FRAZE put FRAZE into fld FRAZE if line textLine of TEKST contains FRAZE then if fld COOKED is empty then put line textLine of TEKST after fld COOKED -- this is here so that line 1 of fld COOKED does not end up empty else put cr line textLine of TEKST after fld COOKED end if end if add 1 to verbLine end repeat add 1 to textLine end repeat end mouseUp but still get only the last value. Well, your logic still makes my head hurt, too many counters. My logic is based on the belief that one has to keep count of the lines in the Ur-text, the verb list, and the output field. It is also derived from a program I wrote to make concordances in PASCAL in 1985. The idea of using a carriage-return is very useful (haven't actually thought about those since my BA, as typed all my work on an Olivetti portable which my Mum was given by her Mum and Dad for her 21st birthday). Here's what I'd do, using a variant of my original function since it appears that you want to list the lines the relevant phrases occur in, not just the isolated phrases. Isolated phrases are not much use for subsequent analysis; i.e. to see which collocations they occur in, context, and so on. function findWere pText -- returns a comma-delim list of all the line offsets matching were *ed --or were a word in your preterite list. put fld WERBS into pretList put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed or testWord is among the words of pretList then put (the number of lines of word 1 to w of pText) comma after outList end if end repeat return char 1 to -2 of outList end if then: on mouseup put fld TEKXT into tText put findWere(tText) into linesList repeat for each item i in linesList put line i of tText cr after relevantLines end repeat put char 1 to -2 of relevantLines into fld COOKED end mouseup Untested, but you get the idea. I do. Thanks. I shall put your code in another button and see what happens. -- Peter Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
on mouseUp put fld TEKST into TEKST put fld WERBS into WERBS put findWere(TEKST) into linesList repeat for each item i in linesList put line i of TEKST cr after relevantLines end repeat put char 1 to -2 of relevantLines into fld COOKED end mouseUp function findWere pText -- returns a comma-delim list of all the line offsets matching were *ed --or were a word in your preterite list. put fld WERBS into pretList put wordOffsets(were, pText, true) into offList repeat for each item w in offList put word w+1 of pText into testWord if testWord ends with ed or testWord is among the words of pretList then put (the number of lines of word 1 to w of pText) comma after outList end if end repeat return char 1 to -2 of outList end findWere executing at 12:43:43 AM TypeFunction: error in function handler ObjectBrigham Lineput wordOffsets(were, pText, true) into offList HintwordOffsets Something either I don't understand (obviously your 'Logic' and my 'Logic' are not sitting well together). Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Oddly enough this does NOT seem to be a problem with my counters. If I reorder my verbList the stack still only finds text bits with 'found'. Retyping the list (rather than importing it from RTF) the first occurrence of the first verb in the list results. There seems to be a problem with the way LibreOffice encodes RTF, and then another problem inwith my script. However if one imports a text file and says: set the text of fld TEKST to URL (file: it) nothing works at all when an analysis is attempted. Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
Richmond wrote: function findWere pText -- returns a comma-delim list of all the line offsets matching were *ed --or were a word in your preterite list. put fld WERBS into pretList put wordOffsets(were, pText, true) into offList Unless the build you're using a custom build, wouldn't that be wordOffset (singular)? Also, if you're using v7 you might consider trueWordOffset, which accounts for quote characters and omits punctuation that characterize the historic definition of word in xTalks. The Unicode libraries in v7 make many natural-language parsing tasks much simpler - there's even a new sentence chunk type. -- Richard Gaskin Fourth World Systems Software Design and Development for the Desktop, Mobile, and the Web ambassa...@fourthworld.comhttp://www.FourthWorld.com ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
I think I'd agree that a conditional clause should be equired (could it be any of 'if', 'unless', 'whether', ...)? Otherwise, you'd be finding false positives like: I gave two shillings to my brother and last night they _were returned_ to me. -- Alex. On 08/08/2015 20:56, Paul Looney wrote: Richmond, The key here is the “if” - which creates a conditional clause - which requires the past plural of the verb (in this case “were”). This is similar to the “wenn clause in German (Deutsch) and the “ut” clause in Latin. If I were able, I’d thank you in person for mentioning this. Paul Looney On Aug 8, 2015, at 9:42 AM, Richmond richmondmathew...@gmail.com wrote: Jane Austen [amongst others] uses an interesting type of grammatical construction of this sort: After breakfast, the girls walked to Meryton to inquire if Mr. Wickham _were returned_, and to lament over his absence from the Netherfield ball. Pride and Prejudice. I would like to analyse a million word corpus that I have been granted access to for this type of construction. However, I don't want to find examples of only 'were returned', but all examples of were + infinitive / preterite / past participle and, presumably for that I shall have to use wildcards . . . OR ??? Richmond. ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 6:41 PM, Richard Gaskin wrote: Richmond wrote: function findWere pText -- returns a comma-delim list of all the line offsets matching were *ed --or were a word in your preterite list. put fld WERBS into pretList put wordOffsets(were, pText, true) into offList Unless the build you're using a custom build, wouldn't that be wordOffset (singular)? I included the utility functions wordOffsets() and offsets() in one of my previous posts. I probably should have repeated them. I use them a lot -- there are many contexts in which they are useful. function wordOffsets str, pContainer, matchWhole -- returns a comma-delimited list of all the wordOffsets of str in pContainer -- if matchWhole = true then only whole words are located --else will find word matches everywhere str is part of a word in pContainer --note that in LC words will include adjacent puncutation, -- so using matchWhole = true may exclude too many words -- duplicates are stripped out --eg wordOffsets(co,the common coconut) = 2,3 not 2,3,3 -- note: to get the last wordOffset of a string in a container (often useful) --use item -1 of wordOffsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware -- requires offsets() if matchWhole = empty then put false into matchWhole put offsets(str,pContainer) into offList if offList = 0 then return 0 repeat for each item i in offList put the number of words of (char 1 to i of pContainer) into wdNbr if matchWhole then if word wdNbr of pContainer str then next repeat end if put 1 into A[wdNbr] -- using an array avoids duplicates end repeat put the keys of A into wordList sort lines of wordList ascending numeric replace cr with comma in wordList return wordList end wordOffsets function offsets str, pContainer -- returns a comma-delimited list of all the offsets of str in pContainer -- returns 0 if not found -- note: offsets(xx,xx) returns 1,3,5 not 1,2,3,4,5 -- ie, overlapping offsets are not counted -- note: to get the last occurrence of a string in a container (often useful) -- use item -1 of offsets(...) -- by Peter M. Brigham, pmb...@gmail.com — freeware if str is not in pContainer then return 0 put 0 into startPoint repeat put offset(str,pContainer,startPoint) into thisOffset if thisOffset = 0 then exit repeat add thisOffset to startPoint put startPoint comma after offsetList add length(str)-1 to startPoint end repeat return item 1 to -1 of offsetList -- delete trailing comma end offsets Also, if you're using v7 you might consider trueWordOffset, which accounts for quote characters and omits punctuation that characterize the historic definition of word in xTalks. The Unicode libraries in v7 make many natural-language parsing tasks much simpler - there's even a new sentence chunk type. Yes, with newer versions the engine now does stuff that required scripted functions in earlier LC versions. I'm still not using later versions because my work stacks don't run in them properly, so I have all these utility functions in my library. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Jane Austen's peculiarity
On Aug 8, 2015, at 5:44 PM, Richmond wrote: executing at 12:43:43 AM TypeFunction: error in function handler ObjectBrigham Lineput wordOffsets(were, pText, true) into offList HintwordOffsets Probably you didn't include the wordOffsets() handler from my post earlier today. Try it again with the utility functions available. See the post I sent 5 minutes ago. -- Peter Peter M. Brigham pmb...@gmail.com http://home.comcast.net/~pmbrig ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode