"Jim Ault" <JimAultWins at yahoo.com> wrote:
One catch I can see is the "set whole matches to true"
also considering the false hits generated by your definition of a
unique
line (lower case, sub string, number format)
"Mary had a little lamb" = line 6 of field 2
"Mary had a little lamb, whose fleece was white" = line 8 of field 1
line 6 of fld 2 is in line 8 of fld 1 => lineoffset would be > 0
"234" & "2345" == offset match, lineOffset not
"234" & "2,345" == offset match not, lineOffset not
"234" & "2345.00 == offset match, lineOffset not
"234" & "2345, 554, 234, 196" == lineoffset match twice
"snow" & "snow shovel" & "snowbound" & "snow-bound"
Jim Ault
Las Vegas
Good catch :-)
"Alex Tweedly" <alex at tweedly.net > wrote:
-snip-
> put fld "Field" & cr & "ZZZZZZZZZZ" into t1
> put fld "Field" & cr & "test line" & cr & "ZZZZZZZZZZ" into t2
>
> put the millisecs into tStart
> put 1 into i2
> put the number of lines in t2 into limit2
>
> sort t1
> sort t2
> split t2 by CR
> put t2[1] into L2
>
> repeat for each line L1 in t1
> repeat while L2 < L1
> add 1 to i2
> put t2[i2] into L2
> end repeat
> if L2 = L1 then
> -- put L1 & cr after tBoth
> add 1 to i2
> put t2[i2] into L2
> else
> -- put L1 & cr after t1only
> end if
> end repeat
> if i2 < limit2 then
> repeat with i = i2 to limit2-1
> put t2[i] & cr after t2only
> end repeat
> end if
> put "loop" && the millisecs - tStart & cr after msg
P.S. I tried hard to break every one of Jerry's recommendation about
variable naming as described in his excellent tutorial from the
"Conference" session; if you haven't already downloaded and read that
stack, you should. It *might* just stop you from writing such ugly
code
as I did above - but my old Fortran habits just keep coming back :-)
--
Alex Tweedly http://www.tweedly.net
The handler above is not giving correct results, neither on numeric
lists nor on word or mixed lists.
Follows a function which is a combination and adaptation of
techniques mentioned previously in this thread
### adapt the names of handler and the filtermodes to own taste
function intersectSpecial pList1,pList2,pMode
repeat for each line i in pList1
add 1 to a[i]
end repeat
repeat for each line i in pList2
add 2 to a[i]
end repeat
combine a with cr and tab
### elements only in pList1 --> 1
### elements only in pList2 --> 2
### elements in both lists --> 3
if pMode = "bothCommon" then put "*"&tab&"3" into tFilter
else if pMode = "uniqueA" then put "*"&tab&"1" into tFilter
else if pMode = "uniqueB" then put "*"&tab&"2" into tFilter
else if pMode = "bothUnique" then put "*"&tab&"1,*" &tab&"2" into
tFilter
repeat for each item tFilterString in tFilter
put a into b
filter b with tFilterString
replace char 2 to -1 of tFilterString with "" in b
put b & cr after tList
end repeat
return tList
end intersectSpecial
on mouseUp
put the millisecs into zap
put intersectSpecial(fld 1,fld 2,"bothUnique") into fld 3
put the millisecs - zap
end mouseUp
May be not a real speed monster but not bad either
(takes < 500 millisecs for 2 fields with > 25000 lines on an iMac G5
1.8 gHz)
Greetings,
Wouter
_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution