Hi Everyone,
Thought I would pick your brains on the topic of comparing two big lists. Both are tab delimited. bigList has about 100,000 lines and 6 items (columns) per line. smallList is about 15,000 lines and 2 items per line. I want to identify the lines in bigList in which the third item is the same as the second item in a line in smallList, and then pull out the intersection. I used something like this, which works fine.
set the itemDelimiter to tab
repeat for each line j of smallList
put lineOffset(item 2 of j, bigList) into thisLine
if thisLine is not 0 then put j & tab & \
line thisLine of bigList & return after mergedList
end repeat
delete last character of mergedList -- Get rid of the trailing Return
Using the lineOffset function seemed the obvious choice to me, but I'm also interested in other approaches.
Regards,
Greg
Gregory Lypny
Associate Professor
John Molson School of Business
Concordia University
_________________________
"Absence of evidence is not evidence of absence."
- Anonymous
http://rubbersoul.concordia.ca
- Re: Comparing big lists Gregory Lypny
- Re: Comparing big lists Scott Raney
- Re: Comparing big lists Gregory Lypny
- Re: Comparing big lists Dar Scott
- Re: Comparing big lists jbv
- Re: Comparing big lists Dave Cragg
- Re: Comparing big lists erik hansen
- Re: Comparing big lists Dave Cragg
- Re: Comparing big lists Ben Rubinstein
- Re: Comparing Big Lists Gregory Lypny
- Re: Comparing Big Lists Gregory Lypny
