do it outside basic using $grep -F -f pattern-file csv-file > remove-file
the pattern file would have the pieces in there. what if you're excluding something that's not unique? "smith" would exclude "smithers", "smithy". "psmith (one for the wodehouse fans :-)" etc. i do this with some huge syslog files, and fairly big pattern files and it's pretty darn quick. ian On Tue, 2004-01-27 at 10:33, George Gallen wrote: > I can't setup any indexs to speed this up. Basically I'm scanning a CSV > file > for names to remove > and set the flag of KICK=1 to remove it (creating a new CSV file at > the > same time). > > Keep in mind the ".." are people's last names, or zip codes, or part of > their address, changed > them to ".." to protect the unwanting... > > Right now, I do a series of CASE's ... > Now, it's not a major problem as I'm only checking for 20 or so names, > but > as more and more people > request to be removed (and we don't have access to the creation of the > list). this could get quite > slow over 50 or 60 thousand lines of checking. > > LIN is one line of the CSV file, the INDEX is checking for a last name & > a > zip code and sometimes > part of the address line. > > Any Ideas? > > Remember, we can't change the source of the file, it will always be a > CSV, > being read line by line > > KICK=0 > BEGIN CASE > CASE -1 > KICK=1 > BEGIN CASE > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 AND > INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 AND > INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 > CASE -1 > KICK=0 > END CASE > END CASE > > George Gallen > Senior Programmer/Analyst > Accounting/Data Division > [EMAIL PROTECTED] > ph:856.848.1000 Ext 220 > > SLACK Incorporated - An innovative information, education and management > company > http://www.slackinc.com > > _______________________________________________ > u2-users mailing list > [EMAIL PROTECTED] > http://www.oliver.com/mailman/listinfo/u2-users -- Ian McGowan <[EMAIL PROTECTED]> _______________________________________________ u2-users mailing list [EMAIL PROTECTED] http://www.oliver.com/mailman/listinfo/u2-users
