Title: RE: looking for faster Ideas...
How about using *nix sort and comm based on a like-structured csv reference file to produce a sub-file of possible hits, then trawl this output using D3/UD to refine the list of unwanted rows (building back into a flat file) and then again using comm to produce your cleaned output file.
 
Cuts down the file size you'll need to process in mv basic
 
Cheers
 
Steve
-----Original Message-----
From: George Gallen [mailto:[EMAIL PROTECTED]
Sent: 27 January 2004 20:04
To: 'U2 Users Discussion List'
Subject: RE: looking for faster Ideas...

keep in mind, it's not the renting company that is
giveing us the remove infomation, it's the consumer,
and of course they never have the mailing piece in
their hand. Although usually, if they call, we can get
the specific info we are looking for which can change
the case to one check.
 
But when the info is mailed in or emailed in or left on
a voice mail, that's when we run into not having the
best data to go with. Calling/emailing/mailing them
back usually just increases the annoyance level on
their end, since we are contacting them Again..
 
George
-----Original Message-----
From: George Gallen [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 27, 2004 2:51 PM
To: 'U2 Users Discussion List'
Subject: RE: looking for faster Ideas...

sometimes there is a number, but rarely, are we given
the number when requested to remove, usually just
remove me from your $^&#^$*&$ mailing :) some add please.

I considered PERL as a pre-processor to remove the names
then pass that file to my program which does other stuff
too

George

>-----Original Message-----
>From: Ian McGowan [mailto:[EMAIL PROTECTED]]
>Sent: Tuesday, January 27, 2004 2:22 PM
>To: U2 Users Discussion List
>Subject: RE: looking for faster Ideas...
>
>
>if speed is the issue, sounds like a job for a compiled lanuage. or
>semi-compiled like perl or python.
>
>is there a unique number sent over by the other system?  it might be
>quicker to parse the whole thing and keep an exclude file keyed off the
>unique number.  if it weren't for embedded comma's you could
>CONVERT ","
>TO @AM, extract the key and write the record out as-is.  that would be
>quicker than 852 INDEX's :-)
>
>On Tue, 2004-01-27 at 11:05, George Gallen wrote:
>> I can't just check for names, it has to a name with a
>specific zip code
>> and if the name is fairly common, we also add in part of the
>address to
>> make sure no one else is weeded out that shouldn't be.
>>
>> I suppose I could keep two or three arrays, do a specific
>lookup in each
>>
>> saving the position, and if all three positions are
>identicle (asuming
>> all
>> three arrays have the name, address, zip in the same order) then that
>> would
>> be a match....Thanks
>>
>> George
>>
>> >-----Original Message-----
>> >From: Jeff Schasny [ mailto:[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]> ]
>> >Sent: Tuesday, January 27, 2004 1:51 PM
>> >To: U2 Users Discussion List
>> >Subject: RE: looking for faster Ideas...
>> >
>> >
>> >how about keeping a list of excluded names as a record in a
>> >file (or as a
>> >flat file in a directory with each name/item/whatever on a
>> >line) and reading
>> >it into the program as a dynamic array then doing a locate on
>> >the string in
>> >question.  Something like this:
>> >
>> >
>> >READ ALIST FROM AFILE,SOME-ID ELSE STOP
>> >X = 0
>> >LOOP
>> >   X += 1
>> >   ASTRING = INLIST<X>
>> >UNTIL ASTRING = ''
>> >   LOCATE ASTRING IN ALIST SETTING POS THEN
>> >      DO
>> >      OTHER
>> >      STUFF
>> >   END ELSE
>> >      DONT
>> >   END
>> >REPEAT
>> >
>> >Of course of you really want speed then sort the list and use
>> >a "BY clause
>> >in the locate
>> >
>> >-----Original Message-----
>> >From: George Gallen [ mailto:[EMAIL PROTECTED]
>> <mailto:[EMAIL PROTECTED]> ]
>> >Sent: Tuesday, January 27, 2004 11:33 AM
>> >To: 'Ardent List'
>> >Subject: looking for faster Ideas...
>> >
>> >
>> >I can't setup any indexs to speed this up. Basically I'm
>> >scanning a CSV file
>> >for names to remove
>> >   and set the flag of KICK=1 to remove it (creating a new CSV
>> >file at the
>> >same time).
>> >
>> >Keep in mind the ".." are people's last names, or zip
>codes, or part of
>>
>> >their address, changed
>> >them to ".." to protect the unwanting...
>> >
>> >Right now, I do a series of CASE's ...
>> >Now, it's not a major problem as I'm only checking for 20 or
>> >so names, but
>> >as more and more people
>> >  request to be removed (and we don't have access to the
>> >creation of the
>> >list). this could get quite
>> >  slow over 50 or 60 thousand lines of checking.
>> >
>> >LIN is one line of the CSV file, the INDEX is checking for a
>> >last name & a
>> >zip code and sometimes
>> >   part of the address line.
>> >
>> >Any Ideas?
>> >
>> >Remember, we can't change the source of the file, it will
>> >always be a CSV,
>> >being read line by line
>> >
>> >   KICK=0
>> >   BEGIN CASE
>> >      CASE -1
>> >         KICK=1
>> >        BEGIN CASE
>> >            CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 AND
>> >INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0 AND
>> >INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE INDEX(LIN,"..",1)#0 AND INDEX(LIN,"..",1)#0
>> >           CASE -1
>> >              KICK=0
>> >        END CASE
>> >   END CASE
>> >
>> >George Gallen
>> >Senior Programmer/Analyst
>> >Accounting/Data Division
>> >[EMAIL PROTECTED]
>> >ph:856.848.1000 Ext 220
>> >
>> >SLACK Incorporated - An innovative information, education and
>> >management
>> >company
>> > http://www.slackinc.com <http://www.slackinc.com
>> >
>> >_______________________________________________
>> >u2-users mailing list
>> >[EMAIL PROTECTED]
>> > http://www.oliver.com/mailman/listinfo/u2-users
>> <http://www.oliver.com/mailman/listinfo/u2-users
>> >_______________________________________________
>> >u2-users mailing list
>> >[EMAIL PROTECTED]
>> > http://www.oliver.com/mailman/listinfo/u2-users
>> <http://www.oliver.com/mailman/listinfo/u2-users
>> >
>--
>Ian McGowan <[EMAIL PROTECTED]>
>
>_______________________________________________
>u2-users mailing list
>[EMAIL PROTECTED]
>http://www.oliver.com/mailman/listinfo/u2-users
>

----------------------------------------------------------------------------
The information contained in this e-mail is confidential and is intended only for the named recipient(s). If you are not the intended recipient you must not copy, distribute, or take any action or reliance on it.
If you have received this e-mail in error, please notify the sender.
Any unauthorised disclosure of the information contained in this e-mail is strictly prohibited.
----------------------------------------------------------------------------
_______________________________________________
u2-users mailing list
[EMAIL PROTECTED]
http://www.oliver.com/mailman/listinfo/u2-users

Reply via email to