RE: looking for faster Ideas...

Mike Rajkowski Wed, 28 Jan 2004 10:02:23 -0800

Title: RE: looking for faster Ideas...

I might not have mad myself clear. If you have 10,000 name that want to be removed. You put them into a hasfile, and then process though the csv file, and attempt to read the item from the hash file based on the criteria ( i.e. Name )

A few read per line, if ordering does not matter.

Otherwise you could potentially have to do 10,000 (multiple more if order matters) case statements, for each name.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of George Gallen
Sent: Tuesday, January 27, 2004 1:00 PM
To: 'U2 Users Discussion List'
Subject: RE: looking for faster Ideas...

Mike, doing what you propose would require a massive file to start with, and

would require a crap load of disk reads, which would be far slower then a bunch

of cases, and the project isn't worth that kind of investment anyway. But thanks.

the source line would look something like

"","jon c smith","1234 anywhere st","","","somecity","SS","12345-1254",""

I'm looking for "smith" & "12345" and sometimes "anywhere"

We may get a call from john smith (john not jon because they

didn't spell their first name), didn't leave their middle init and

didn't give us their 9 digit zip, only 5 digit zip.

So I can't build any indexes. Searching for multiple pieces on the same line

pretty much gives a fairly good matchup considing the source and match

data aren't EXACTLY the same.

Any of course, I'm not going to go hog wild in doing this. Creating a temp

file, parsing into dynamic arrays loops and lookups...way too much, rather

just use PERL to pre-process.

-----Original Message-----
From: Mike Rajkowski [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 27, 2004 2:41 PM
To: U2 Users Discussion List
Subject: RE: looking for faster Ideas...

Create a temp file, and populate it with variations of the name in question (upcase and remove spaces). (Storing address information in each record)

Then loop through your list, taking the name, and parsing the various combinations of the words.

( John David Doe - JOHNDOE, DOEJOHN JOHNDAVIDDOE, JOHNDOEDAVID)

And attempt to read the item from the temp file, if it can read an item then verify the address information. Otherwise check the next item.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of George Gallen
Sent: Tuesday, January 27, 2004 12:13 PM
To: 'U2 Users Discussion List'
Subject: RE: looking for faster Ideas...

in rethinking my take on that. That would still be difficult

since the arrays would only contain "parts" of the whole fields.

making the searching of the arrays very difficult.

We can't store the exact entry, since sometimes people will

call and say stop sending me things and not give us the name

the same way it's in the database we rent.

Basically it takes the renting company a couple months to remove

the name, but we like to filter it immediately to stop anything

from going out before the renting company removes it, and it

also will catch it if the renting company replaces it in a couple

months later....

George

-----Original Message-----
From: George Gallen [mailto:[EMAIL PROTECTED]
Sent: Tuesday, January 27, 2004 2:06 PM
To: 'U2 Users Discussion List'
Subject: RE: looking for faster Ideas...

I can't just check for names, it has to a name with a specific zip code
and if the name is fairly common, we also add in part of the address to
make sure no one else is weeded out that shouldn't be.

I suppose I could keep two or three arrays, do a specific lookup in each
saving the position, and if all three positions are identicle (asuming all
three arrays have the name, address, zip in the same order) then that would
be a match....Thanks

George

>-----Original Message-----
>From: Jeff Schasny [mailto:[EMAIL PROTECTED]]
>Sent: Tuesday, January 27, 2004 1:51 PM
>To: U2 Users Discussion List
>Subject: RE: looking for faster Ideas...
>
>
>how about keeping a list of excluded names as a record in a
>file (or as a
>flat file in a directory with each name/item/whatever on a
>line) and reading
>it into the program as a dynamic array then doing a locate on
>the string in
>question. Something like this:
>
>
>READ ALIST FROM AFILE,SOME-ID ELSE STOP
>X = 0
>LOOP
>   X += 1
>   ASTRING = INLIST<X>
>UNTIL ASTRING = ''
>   LOCATE ASTRING IN ALIST SETTING POS THEN
>      DO
>      OTHER
>      STUFF
>   END ELSE
>      DONT
>   END
>REPEAT
>
>Of course of you really want speed then sort the list and use
>a "BY clause
>in the locate
>
>-----Original Message-----

_______________________________________________
u2-users mailing list
[EMAIL PROTECTED]
http://www.oliver.com/mailman/listinfo/u2-users

RE: looking for faster Ideas...

Reply via email to