Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread Bill N1VUX
really, any perl programmer worth hiring should be able to do this while sleeping. Really? Postal de-duping may be harder than you think. What's the Reg-exp to convert or match FIELDS CORNER, BOSTON to DORCHESTER It's Not just canonicalization of abbreviations and moving Apartments

Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread John Saylor
hi ( 03.08.04 17:12 -0400 ) Joel Gwynn: we're looking for a fast, customizable de-duping solution. I was thinking there might be some perl stuff out there, really, any perl programmer worth hiring should be able to do this while sleeping. -- \js

Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread Richard Morse
On Tuesday, August 5, 2003, at 09:07 AM, John Saylor wrote: hi ( 03.08.04 17:12 -0400 ) Joel Gwynn: we're looking for a fast, customizable de-duping solution. I was thinking there might be some perl stuff out there, really, any perl programmer worth hiring should be able to do this while

Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread Drew Taylor
Tolkin, Steve mentioned On 8/5/03 11:21 AM,: The article in question can be found at http://www.foo.be/docs/tpj/issues/vol4_1/tpj0401-0002.html (I had a hard time finding it via tpj.com, but Google worked.) Unfortunately I think that the USPS site http://www.usps.com/cgi-bin/zip4/zip4inq needed

Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread Daniel M. Lipton
You may find more useful information as a registered USPS developer: http://www.USPSPriorityMail.com/et_regcert.html If you don't want to register before you get more answers, read carefully through their web tools documents available here: http://www.uspswebtools.com/ I believe the most

Re: [Boston.pm] Postal address De-duping

2003-08-14 Thread Chris Brooks
Actually, if I understand what Joel was asking about, removing duplicates by address is a non-trivial task -- address data is notoriously dirty. What makes the job interesting is that there are a wide variety of abbreviations used in addresses -- for example: 22 Saint John Street 22 St John

Re: [Boston.pm] Postal address De-duping

2003-08-09 Thread Andrew Pimlott
On Tue, Aug 05, 2003 at 11:21:25AM -0400, Tolkin, Steve wrote: Unfortunately I think that the USPS site http://www.usps.com/cgi-bin/zip4/zip4inq needed to run this script is no more. A search there for zip4inq produced nothing. Does anyone know of a similar page, wither by the USPS or

Re: [Boston.pm] Postal address De-duping

2003-08-09 Thread David Cantrell
On Tuesday, August 5, 2003, at 09:07 AM, John Saylor wrote: really, any perl programmer worth hiring should be able to do this while sleeping. No, it's quite a hard problem. All of the following UK addresses are the same and are deliverable. 2/11 CR7 8JH 11b CR7 8JH Flat 2, 11 Beulah Road, CR7

Re: [Boston.pm] Postal address De-duping

2003-08-08 Thread Steve Revilak
Unfortunately I think that the USPS site http://www.usps.com/cgi-bin/zip4/zip4inq needed to run this script is no more. A search there for zip4inq produced nothing. Does anyone know of a similar page, wither by the USPS or another provider of (web) services? Just follow the Find a Zip

Re: [Boston.pm] Postal address De-duping

2003-08-06 Thread John Saylor
hi On Tuesday, August 5, 2003, at 09:07 AM, John Saylor wrote: really, any perl programmer worth hiring should be able to do this while sleeping. ( 03.08.05 19:21 +0100 ) David Cantrell: No, it's quite a hard problem. i guess it depends on the way the problem is defined by the client. as

RE: [Boston.pm] Postal address De-duping

2003-08-06 Thread Tolkin, Steve
] Postal address De-duping On Monday, August 4, 2003, at 05:12 PM, Joel Gwynn wrote: Hey, all. We do lots of (snail) mailings, and we're looking for a fast, customizable de-duping solution. We're currently taking a look at doubletake from http://peoplesmith.com/, which

Re: [Boston.pm] Postal address De-duping

2003-08-04 Thread Jon Orwant
On Monday, August 4, 2003, at 05:12 PM, Joel Gwynn wrote: Hey, all. We do lots of (snail) mailings, and we're looking for a fast, customizable de-duping solution. We're currently taking a look at doubletake from http://peoplesmith.com/, which is not too expensive, but I was thinking there