Dave Craven wrote:

To Harold Fuchs, et al: I am surprised at your attitude in regards to Linux, There are too many programs available, by developers, to make any sense of what your statements are saying! What a shame as I had thought you were more intelligent than that! I am NOT a programmer nor a developer but, thought you would have understood what I was saying more than anyone else! I am sorely disappointed and confused at your attitude in regards to this situation, when it is obvious you have not done the research and found programs on Linux that, if not completely satisfactory, at least the beginnings of what are what is needed. I am really surprised at what you are now saying in regards to same!

You can always find the “beginnings of what is needed”. That’s hardly the same as having a working system. Even a non-programmer should understand that difference.

If I had the spare cycles, I'd coble together a USPS address validation
macro.  I've never written in Star Basic, so there'd be a bit of a
learning curve.  I did a little looking around and saw that the language
has a SAX parser, etc., so it looks like it should be quite doable.

Yes, it is quite doable, or it would not have been done again and again. But “If I had the spare cycles...” indicates one of the problems. If wishes were horses than beggars would ride. That you suggest a macro indicates unawareness of the complexity of an address correction application.

Search for [address correction" software requirements] in Google to give you an idea of the number of companies that offer this kind of software and some idea of the requirements. I have not been able to find a Canadian or US post-office list of the official requirements on the web. See http://www.canadapost.ca/business/offerings/address_management/pdf/addaccu-e.pdf for Canada Post guidelines. I believe you would have to write them to get full criteria from a developers point of view. Same with the US.

Once you have parsed the address (which itself is tricky in some cases), you have to match an address that exists on the post office data base. Remember, that you don’t care about the addresses that match automatically (unless you are also standardizing them). It is the ones that don’t match that you must either correct to what they should be or mark as uncorrectable.

You will need lists of common misspellings both of street names and city names, and ways of working with the fact that a common misspelling of one name may be the correct spelling of another name or that the same misspelling may be equally possible for a number of street names or city names.

As an example, the city of St. Catharines may appear as “Saint Catharines”, “St. Catherines”, or “Saint Catherines” and all of these without the final “s”. "London, Ontario" might appear incorrectly as “Londres, Ontario” if the letter was written by a French speaker. The names may have been originally entered into the database by cheap offshore typists from handwritten copy, introducing a number of errors.

I recall one very bad datafile from a large and reputable organization years ago where again and again some data-entry person had cleverly/stupidly written the city name “Clearwater” as “ClearH2O”.

I think your attitude is rather “fools rush in”. But try to see if you can do something in this area if you want to. I’ve written a number of parsers to fix up bad customer databases just to get the data in good enough order to be usable by address correction software, so I am well aware of the difficulties (and of many solutions) in respect to just the parsing part of the task.

We have a commercial parser which attempts to put data in the proper fields, and I won’t use it because it isn’t good enough. Again and again I have to write another new parser to fit the oddities of particular files. I expect you will find the SAX parser does not do at all what you think it does in this respect, especially as people writing addresses don’t always break up phrases into words properly. “RR # 1”, “RR#1”, “RR 1” and “RR1” must all be interpreted as “RR 1” as an example.

An XML parser like SAXparser isn’t a name and address parser and I think it would be very much the wrong thing to even attempt to use it as though it were.

Jim Allan


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to