> -----Original Message----- > From: news [mailto:[EMAIL PROTECTED] On Behalf Of Jim Allan > Sent: Thursday, June 12, 2008 1:28 PM > To: [email protected] > Subject: [users] Re: http://www.saveXP.org/ [WAS: Wrong OS or OS > Version] > > Dave Craven wrote: > > > To Harold Fuchs, et al: I am surprised at your attitude in regards to > > Linux, > > There are too many programs available, by developers, to make any > sense > > of what your statements are saying! What a shame as I had thought you > > were more intelligent than that! I am NOT a programmer nor a > developer > > but, thought you would have understood what I was saying more than > > anyone else! I am sorely disappointed and confused at your attitude > in > > regards to this situation, when it is obvious you have not done the > > research and found programs on Linux that, if not completely > > satisfactory, at least the beginnings of what are what is needed. I > am > > really surprised at what you are now saying in regards to same! > > You can always find the "beginnings of what is needed". That's hardly > the same as having a working system. Even a non-programmer should > understand that difference. > > > If I had the spare cycles, I'd coble together a USPS address > validation > > macro. I've never written in Star Basic, so there'd be a bit of a > > learning curve. I did a little looking around and saw that the > language > > has a SAX parser, etc., so it looks like it should be quite doable. > > Yes, it is quite doable, or it would not have been done again and > again. > But "If I had the spare cycles..." indicates one of the problems. If > wishes were horses than beggars would ride. That you suggest a macro > indicates unawareness of the complexity of an address correction > application. > > Search for [address correction" software requirements] in Google to > give > you an idea of the number of companies that offer this kind of software > and some idea of the requirements. I have not been able to find a > Canadian or US post-office list of the official requirements on the > web. > See > http://www.canadapost.ca/business/offerings/address_management/pdf/adda > ccu-e.pdf > for Canada Post guidelines. I believe you would have to write them to > get full criteria from a developers point of view. Same with the US. > > Once you have parsed the address (which itself is tricky in some > cases), > you have to match an address that exists on the post office data base. > Remember, that you don't care about the addresses that match > automatically (unless you are also standardizing them). It is the ones > that don't match that you must either correct to what they should be or > mark as uncorrectable. > > You will need lists of common misspellings both of street names and > city > names, and ways of working with the fact that a common misspelling of > one name may be the correct spelling of another name or that the same > misspelling may be equally possible for a number of street names or > city > names. > > As an example, the city of St. Catharines may appear as "Saint > Catharines", "St. Catherines", or "Saint Catherines" and all of these > without the final "s". "London, Ontario" might appear incorrectly as > "Londres, Ontario" if the letter was written by a French speaker. The > names may have been originally entered into the database by cheap > offshore typists from handwritten copy, introducing a number of errors. > > I recall one very bad datafile from a large and reputable organization > years ago where again and again some data-entry person had > cleverly/stupidly written the city name "Clearwater" as "ClearH2O". > > I think your attitude is rather "fools rush in". But try to see if you > can do something in this area if you want to. I've written a number of > parsers to fix up bad customer databases just to get the data in good > enough order to be usable by address correction software, so I am well > aware of the difficulties (and of many solutions) in respect to just > the > parsing part of the task. > > We have a commercial parser which attempts to put data in the proper > fields, and I won't use it because it isn't good enough. Again and > again > I have to write another new parser to fit the oddities of particular > files. I expect you will find the SAX parser does not do at all what > you > think it does in this respect, especially as people writing addresses > don't always break up phrases into words properly. "RR # 1", "RR#1", > "RR > 1" and "RR1" must all be interpreted as "RR 1" as an example. > > An XML parser like SAXparser isn't a name and address parser and I > think > it would be very much the wrong thing to even attempt to use it as > though it were. > > Jim Allan
Among the reasons I think that it is possible to do as a Starbasic macro is that, within OOo, I'm making the assumption of working with a spreadsheet or database where the fields have already been parsed into usable fields (name, address1, address2, city, state, zip, etc.) The kind of thing a typical OOo user would use for a mail merge. The other big reason, is that I'VE ALREADY DONE THIS in another language. The USPS facilities will correct all of the "RR # 1" variations into their standard format, along with standardizing various abbreviations for you. None of this is the job of the proposed macro. It may not correct "ClearH2O", but at worst will tell you that it isn't a valid address. A macro could be a sufficient solution for a desktop user. A SAX parser is strictly for breaking up the data returned from USPS into something that can be returned to the spreadsheet or error(s) presented to the user. I agree that it's not what one might use for large volume corporate mailings. However, the USPS provides even more facilities for corporate uses. They aren't free, however. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
