On Mar 14, 2008, at 1:25 PM, Maxwell, Adam R wrote: > I'd recommend that everyone who wants smarter whitespace and text > handling > logic read this > > http://www.betalogue.com/2007/07/13/mac-os-xs-preview-when-a-space-is-not-a- > space/ > > and pay particular attention to John Calhoun's comments (he wrote > PDF Kit). > This is not a simple problem.
Thanks for this, and I thought as much. It's a little worrisome that a lot of the ensuing conversation in that thread is about the quality of Adobe's and Apple's algorithms for detecting multiple oddball edge cases that the spec allows rather than tightening of the spec. Since it's obvious that a standard like PDF should not be regarded as a dead end for information (what goes in should be reasonably accessible for processing in the future), one would hope that in addition to backward compatibility one should strive to eliminate the opportunity for the oddball edge cases. In any case, since Skim allows phrase searching, its users need to know that searches can fail due to these issues. Apparently my examples can be extended to include failed searches because a phrase that appeared to have an internal space actually did not. If a regular expression library was used for searching, one might define a phrase pattern to be the visible characters of the string separated by optional white space characters. Jim Harrison UVa ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Skim-app-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/skim-app-users
