----- Original Message -----
From: "WJCarpenter" <[EMAIL PROTECTED]>
To: "AbiWord Mailing List" <[EMAIL PROTECTED]>
Sent: Thursday, July 20, 2000 8:55 AM
Subject: smart quote algorithm


| PUNCT A subset of layman's "punctuation".  I include only things that
|   can normally occur after a quote mark with no intervening white
|   space.  Includes period, exclamation point, question mark,
|   semi-colon, colon, comma, parentheses, square and curly brackets.
|   There may be a few others that aren't on the kinds of keyboards I
|   use, and there are certainly Latin1 and other locale-specific
|   variants, but the point is that there are lots of random
|   non-alphanumerics which aren't included in PUNCT for this algorithm.

We can probably use all characters defined as 'Punctuation' in the Unicode
standard. These are marked as 'Po', e.g.:

0021;EXCLAMATION MARK;Po;0;ON;;;;;N;;;;;

| ALPHA  Alphabetic characters in the C isalpha() sense, but there are
|   certainly some non-ASCII letter characters which belong in this
|   bucket, too.

Almost 50,000 in Unicode ...

| The algorithm doesn't make a special case of using ASCII double quote
| as an inches indicator (there are other uses, like lat/long minutes;

For minutes and feet, U+2032 should really be used (yes, nobody (will) use them
(if AbiWord doesn't make them easy to use/insert!)).

BTW, I've found that " shouldn't be used for inches after all, U+2033 shold be
used (for inches and seconds).

| ditto for the ASCII quote) because it is tough to tell if some numbers
| with an ASCII double quote after them are intended to be one of those
| "other things" or is just the end of a very long quote.

Yes.

| So, the
| algorithm will be wrong sometimes in those cases.


| It is otherwise sort of conservative, preferring to not convert things
| it doesn't feel confident about.  The reason for that is that there is
| a contemplated on-the-fly conversion to smart quotes, but there is no
| contemplated on-the-fly conversion to ASCII QUOTEs.

Well, you can turn of smart quotes. In word, you can use 'Undo' to undo
conversion of a single quote.

| What about the occasions when this algorithm (or any alternative
| algorithm) makes a mistake and converts a QUOTE to the curly form when
| it really isn't wanted, in a particular case, by the user?

So, if your inch marks is converted to a '99', you only have to press 'Ctrl+Z'.
This should work in AbiWord too.

-- 
Karl Ove Hufthammer




Reply via email to