-------- Original Message --------
Subject: Re: [l2h] Converting emdashs and endashs?
Date: Tue, 12 Aug 2003 19:35:11 +0200
From: Daniel Taupin <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: James Howison <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]>

Please, do not confuse shapes of quotes (single, double) which are a character
problem, with the handling of -- and ---. The last things are standard ligatures
with TeX fonts, while the first ones are a question of typing taste.

Therefore, since it is a TeX/LaTeX standard, I ask for a standard conversion
(unless in math mode) from -- to "endash" and a FURTHER conversion of  "endash"
followed by a - to "emdash".

On the other hand, I would disagree with a change in the behaviour of double
quotes, mainly because iot would be tricky for people performing copy/paste from
latex2html generated screens.

James Howison wrote:

On Monday, August 11, 2003, at 11:08 pm, Ross Moore wrote:

Hello James,

On Mon, 11 Aug 2003, James Howison wrote:

Now I have curly quotes happening (yay!) I am wondering about the  other
special characters.  I realize that this will break back-wards
compatibility but that is not an issue for my needs.

I would like "---" to be converted to "&#8212;" as defined in the
unicode.pl file at 799 - but this doesn't seem to happen - instead it
is converted to "--".  This is also what happens if I change --- to
{---}.


That is definitely a lot harder; particularly since -- and --- are
rarely used correctly in LaTeX manuscripts.
So general rules may easily result in something that the author
never intended.


I use -- and --- often.

I'm still wondering, though, how to tell which conversions specified in the unicode.pl file actually happen and which do not---and how those are controlled ... I guess I'll spend some more time with the source ;)

Also I see from the source that converting single quotes is
tough---perhaps I'm naive but it would seem to me that this sequence
would work...

s/``/&#8220;/og
s/`/&#8216;/og     # once the `` is gone then the ` is only used for
open single quote right?


Not at all.  \`  is used as an accent, and in some language variants,
the ` is made active to remove the need to use the \ .
With this active character, overloading can occur for generating
other special characters or ligatures.


Right - well I see the difficulty now. Quite an important distinction - language compatibility being very important. The use of ` rather than \ is not something that I'm familiar with - out of interest why is this done - is it because the \ character is not easily accessible on the keyboard?

Perhaps if these conversions are done _after_ the conversions from latex->unicode then perhaps this would work (i.e. the international characters would already be converted to their unicode expressions ...).

s/''/&#8221;/og
s/'/&#8217;/og     # Will also replace apostrophes with close curly
single - not a bad thing.


Sorry; I cannot agree.
Every Latin-based charset encoding has an apostrophe character.
A curly-quote is most definitely *not* logically an apostrophe, even
though it may look like one.


I acknowledge that this is a matter of style---but the unicode standard discusses this and generally prefers the use of the curly single (&#2019) to the straight mark (&#0027)

http://www.unicode.org/unicode/reports/tr8/ #Apostrophe%20Semantics%20Errata

<snip>

The aim of an HTML translation should not be appearance.
It should be ensuring that meaning is preserved, and that no symbol
is rendered with the 'missing character' glyph.


I think one might reasonably disagree that appearance is not important---HTML is, intentions notwithstanding, a format used for presentation. Your point and care is about the 'missing character' glyph is well taken, the warnings are very useful for this.

The 'div' request for CSS in Hakan's email also reflects the use of HTML as an appearance format.

Thanks,
James

Hope this helps,

Ross Moore


Thanks, James


On Saturday, August 9, 2003, at 02:53 am, Ross Moore wrote:


On Sat, 9 Aug 2003, James Howison wrote:

Hi all,

I'd really like to convert the latex quotation marks, `` and '' to the
recommended HTML curly quotes, &#8220 instead of `` and &#8221 instead
of '' - standard codes that render the curly quotes beautifully.


set
 $USE_CURLY_QUOTES =1;
in an initialisation file.

This is not the default, because not all browsers actually render
these characters. (At least, that was the situation 3-4 years ago  when
the LaTex2HTML coding was written.)


Hope this helps,


Ross Moore



I'm sure that this is possible through latex2html - the codes are listed around unicode.pl:722 - but either I can't find the magic incantation to have latex2html do the conversion or there is a bug preventing this from working in my version (1.70) or set-up.

I've tried:

latex2html -html_version 4.0,unicode test.tex

What is strange is that this does work for, say \v{Z} which converts
to
the code &#381; (and that is definitely happening through unicode.pl
(I
changed the translation and it worked fine).

So why doesn't the translation for `` (which is correctly listed in
the
unicode.pl as \`\`) and '' which is correctly listed as \'\' work?

I've had a good hunt around for this - but I can't see why the other
codes are converted but not the quotes.

Cheers,
James

ps. minimal test.tex follows

----------

\documentclass[11pt]{article}
\begin{document}
``Why are these quotes not converted to unicode''  (they are in the
unicode.pl file)
While this symbol (also in the unicode.pl file) is? - \v{Z}
\end{document}

_______________________________________________
latex2html mailing list
[EMAIL PROTECTED]
http://tug.org/mailman/listinfo/latex2html



_______________________________________________ latex2html mailing list [EMAIL PROTECTED] http://tug.org/mailman/listinfo/latex2html



_______________________________________________ latex2html mailing list [EMAIL PROTECTED] http://tug.org/mailman/listinfo/latex2html


-- ------------------------------------------------------------------------ Daniel Taupin, 91400 ORSAY - France E-mail= mailto:[EMAIL PROTECTED] Home/fax: (33)1.60.10.26.44. Rep.: (33)1.60.10.04.13, fax (work) (33)1.69.15.60.86













--
------------------------------------------------------------------------
Daniel Taupin, 91400 ORSAY - France
E-mail= mailto:[EMAIL PROTECTED]
Home/fax: (33)1.60.10.26.44. Rep.: (33)1.60.10.04.13, fax (work) (33)1.69.15.60.86












_______________________________________________
latex2html mailing list
[EMAIL PROTECTED]
http://tug.org/mailman/listinfo/latex2html

Reply via email to