Dear Andrew,
you already got good suggestions by Mojca and Arthur; they show the only path that you can follow if you want to mark with macrons and breves all the vowels of a Latin text.

Now the five lowercase and upper case vowels a, e, i, o, and u with macrons and breves have all hexadecimal UNICODE codes in the second UNICODE page (this means that the first byte is 01 for all of them and the second byte ranges from 00 to 6D-- from Ā = 0100 to ŭ = 016D -- I did not look up for ȳ, which appears to me a very rare "Latin" vowel; yes I know that in transliteration from Greek it might be used, but it is not Latin "proper").

Therefore, utf8x as an option for the inputnec package, and this may be OK, but the T1 option for the fontenc package does not work, because this encoding contains mostly selected glyphs useful for western languages, including the abreve for Romanian, but, if I did not miss any glyph, it does not contain all the 20 upper and lower case vowels with macrons and breves.

Therefore you have to use Xe(La)TeX or lua(La)TeX. The variants containing the letters La allow you to use the familiar LaTeX markup, but they allow to use all the bells and whistles those programs offer; I have no experience with Xe(La)TeX, but I have a minimal (infinitesimal?) experience with Lua(La)TeX and I am surprised of the many things that can be done with this typesetting engine.

Another solution would be to redefine the original macros \= and \u, for the macron and the breve respectively, in such a way that before and after the marked vowel they insert a \allowhyphens declaration, a macro that is defined in babel and that introduces a zero-width blob of glue preceded by an infinite penalty; by so doing a word containing a marked vowel gets split into three pieces: (a) what precedes such vowel, (b) the marked wowel, and (c) what follows such vowel. The fragments (a) and (c) get proper hyphenation, but the word is never split just before or just after the marked vowel. A possible break point is missed, but at least the rest is properly hyphenated.

But, AAAAARGH!!! catastrophe!!!! if you mark all the vowel of every word, this procedure fails miserably; this was the trick I used when the old TeX2.x was available; it used only 128 glyph fonts (the standard CM fonts) and it was almost impossible to typeset in a decent way texts containing sentences in languages such as Spanish, or Portuguese, or Catalan, and, most regrettably, French; let's not speak about Romanian.

At that time we already had foreign European students on Erasmus mobility, and they had to typeset their theses in our technical University in their home languages; I succeeded setting up different versions of LaTeX (one for each language, since at that time TeX could handle one language at a time) using the above trick; it worked pretty well with the Hiberian languages, that use many accents, but not more than one per word; it worked not so well with French, that may use several accents on a single word (for example, électricité); it worked definitely in a bad way in Romanian where I found even words with 5 special national characters. In any case it was better than nothing. It was necessary to insert by hand several \- commands in order to adjust line breaking, but at least something more or less automatic was better than nothing. Fortunately enough TeX 3 came up at the beginning of the nineties together with the DC (EC) T1 encoded fonts and all these problems vanished.

Nevetheless...

The language description files for Italian and Latin contain the definition of the double quote character " as an active character; when inserted inside a word between to "normal" letters (letters without diacritics) it inserts a discretionary break that does not forbid the hyphenation of both word fragments: it is used for introducing etymological break points where the patterns would simply produce phonetic break points. If the potential break point just precedes a special letter input to TeX with an "accent" macro (such as \= or \u, for example) then the discretionary break must be introduced with "|.

Your problem with ȳ might be solved with a simple macro, if the discretionary break should immediately precede the glyph:

\def\y{"|\=y}

But, let me ask a naïve question: Why would you spell every Latin word with the "longa" or "brevis" melodic poetic notation?

Today I believe that very few modern languages have semantic differences depending on the length of the syllables while missing rhythmic accents; in western Europe I think that the language they speak in Chekia is one of such remaining languages, but I might be completely wrong. No modern person could read Latin prose maintaining the longa and brevis duration of each syllable as it is marked with \= and \u.

But, you might say, in poetry this is very important; yes I agree, but I remember reading Virgil and other poets (and also the Greek tragedies) when I was in high school (I graduated 52 years ago) and there was no longa/brevis indication on any vowel; myself and my schoolmates had clear in mind the rhythmic differences of dactili and spondaei, the rhythmic differences between hexametri and pentametri, and rarely missed a correct metric division. At the final State exam, very difficult and very selective, at the end of high school, we were supposed to read Latin and Greek poetry with the rhythmics these poems required "at first sight", that means without trying in advance two or three times each verse. The only difficulty for us, and possibly for our instructors, was to respect the Greek tones, raising or lowering the voice, or, even worse, doing a vibrato with the long vowels marked with a perispomeni/circumflex. I don't claim we were infallible, but as 18 or 19 year old teenagers we were all doing pretty well; we knew the rules and we could apply them at first sight. Then, why stressing the typographical capabilities of our beloved typesetting engines for something that may appear as useless to an ignorant as myself? (remember that although I had classical studies in high school, I am an engineer and spent most of my working years as a researcher and a professor of electronics in University).

Certainly a different approach is necessary in a dictionary or in a grammar, but in that case single words are marked with their longae and breves; even more important would be to mark the longae and breves on single words when philological considerations are being made, especially for discussing the derivation of modern Romance languages from their Latin father, taking into account also the substrate and the adstrate of the populations where Latin was forced upon by the conquerors. By knowing even few elements of romance philology, a person with even a basic latin education can understand most easily the Latin derived words of many official languages and their varieties; ... English included, since its Saxon bases are diluted in a large see of Latin based words introduced by the Normans in the XI century.

Forgive my intrusiveness, please :-)
Claudio



Andrew Gollan wrote:
grātiās plūrimās vōbīs agō

Andrew Gollan
"bis vincit qui se vincit"
Latin - Henry Clay HS

Reply via email to