[scribus] Hyphenation

john Culleton Fri, 28 Oct 2011 14:12:35 -0400

On Thu, 27 Oct 2011 20:19:07 +0000 (UTC)
Andreas Vox <avox at arcor.de> wrote:

> John Jason Jordan <johnxj at ...> writes:
> 
> ...
> 
> > I agree with Gregory that hyphenation is not perfect in many
> > programs. And Gregory has an excellent point about short syllable
> > breaks (e.g. "re-ceived") being sometimes ambiguous, leading the
> > reader to have to pause or re-read a line to connect the first and
> > last parts of the hyphenated word. I'm not sure how a layout
> > program can fix this, however.
> ...
> 
> > 
> > I don't know how Scribus does its hyphenation. And if it does use an
> > algorithm, switching to dictionary-based hyphenation just for
> > English may be impractical. Nevertheless, I wanted to point out that
> > hyphenation is more problematic than just deciding whether to base
> > it on the entire paragraph or one line at a time.
> 
> Scribus uses the same algorithm as TeX and OO.o. But that congenial
> method is really also a dictionary approach: to create the
> hyphenation rules, a large corpus of text is fed into the generation
> program. The program then tries to condense that information into a
> ruleset, which contains rules like "if you see this 'xyz" pattern,
> assume good break pos at 1 and bad break pos at 2, unless it also
> matches "pxyzq", in which case the best break position is 4,
> unless...." This results in a file with hundreds of short patterns
> which indicate good and bad break positions (priotized 1-5 iirc).
> Then the whole corpus is tested with this algorithm and the remaining
> words which aren't hyphenated correctly (usually just a dozen or so)
> are put into an exception list.
> 
> I don't know of any program that handles problems like "re-ceive"
> properly. With a paragraph layouter it should be possible to include
> extra penalties for such cases, so the layouter would automatically
> try to avoid those.
> 
> /Andreas
> 
> 
> 
> 
> ___
> Scribus Mailing List: scribus at lists.scribus.net
> Edit your options or unsubscribe:
> http://lists.scribus.net/mailman/listinfo/scribus
> See also:
> http://wiki.scribus.net
> http://forums.scribus.net
> 
> _______________________________________________________
> Unlimited Disk, Data Transfer, PHP/MySQL Domain Hosting
>               http://www.doteasy.com 

TeX works first from a hyphenation dictionary; if the word is not found
then from an algorithm. The algorithm is based on the work of Frank. M.
Liang. The minimum length of the first fragment of an hyphenated
word is by default 2 characters (\lefthyphenmin=2) and the right
fragment is 3 characters. (\righthyphenmin=3). There is a parameter
for discouraging hyphens (\hyphenpenalty=50). As you increase it
hyphens become less likely.  A discretionary hyphen can be inserted in
a word by inserting \- where a hyphen might occur. There is a settable
parameter that discourages two hyphens in a row
(\doublehyphendemerits=10000).

If you don't like a very short word or word fragment to end a paragraph
there is some TeX trickery to prevent that from happening. The accepted
custom is make end of paragraph words or fragments at least as wide 
as the indent of the following paragraph. TeX takes it from there.

And so on. All this complexity can be ignored by most users. The
defaults are sensible. My point is that when anyone says "TeX can't
handle this typesetting situation" they are probably wrong. You set the
rules or accept the defaults. TeX follows those rules. I have set
entire books with zero manual kerning. 

Now how does this impact Scribus? I suggest some optional behind the
scenes magic. If the paragraph is selected for TeX typesetting then the 
text is passed to luatex or xetex (TeX variants) along with the font
name and size, the max measure (width of the print line) etc. Luatex
sets the paragraph and returns the paragraph to Scribus, with hidden
kerns for word spacing.  Is this easy? No. Is it close to optimum?
Yes.  

-- 
John Culleton
Free list of books for self-publishers:
http://wexfordpress.net/shortlist.html

"Create Book Covers with Scribus"
http://www.booklocker.com/books/4055.html

[scribus] Hyphenation

Reply via email to