Hi all,

On the off chance that my hack is of some use to another newb, here's
what I ended up doing. It's a script that inserts a zero-space character
between every character of the table cell's contents under a certain set
of conditions. The conditions could use a lot of fine-tuning; I've
written them so that they work for my particular situation.

Interestingly, this doesn't seem to interfere with normal hyphenation;
FOP uses hyphenation first and then starts looking for a place to break
the offending word.

Hmm. Just re-read it and realized that the $field_size variable is
completely unnecessary. This is what you get when English majors start
scripting. Ah well.

  <xsl:template match="entry/para">
    <xsl:choose>
      <!-- none of my corner cases involved para elements that contained
inline elements, so I just dropped these. Not robust, I know, but I'm on
a deadline.-->
      <xsl:when test="child::*">
        <fo:block country="en" hyphenate="true" wrap-option="wrap">
          <xsl:apply-templates/>
        </fo:block>
      </xsl:when>
   <!-- A better test here would be to determine if the para contains
any words containing numbers, and then just break those up. I've got no
idea where I'd start with that.-->
      <xsl:when test="string-length(.) &gt; 40">
        <fo:block country="en" hyphenate="true" wrap-option="wrap" >
          <xsl:apply-templates/>
        </fo:block>
      </xsl:when>
      <xsl:otherwise>
<!--Ok, the para contains no children and is shorter than 40 characters.
We're gonna put a zero-space character between every single character in
the string -->
        <fo:block country="en" hyphenate="true" wrap-option="wrap" >
          <xsl:variable name="field_contents">
            <xsl:value-of select="."/>
          </xsl:variable>
          <xsl:variable name="field_size">
            <xsl:value-of select="string-length(.)"/>
          </xsl:variable>
          <xsl:call-template name="bit_stuffer">
            <xsl:with-param name="field_contents"
select="$field_contents"/>
            <xsl:with-param name="field_size" select="$field_size"/>
          </xsl:call-template>
        </fo:block>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:template>

  <xsl:template name="bit_stuffer">
    <xsl:param name="field_contents"/>
    <xsl:param name="field_size" />
    <xsl:param name="count" select="1" />
    <xsl:if test="($count - 1) &lt; $field_size">
      <xsl:value-of select="substring($field_contents, $count,
1)"/><xsl:text>&#x200b;</xsl:text>
      <xsl:call-template name="bit_stuffer">
        <xsl:with-param name="count" select="$count + 1"/>
        <xsl:with-param name="field_contents" select="$field_contents"/>
        <xsl:with-param name="field_size" select="$field_size"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>

-----Original Message-----
From: Andreas Delmelle [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 18, 2008 12:49 PM
To: [email protected]
Subject: Re: Word breaking in table cells


On Mar 18, 2008, at 18:39, Jeff Hooker wrote:

Hi

> I'm dealing with an issue that appears fairly frequently on the FOP  
> and
> DocbookXSL mailing lists, but my version of it appears to be a bit  
> of a
> hybrid.
>
> I've got OFFO hyphenation up and running and it works just fine in
> easily 90% of my cases, but I've a special problem with table entry
> values that are basically impossible to hyphenate (e.g.
> XREF2TTMQ8_QRS20) and are contained in columns that require them to
> break at least once. They overflow the cell and overwrite the contents
> of the next cell. I'm not sure if the answer lies in hyphenation or  
> line
> breaking.

The answer would lie in implementing wrap-option="wrap" properly in FOP.
As I recall, it was once added to 0.9x, but got broken at some point  
(but I could be wrong).

Hyphenation is not really the answer, since --strictly  
theoretically-- hyphenation is mentioned only in the context of  
"words". Hyphenating a date, a number or some product code for  
instance, may be possible, but it doesn't make much sense IMO. Using  
hyphenation would be more a workaround/hack than a decent solution.

Currently, FOP's hyphenator has problems with anything non-alfabetic  
(unless you use a customized pattern file). Once the hyphenator  
encounters anything that is not a letter, it simply gives up, and the  
portion of text will be rendered as-is (no breaks).

>
> I've got a couple ideas for how to approach this, and would appreciate
> the input of anyone who had experience with them.
>
> 1. I've noticed that values will break if they contain a backslash  
> (/).
> If I could extend that function to include underscores, colons, and
> brackets, that would address the vast majority of my corner cases.

That's due to the implementation of Unicode UAX#14 line-breaking,  
where the slash and backslash are special characters that offer line- 
break opportunities.

I seem to remember vaguely that Manuel mentioned that someone could  
alter the tables and recompile FOP to customize the behavior. OTOH, I  
don't think this would be a very robust way of going about it. Better  
to rely on the Unicode standard, and see if the exceptions you need  
can be covered in another fashion.

> 2. I've notices that the values that refuse to break invariably  
> contain
> numbers; I suspect that this is what's throwing the hyphenation
> algorithm for a loop, but I can't find any guidance on adding  
> numbers to
> the classes in the hyphenation configuration file and just telling the
> system to break the word whereever the heck it wants to.

See above for the explanation.
No idea if it is even possible to add numbers to the hyphenation  
pattern file, since numbers are also used in the pattern file to  
specify the preference/desirability of a hyphenation point...

> 3. Many have suggested that one should use zero-width spaces. I can't
> ask my writers to do this on an as-needed basis, since the data is
> published in many different scales and formats, but I'd be fine with
> adding a zero-width space between every single character of a table
> entry block and letting the lines break where they may. Has anyone
> written the XSL for this already, or am I going to be the first?
>

There should be solutions available. I very much doubt you're the  
first to try this...

Cheers

Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to