Edward Rayl wrote
> Actually JPluck can do this [a single CR to space conversion].  Just write
the appropriate XSLT script.

If someone has an XSLT script of this general sort, I'd be very grateful if
they would post it here.  A few more examples (in addition to those in the
showcase) would help me a lot.

To get things rolling, here's what I came up with to get rid of the cruft at
the Seattle Times site:

   <transform pattern=".*">
           <xsl>
           <stylesheet><![CDATA[
           <?xml version="1.0" encoding="UTF-8"?>
           <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
           <xsl:import href="jpluck.xsl" />
           <xsl:template match="/">
               <html>
               <head>
               </head>
               <body>
                  <xsl:apply-templates select="//[EMAIL PROTECTED]|//img"/>
               </body>
               </html>
            </xsl:template>
            <xsl:template match="//img">
                  <xsl:if test="contains(@src,'ABPub')">
                     <p>
                     <xsl:copy-of select="."/>
                     </p>
                  </xsl:if>
                  <xsl:if test="contains(@src,'localnews')">
                     <p>
                     <xsl:copy-of select="."/>
                     </p>
                  </xsl:if>
                  <xsl:if test="contains(@src,'Photograph_link')">
                     <p>
                     <xsl:copy-of select="."/>
                     </p>
                  </xsl:if>
            </xsl:template>
            <xsl:template match="font">
               <xsl:if test="not(ancestor::font)">
                  <xsl:if test="@size=2">
                     <p>
                     <xsl:copy-of select="."/>
                     </p>
                  </xsl:if>
               </xsl:if>
            </xsl:template>
            </xsl:stylesheet>
            ]]></stylesheet>
            </xsl>
        </transform>

It turns out that the Seattle Times has the whole text of articles
(including the heading and such) within a <font size="2"... /> block.
That's matched here by "//[EMAIL PROTECTED]", and tested using test="@size=2".
The main trick is that you don't want to select any nested font tags.
That's avoided with the "not(ancestor::font)" test, which excludes any font
tags which are contained within other font tags.

The stuff with images matches any img tags, then includes only those which I
specifically want -- an icon (Photograph_link), the pretty top of page
picture (ABPub), and anything in the local section (localnews).

Here's another request for anyone experienced with XML -- Is there a
(preferably free, on Windows or Linux) utility out there which can tell you
the ancestors of a particular selection in an XSL file?

9:)     Lindsey Dubb     [EMAIL PROTECTED]

_______________________________________________
plucker-list mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-list

Reply via email to