> I only read the ginormous  XML once... I apply the 7 filters to each
> node read and it gets allocated to one of the 7 output buckets (hows
> that for a semantically neutral term).
> 

This is known within the XSL WG as the "coloured widgets" problem after a 
streaming use case put forward by Oliver Becker. (The problem is, given an 
input document containing widgets of different colours, produce N output 
documents, one for each colour present in the file. There are two variants of 
the problem, one where the set of colours is known statically, one where it is 
dynamic). The XSLT 3.0 streaming solution for the static case is:

<xsl:stream href="widgets.xml">
  <xsl:fork>
    <xsl:sequence>
      <xsl:result-document href="red.xml">
       <xsl:sequence select="*/widget[@colour='red']"/>
      </xsl:result-document>
   </xsl:sequence>
   <xsl:sequence>
      <xsl:result-document href="blue.xml">
       <xsl:sequence select="*/widget[@colour='blue']"/>
      </xsl:result-document>
   </xsl:sequence>
   <xsl:sequence>
     <xsl:result-document href="green.xml">
       <xsl:sequence select="*/widget[@colour='green']"/>
      </xsl:result-document>
  </xsl:sequence>
 </xsl:fork>
</xsl:stream>

A streaming processor is required to evaluate this in a single pass of the 
input document; the three "prongs" of the xsl:fork are effectively executed in 
parallel.

I mention this purely for academic interest, since there is no implementation 
available, unless you count the one I wrote last week.

I don't think XSLT 3.0 currently has an equivalent solution for the dynamic 
case, where the colours are not known in advance. The normal solution would use 
"group-by" but this is not streamable.

Michael Kay
Saxonica


_______________________________________________
[email protected]
http://x-query.com/mailman/listinfo/talk

Reply via email to