arved       02/03/17 09:21:53

  Added:       docs/design/understanding area_tree.xml book.xml fo_tree.xml
                        handling_attributes.xml images.xml
                        layout_managers.xml layout_process.xml
                        pdf_library.xml properties.xml renderers.xml
                        status.xml svg.xml understanding.xml
                        xml_parsing.xml
  Log:
  Extra design commentary
  
  Revision  Changes    Path
  1.1                  xml-fop/docs/design/understanding/area_tree.xml
  
  Index: area_tree.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Area Tree</title> 
         <subtitle>All you wanted to know about the Area Tree !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Area Tree"> 
                <p>Yet to come :))</p> 
                <note>The series of notes for developers has started but it has not 
yet gone so far ! Keep watching</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/book.xml
  
  Index: book.xml
  ===================================================================
  <?xml version="1.0"?>
  
  <book title="FOP Design" copyright="1999-2002 The Apache Software Foundation">
    <external href="http://xml.apache.org/fop/";  label="About FOP"/>
    <separator/>
    <external href="../index.html"      label="NEW DESIGN" />
    <page id="index"          label="Uderstanding"      source="understanding.xml"/>
    <separator/>
    <page id="xml_parsing"          label="XML Parsing"      
source="xml_parsing.xml"/>
    <page id="fo_tree"          label="FO Tree"      source="fo_tree.xml"/>
    <page id="properties"          label="Properties"      source="properties.xml"/>
    <page id="layout_managers"          label="Layout Managers"      
source="layout_process.xml"/>
    <page id="layout_process"          label="Layout Process"      
source="layout_process.xml"/>
    <page id="handling_attributes"          label="Handling Attributes"      
source="handling_attributes.xml"/>
    <page id="area_tree"          label="Area Tree"      source="area_tree.xml"/>
    <page id="renderers"          label="Renderers"      source="renderers.xml"/>
    <separator/>
    <page id="images"          label="Images"      source="images.xml"/>
    <page id="pdf_library"          label="PDF Library"      
source="pdf_library.xml"/>
    <page id="svg"          label="SVG"      source="svg.xml"/>
    <separator/>
    <page id="status"          label="Status"      source="status.xml"/>
  </book>
  
  
  1.1                  xml-fop/docs/design/understanding/fo_tree.xml
  
  Index: fo_tree.xml
  ===================================================================
  <?xml version="1.0"?>
  <document>
    <header> 
     <title>FO Tree</title> 
     <subtitle>All you wanted to know about FO Tree !</subtitle> 
     <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
     </authors> 
    </header> 
  <body><s1 title="FO Tree">
        <p>
          The FO Tree is a representation of the XSL:FO document. This
          represents the <strong>Objectify</strong> step from the
          spec. The <strong>Refinement</strong> step is part of reading
          and using the properties which may happen immediately or
          during the layout process.
        </p>
  
  
  
  <p>Each xml element is represented by a java object. For pagination the
  classes are in <code>org.apache.fop.fo.pagination.*</code>, for elements in the flow
  they are in <code>org.apache.fop.fo.flow.*</code> and some others are in
  <code>org.apache.fop.fo.*.</code></p>
  
  
  
  <p>The base class for all objects in the tree is FONode. The base class for
  all FO Objects is FObj.</p>
  
  
  
  <p>(insert diagram here)</p>
  
  
  
  <p>There is a class for each element in the FO set. An object is created for
  each element in the FO Tree. This object holds the properties for the FO
  Object.</p>
  
  
  
        <p>
          When the object is created it is setup. It is given its
          element name, the FOUserAgent - for resolving properties
          etc. - the logger and the attributes. The methods
          <code>handleAttributes()</code> and
          <code>setuserAgent()</code>, common to <code>FONode</code>,
          are used in this process. The object will then be given any
          text data or child elements.  Then the <code>end()</code>
          method is called.  The end method is used by a number of
          elements to indicate that it can do certain processing since
          all the children have been added.
        </p>
  
  
  
  <p>Some validity checking is done during these steps. The user can be warned of the 
error and processing can continue if possible.
  </p>
  
  
        <p>
          The FO Tree is simply a heirarchy of java objects that
          represent the fo elements from xml. The traversal is done by
          the layout or structure process only in the flow elements.
        </p>
  
  
  
  <s2 title="Properties">
  
  
  
  <p>The XML attributes on each element are passed to the object. The objects
  that represent FO objects then convert the attributes into properties.
  </p>
  
  
  <p>Since properties can be inherited the PropertyList class handles resolving
  properties for a particular element.
  All properties are specified in an XML file. Classes are created
  automatically during the build process.
  </p>
  
  
  <p>(insert diagram here)</p>
  
  
  
  <p>In some cases the element may be moved to have a different parent, for
  example markers, or the inheritance could be different, for example
  initial property set.</p></s2>
  
  
  
  
  <s2 title="Foreign XML">
  
  
  <p>The base class for foreign XML is XMLObj. This class handles creating a
  DOM Element and the setting of attributes. It also can create a DOM
  Document if it is a top level element, class XMLElement.
  This class must be extended for the namespace of the XML elements. For
  unknown namespaces the class is UnknowXMLObj.</p>
  
  
  
  <p>(insert diagram here)</p>
  
  
  
  <p>If some special processing is needed then the top level element can extend
  the XMLObj. For example the SVGElement makes the special DOM required for
  batik and gets the size of the svg.
  </p>
  
  
  <p>Foreign XML will usually be in an fo:instream-foreign-object, the XML will
  be passed to the render as a DOM where the render will be able to handle
  it. Other XML from an unknwon namespace will be ignored.
  </p>
  
  
  <p>By using element mappings it is possible to read other XML and either</p>
  <ul><li>set information on the area tree</li>
  <li>create pseudo FO Objects that create areas in the area tree</li>
  <li>create FO Objects</li></ul>
  </s2>
  
  
  
  <s2 title="Unknown Elements">
  <p>If an element is in a known namespace but the element is unknown then an
  Unknown object is created. This is mainly to provide information to the
  user.
  This could happen if the fo document contains an element from a different
  version or the element is misspelt.</p>
  </s2>
  
  
  <s2 title="Page Masters">
          <p>
            The first elements in a document are the elements for the
            page master setup. This is usually only a small number and
            will be used throughout the document to create new pages.
            These elements are kept as a factory to create the page and
            appropriate regions whenever a new page is requested by the
            layout. The objects in the FO Tree that represent these
            elements are themselves the factory. The root element keeps
            these objects as a factory for the page sequences.
          </p>
  </s2>
  
  
  <s2 title="Flow">
  <p>The elements that are in the flow of the document are a set of elements
  that is needed for the layout process. Each element is important in the
  creation of areas.</p>
  </s2>
  
  
  
  <s2 title="Other Elements">
  
  
  
          <p>
            The remaining FO Objects are things like page-sequence,
            title and color-profile. These are handled by their parent
            element; i.e. the root looks after the declarations and the
            declarations maintains a list of colour profiles.  The
            page-sequences are direct descendents of root.
          </p>
        </s2>
  
  
  
  <s2 title="Associated Tasks">
  
  
  
  <ul><li>Create diagrams</li>
  <li>Setup all properties and elements for XSL:FO</li>
  <li>Setup user agent for property resolution</li>
  <li>Verify all XML is handled appropriately</li></ul></s2></s1></body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/handling_attributes.xml
  
  Index: handling_attributes.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Handling Attributes</title> 
         <subtitle>All you wanted to know about FOP Handling Attributes !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Handling Attributes"> 
                <p>Yet to come :))</p> 
                <note>The series of notes for developers has started but it has not 
yet gone so far ! Keep watching</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/images.xml
  
  Index: images.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Images</title> 
         <subtitle>All you wanted to know about Images in FOP !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body>
   
  
    <s1 title="Images in FOP"> <note> this is still in progress, input in the code is 
welcome. Needs documenting formats, testing. 
    So all those people interested in images should get involved.</note>
                <p>Images may only be needed to be loaded when the image is rendered 
to the
  output or to find the dimensions.<br/>
  An image url may be invalid, this can be costly to find out so we need to
  keep a list of invalid image urls.</p> 
  <p>We have a number of different caching schemes that are possible.</p>
  <p>All images are referred to using the url given in the XSL:FO after
  removing "url('')" wrapping. This does
  not include any sort of resolving such as relative -> absolute. The
  external graphic in the FO Tree and the image area in the Area Tree only
  have the url as a reference.
  The images are handled through a static interface in ImageFactory.<br/></p>
  
  
  <p>(insert image)</p>         
  
  
  <s2 title="Threading">
  
  
  
  <p>In a single threaded case with one document the image should be released
  as soon as the renderer caches it. If there are multiple documents then
  the images could be held in a weak cache in case another document needs to
  load the same image.</p>
  
  
  <p>In a multi threaded case many threads could be attempting to get the same
  image. We need to make sure an image will only be loaded once at a
  particular time. Once a particular document is finished then we can move
  all the images to a common weak cache.</p>
  </s2>
  
  <s2 title="Caches">
  <s3 title="LRU">
  <p>All images are in a common cache regardless of context. To limit the size
  of the cache the LRU image is removed to keep the amount of memory used
  low. Each image can supply the amount of data held in memory.</p>
  </s3>
  
  <s3 title="Context">
  <p>Images are cached according to the context, using the FOUserAgent as a key.
  Once the context is finished the images are added to a common weak hashmap
  so that other contexts can load these images or the data will be garbage
  collected if required.</p>
  <p>If images are to be used commonly then we cannot dispose of data in the
  FopImage when cached by the renderer. Also if different contexts have
  different base directories for resolving relative url's then the loading
  and caching must be separate. We can have a cache that shares images among
  all contexts or only loads an image for a context.</p>
  </s3>
  
  <p>The cache uses an image loader so that it can synchronize the image
  loading on an image by image basis. Finding and adding an image loader to
  the cache is also synchronized to prevent thread problems.</p>
  </s2>
  
  <s2 title="Invalid Images">
  
  
  <p>
  If an image cannot be loaded for some reason, for example the url is
  invalid or the image data is corrupt or an unknown type. Then it should
  only attempt to load the image once. All other attempts to get the image
  should return null so that it can be easily handled.<br/>
  This will prevent any extra processing or waiting.</p>
  </s2>
  
  
  <s2 title="Reading">
  <p>Once a stream is opened for the image url then a set of image readers is
  used to determine what type of image it is. The reader can peek at the
  image header or if necessary load the image. The reader can also get the
  image size at this stage.
  The reader then can provide the mime type to create the image object to
  load the rest of the information.<br/></p></s2>
  
  
  
  <s2 title="Data">
  
  
  
  <p>The data usually need for an image is the size and either a bitmap or the
  original data. Images such as jpeg and eps can be embedded into the
  document with the original data. SVG images are converted into a DOM which
  needs to be rendered to the PDF. Other images such as gif, tiff etc. are
  converted into a bitmap.
  Data is loaded by the FopImage by calling load(type) where type is the type of data 
to load.<br/></p></s2>
  
  
  <s2 title="Rendering">
  
  <p>Different renderers need to have the information in different forms.</p>
  
  
  <s3 title="PDF">
  <dl><dt>original data</dt>  <dd>JPG, EPS</dd>
  <dt>bitmap</dt>  <dd>gif, tiff, bmp, png</dd>
  <dt>other</dt>  <dd>SVG</dd></dl>
  </s3>
  
  <s3 title="PS">
  <dl><dt>bitmap</dt>  <dd>JPG, gif, tiff, bmp, png</dd>
  <dt>other</dt> <dd>SVG</dd></dl>
  </s3>
  
  <s3 title="awt">
  <dl><dt>bitmap</dt> <dd>JPG, gif, tiff, bmp, png</dd>
  <dt>other</dt>  <dd>SVG</dd></dl></s3>
  
  
  
  <p>The renderer uses the url to retrieve the image from the ImageFactory and
  then load the required data depending on the image mime type. If the
  renderer can insert the image into the document and use that data for all
  future references of the same image then it can cache the reference in the
  renderer and the image can be released from the image cache.</p></s2>
  </s1> 
    </body></document>
   
  
  
  
  
  
  
  
  
  
  
  
    
  
  
  
  1.1                  xml-fop/docs/design/understanding/layout_managers.xml
  
  Index: layout_managers.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Layout Managers</title> 
         <subtitle>All you wanted to know about Layout Managers !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Layout Managers"> 
                <p>Yet to come :))</p> 
                <note>The series of notes for developers has started but it has not 
yet gone so far ! Keep watching</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/layout_process.xml
  
  Index: layout_process.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Layout Process</title> 
         <subtitle>All you wanted to know about the Layout Process !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Layout Process"> 
                <p>Yet to come :))</p> 
                <note>The series of notes for developers has started but it has not 
yet gone so far ! Keep watching</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/pdf_library.xml
  
  Index: pdf_library.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>PDF Library</title> 
         <subtitle>All you wanted to know about the PDF Library !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="PDF Library"> 
  
  <p>The PDF Library is an independant package of classes in FOP. These class
  provide a simple way to construct documents and add the contents. The
  classes are found in <code>org.apache.fop.pdf.*</code>.</p>
  
  
  
  
  <s2 title="PDF Document">
  <p>This is where most of the document is created and put together.</p>
  <p>It sets up the header, trailer and resources. Each page is made and added to the 
document.
  There are a number of methods that can be used to create/add certain PDF objects to 
the document.</p>
  </s2>
  
  <s2 title="Building PDF">
  <p>The PDF Document is built by creating a page for each page in the Area Tree.</p>
  <p> This page then has all the contents added.
   The page is then added to the document and available objects can be written to the 
output stream.</p>
  <p>The contents of the page are things such as text, lines, images etc. 
  The PDFRenderer inserts the text directly into a pdf stream. 
  The text consists of markup to set fonts, set text position and add text.</p>
  <p>Most of the simple pdf markup is inserted directly into a pdf stream. 
  Other more complex objects or commonly used objects are added through java classes.
  Some pdf objects such as an image consists of two parts.</p> 
  <p>It has a separate object for the image data and another bit of markup to display 
the image in a certain position on the page.
  </p><p>The java objects that represent a pdf object implement a method that returns 
the markup for inserting into a stream.
  The method is: byte[] toPDF().</p>
  
  </s2>
  <s2 title="Features">
  
  
  
  <s3 title="Fonts">
  <p>Support for embedding fonts and using the default Acrobat fonts.
  </p></s3>
  
  <s3 title="Images">
  <p>Images can be inserted into a page. The image can either be inserted as a pixel 
map or directly insert a jpeg image.
  </p></s3>
  
  <s3 title="Stream Filters">
  <p>A number of filters are available to encode the pdf streams. These filters can 
compress the data or change it such as converting to hex.
  </p></s3>
  
  <s3 title="Links">
  <p>A pdf link can be added for an area on the page. This link can then point to an 
external destination or a position on any page in the document.
  </p></s3>
  
  <s3 title="Patterns">
  <p>The fill and stroke of graphical objects can be set with a colour, pattern or 
gradient.
  </p></s3>
  
  
  <p>The are a number of other features for handling pdf markup relevent to creating 
PDF files for FOP.</p>
  </s2>
  
  
  <s2 title="Associated Tasks">
  <p>There are a large number of additional features that can be added to pdf.</p>
  <p>Many of these can be handled with extensions or post processing.</p>
  
  </s2>
  
  
  
    </s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/properties.xml
  
  Index: properties.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Properties</title> 
         <subtitle>All you wanted to know about the Properties !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Property Handling"> 
  <p>During XML Parsing, the FO tree is constructed. For each FO object (some
  subclass of FObj), the tree builder then passes the list of all
  attributes specified on the FO element to the handleAttrs method. This
  method converts the attribute specifications into a PropertyList.</p>
  <p>The actual work is done by a PropertyListBuilder (PLB for short). The
  basic idea of the PLB is to handle each attribute in the list in turn,
  find an appropriate "Maker" for it, call the Maker to convert the
  attribute value into a Property object of the correct type, and store
  that Property in the PropertyList.</p>
  
  
  <s2 title="Finding a Maker">
  <p>
  The PLB finds a "Maker" for the property based on the attribute name and
  the element name. Most Makers are generic and handle the attribute on
  any element, but it's possible to set up an element-specific property
  Maker. The attribute name to Maker mappings are automatically created
  during the code generation phase by processing the XML property
  description files.</p>
  </s2>
  
  <s2 title="Processing the attribute list">
  <p>The PLB first looks to see if the font-size property is specified, since
  it sets up relative units which can be used in other property
  specifications. Each attribute is then handled in turn. If the attribute
  specifies part of a compound property such as space-before.optimum, the
  PLB looks to see if the attribute list also contains the "base" property
  (space-before in this case) and processes that first.</p></s2>
  <s2 title="How the Property Maker works"><p>There is a family of Maker objects for 
each of the property datatypes,
  such as Length, Number, Enumerated, Space, etc. But since each Property
  has specific aspects such as whether it's inherited, its default value,
  its corresponding properties, etc. there is usually a specific Maker for
  each Property. All these Maker classes are created during the code
  generation phase by processing (using XSLT) the XML property description
  files to create Java classes.</p>
  
  
  <p>The Maker first checks for "keyword" values for a property. These are
  things like "thin, medium, thick" for the border-width property. The
  datatype is really a Length but it can be specified using these keywords
  whose actual value is determined by the "User Agent" rather than being
  specified in the XSL standard. For FOP, these values are currently
  defined in foproperties.xml. The keyword value is just a string, so it
  still needs to be parsed as described next.</p>
  
  
  <p>The Maker also checks to see if the property is an Enumerated type and
  then checks whether the value matches one of the specified enumeration
  values.</p>
  
  
  <p>Otherwise the Maker uses the property parser in the fo.expr package to
  evaluate the attribute value and return a Property object. The parser
  interprets the expression language and performs numeric operations and
  function call evaluations.</p>
  
  
  <p>If the returned Property value is of the correct type (specificed in
  foproperties.xml, where else?), the Maker returns it. Otherwise, it may
  be able to convert the returned type into the correct type.</p>
  
  
  <p>Some kinds of property values can't be fully resolved during FO tree
  building because they depend on layout information. This is the case of
  length values specified as percentages and of the special
  proportional-column-width(x) specification for table-column widths.
  These are stored as special kinds of Length objects which are evaluated
  during layout. Expressions involving "em" units which are relative to
  font-size _are_ resolved during the FO tree building however.</p></s2>
  
  
  <s2 title="Structure of the PropertyList">
  <p>The PropertyList extends HashMap and its basic function is to associate
  Property value objects with Property names. The Property objects are all
  subclasses of the base Property class. Each one simply contains a
  reference to one of the property datatype objects. Property provides
  accessors for all known datatypes and various subclasses override the
  accessor(s) which are reasonable for the datatype they store.</p>
  
  
  <p>The PropertyList itself provides various ways of looking up Property
  values to handle such issues as inheritance and corresponding
  properties. </p>
  
  
  <p>The main logic is:<br/>If the property is a writing-mode relative property (using 
start, end,
  before or after in its name), the corresponding absolute property value
  is returned if it's explicitly set on this FO. <br/>Otherwise, the
  writing-mode relative value is returned if it's explicitly set. If the
  property is inherited, the process repeats using the PropertyList of the
  FO's parent object. (This is easy because each PropertyList points to
  the PropertyList of the nearest ancestor FO.) If the property isn't
  inherited or no value is found at any level, the initial value is
  returned.</p></s2>
  
  
  <s2 title="References">
  
  <dl><dt>docs/design/properties.xml</dt> <dd>a more detailed version of this 
(generated
  html in docs/html-docs/design/properties.html)</dd>
  
  
  <dt>src/codegen/properties.dtd</dt> <dd>heavily commented DTD for foproperties.xml,
  but may not be completely up-to-date</dd></dl></s2>
  
  
  <s2 title="To Do"> <s3 title="documentation">
                                                                   
  <ul><li>explain PropertyManager vs. direct access</li>
  <li>Explain corresponding properties</li></ul></s3>
  
  
  <s3 title="development">
  
  <p>Lots of properties are incompletely handled, especially funny kinds of
  keyword values and shorthand values (one attribute which sets several
  properties)</p></s3></s2>
  
  </s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/renderers.xml
  
  Index: renderers.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Renderers</title> 
         <subtitle>All you wanted to know about the Renderers !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Renderers"> 
                <p>Yet to come :))</p> 
                <note>The series of notes for developers has started but it has not 
yet gone so far ! Keep watching</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/status.xml
  
  Index: status.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>Tutorial series Status</title> 
         <subtitle>Current Status of tutorial about FOP and Design</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="Tutorial series Status"> <p>Peter said : Do we have a volunteer 
to track
                  Keiron's tutorials and turn them into web page documentation?</p> 
<p><strong>The answer is yes
                  we have, but the work is on progress !</strong></p> <note>Keiron has 
recently extended
                  the documentation generation on the CVS trunk to make this process a 
bit
                  easier. Keiron tells Peter that Apache is readying a major overhaul 
of its web
                  site and xml-&gt;html generation, but that should not deter us from 
proceeding
                  with documentation.</note></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/svg.xml
  
  Index: svg.xml
  ===================================================================
  <?xml version="1.0" standalone="no"?>
  <!-- Overview -->
  <document> 
    <header> 
         <title>SVG</title> 
         <subtitle>All you wanted to know about SVG and FOP !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body><s1 title="SVG"> 
                <p>SVG is rendered through Batik.</p><p>The XML from the XSL:FO 
document
                  is converted into an SVG DOM with batik. This DOM is then set as the 
Document
                  on the Foreign Object area in the Area Tree.</p><p>This DOM is then 
available to
                  be rendered by the renderer.</p><p>SVG is rendered in the renderers 
via an
                  XMLHandler in the FOUserAgent. This XML handler is used to render 
the SVG. The
                  SVG is rendered by using batik. Batik converts the SVG DOM into an 
internal
                  structure that can be drawn into a Graphics2D. So for PDF we use a
                  PDFGraphics2D to draw into.</p><p>This creates the necessary PDF 
information to
                  create the SVG image in the PDF document.</p><p>Most of the work is 
done in the
                  PDFGraphics2D class. There are also a few bridges that are plugged 
into batik
                  to provide different behaviour for some SVG elements.</p><s2
                title="Text Drawing"><p>Normally batik converts text into a set of 
curved
                         shapes. </p><p>This is handled as any other shapes when 
rendering to the output. This
                         is not always desirable as the shapes have very fine curves. 
This can cause the
                         output to look a bit bad in PDF and PS (it can be drawn 
properly but is not by
                         default). These curves also require much more data than the 
original
                         text.</p><p>To handle this there is a PDFTextElementBridge 
that is set when
                         using the bridge in batik. If the text is simple enough for 
the text to be
                         drawn in the PDF as with all other text then this sets the 
TextPainter to use
                         the PDFTextPainter. This inserts the text directly into the 
PDF using the
                         drawString method on the PDFGraphics2D.</p><p>Text is 
considered simple if the
                         font is available, the font size is useable and there are no 
tspans or other
                         complications. This can make the resulting PDF significantly
                         smaller.</p></s2><s2 title="PDF Links"><p>To support links in 
PDF another batik
                         element bridge is used. The PDFAElementBridge creates a 
PDFANode which inserts
                         a link into the PDF document via the 
PDFGraphics2D.</p><p>Since links are
                         positioned on the page without any transforms then we need to 
transform the
                         coordinates of the link area so that they match the current 
position of the a
                         element area. This transform may also need to account for the 
svg being
                         positioned on the page.</p></s2><s2 title="Images"><p>Images 
are normally drawn
                         into the PDFGraphics2D. This then creates a bitmap of the 
image data that can
                         be inserted into the PDF document. </p><p>As PDF can support 
jpeg images then another
                         element bridge is used so that the jpeg can be directly 
inserted into the
                         PDF.</p></s2><s2 title="PDF Transcoder"><p>Batik provides a 
mechanism to
                         convert SVG into various formats. Through FOP we can convert 
an SVG document
                         into a single paged PDF document. The page contains the SVG 
drawn as best as
                         possible on the page. There is a PDFDocumentGraphics2D that 
creates a
                         standalone PDF document with a single page. This is then 
drawn into by batik in
                         the same way as with the PDFGraphics2D.</p></s2><s2
                title="Other Outputs"><p>When rendering to AWT the SVG is simply drawn 
onto the
                         awt canvas using batik.</p><p>The PS Renderer uses a similar 
technique as the
                         PDF Renderer.</p><p>The SVG Renderer simply embeds the SVG 
inside an svg
                         element.</p></s2><s2 title="Associated Tasks"><ul><li>To get 
accurate drawing
                                pdf transparency is needed.</li><li>The 
drawRenderedImage methods need
                                implementing.</li><li>Handle colour space 
better.</li><li>Improve link handling
                                with pdf.</li><li>Improve image 
handling.</li></ul></s2></s1> 
    </body></document>
  
  
  1.1                  xml-fop/docs/design/understanding/understanding.xml
  
  Index: understanding.xml
  ===================================================================
  <?xml version="1.0"?>
  <!-- Overview -->
  <document> 
    <header> 
     <title>Understanding FOP Design</title> 
     <subtitle>Tutorial series about Design Approach to FOP</subtitle> 
     <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
     </authors> 
    </header> 
    <body>
  <s1 title="Understanding">
        <note>
            The content of this <strong>Understanding series</strong>
            was all taken from the interactive fop development mailing
            list discussion . <br/> We strongly advise you to join this
            mailing list and ask question about this series there. <br/>
            You can subscribe to [EMAIL PROTECTED] by sending an
            email to <link href=
            "mailto:[EMAIL PROTECTED]";
            >[EMAIL PROTECTED]</link>.  <br/> You will
            find more information about how to get involved <link href=
            "http://xml.apache.org/fop/involved.html";
            >there</link>.<br/> You can also read the <link href=
            "http://marc.theaimsgroup.com/?l=fop-dev&amp;r=1&amp;w=2";
            >archive</link> of the discussion list fop-dev to get an
            idea of the issues being discussed.
        </note>
        <s2 title="Introduction">
          <p>
            Welcome to the understanding series. This will be
            a series of notes for developers to understand how FOP
            works. We will
            attempt to clarify the processes involved to go from xml(fo)
            to pdf or other formats. Some areas will get more
            complicated as we proceed.
          </p>
        </s2>
    
    
      <s2 title="Overview"> 
        <p>FOP takes an xml file does its magic and then writes a document to a
         stream.</p> 
        <p>xml -&gt; [FOP] -&gt; document</p> 
        <p>The document could be pdf, ps etc. or directed to a printer or the
         screen. The principle remains the same. The xml document must be in the XSL:FO
         format.</p> 
        <p>For convenience we provide a mechanism to handle XML+XSL as
         input.</p> 
        <p>The xml document is always handled internally as SAX. The SAX events
         are used to read the elements, attributes and text data of the FO document.
         After the manipulation of the data the renderer writes out the pages in the
         appropriate format. It may write as it goes, a page at a time or the whole
         document at once. Once finished the document should contain all the data in 
the
         chosen format ready for whatever use.</p></s2> 
      <s2 title="Stages"><p>The fo data goes through a few stages. Each piece
         of data will generally go through the process in the same way but some
         information may be used a number of times or in a different order. To reduce
         memory one stage will start before the previous is completed.</p> 
        <p>SAX Handler -&gt; FO Tree -&gt; Layout Managers -&gt; Area Tree
         -&gt; Render -&gt; document</p> 
        <p>In the case of rtf, mif etc. <br/>SAX Handler -&gt; FO Tree -&gt;
         Structure Renderer -&gt; document</p> 
        <p>The FO Tree is constructed from the xml document. It is an internal
         representation of the xml document and it is like a DOM with some differences.
         The Layout Managers use the FO Tree do their layout stuff and create an Area
         Tree. The Area Tree is a representation of the final result. It is a
         representation of a set of pages containing the text and other graphics. The
         Area Tree is then given to a Renderer. The Renderer can read the Area Tree and
         convert the information into the render format. For example the PDF Renderer
         creates a PDF Document. For each page in the Area Tree the renderer creates a
         PDF Page and places the contents of the page into the PDF Page. Once a PDF 
Page
         is complete then it can be written to the output stream.</p> 
        <p>For the structure documents the Structure listener will read
         directly from the FO Tree and create the document. These documents do not need
         the layout process or the Area Tree.</p></s2> 
      <s2 title="Associated Tasks"><p>Verify Structure Listener
         concept.</p></s2> 
      <s2 title="Further Topics"> 
        <ul><li>XML parsing</li> 
         <li>FO Tree</li> 
         <li>Properties</li> 
         <li>Layout Managers</li> 
         <li>Layout Process</li> 
         <li>Handling Attributes</li> 
         <li>Area Tree</li> 
         <li>Renderers</li> 
         <li>Images</li> 
         <li>PDF Library</li> 
         <li>SVG</li> 
        </ul> 
      </s2> 
      
     </s1>  </body></document>
  
  
  
  
  1.1                  xml-fop/docs/design/understanding/xml_parsing.xml
  
  Index: xml_parsing.xml
  ===================================================================
  <?xml version="1.0"?>
  <document> 
    <header> 
         <title>XML Parsing</title> 
         <subtitle>All you wanted to know about XML Parsing !</subtitle> 
         <authors> <person name="Keiron Liddle" email="[EMAIL PROTECTED]"/> 
         </authors> 
    </header> 
    <body>
    
  <s1 title="XML Parsing"><p>Since everyone knows the basics we can get
                    into the various stages starting with the XML handling.</p> 
                  <s2 title="XML Input"><p>FOP can take the input XML in a number of 
ways:
                           </p>
          <ul>
            <li>SAX Events through SAX Handler
              <ul>
                <li>
                  <code>FOTreeBuilder</code> is the SAX Handler which is
                  obtained through <code>getContentHandler</code> on
                  <code>Driver</code>.
                </li>
              </ul>
            </li>
            <li>
              DOM which is converted into SAX Events
              <ul>
                <li>
                  The conversion of a DOM tree is done via the
                  <code>render(Document)</code> method on
                  <code>Driver</code>.
                </li>
              </ul>
            </li>  
            <li>
              data source which is parsed and converted into SAX Events
              <ul>
                <li>
                  The <code>Driver</code> can take an
                  <code>InputSource</code> as input.  This can use a
                  <code>Stream</code>, <code>String</code> etc.
                </li>
              </ul>
            </li> 
            <li>
              XML+XSLT which is transformed using an XSLT Processor and
              the result is fired as SAX Events
              <ul>
                <li>
                  <code>XSLTInputHandler</code> is used as an
                  <code>InputSource</code> in the
                  render(<code>XMLReader</code>,
                  <code>InputSource</code>) method on
                  <code>Driver</code>
                </li>
              </ul>
            </li>
          </ul>
                                  
                    <p>The SAX Events which are fired on the SAX Handler, class
                           <code>FOTreeBuilder</code>, must represent an XSL:FO 
document. If not there will be an
                           error. Any problems with the XML being well formed are 
handled here.</p></s2> 
                  <s2 title="Element Mappings"><p> The element mapping is a hashmap of 
all
                           the elements in a particular namespace. This makes it easy 
to create a
                           different object for each element. Element mappings are 
static to save on
                           memory. </p><p>To add an extension a developer can put in 
the classpath a jar
                           that contains the file 
<code>/META-INF/services/org.apache.fop.fo.ElementMapping</code>.
                           This must contain a line with the fully qualified name of a 
class that
                           implements the <em>org.apache.fop.fo.ElementMapping</em> 
interface. This will then be
                           loaded automatically at the start. Internal mappings are: 
FO, SVG and Extension
                           (pdf bookmarks)</p></s2> 
                  <s2 title="Tree Building"><p>The SAX Events will fire all the 
information
                           for the document with start element, end element, text data 
etc. This
                           information is used to build up a representation of the FO 
document. To do this
                           for a namespace there is a set of element mappings. When an 
element + namepsace
                           mapping is found then it can create an object for that 
element. If the element
                           is not found then it creates a dummy object or a generic 
DOM for unknown
                           namespaces.</p> 
                    <p>The object is then setup and then given attributes for the 
element.
                           For the FO Tree the attributes are converted into 
properties. The FO objects
                           use a property list mapping to convert the attributes into 
a list of properties
                           for the element. For other XML, for example SVG, a DOM of 
the XML is
                           constructed. This DOM can then be passed through to the 
renderer. Other element
                           mappings can be used in different ways, for example to 
create elements that
                           create areas during the layout process or setup information 
for the renderer
                           etc.</p> 
          <p>
            While the tree building is mainly about creating the FO Tree
            there are some stages that can propagate to the renderer. At
            the end of a page sequence we know that all pages in the
            page sequence can be laid out without being effected by any
            further XML. The significance of this is that the FO Tree
            for the page sequence may be able to be disposed of.  The
            end of the XML document also tells us that we can finalise
            the output document.  (The layout of individual pages is
            accomplished by the layout managers page at a time;
            i.e. they do not need to wait for the end of the page
            sequence.  The page may not yet be complete, however,
            containing forward page number references, for example.)
          </p>
        </s2> 
                  <s2 title="Associated Tasks"> 
                    <ul><li>Error handling for xml not well formed.</li> 
                           <li>Error handling for other XML parsing 
errors.</li><li>Developer
                                  info for adding namespace 
handlers.</li></ul></s2></s1>   
    </body></document>
  
  

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to