Konstantin Piroumian wrote: > Hm... Does anybody have an idea on how to paginate the content?
Ok, damn it, I don't have time to make mark this up, but since it's the content that is useful, here's a small tutorial for the Paginator. - 0 - Paginator Transformer ===================== classname: org.apache.cocoon.transformation.paginatation.Paginator location: scratchpad (available in both cocoon 2.1-dev and 2.0.3-dev) Design idea ----------- The paginator is a 'FilterTransformer' on pagination steroids. It works filtering SAX events things out and counting page. The design isn't very efficient since it has to process the entire file to extract a single page. It works nicely with few tens of pages, but I would seriously suggest *against* using it for books or very big documents. The good news is that its cacheable, so if the document doesn't change and the same page is requested, there is no need to reprocess the document. Anyway, for static generation, all this doesn't really matter. A simple example of use ----------------------- Suppose you have an XML file like this <a> <b/> <b/> <b/> <b/> <b/> <b/> <b/> </a> and you want to paginate this having 3 <b> elements per page. In order to achieve this, you write a simple "pagesheet" (which contains the instructions for the filter, much like a stylesheet gives instructions to an xslt processor) like this: <?xml version="1.0"?> <pagesheet xmlns="http://apache.org/cocoon/paginate/1.0"> <rules> <count type="element" name="b" num="3"/> </rules> </pagesheet> then you connect the two with a sitemap snippet like this: <map:match pattern="page(*)"> <map:generate src="document.xml"/> <map:transform type="paginate" src="pagesheets/images.xml"> <map:parameter name="page" value="{2}"/> </map:transform> <map:serialize type="xml"/> </map:match> and accessing the URI page(1) yields <a> <b> <b> <b> <page:page xmlns:page="http://apache.org/cocoon/paginate/1.0" current="1" total="3" current-uri="page(1)" clean-uri="page" /> </a> which can be easily transformed into something more meaningful. Note that the transformer processes all the pages to obtain the 'total'. There is no way around this. Adding navigation ----------------- The problem with XSLT-based pagination is that the logic is very complex to define in XSLT and is rarely reusable across different pagination needs. This was the main reason for the creation of a custom components for this. But since we have a full blown pagesheet language, there are a few other things that we can make the Paginator do, most important, navigation. For example, with this other pagesheet <?xml version="1.0"?> <pagesheet xmlns="http://apache.org/cocoon/paginate/1.0"> <rules> <count type="element" name="b" num="3"/> <link type="unit" num="1"/> </rules> </pagesheet> indicates that the transformer must understand how the page was encoded in the given URI and provide a link to the pages +/- 1 position, if they are available. So, using the same environment as before we get <a> <b> <b> <b> <page:page xmlns:page="http://apache.org/cocoon/paginate/1.0" current="1" total="3" current-uri="page(1)" clean-uri="page"> <page:link page="2" type="next" uri="page(2)"/> </page:page> </a> which indicates 1) there is no page 0, so no link is created. 2) the link goes to page 2, the type is 'next' (useful for visualization) and the URI is page(2) (useful for linking without XSLT-specific logic). NOTE: the URI is re-encoded using the same pattern, this paginator assumes that the 'round brakets' are used to identify page numbering. Now, without changing anything, requesting page(2) would yield <a> <b> <b> <b> <page:page xmlns:page="http://apache.org/cocoon/paginate/1.0" current="2" total="3" current-uri="page(2)" clean-uri="page"> <page:link page="1" type="prev" uri="page(1)"/> <page:link page="3" type="next" uri="page(3)"/> </page:page> </a> while page(3) would yield: <a> <b> <page:page xmlns:page="http://apache.org/cocoon/paginate/1.0" current="3" total="3" current-uri="page(3)" clean-uri="page"> <page:link page="2" type="prev" uri="page(2)"/> </page:page> </a> NOTE: here there is only one <b> because the original document doesn't contain enough elements to fill the page entirely. It's the modulo of the division. A real-life example ------------------- Here are a few pagesheets which are a little more complex: Paginating the results from DirectoryGenerator: <?xml version="1.0"?> <pagesheet xmlns="http://apache.org/cocoon/paginate/1.0"> <rules> <count type="element" name="file" namespace="http://apache.org/cocoon/directory/2.0" num="16"/> <link type="unit" num="2"/> <link type="range" value="5"/> </rules> </pagesheet> This says: 1) paginate 16 files per page 2) provide me with links to +/- 1 and +/- 2 pages (when available) 3) provide me with linkts to +/- 5 (when available) So, suppose we have a directory with 300 files and we request page 10, the generated page will be <dir:directory> <dir:file ...> [other 15 dir:file] <page:page xmlns:page="http://apache.org/cocoon/paginate/1.0" current="10" total="19" current-uri="dir(10)" clean-uri="dir"> <page:range-link page="5" type="prev" uri="page(5)"/> <page:link page="8" type="prev" uri="page(8)"/> <page:link page="9" type="prev" uri="page(9)"/> <page:link page="11" type="next" uri="page(11)"/> <page:link page="12" type="next" uri="page(12)"/> <page:range-link page="15" type="next" uri="page(15)"/> </page:page> </dir:directory> Asymmetric pagination --------------------- We have also the ability to indicate different rules for each page, so: <pagesheet xmlns="http://apache.org/cocoon/paginate/1.0"> <rules page="1"> <count type="element" name="b" num="5"/> <link type="unit" num="1"/> </rules> <rules> <count type="element" name="b" num="10"/> <link type="unit" num="2"/> </rules> </pagesheet> Count types ----------- The paginator works by counting stuff. It's up to you to define what you want to use for counting and you do so with the attributes of the <count> element in the pagesheet. This element supports 2 required attributes: num="" -> a number indicating how many times the thing to count must be present in this page. type="" -> the type of counting that the paginator must perform. Only one type is currently implemented and two are currently supported. type="element" -> makes the paginator counts the startElement() SAX events type="chars" -> (not currently implemented!) makes the paginator count the chars inclued in the page. In case type="element" is used, two other attributes become useful: name="" -> the name of the element (without namespace prefix!) namespace="" -> the URI of the namespace (if not specified, the default NS is used) - o - Ok, from now on some RT on the future of this transformer: Using the paginator for docs ---------------------------- I originally wrote the paginator to paginate a directory listing and it works great for paginating counting elements. For docs, it could be possible to paginate by counting sections or subsections, but this doesn't necessarely yield visually balanced pages (which is the reason for web pagination). This is why I assumed a way to count by chars, even if I didn't go as far as implementing it because while paginating by counting elements is ok (sounds trivial, but it's not! think of nesting!) paginating by counting chars is a real pain, due to the algorithms that must perform 'chunking'. I mean, assume you have a document like this: <p>this is some <strong>text</strong> that happens to be <em>chuncked</em></p> ^ | and suppose that counting the chars leads you to the chunking point indicated by the arrow above. Cutting the page there results in XML which is not well-formed. Providing a way to 're-well-form' the XML truncates words. So, we must provide a way to 're-well-form' the XML until the first 'block-level' element is encountered (p in this case). But this means that the pagesheet must contain at least the list of 'block-delimiting' elements (and the current Pagesheet parser parser and object model doesn't support this notion). Result: pagination at the char-level is not trivial and requires a little bit of work on the transformer Nesting behavior ---------------- If counting by chars is a pain, even counting elements is not easy. Assume you have this: <a> <b> <a> <b> <a> <b/> </a> </b> </a> </b> </a> and you want to paginate using one <b> per page, what do the pages look like? ok, I'll give you some space to think about it. Ok, here is my solution (but I'm not sure it's the best): page 1: <a> <b> <a> <a/> </a> </b> </a> page 2: <a> <a> <b> <a/> </b> </a> </a> page 3: <a> <a> <a> <b/> </a> </a> </a> I'm pretty sure the current code is buggy someplace because for deep nesting like this one, it looses some SAX events someplace and ends up making the SAX stream non-well-formed and chocking the subsequent transformers which are sensible to well-formness (such as XSLT). Note: the above might look like a mental exercise to many, but if you think about our Document DTD 1.1, you'll find nested <section> and paginating those results in very similar problems. But I'm not sure if the solution adopted above is meaningful for a real-case pagination. I'm up to suggestions in on this. Improving the concept --------------------- One possible way to improve the concept is to count by XPath results, that is you might want to count by 'sections included in sections'. Also, another way to improve the system is providing booleans: you might want to count 'sessions AND chapters' (probably, XPath helps here as well). Ok, anyway, hope this helps and sorry for taking so long to write this. -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. <[EMAIL PROTECTED]> Friedrich Nietzsche -------------------------------------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]