shannon 2002/12/03 20:16:45 Modified: src/documentation/xdocs/faq Tag: cocoon_2_0_3_branch faq-configure-c2.xml src/documentation/xdocs/howto/xmlform-wizard Tag: cocoon_2_0_3_branch howto-xmlform-wizard-4.xml src/documentation/xdocs/userdocs/concepts Tag: cocoon_2_0_3_branch xmlsearching.xml src/documentation/xdocs/userdocs/generators Tag: cocoon_2_0_3_branch search-generator.xml Log: Sync with Head. Revision Changes Path No revision No revision 1.1.2.5 +17 -1 xml-cocoon2/src/documentation/xdocs/faq/faq-configure-c2.xml Index: faq-configure-c2.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/faq/faq-configure-c2.xml,v retrieving revision 1.1.2.4 retrieving revision 1.1.2.5 diff -u -r1.1.2.4 -r1.1.2.5 --- faq-configure-c2.xml 15 Nov 2002 13:41:45 -0000 1.1.2.4 +++ faq-configure-c2.xml 4 Dec 2002 04:16:44 -0000 1.1.2.5 @@ -14,11 +14,16 @@ <source><![CDATA[ ... - <cocoon version="2.0" user-roles="WEB-INF/my.roles"> + <cocoon version="2.0" user-roles="/WEB-INF/my.roles"> ... ]]></source> <p> + if you are using Tomcat 4. For other versions, skip the leading + slash of the URI. + </p> + + <p> And create a new file my.roles in WEB-INF directory with </p> @@ -86,5 +91,16 @@ </answer> </faq> +<faq> +<question> +How can I solve 'Too many open files' errors when I try to create a search index for my site? +</question> +<answer> + <p>Either reduce the number of tags in your documant, by filtering it with xslt in your search view.</p> + <p>Or, increase the limit on the number of files your opperating system allows you to have open simultaneously, in the shell you launch the servlet engine in.</p> + <p>example (Linux, bash shell): ulimit 2048</p> + <p>example (MacOSX, tcsh shell): limit descriptors 2048</p> +</answer> +</faq> </faqs> No revision No revision 1.2.2.4 +20 -30 xml-cocoon2/src/documentation/xdocs/howto/xmlform-wizard/howto-xmlform-wizard-4.xml Index: howto-xmlform-wizard-4.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/howto/xmlform-wizard/howto-xmlform-wizard-4.xml,v retrieving revision 1.2.2.3 retrieving revision 1.2.2.4 diff -u -r1.2.2.3 -r1.2.2.4 --- howto-xmlform-wizard-4.xml 3 Jul 2002 22:50:36 -0000 1.2.2.3 +++ howto-xmlform-wizard-4.xml 4 Dec 2002 04:16:44 -0000 1.2.2.4 @@ -299,51 +299,41 @@ * called in the beginning Form.populate() * before population starts. * - * This is the place to handle unchecked checkboxes. + * This is NOT the place to handle unchecked checkboxes, + * if the form is stored in the session. + * The XMLForm framework will automatically handle unchecked + * check-boxes for session scope forms. + * + * Only request scoped forms need handle check boxes explicitly. + * * */ public void reset( Form form ) { - // based on the current form view - // make some decisions regarding checkboxes, etc. - String formView = getFormView(); - if ( formView.equals ( VIEW_INTEREST ) ) - { - // deal with the organicGardening checkbox + // nothing to do here + + // unchecked check boxes are handled automatically + // since this is a session scoped form + + /* + No need for any of the following: + form.setValue( "/organicGardening", Boolean.FALSE ); - // deal with the cooking checkbox form.setValue( "/cooking", Boolean.FALSE ); - // deal with the smallholdingManagement checkbox form.setValue( "/smallholdingManagement", Boolean.FALSE ); - } - else if ( formView.equals ( VIEW_GARDENING ) ) - { - // deal with the flowers checkbox + form.setValue( "/flowers", Boolean.FALSE ); - // deal with the vegetables checkbox form.setValue( "/vegetables", Boolean.FALSE ); - // deal with the fruitTrees checkbox form.setValue( "/fruitTrees", Boolean.FALSE ); - } - else if ( formView.equals ( VIEW_COOKING ) ) - { - // deal with the traditionalReciepes checkbox + form.setValue( "/traditionalReciepes", Boolean.FALSE ); - // deal with the soups checkbox form.setValue( "/soups", Boolean.FALSE ); - // deal with the veganCookery checkbox form.setValue( "/veganCookery", Boolean.FALSE ); - } - else if ( formView.equals ( VIEW_SMALLHOLDING ) ) - { - // deal with the pigKeeping checkbox + form.setValue( "/pigKeeping", Boolean.FALSE ); - // deal with the pygmyGoats checkbox form.setValue( "/pygmyGoats", Boolean.FALSE ); - // deal with the henKeeping checkbox form.setValue( "/henKeeping", Boolean.FALSE ); - } - + */ } @@ -375,7 +365,7 @@ */ public boolean filterRequestParameter (Form form, String parameterName) { - // TBD + // Nothing to do in this case return false; } No revision No revision 1.2.2.1 +92 -10 xml-cocoon2/src/documentation/xdocs/userdocs/concepts/xmlsearching.xml Index: xmlsearching.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/userdocs/concepts/xmlsearching.xml,v retrieving revision 1.2 retrieving revision 1.2.2.1 diff -u -r1.2 -r1.2.2.1 --- xmlsearching.xml 18 Feb 2002 09:25:27 -0000 1.2 +++ xmlsearching.xml 4 Dec 2002 04:16:44 -0000 1.2.2.1 @@ -9,6 +9,7 @@ <type>Technical document</type> <authors> <person name="Bernhard Huber" email="[EMAIL PROTECTED]"/> + <person name="Jeremy Quinn" email="[EMAIL PROTECTED]"/> </authors> </header> @@ -19,9 +20,9 @@ in Apache Cocoon. </p> <p> - Indexing describes the process of fetching XML documents from an Apache Cocoon + Indexing is the process of fetching XML documents from an Apache Cocoon instance, and building an index file. - Searching describes the process of querying the once built index. + Searching is the process of querying the once built index. </p> </s1> @@ -48,21 +49,21 @@ Specifying the base URL determines the protocol for fetching XML resources. The implementation offers to specify <code>http:</code> URLs, crawling an Apache Cocoon instance deployed in a servlet-engine. - Alternativly you may specify an URI, e.g.: <code>/documents/index.html</code>, + Alternatively you may specify an URI, e.g.: <code>/documents/index.html</code>, offering to crawl the local Apache Cocoon instance only, either servlet-deployed, or in commandline-mode. </p> </s2> <s2 title="Fetching URL resource"> <p> - This processing step fetches an URL resource from Apache Cocoon. + This processing step fetches the URL resource from Apache Cocoon. </p> <p> Apache Cocoon offers the feature of views. This feature is used to fetch the 'bare' content of an URL. </p> <p> - The above described crawling component is used by the this processing step + The crawling component described above is used by the this processing step to retrieve a link of an XML document. The link name is augmented by a cocoon view name for fetching the XML resource. </p> @@ -70,12 +71,16 @@ The Avalon component <code>CocoonCrawler</code> defines the interface of a crawler. </p> + <p> + The Avalon component <code>SimpleCocoonCrawlerImpl</code> is the implementation. + It can be configured to use a specific view, or default to the 'content' view. + </p> </s2> <s2 title="Generating index"> <p> A xml resource is fed into a indexing engine. Generating an index specifies which elements of an XML resources - should get indexed, how the elements are stored in the indexed. + should get indexed, how the elements are stored in the index. Moreover the physical file location of the index is specified by this processing step. </p> @@ -89,6 +94,7 @@ as field name. An attribute has following field name <code>{element-name}@{attribute-name}</code>. </li> + <li>XML elements that match the names you configured in cocoon.xconf are added as stored fields.</li> </ul> <p> The Avalon component <code>LuceneCocoonIndexer</code> defines the interface @@ -133,7 +139,7 @@ <p> As both Avalon components <code>LuceneXMLIndexer</code>, and <code>LuceneCocoonSearcher</code> may use the same Lucene index, you must - take care of the Lucene index structure in both compoents. + take care of the Lucene index structure in both components. </p> <p> The current implementation uses following Lucene index layout @@ -145,11 +151,11 @@ </li> <li>Each XML element generates a Lucene field having the same name as the XML element name. For example searching for occurences of <code>Cocoon</code> inside of an XML abstract - elemen, use query-string <code>abstact:Cocoon</code>. + element, use query-string <code>abstact:Cocoon</code>. </li> <li>Each XML attribute generates a Lucene field having the name <code>{element-name}@{attribute-name}</code>. - For example searching for occurences of <code>Cocoon</code> inside of an XML title attribute + For example searching for occurrences of <code>Cocoon</code> inside of an XML title attribute of s1 element, use query-string <code>s1@title:Cocoon</code>. </li> <li> @@ -163,6 +169,10 @@ the index. This field is used for checking if the XML resource is newer than the information stored in the Lucene index. </li> + <li> + Further Stored fields can be added, depending on your configuration. + Stored fields are returned in the hits found by the engine. + </li> </ul> </s1> @@ -171,10 +181,38 @@ Configuring the indexing, and searching Avalon components is specified in the <code>cocoon.xconf</code> file. </p> + <s2 title="example"> + <p>This would set up the crawler to crawl all of your site, except pages in the 'search' section, also we are telling the crawler to use a non-standard cocoon-view for getting the links in documents, called <code>my-search-links</code>. </p> +<source><![CDATA[ +<cocoon-crawler logger="core.search.crawler"> + <exclude>.*/search/.*</exclude> + <link-view-query>cocoon-view=my-search-links</link-view-query> +</cocoon-crawler> +]]></source> + <p>This tells the indexer to use the non-standard 'my-search-content' view to retrieve the content for indexing. Also it tells the indexer that we would like to have any <code>title</code> or <code>subtitle</code> XML elements in the document added to the index as stored fields, so they can be retrieved and displayed to the user with any hits they get.</p> +<source><![CDATA[ +<lucene-xml-indexer logger="core.search.lucene"> + <store-fields>title, subtitle</store-fields> + <content-view-query>cocoon-view=my-search-content</content-view-query> +</lucene-xml-indexer> +]]></source> + </s2> <p> Setting up the sitemap component SearchGenerator takes place in the <code>sitemap.xmap</code> file. </p> + <s2 title="example"> + <p>This would generate a document from a search, getting the query and other information from request parameters.</p> +<source><![CDATA[ +<map:generate type="search"/> +]]></source> + <p>This would generate a document from a search, getting the query from the sitemap parameter '1' and other information from request parameters.</p> +<source><![CDATA[ +<map:generate type="search"> + <map:parameter name="query" value="{1}"/> +</map:generate> +]]></source> + </s2> </s1> <s1 title="Implementation notes"> @@ -236,7 +274,51 @@ needs. </p> </s1> - + <s1 title="Extending the Sample"> + <p> + It is easy to extend the search sample to display more information about the search hit than just the url of the resource.</p> + <p>In order to show, for example, the title and summary of a document, these first need to be added to the search index as 'Stored Fields'. Then when the documents are found during a search, that information is available to display, from the search engine itself.</p> + <p>First, decide which fields you want to store.</p> + <p>Decide where is the best place in your pipeline for content to be extracted for indexing, it might not always be the default view 'content'.</p> + <p>Next, decide if you need an XSLT transformation on your documents, to make them more suitable for indexing. This may include deciding on one of several titles in your document, what part of your document gets added to the summary etc. You might want to strip certain tags out because you don't want their content searched. You might be able to raise hit scores on documents by re-arranging content, or keeping larger amounts of content in fewer tags.</p> + <p>Now you tell the search engine (in cocoon.xconf) which tags you'd like storing.</p> +<source><![CDATA[ +<lucene-xml-indexer logger="core.search.lucene"> + <store-fields>title, summary</store-fields> + <content-view-query>cocoon-view=search-content</content-view-query> +</lucene-xml-indexer> +]]></source> + <p>This example tells the indexer to store any tags called 'title' or 'summary' it finds in your documents. It also tells the indexer to get it's content from the view called 'search-content'.</p> +<source><![CDATA[ +<map:view from-label="search" name="search"> + <map:transform src="search-filter.xsl"/> + <map:serialize type="xml"/> +</map:view> +]]></source> + <p>This is how you might setup that custom view in your sitemap. You would then add a label attribute <code>label="search"</code> to the appropriate place in your pipelines. See the section on views for more information.</p> + <p>After you have re-indexed the site, when you do searches, the new fields will be available in the XML output by Lucene, in the form of a <code>search:field</code> tag, you will need to modify your XSLT that displays the hits to show this.</p> +<source><![CDATA[ +<xsl:template match="search:hit"> + <tr> + <td> + <xsl:value-of select="format-number( @search:score, '### %' )"/> + </td> + <td> + <xsl:value-of select="@search:rank"/> + </td> + <td> + <a target="_blank" href="{@search:uri}"> + <xsl:attribute name="title"> + <xsl:value-of select="search:field[@search:name='summary']"/> + </xsl:attribute> + <xsl:value-of select="search:field[@search:name='title']"/> + </a> + </td> + </tr> +</xsl:template> +]]></source> +<p>This is how the search sample's xslt might be changed. All the fields you made for each document are available to you as <code>search:field</code> elements in the <code>search:hit</code> elements. The code above assumes you only had one 'title' and one 'summary' per document.</p> + </s1> <s1 title="Summary"> <p> This document gives an overview of the components for No revision No revision 1.1.2.3 +7 -1 xml-cocoon2/src/documentation/xdocs/userdocs/generators/search-generator.xml Index: search-generator.xml =================================================================== RCS file: /home/cvs/xml-cocoon2/src/documentation/xdocs/userdocs/generators/search-generator.xml,v retrieving revision 1.1.2.2 retrieving revision 1.1.2.3 diff -u -r1.1.2.2 -r1.1.2.3 --- search-generator.xml 3 Dec 2002 16:39:25 -0000 1.1.2.2 +++ search-generator.xml 4 Dec 2002 04:16:45 -0000 1.1.2.3 @@ -30,6 +30,12 @@ <source><![CDATA[ <map:generate type="search"/> ]]></source> +<p>or</p> +<source><![CDATA[ +<map:generate type="search"> + <query>your query string</query> +</map:generate> +]]></source> </s1> <s1 title="Configuration"> <p> @@ -206,7 +212,7 @@ count-of-pages CDATA #IMPLIED > -<!ELEMENT hit (field)*> +<!ELEMENT hit (#PCDATA)> <!ATTLIST hit rank CDATA #REQUIRED score CDATA #IMPLIED
---------------------------------------------------------------------- In case of troubles, e-mail: [EMAIL PROTECTED] To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]