Jose Luis de la Rosa Triviño a écrit : > > I am testing CPSOOo and when I upload an odt or sxw file the content is > not indexed. >
On my side, with a CPS from trunk both ODT and SXW files got indexed when I create an OpenOffice.org DocBook Document : 2007-06-26T14:58:51 INFO PortalTransforms PATH FROM application/vnd.oasis.opendocument.text TO application/docbook+xml : [<Transform at ooo_to_docbook>] ------ 2007-06-26T14:58:51 DEBUG ooo_to_docbook cmd = cd "/tmp/tmpULekmy" && /usr/local/zope/instance/cps/Products/PortalTransforms/transforms/ooo2dbk/ooo2dbk --dbkfile unknown.docb.xml /tmp/tmpULekmy/unknown.sxw 2>"unknown.log-xsltproc" ------ 2007-06-26T14:58:57 DEBUG CPSSchemas.FileUtils._convertFileToMimeType to text/html for file <File at document.docb> ------ 2007-06-26T14:58:57 INFO PortalTransforms PATH FROM application/docbook+xml TO text/html : [<Transform at docbook_to_html>] ------ 2007-06-26T14:58:57 DEBUG docbook_to_html cmd = cd "/tmp/tmpdBmkDr" && /usr/bin/xsltproc --novalid /usr/local/zope/instance/cps/Products/PortalTransforms/transforms/docbook/custom-xhtml.xsl /tmp/tmpdBmkDr/unknown.docb.xml >"unknown.docb.html" 2>"unknown.docb.log-xsltproc" ------ 2007-06-26T14:59:01 DEBUG CPSSchemas.FileUtils._convertFileToMimeType to text/plain for file <File at document.html> ------ 2007-06-26T14:59:01 INFO PortalTransforms PATH FROM text/html TO text/plain : [<Transform at lynx_dump>] ------ When you arrived to the lynx_dump transform it means that the HTML is transformed to text, and this is this plain text which is used for the indexing. What have you got in http://localhost:8080/cps/portal_transforms/ooo_to_docbook/manage_main http://localhost:8080/cps/portal_transforms/ooo_to_docbook/manage_introspection ? > In debug mode, the information thrown to the console is: > > --------------------------------------------- > 2007-06-25 12:54:18 INFO PortalTransforms PATH FROM > application/vnd.oasis.opendocument.text TO application/docbook+xml : > [<Transform at ooo_to_docbook>] > There isn't any value for this parameter. There should be an error in > your config.xml. > ------------------------------------------------------ > This error message comes from the ooo2dbk program. It says that the config.xml ooo2dbk configuration file has an empty configuration parameter while it should not be empty. Actually there's an error in the error message because the configuration file is ooo2dbk.xml. > The name searched is "ooo2" and the content of the global variable > configElts is: > > ----------------------------------------------------------------------- > [(u'xslt-command', {u'param-syntax': u'--stringparam %s %s', u'command': > u'xsltproc %v -o %o %s %i', u'name': u'xsltproc'}), > (u'xslt-command', {u'param-syntax': u'%s=%s', u'command': u'java > com.icl.saxon.StyleSheet -o %o %i %s %v', u'name': u'saxon'}), > (u'xslt-command', {u'param-syntax': u'%s=%s', u'command': u'java > com.icl.saxon.StyleSheet -r > org.apache.xml.resolver.tools.CatalogResolver -x > org.apache.xml.resolver.tools.ResolvingXMLReader -y > org.apache.xml.resolver.tools.ResolvingXMLReader -u -o %o %i %s %v', > u'name': u'saxon-cat'}), > (u'xslt-stylesheet', {u'stylesheetPath': u'ooo2dbk.xsl', u'name': > u'o2d4windows'}), > (u'xslt-stylesheet', {u'stylesheetPath': u'ooo2dbk.xsl', u'name': > u'o2d4unix'}), > (u'dtd', {u'doctype-public': u'"-//OASIS//DTD DocBook XML V4.4//EN"', > u'name': u'docbook44', u'doctype-system': > u'"http://www.oasis-open.org/docbook/xml/4.4/docbookx.dtd"'}), > (u'dtd', {u'doctype-public': u'"-//OASIS//DTD DocBook XML V4.3//EN"', > u'name': u'docbook43', u'doctype-system': > u'"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd"'}), > (u'dtd', {u'doctype-public': u'"-//OASIS//DTD DocBook XML V4.1.2//EN"', > u'name': u'docbook412', u'doctype-system': > u'"http://www.oasis-open.org/docbook/xml/4.0/docbookx.dtd"'}), > (u'images', {u'imageNameRoot': u'img', u'imagesRelativeDirectory': > u'images'}), > (u'oooserver', {u'host': u'localhost', u'port': u'2002'}), > (u'ole', {u'imgFormat': u'png', u'scriptPath': u'ole2img.py'}), > (u'ooopython', {u'path': u'/usr/bin/python'})] > --------------------------------------------------------------------- > This content of configElts you have written seems to be the content of an old version of the config file. Please check on your system how many ooo2dbk.xml (or even config.xml) files do you have. I think that this is the problem. ooo2dbk looks for configuration files in /etc too and you might have an old version of ooo2dbk configuration files there. > I have been looking at the file ooo2dbk.xml and it seems to have a > statement for ooo2 as it has these lines: > > --------------------------------------------------------- > <xslt-stylesheet > name="ooo2" > stylesheetPath="ooo2dbk.odf.xsl" > /> > -------------------------------------------------------- > You're looking in the right direction. You should have this file in your directories : Products/PortalTransforms/transforms/ooo2dbk/ooo2dbk.odf.xsl > Why the OOO Document does not uses the same transform as the one used by > the type "File"? > The tranformation for "File" transforms from OOo to HTML, then to text. The tranformation for "OOo Documents" transforms from OOo, to DocBook, then to HTML, and finally to text. The additional step to DocBook adds retrieves more semantic and the HTML output is thus richer. The main reason for this product is that some Nuxeo clients use DocBook as the preferred open file format for long term storage. Cheers, -- Marc-Aurèle DARCHE Open Source Enterprise Content Management (ECM) http://www.nuxeo.org/ NUXEO (Paris, France) http://nuxeo.com/ _______________________________________________ cps-devel mailing list http://lists.nuxeo.com/mailman/listinfo/cps-devel
