Hi,

I am using Solr 1.3 data import handler. One of my table fields has html
tags, I want to strip it of the field text. So obviously I need the Regex
Transformer.

I added transformer="RegexTransformer" attribute to my entity and a new
field with:

<field sourceColName="content" column="content" regex="English"
replaceWith="XXXXX"/>

Every thing works fine. The text is replace without any problem. The provlem
happend with my regular experession to strip html tags. So I use
regex="<(.|\n)*?>". Of course the charecters '<' and '>' are not allowed in
XML. I tried the following
regex="&lt;(.|\n)*?&gt;" and regex="&#3C;(.|\n)*?&#3E;" but I get the
following error:

The value of attribute "regex" associated with an element type "field" must
not contain the '<' character. at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
...

The full stack trace is following:

*FATAL: Could not create importer. DataImporter config invalid
org.apache.solr.common.SolrException: FATAL: Could not create importer.
DataImporter config invalid at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
at java.lang.Thread.run(Unknown Source) Caused by:
org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
occurred while initializing context Processing Document # at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176)
at
org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:93)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106)
... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute
"regex" associated with an element type "field" must not contain the '<'
character. at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source) at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166)
... 19 more *

*description* *The server encountered an internal error (FATAL: Could not
create importer. DataImporter config invalid
org.apache.solr.common.SolrException: FATAL: Could not create importer.
DataImporter config invalid at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509)
at java.lang.Thread.run(Unknown Source) Caused by:
org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
occurred while initializing context Processing Document # at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176)
at
org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:93)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106)
... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute
"regex" associated with an element type "field" must not contain the '<'
character. at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source) at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166)
... 19 more ) that prevented it from fulfilling this request.*

I appreciate your help.

Regards,
ahmd

Reply via email to