Hi, I am using Solr 1.3 data import handler. One of my table fields has html tags, I want to strip it of the field text. So obviously I need the Regex Transformer.
I added transformer="RegexTransformer" attribute to my entity and a new field with: <field sourceColName="content" column="content" regex="English" replaceWith="XXXXX"/> Every thing works fine. The text is replace without any problem. The provlem happend with my regular experession to strip html tags. So I use regex="<(.|\n)*?>". Of course the charecters '<' and '>' are not allowed in XML. I tried the following regex="<(.|\n)*?>" and regex="C;(.|\n)*?E;" but I get the following error: The value of attribute "regex" associated with an element type "field" must not contain the '<' character. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) ... The full stack trace is following: *FATAL: Could not create importer. DataImporter config invalid org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565) at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) ... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute "regex" associated with an element type "field" must not contain the '<' character. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166) ... 19 more * *description* *The server encountered an internal error (FATAL: Could not create importer. DataImporter config invalid org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:114) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:206) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:857) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:565) at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1509) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context Processing Document # at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:176) at org.apache.solr.handler.dataimport.DataImporter.<init>(DataImporter.java:93) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:106) ... 17 more Caused by: org.xml.sax.SAXParseException: The value of attribute "regex" associated with an element type "field" must not contain the '<' character. at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source) at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:166) ... 19 more ) that prevented it from fulfilling this request.* I appreciate your help. Regards, ahmd