Re: XSLT transform before update?
Thanks Shalin. The particular XSLT processor used is not relevant; it's a spec. Just use the standard Java APIs. If I want a particular processor, then I can get that to happen by using a system property and/or you could offer a configuration input for the standard factory class implementation for a processor of my choice. ~ David Shalin Shekhar Mangar wrote: Hi David, Actually you can concatenate values, however you'll have to write a bit of code. You can write this in javascript (if you're using Java 6) or in Java. Basically, you need to write a Transformer to do it. Look at http://wiki.apache.org/solr/DataImportHandler#head-a6916b30b5d7605a990fb03c4ff461b3736496a9 For example, lets say you get fields first-name and last-name in the XML. But in the schema.xml you have a field called name in which you need to concatenate the values of first-name and last-name (with a space in between). Create a Java class: public class ConcatenateTransformer { public Object transformRow(MapString, Object row) { String firstName = row.get(first-name); String lastName = row.get(last-name); row.put(name, firstName + + lastName); return row; } } Add this class to solr's classpath by putting its jar in solr/WEB-INF/lib The data-config.xml should like this: entity name=myEntity processor=XPathEntityProcessor url= http://myurl/example.xml; transformer=com.yourpackage.ConcatenateTransformer field column=first-name xpath=/record/first-name / field column=last-name xpath=/record/last-name / field column=name / /entity This will call ConcatenateTransformer.transformRow method for each row and you can concatenate any field with any field (or constant). Note that solr document will keep only those fields which are in the schema.xml, the rest are thrown away. If you don't want to write this in Java, you can use JavaScript by using the built-in ScriptTransformer, for an example look at http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 However, I'm beginning to realize that XSLT is a common need, let me see how best we can accomodate it in DataImportHandler. Which XSLT processor will you prefer? On Sat, Apr 19, 2008 at 12:13 AM, David Smiley @MITRE.org [EMAIL PROTECTED] wrote: I'm in the same situation as you Daniel. The DataImportHandler is pretty awesome but I'd also prefer it had the power of XSLT. The XPath support in it doesn't suffice for me. And I can't do very basic things like concatenate one value with another, say a constant even. It's too bad there isn't a mode that XSLT can be put in to to not build the whole file into memory to do the transform. I've been looking into this and have turned up nothing. It would be neat if there was a STaX to multi-document adapter, at which point XSLT could be applied to the smaller fixed-size documents instead of the entire data stream. I haven't found anything like this so it'd need to be built. For now my documents aren't too big to XSLT in-memory. ~ David Daniel Papasian wrote: Shalin Shekhar Mangar wrote: Hi Daniel, Maybe if you can give us a sample of how your XML looks like, we can suggest how to use SOLR-469 (Data Import Handler) to index it. Most of the use-cases we have yet encountered are solvable using the XPathEntityProcessor in DataImportHandler without using XSLT, for details look at http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476 I think even if it is possible to use SOLR-469 for my needs, I'd still prefer the XSLT approach, because it's going to be a bit of configuration either way, and I'd rather it be an XSLT stylesheet than solrconfig.xml. In addition, I haven't yet decided whether I want to apply any patches to the version that we will deploy, but if I do go down the route of the XSLT transform patch, if I end up having to back it out the amount of work that it would be for me to do the transform at the XML source would be negligible, where it would be quite a bit of work ahead of me to go from using the DataImportHandler to not using it at all. Because both the solr instance and the XML source are in house, I have the ability to apply the XSLT at the source instead of at solr. However, there are different teams of people that control the XML source and solr, so it would require a bit more office coordination to do it on the backend. The data is a filemaker XML export (DTD fmresultset) and it looks roughly like this: fmresultset resultset field name=IDdata125/data/field field name=organizationdataFord Foundation/data/field ... relatedset table=Employees record field name=IDdataY5-A/data/field field name=NamedataJohn Smith/data/field /record record field name=IDdataY5-B/data/field field name=NamedataJane Doe/data/field
Re: DataField parsing error using BinaryResponseParser for solrj
It is not a problem with the BinaryResponseWriter itself. It is caused by the bug https://issues.apache.org/jira/browse/SOLR-470 we need to fix it now. --Noble On Mon, Apr 21, 2008 at 9:16 AM, Eason. Lee [EMAIL PROTECTED] wrote: Error comes from solr while parsing the datefield It is ok with XMLResponseParser Apr 22, 2008 11:02:13 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: java.text.ParseException: Unparseable date: 1995-02-16T00:00:00Z at org.apache.solr.schema.DateField.toObject(DateField.java:173) at org.apache.solr.schema.DateField.toObject(DateField.java:83) at org.apache.solr.request.BinaryResponseWriter$Resolver.getDoc(BinaryRe sponseWriter.java:137) at org.apache.solr.request.BinaryResponseWriter$Resolver.writeDocList(Bi naryResponseWriter.java:115) at org.apache.solr.request.BinaryResponseWriter$Resolver.resolve(BinaryR esponseWriter.java:84) at org.apache.solr.common.util.NamedListCodec.writeVal(NamedListCodec.ja va:128) at org.apache.solr.common.util.NamedListCodec.writeNamedList(NamedListCo dec.java:118) at org.apache.solr.common.util.NamedListCodec.marshal(NamedListCodec.jav a:77) at org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWrit er.java:44) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:295) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl icationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF ilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV alve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextV alve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j ava:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.j ava:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineVal ve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav a:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java :844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proce ss(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:44 7) at java.lang.Thread.run(Thread.java:619) Caused by: java.text.ParseException: Unparseable date: 1995-02-16T00:00:00Z at java.text.DateFormat.parse(DateFormat.java:337) at org.apache.solr.schema.DateField.toObject(DateField.java:170) ... 21 more -- --Noble Paul
Re: DataField parsing error using BinaryResponseParser for solrj
Thanks 2008/4/21, Noble Paul നോബിള് नोब्ळ् [EMAIL PROTECTED]: It is not a problem with the BinaryResponseWriter itself. It is caused by the bug https://issues.apache.org/jira/browse/SOLR-470 we need to fix it now. --Noble On Mon, Apr 21, 2008 at 9:16 AM, Eason. Lee [EMAIL PROTECTED] wrote: Error comes from solr while parsing the datefield It is ok with XMLResponseParser Apr 22, 2008 11:02:13 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: java.text.ParseException: Unparseable date: 1995-02-16T00:00:00Z at org.apache.solr.schema.DateField.toObject(DateField.java:173) at org.apache.solr.schema.DateField.toObject(DateField.java:83) at org.apache.solr.request.BinaryResponseWriter$Resolver.getDoc(BinaryRe sponseWriter.java:137) at org.apache.solr.request.BinaryResponseWriter$Resolver.writeDocList(Bi naryResponseWriter.java:115) at org.apache.solr.request.BinaryResponseWriter$Resolver.resolve(BinaryR esponseWriter.java:84) at org.apache.solr.common.util.NamedListCodec.writeVal(NamedListCodec.ja va:128) at org.apache.solr.common.util.NamedListCodec.writeNamedList(NamedListCo dec.java:118) at org.apache.solr.common.util.NamedListCodec.marshal(NamedListCodec.jav a:77) at org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWrit er.java:44) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:295) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Appl icationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationF ilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperV alve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextV alve.java:175) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.j ava:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.j ava:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineVal ve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.jav a:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java :844) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.proce ss(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:44 7) at java.lang.Thread.run(Thread.java:619) Caused by: java.text.ParseException: Unparseable date: 1995-02-16T00:00:00Z at java.text.DateFormat.parse(DateFormat.java:337) at org.apache.solr.schema.DateField.toObject(DateField.java:170) ... 21 more -- --Noble Paul