Thanks Shalin. The particular XSLT processor used is not relevant; it's a spec. Just use the standard Java APIs. If I want a particular processor, then I can get that to happen by using a system property and/or you could offer a configuration input for the standard factory class implementation for a processor of my choice.
~ David Shalin Shekhar Mangar wrote: > > Hi David, > Actually you can concatenate values, however you'll have to write a bit of > code. You can write this in javascript (if you're using Java 6) or in > Java. > > Basically, you need to write a Transformer to do it. Look at > http://wiki.apache.org/solr/DataImportHandler#head-a6916b30b5d7605a990fb03c4ff461b3736496a9 > > For example, lets say you get fields first-name and last-name in the XML. > But in the schema.xml you have a field called "name" in which you need to > concatenate the values of first-name and last-name (with a space in > between). Create a Java class: > > public class ConcatenateTransformer { public Object > transformRow(Map<String, > Object> row) { String firstName = row.get("first-name"); String lastName = > row.get("last-name"); row.put("name", firstName + " " + lastName); return > row; } } > > Add this class to solr's classpath by putting its jar in solr/WEB-INF/lib > > The data-config.xml should like this: > <entity name="myEntity" processor="XPathEntityProcessor" url=" > http://myurl/example.xml" > transformer="com.yourpackage.ConcatenateTransformer"> <field > column="first-name" xpath="/record/first-name" /> <field > column="last-name" > xpath="/record/last-name" /> <field column="name" /> </entity> > > This will call ConcatenateTransformer.transformRow method for each row and > you can concatenate any field with any field (or constant). Note that solr > document will keep only those fields which are in the schema.xml, the rest > are thrown away. > > If you don't want to write this in Java, you can use JavaScript by using > the > built-in ScriptTransformer, for an example look at > http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 > > However, I'm beginning to realize that XSLT is a common need, let me see > how > best we can accomodate it in DataImportHandler. Which XSLT processor will > you prefer? > > On Sat, Apr 19, 2008 at 12:13 AM, David Smiley @MITRE.org > <[EMAIL PROTECTED]> > wrote: > >> >> I'm in the same situation as you Daniel. The DataImportHandler is pretty >> awesome but I'd also prefer it had the power of XSLT. The XPath support >> in >> it doesn't suffice for me. And I can't do very basic things like >> concatenate one value with another, say a constant even. It's too bad >> there >> isn't a mode that XSLT can be put in to to not build the whole file into >> memory to do the transform. I've been looking into this and have turned >> up >> nothing. It would be neat if there was a STaX to multi-document adapter, >> at >> which point XSLT could be applied to the smaller fixed-size documents >> instead of the entire data stream. I haven't found anything like this so >> it'd need to be built. For now my documents aren't too big to XSLT >> in-memory. >> >> ~ David >> >> >> Daniel Papasian wrote: >> > >> > Shalin Shekhar Mangar wrote: >> >> Hi Daniel, >> >> >> >> Maybe if you can give us a sample of how your XML looks like, we can >> >> suggest >> >> how to use SOLR-469 (Data Import Handler) to index it. Most of the >> >> use-cases >> >> we have yet encountered are solvable using the XPathEntityProcessor in >> >> DataImportHandler without using XSLT, for details look at >> >> >> http://wiki.apache.org/solr/DataImportHandler#head-e68aa93c9ca7b8d261cede2bf1d6110ab1725476 >> > >> > I think even if it is possible to use SOLR-469 for my needs, I'd still >> > prefer the XSLT approach, because it's going to be a bit of >> > configuration either way, and I'd rather it be an XSLT stylesheet than >> > solrconfig.xml. In addition, I haven't yet decided whether I want to >> > apply any patches to the version that we will deploy, but if I do go >> > down the route of the XSLT transform patch, if I end up having to back >> > it out the amount of work that it would be for me to do the transform >> at >> > the XML source would be negligible, where it would be quite a bit of >> > work ahead of me to go from using the DataImportHandler to not using it >> > at all. >> > >> > Because both the solr instance and the XML source are in house, I have >> > the ability to apply the XSLT at the source instead of at solr. >> > However, there are different teams of people that control the XML >> source >> > and solr, so it would require a bit more office coordination to do it >> on >> > the backend. >> > >> > The data is a filemaker XML export (DTD fmresultset) and it looks >> > roughly like this: >> > <fmresultset> >> > <resultset> >> > <field name="ID"><data>125</data></field> >> > <field name="organization"><data>Ford Foundation</data></field> >> > ... >> > <relatedset table="Employees"> >> > <record> >> > <field name="ID"><data>Y5-A</data></field> >> > <field name="Name"><data>John Smith</data></field> >> > </record> >> > <record> >> > <field name="ID"><data>Y5-B</data></field> >> > <field name="Name"><data>Jane Doe</data></field> >> > </record> >> > </relatedset> >> > </fmresultset> >> > >> > I'm taking the product of the resultset and the relatedset, using both >> > IDs concatenated as a unique identifier, like so: >> > >> > <doc> >> > <field name="ID">125Y5-A</field> >> > <field name="organization">Ford Foundation</field> >> > <field name="Name">John Smith</field> >> > </doc> >> > <doc> >> > <field name="ID">125Y5-B</field> >> > <field name="organization">Ford Foundation</field> >> > <field name="Name">Jane Doe</field> >> > </doc> >> > >> > I can do the transform pretty simply with XSLT. I suppose it is >> > possible to get the DataImportHandler to do this, but I'm not yet >> > convinced that it's easier. >> > >> > Daniel >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/XSLT-transform-before-update--tp16738227p16764009.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/XSLT-transform-before-update--tp16738227p16796900.html Sent from the Solr - User mailing list archive at Nabble.com.