The column in my database is of xml datatype. But if I do not use XMLSERIALIZE(SMRY as CLOB(1M)) as SMRY , and instead take SMRY field directly as
select ID,SMRY from BOOK_REC, i get the below error, Exception while processing: x document : SolrInputDocument(fields: [id=45768734]):org.apache.solr.handler.dataimport.DataImportHandlerException: Parsing failed for xml, url:null rows processed:0 Processing Document # 1 Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character 'c' (code 99) in prolog; expected '<' at javax.xml.stream.SerializableLocation@5780578 Thanks, Prasi On Mon, Mar 24, 2014 at 3:51 PM, Prasi S <prasi1...@gmail.com> wrote: > Below is my full configuration, > > <dataConfig> > <dataSource driver="com.ibm.db2.jcc.DB2Driver" > url="jdbc:db2://IP:port/dbname" user="" password="" /> > > <dataSource name="xmldata" type="FieldReaderDataSource"/> > > <document> > > <entity name="x" query="SELECT ID, XMLSERIALIZE(SMRY as CLOB(1M)) as SMRY > FROM BOOK_REC fetch first 40 rows only" > transformer="ClobTransformer" > > <field column="MBR" name="mbr" /> > <entity name="y" dataSource="xmldata" dataField="x.SMRY" > processor="XPathEntityProcessor" > forEach="/*:summary" rootEntity="true" > > <field column="card_no" xpath="/cardNo" /> > > </entity> > </entity> > </document> > </dataConfig> > > And this is my xml data > > <ns:summary xmlns:ns="***"> > <cardNo>ZAYQ5181</tripId> > <firstName>Sam</firstName> > <lastName>Mathews</lastName> > <date>2013-01-18T23:29:04.492</date> > </ns:summary> > > Thanks, > Prasi > > > On Mon, Mar 24, 2014 at 3:23 PM, Shalin Shekhar Mangar < > shalinman...@gmail.com> wrote: > >> 1. I don't see the definition of a datasource named 'xmldata' in your >> data-config. >> 2. You have forEach="/*:summary" but I don't think that is a syntax >> supported by XPathRecordReader. >> >> If you can give a sample of the xml stored as Clob in your database, >> then we can help you write the right xpaths. >> >> On Mon, Mar 24, 2014 at 12:55 PM, Prasi S <prasi1...@gmail.com> wrote: >> > My database configuration is as below >> > >> > <entity name="x" query="SELECT ID, XMLSERIALIZE(SMRY as CLOB(1M)) as >> SMRY >> > FROM BOOK_REC fetch first 40 rows only" >> > transformer="ClobTransformer" > >> > <field column="MBR" name="mbr" /> >> > <entity name="y" dataSource="xmldata" dataField="x.SMRY" >> > processor="XPathEntityProcessor" >> > forEach="/*:summary" rootEntity="true" > >> > <field column="card_no" xpath="/cardNo" /> >> > >> > </entity> >> > </entity> >> > >> > and i get my response from solr as below >> > >> > <doc> >> > <str name="card_no">org.......@1c8e807</str> >> > >> > Am i mising anything? >> > >> > >> > >> > Thanks, >> > Prasi >> > >> > >> > On Thu, Mar 20, 2014 at 4:25 PM, Gora Mohanty <g...@mimirtech.com> >> wrote: >> > >> >> On 20 March 2014 14:53, Prasi S <prasi1...@gmail.com> wrote: >> >> > >> >> > Hi, >> >> > I have a requirement to index a database table with clob content. >> Each >> >> row >> >> > in my table a column which is an xml stored as clob. I want to read >> the >> >> > contents of xmlthrough dih and map each of the xml tag to a separate >> solr >> >> > field, >> >> > >> >> > Below is my clob content. >> >> > <root> >> >> > <author>A</author> >> >> > <date>02-Dec-2013</date> >> >> > . >> >> > . >> >> > . >> >> > </root> >> >> > >> >> > i want to read the contents of the clob and map author to >> author_solr and >> >> > date to date_solr . Is this possible with a clob tranformer or a >> script >> >> > tranformer. >> >> >> >> You will need to use a FieldReaderDataSource, and a >> XPathEntityProcessor >> >> along with the ClobTransformer. You do not provide details of your DIH >> data >> >> configuration file, but this should look something like: >> >> >> >> <dataSource name="xmldata" type="FieldReaderDataSource"/> >> >> ... >> >> <document> >> >> <entity name="x" query="..." transformer="ClobTransformer"> >> >> <entity name="y" dataSource="xmldata" dataField="x.clob_column" >> >> processor="XPathEntityProcessor" forEach="/root"> >> >> <field column="author_solr" xpath="/author" /> >> >> <field column="date_solr" xpath="/date" /> >> >> </entity> >> >> </entity> >> >> </document> >> >> >> >> Regards, >> >> Gora >> >> >> >> >> >> -- >> Regards, >> Shalin Shekhar Mangar. >> > >