Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
Thanks a million for your time and help. It indeed works smoothly now. I also, by the way, had to apply the patch attached to the following message : http://www.nabble.com/Re%3A-How-to-describe-2-entities-in-dataConfig- for-the-DataImporter--p17577610.html in order to have the TemplateTransformer to not throw Null Pointer exceptions :) Cheers ! -- Nicolas Pastorino On Jun 10, 2008, at 18:05 , Noble Paul നോബിള് नोब्ळ् wrote: It is a bug, nice catch there needs to be a null check there in the method can us just try replacing the method with the following? private Node getMatchingChild(XMLStreamReader parser) { if(childNodes == null) return null; String localName = parser.getLocalName(); for (Node n : childNodes) { if (n.name.equals(localName)) { if (n.attribAndValues == null) return n; if (checkForAttributes(parser, n.attribAndValues)) return n; } } return null; } I tried with that code and it is working. We shall add it in the next patch --Noble On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: I just forgot to mention the error related to the description below. I get the following when running a full-import ( sorry for the noise .. ) : SEVERE: Full Import failed java.lang.RuntimeException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords (XPathRecordReader.java:85) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery (XPathEntityProcessor.java:207) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow( XPathEntityProcessor.java:161) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow (XPathEntityProcessor.java:144) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument (DocBuilder.java:280) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument (DocBuilder.java:302) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump (DocBuilder.java:173) at org.apache.solr.handler.dataimport.DocBuilder.execute (DocBuilder.java:134) at org.apache.solr.handler.dataimport.DataImporter.doFullImport (DataImporter.java:323) at org.apache.solr.handler.dataimport.DataImporter.rumCmd (DataImporter.java:374) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBod y(DataImportHandler.java:179) at org.apache.solr.handler.RequestHandlerBase.handleRequest (RequestHandlerBase.java:125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at org.apache.solr.servlet.SolrDispatchFilter.execute (SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter (SolrDispatchFilter.java:272) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter (ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle (ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle (SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle (SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle (ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle (ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle (HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle (HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java: 502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete (HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable (HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle (HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run (SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run (BoundedThreadPool.java:442) Caused by: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader $Node.getMatchingChild(XPathRecordReader.java:198) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse (XPathRecordReader.java:171) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse (XPathRecordReader.java:174) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse (XPathRecordReader.java:174) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.access $000(XPathRecordReader.java:89) at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords
Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
We are cutting a a patch which incorporates all the recent bug fixes, so that you guys do not have to apply patches over patches --Noble On Wed, Jun 11, 2008 at 3:49 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: Thanks a million for your time and help. It indeed works smoothly now. I also, by the way, had to apply the patch attached to the following message : http://www.nabble.com/Re%3A-How-to-describe-2-entities-in-dataConfig-for-the-DataImporter--p17577610.html in order to have the TemplateTransformer to not throw Null Pointer exceptions :) Cheers ! -- Nicolas Pastorino On Jun 10, 2008, at 18:05 , Noble Paul നോബിള് नोब्ळ् wrote: It is a bug, nice catch there needs to be a null check there in the method can us just try replacing the method with the following? private Node getMatchingChild(XMLStreamReader parser) { if(childNodes == null) return null; String localName = parser.getLocalName(); for (Node n : childNodes) { if (n.name.equals(localName)) { if (n.attribAndValues == null) return n; if (checkForAttributes(parser, n.attribAndValues)) return n; } } return null; } I tried with that code and it is working. We shall add it in the next patch --Noble On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: I just forgot to mention the error related to the description below. I get the following when running a full-import ( sorry for the noise .. ) : SEVERE: Full Import failed java.lang.RuntimeException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:207) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:161) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:144) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:280) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:302) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:173) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:134) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:323) at org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:374) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:179) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader$Node.getMatchingChild(XPathRecordReader.java:198) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:171) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174) at
DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
Hello fellow Solr users ! I am in the process of trying to index XML documents in Solr. I went for the DataImportHandler approach, which seemed to perfectly suit this need. Due to the large amount of XML documents to be indexed ( ~60MB ), i thought i would hardly be possible to feed solr with the concatenation of all these docs at once. Hence this small php script i wrote, serving on HTTP the list of these documents, under the following form ( available from a local URL replicated in data- config.xml ) : ?xml version=1.0 encoding=UTF-8? root entries entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 10098.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 10099.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ all_in_one.xml/source /entry /entries /root The idea would be to have one single data-config.xml configuration file for the DataImportHandler, which would read the listing presented above, and request every single subitem and index it. Every subitem has the following structure : ?xml version=1.0 encoding=ISO-8859-1 ? root contenido id=10099 idioma=cat antetitulo![CDATA[This is an introduction text]] /antetitulo titulo![CDATA[This is a title]]/titulo resumen![CDATA[ This a a summary ]]/resumen texto![CDATA[This is the body of my articlebrbr]] /texto autor![CDATA[John Doe]]/autor fecha![CDATA[31/10/2001]]/fecha fuente![CDATA[]]/fuente webexterna![CDATA[]]/webexterna recursos/recursos ambitos/ambitos /contenido /root After struggling for a ( long ) while with different configuration scenarios, here is a data-config.xml i ended up with : dataConfig dataSource type=HttpDataSource/ document entity name=oldsearchcontentlist pk=m_guid url=http://localhost/psc/trunk/ezfiles/list_old_content.php? limit=10amp;urlsonly=1 processor=XPathEntityProcessor forEach=/root/entries/entry field column=elementurl xpath=/root/entries/entry/source/ / entity name=oldsearchcontent pk=m_guid url=${oldsearchcontentlist.elementurl} processor=XPathEntityProcessor forEach=/root/contenido transformer=TemplateTransformer field column=m_guid xpath=/root/contenido/titulo / /entity /entity /document /dataConfig As a note, i had to check out Solr's trunk, and patched it with the following : https://issues.apache.org/jira/browse/SOLR-469 ( https:// issues.apache.org/jira/secure/attachment/12380679/SOLR-469.patch ), and recompiled. Running the following command : http://localhost:8983/solr/dataimport?command=full- importverbose=ondebug=on tells me that no Document was created at all, and does not throw any errorhere is the full output : response lst name=responseHeader int name=status0/int int name=QTime39/int /lst lst name=initArgs lst name=defaults str name=configdata-config.xml/str lst name=datasource str name=typeHttpDataSource/str /lst /lst /lst str name=commandfull-import/str str name=modedebug/str null name=documents/ lst name=verbose-output lst name=entity:oldsearchcontentlist lst name=document#1 str name=query http://localhost/psc/trunk/ezfiles/list_old_content.php? limit=10urlsonly=1 /str str name=time-taken0:0:0.23/str /lst /lst /lst str name=statusidle/str str name=importResponseConfiguration Re-loaded sucessfully/str lst name=statusMessages str name=Total Requests made to DataSource1/str str name=Total Rows Fetched0/str str name=Total Documents Skipped0/str str name=Full Dump Started2008-06-10 14:38:56/str str name= Indexing completed. Added/Updated: 0 documents. Deleted 0
Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
The configuration is fine but for one detail The documents are to be created for the entity 'oldsearchcontent' not for the root entity . so add an attribute rootEntity=false for the entity 'oldsearchcontentlist' as follows. entity name=oldsearchcontentlist url=http://localhost/psc/trunk/ezfiles/list_old_content.php?limit=10amp;urlsonly=1; processor=XPathEntityProcessor forEach=/root/entries/entry rootEntity=false this means that the entity directly under this ('oldsearchcontent')will be treated as the root and documents will be created for that. --Noble On Tue, Jun 10, 2008 at 6:15 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: Hello fellow Solr users ! I am in the process of trying to index XML documents in Solr. I went for the DataImportHandler approach, which seemed to perfectly suit this need. Due to the large amount of XML documents to be indexed ( ~60MB ), i thought i would hardly be possible to feed solr with the concatenation of all these docs at once. Hence this small php script i wrote, serving on HTTP the list of these documents, under the following form ( available from a local URL replicated in data-config.xml ) : ?xml version=1.0 encoding=UTF-8? root entries entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/10098.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/10099.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/all_in_one.xml/source /entry /entries /root The idea would be to have one single data-config.xml configuration file for the DataImportHandler, which would read the listing presented above, and request every single subitem and index it. Every subitem has the following structure : ?xml version=1.0 encoding=ISO-8859-1 ? root contenido id=10099 idioma=cat antetitulo![CDATA[This is an introduction text]]/antetitulo titulo![CDATA[This is a title]]/titulo resumen![CDATA[ This a a summary ]]/resumen texto![CDATA[This is the body of my articlebrbr]] /texto autor![CDATA[John Doe]]/autor fecha![CDATA[31/10/2001]]/fecha fuente![CDATA[]]/fuente webexterna![CDATA[]]/webexterna recursos/recursos ambitos/ambitos /contenido /root After struggling for a ( long ) while with different configuration scenarios, here is a data-config.xml i ended up with : dataConfig dataSource type=HttpDataSource/ document entity name=oldsearchcontentlist pk=m_guid url=http://localhost/psc/trunk/ezfiles/list_old_content.php?limit=10amp;urlsonly=1; processor=XPathEntityProcessor forEach=/root/entries/entry field column=elementurl xpath=/root/entries/entry/source/ / entity name=oldsearchcontent pk=m_guid url=${oldsearchcontentlist.elementurl} processor=XPathEntityProcessor forEach=/root/contenido transformer=TemplateTransformer field column=m_guid xpath=/root/contenido/titulo / /entity /entity /document /dataConfig As a note, i had to check out Solr's trunk, and patched it with the following : https://issues.apache.org/jira/browse/SOLR-469 ( https://issues.apache.org/jira/secure/attachment/12380679/SOLR-469.patch ), and recompiled. Running the following command : http://localhost:8983/solr/dataimport?command=full-importverbose=ondebug=on tells me that no Document was created at all, and does not throw any errorhere is the full output : response lst name=responseHeader int name=status0/int int name=QTime39/int /lst lst name=initArgs lst name=defaults str name=configdata-config.xml/str lst name=datasource str name=typeHttpDataSource/str /lst /lst /lst str name=commandfull-import/str str name=modedebug/str null name=documents/ lst name=verbose-output lst name=entity:oldsearchcontentlist lst name=document#1 str name=query
Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
Thanks a lot, it works fine now, fetching subelements properly. The only issue left is that the XPath syntax passed in the data- config.xml does not seem to work properly. As an example, processing the following entity : root contenido id=10097 idioma=cat antetitulo/antetitulo titulo This is my title /titulo resumen This is my summary /resumen texto This is the body of my text /texto /contenido /root and trying to fill a solr field with the 'id' attribute of the 'contenido' tag with the following config : field column=m_guid xpath=/root/contenido/@id / does not seem to work properly. Thanks a lot for your time already ! Regards, Nicolas Pastorino On Jun 10, 2008, at 14:55 , Noble Paul നോബിള് नोब्ळ् wrote: The configuration is fine but for one detail The documents are to be created for the entity 'oldsearchcontent' not for the root entity . so add an attribute rootEntity=false for the entity 'oldsearchcontentlist' as follows. entity name=oldsearchcontentlist url=http://localhost/psc/trunk/ezfiles/list_old_content.php? limit=10amp;urlsonly=1 processor=XPathEntityProcessor forEach=/root/entries/entry rootEntity=false this means that the entity directly under this ('oldsearchcontent')will be treated as the root and documents will be created for that. --Noble On Tue, Jun 10, 2008 at 6:15 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: Hello fellow Solr users ! I am in the process of trying to index XML documents in Solr. I went for the DataImportHandler approach, which seemed to perfectly suit this need. Due to the large amount of XML documents to be indexed ( ~60MB ), i thought i would hardly be possible to feed solr with the concatenation of all these docs at once. Hence this small php script i wrote, serving on HTTP the list of these documents, under the following form ( available from a local URL replicated in data-config.xml ) : ?xml version=1.0 encoding=UTF-8? root entries entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 10098.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 10099.xml/source /entry entry realmold_search_content/realm sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ all_in_one.xml/source /entry /entries /root The idea would be to have one single data-config.xml configuration file for the DataImportHandler, which would read the listing presented above, and request every single subitem and index it. Every subitem has the following structure : ?xml version=1.0 encoding=ISO-8859-1 ? root contenido id=10099 idioma=cat antetitulo![CDATA[This is an introduction text]]/antetitulo titulo![CDATA[This is a title]]/titulo resumen![CDATA[ This a a summary ]]/resumen texto![CDATA[This is the body of my articlebrbr]] /texto autor![CDATA[John Doe]]/autor fecha![CDATA[31/10/2001]]/fecha fuente![CDATA[]]/fuente webexterna![CDATA[]]/webexterna recursos/recursos ambitos/ambitos /contenido /root After struggling for a ( long ) while with different configuration scenarios, here is a data-config.xml i ended up with : dataConfig dataSource type=HttpDataSource/ document entity name=oldsearchcontentlist pk=m_guid url=http://localhost/psc/trunk/ezfiles/list_old_content.php? limit=10amp;urlsonly=1 processor=XPathEntityProcessor forEach=/root/entries/entry field column=elementurl xpath=/root/entries/entry/source/ / entity name=oldsearchcontent pk=m_guid url=$ {oldsearchcontentlist.elementurl} processor=XPathEntityProcessor forEach=/root/contenido transformer=TemplateTransformer field column=m_guid xpath=/root/contenido/titulo / /entity /entity /document /dataConfig As a note, i had to check out Solr's trunk, and patched it with the following : https://issues.apache.org/jira/browse/SOLR-469 ( https://issues.apache.org/jira/secure/attachment/12380679/ SOLR-469.patch ), and recompiled. Running the following command : http://localhost:8983/solr/dataimport?command=full- importverbose=ondebug=on tells me that no Document was created at all,
Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer
It is a bug, nice catch there needs to be a null check there in the method can us just try replacing the method with the following? private Node getMatchingChild(XMLStreamReader parser) { if(childNodes == null) return null; String localName = parser.getLocalName(); for (Node n : childNodes) { if (n.name.equals(localName)) { if (n.attribAndValues == null) return n; if (checkForAttributes(parser, n.attribAndValues)) return n; } } return null; } I tried with that code and it is working. We shall add it in the next patch --Noble On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote: I just forgot to mention the error related to the description below. I get the following when running a full-import ( sorry for the noise .. ) : SEVERE: Full Import failed java.lang.RuntimeException: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85) at org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:207) at org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:161) at org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:144) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:280) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:302) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:173) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:134) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:323) at org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:374) at org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:179) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125) at org.apache.solr.core.SolrCore.execute(SolrCore.java:965) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Caused by: java.lang.NullPointerException at org.apache.solr.handler.dataimport.XPathRecordReader$Node.getMatchingChild(XPathRecordReader.java:198) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:171) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174) at org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89) at org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82) ... 31 more Regards, Nicolas Pastorino On Jun 10, 2008, at 17:38 , Nicolas Pastorino wrote: Thanks a lot, it works fine now, fetching subelements properly. The only issue left is that the XPath syntax passed in the data-config.xml does not seem to work properly. As an example, processing the following entity : root contenido id=10097 idioma=cat antetitulo/antetitulo titulo This is my title /titulo