Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-11 Thread Nicolas Pastorino

Thanks a million for your time and help.
It indeed works smoothly now.

I also, by the way, had to apply the patch attached to the  
following message :
http://www.nabble.com/Re%3A-How-to-describe-2-entities-in-dataConfig- 
for-the-DataImporter--p17577610.html
in order to have the TemplateTransformer to not throw Null Pointer  
exceptions :)


Cheers !
--
Nicolas Pastorino

On Jun 10, 2008, at 18:05 , Noble Paul നോബിള്‍  
नोब्ळ् wrote:



It is a bug, nice catch
there needs to be a null check there in the method
can us just try replacing the method with the following?

private Node getMatchingChild(XMLStreamReader parser) {
  if(childNodes == null) return null;
  String localName = parser.getLocalName();
  for (Node n : childNodes) {
if (n.name.equals(localName)) {
  if (n.attribAndValues == null)
return n;
  if (checkForAttributes(parser, n.attribAndValues))
return n;
}
  }
  return null;
}

I tried with that code and it is working. We shall add it in the  
next patch



--Noble
On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:
I just forgot to mention the error related to the description  
below. I get

the following when running a full-import ( sorry for the noise .. ) :

SEVERE: Full Import failed
java.lang.RuntimeException: java.lang.NullPointerException
   at
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords 
(XPathRecordReader.java:85)

   at
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery 
(XPathEntityProcessor.java:207)

   at
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow( 
XPathEntityProcessor.java:161)

   at
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow 
(XPathEntityProcessor.java:144)

   at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument 
(DocBuilder.java:280)

   at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument 
(DocBuilder.java:302)

   at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump 
(DocBuilder.java:173)

   at
org.apache.solr.handler.dataimport.DocBuilder.execute 
(DocBuilder.java:134)

   at
org.apache.solr.handler.dataimport.DataImporter.doFullImport 
(DataImporter.java:323)

   at
org.apache.solr.handler.dataimport.DataImporter.rumCmd 
(DataImporter.java:374)

   at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBod 
y(DataImportHandler.java:179)

   at
org.apache.solr.handler.RequestHandlerBase.handleRequest 
(RequestHandlerBase.java:125)

   at org.apache.solr.core.SolrCore.execute(SolrCore.java:965)
   at
org.apache.solr.servlet.SolrDispatchFilter.execute 
(SolrDispatchFilter.java:338)

   at
org.apache.solr.servlet.SolrDispatchFilter.doFilter 
(SolrDispatchFilter.java:272)

   at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter 
(ServletHandler.java:1089)

   at
org.mortbay.jetty.servlet.ServletHandler.handle 
(ServletHandler.java:365)

   at
org.mortbay.jetty.security.SecurityHandler.handle 
(SecurityHandler.java:216)

   at
org.mortbay.jetty.servlet.SessionHandler.handle 
(SessionHandler.java:181)

   at
org.mortbay.jetty.handler.ContextHandler.handle 
(ContextHandler.java:712)

   at
org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
   at
org.mortbay.jetty.handler.ContextHandlerCollection.handle 
(ContextHandlerCollection.java:211)

   at
org.mortbay.jetty.handler.HandlerCollection.handle 
(HandlerCollection.java:114)

   at
org.mortbay.jetty.handler.HandlerWrapper.handle 
(HandlerWrapper.java:139)

   at org.mortbay.jetty.Server.handle(Server.java:285)
   at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java: 
502)

   at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete 
(HttpConnection.java:821)

   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
   at org.mortbay.jetty.HttpParser.parseAvailable 
(HttpParser.java:208)
   at org.mortbay.jetty.HttpConnection.handle 
(HttpConnection.java:378)

   at
org.mortbay.jetty.bio.SocketConnector$Connection.run 
(SocketConnector.java:226)

   at
org.mortbay.thread.BoundedThreadPool$PoolThread.run 
(BoundedThreadPool.java:442)

Caused by: java.lang.NullPointerException
   at
org.apache.solr.handler.dataimport.XPathRecordReader 
$Node.getMatchingChild(XPathRecordReader.java:198)

   at
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse 
(XPathRecordReader.java:171)

   at
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse 
(XPathRecordReader.java:174)

   at
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse 
(XPathRecordReader.java:174)

   at
org.apache.solr.handler.dataimport.XPathRecordReader$Node.access 
$000(XPathRecordReader.java:89)

   at
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords 

Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-11 Thread Noble Paul നോബിള്‍ नोब्ळ्
We are cutting a a patch which incorporates all the recent bug fixes,
so that you guys do not have to apply patches over patches

--Noble

On Wed, Jun 11, 2008 at 3:49 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:
 Thanks a million for your time and help.
 It indeed works smoothly now.

 I also, by the way, had to apply the patch attached to the following
 message :
 http://www.nabble.com/Re%3A-How-to-describe-2-entities-in-dataConfig-for-the-DataImporter--p17577610.html
 in order to have the TemplateTransformer to not throw Null Pointer
 exceptions :)

 Cheers !
 --
 Nicolas Pastorino

 On Jun 10, 2008, at 18:05 , Noble Paul നോബിള്‍ नोब्ळ् wrote:

 It is a bug, nice catch
 there needs to be a null check there in the method
 can us just try replacing the method with the following?

 private Node getMatchingChild(XMLStreamReader parser) {
  if(childNodes == null) return null;
  String localName = parser.getLocalName();
  for (Node n : childNodes) {
if (n.name.equals(localName)) {
  if (n.attribAndValues == null)
return n;
  if (checkForAttributes(parser, n.attribAndValues))
return n;
}
  }
  return null;
}

 I tried with that code and it is working. We shall add it in the next
 patch


 --Noble
 On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:

 I just forgot to mention the error related to the description below. I
 get
 the following when running a full-import ( sorry for the noise .. ) :

 SEVERE: Full Import failed
 java.lang.RuntimeException: java.lang.NullPointerException
   at

 org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
   at

 org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:207)
   at

 org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:161)
   at

 org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:144)
   at

 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:280)
   at

 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:302)
   at

 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:173)
   at

 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:134)
   at

 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:323)
   at

 org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:374)
   at

 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:179)
   at

 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:965)
   at

 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
   at

 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
   at

 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
   at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
   at

 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
   at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
   at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
   at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
   at

 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
   at

 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
   at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
   at org.mortbay.jetty.Server.handle(Server.java:285)
   at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
   at

 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
   at

 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
   at

 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
 Caused by: java.lang.NullPointerException
   at

 org.apache.solr.handler.dataimport.XPathRecordReader$Node.getMatchingChild(XPathRecordReader.java:198)
   at

 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:171)
   at

 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
   at

 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
   at

 

DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-10 Thread Nicolas Pastorino

Hello fellow Solr users !


I am in the process of trying to index XML documents in Solr. I went  
for the DataImportHandler approach, which seemed to perfectly suit  
this need. Due to the large amount of XML documents to be indexed  
( ~60MB ), i thought i would hardly be possible to feed solr with the  
concatenation of all these docs at once. Hence this small php script  
i wrote, serving on HTTP the list of these documents, under the  
following form ( available from a local URL replicated in data- 
config.xml ) :



?xml version=1.0 encoding=UTF-8?
root
entries
entry
realmold_search_content/realm
		sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
10098.xml/source

/entry
entry
realmold_search_content/realm
		sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
10099.xml/source

/entry
entry
realmold_search_content/realm
		sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
all_in_one.xml/source

/entry  
/entries
/root


The idea would be to have one single data-config.xml configuration  
file for the DataImportHandler, which would read the listing  
presented above, and request every single subitem and index it. Every  
subitem has the following structure :

?xml version=1.0 encoding=ISO-8859-1 ?
root
contenido id=10099 idioma=cat
antetitulo![CDATA[This is an introduction text]] 
/antetitulo

titulo![CDATA[This is a title]]/titulo
resumen![CDATA[ This a a summary ]]/resumen
texto![CDATA[This is the body of my  
articlebrbr]]

/texto
autor![CDATA[John Doe]]/autor
fecha![CDATA[31/10/2001]]/fecha
fuente![CDATA[]]/fuente
webexterna![CDATA[]]/webexterna
recursos/recursos
ambitos/ambitos
/contenido
/root



After struggling for a ( long ) while with different configuration  
scenarios, here is a data-config.xml i ended up with :



dataConfig
dataSource type=HttpDataSource/
document
entity name=oldsearchcontentlist
pk=m_guid
url=http://localhost/psc/trunk/ezfiles/list_old_content.php? 
limit=10amp;urlsonly=1

processor=XPathEntityProcessor
forEach=/root/entries/entry

field column=elementurl 
xpath=/root/entries/entry/source/ /

entity name=oldsearchcontent
pk=m_guid
url=${oldsearchcontentlist.elementurl}
processor=XPathEntityProcessor
forEach=/root/contenido
transformer=TemplateTransformer
field column=m_guid 
xpath=/root/contenido/titulo /
/entity
/entity
/document
/dataConfig


As a note, i had to check out Solr's trunk, and patched it with the  
following : https://issues.apache.org/jira/browse/SOLR-469 ( https:// 
issues.apache.org/jira/secure/attachment/12380679/SOLR-469.patch ),  
and recompiled.

Running the following command :
http://localhost:8983/solr/dataimport?command=full- 
importverbose=ondebug=on
tells me that no Document was created at all, and does not throw any  
errorhere is the full output :



response
lst name=responseHeader
int name=status0/int
int name=QTime39/int
/lst
lst name=initArgs
lst name=defaults
str name=configdata-config.xml/str
lst name=datasource
str name=typeHttpDataSource/str
/lst
/lst
/lst
str name=commandfull-import/str
str name=modedebug/str
null name=documents/
lst name=verbose-output
lst name=entity:oldsearchcontentlist
lst name=document#1
str name=query
http://localhost/psc/trunk/ezfiles/list_old_content.php? 
limit=10urlsonly=1

/str
str name=time-taken0:0:0.23/str
/lst
/lst
/lst
str name=statusidle/str
str name=importResponseConfiguration Re-loaded sucessfully/str
lst name=statusMessages
str name=Total Requests made to DataSource1/str
str name=Total Rows Fetched0/str
str name=Total Documents Skipped0/str
str name=Full Dump Started2008-06-10 14:38:56/str
str name=
Indexing completed. Added/Updated: 0 documents. Deleted 
0 

Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
The configuration is fine but for one detail
The documents are to be created for the entity 'oldsearchcontent' not
for the root entity . so add an attribute rootEntity=false for the
entity 'oldsearchcontentlist' as follows.

   entity name=oldsearchcontentlist

url=http://localhost/psc/trunk/ezfiles/list_old_content.php?limit=10amp;urlsonly=1;
   processor=XPathEntityProcessor
   forEach=/root/entries/entry
   rootEntity=false

this means that the entity directly under this
('oldsearchcontent')will be treated as the root and documents will be
created for that.
--Noble

On Tue, Jun 10, 2008 at 6:15 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:
 Hello fellow Solr users !


 I am in the process of trying to index XML documents in Solr. I went for the
 DataImportHandler approach, which seemed to perfectly suit this need. Due to
 the large amount of XML documents to be indexed ( ~60MB ), i thought i would
 hardly be possible to feed solr with the concatenation of all these docs at
 once. Hence this small php script i wrote, serving on HTTP the list of these
 documents, under the following form ( available from a local URL replicated
 in data-config.xml ) :


 ?xml version=1.0 encoding=UTF-8?
 root
 entries
entry
realmold_search_content/realm

  
 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/10098.xml/source
/entry
entry
realmold_search_content/realm

  
 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/10099.xml/source
/entry
entry
realmold_search_content/realm

  
 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/all_in_one.xml/source
/entry
 /entries
 /root


 The idea would be to have one single data-config.xml configuration file for
 the DataImportHandler, which would read the listing presented above, and
 request every single subitem and index it. Every subitem has the following
 structure :
 ?xml version=1.0 encoding=ISO-8859-1 ?
 root
contenido id=10099 idioma=cat
antetitulo![CDATA[This is an introduction
 text]]/antetitulo
titulo![CDATA[This is a title]]/titulo
resumen![CDATA[ This a a summary ]]/resumen
texto![CDATA[This is the body of my articlebrbr]]
/texto
autor![CDATA[John Doe]]/autor
fecha![CDATA[31/10/2001]]/fecha
fuente![CDATA[]]/fuente
webexterna![CDATA[]]/webexterna
recursos/recursos
ambitos/ambitos
/contenido
 /root



 After struggling for a ( long ) while with different configuration
 scenarios, here is a data-config.xml i ended up with :


 dataConfig
dataSource type=HttpDataSource/
document
entity name=oldsearchcontentlist
pk=m_guid

  
 url=http://localhost/psc/trunk/ezfiles/list_old_content.php?limit=10amp;urlsonly=1;
processor=XPathEntityProcessor
forEach=/root/entries/entry

field column=elementurl
 xpath=/root/entries/entry/source/ /

entity name=oldsearchcontent
pk=m_guid
url=${oldsearchcontentlist.elementurl}
processor=XPathEntityProcessor
forEach=/root/contenido
transformer=TemplateTransformer
field column=m_guid
 xpath=/root/contenido/titulo /
/entity
/entity
/document
 /dataConfig


 As a note, i had to check out Solr's trunk, and patched it with the
 following : https://issues.apache.org/jira/browse/SOLR-469 (
 https://issues.apache.org/jira/secure/attachment/12380679/SOLR-469.patch ),
 and recompiled.
 Running the following command :
 http://localhost:8983/solr/dataimport?command=full-importverbose=ondebug=on
 tells me that no Document was created at all, and does not throw any
 errorhere is the full output :


 response
lst name=responseHeader
int name=status0/int
int name=QTime39/int
/lst
lst name=initArgs
lst name=defaults
str name=configdata-config.xml/str
lst name=datasource
str name=typeHttpDataSource/str
/lst
/lst
/lst
str name=commandfull-import/str
str name=modedebug/str
null name=documents/
lst name=verbose-output
lst name=entity:oldsearchcontentlist
lst name=document#1
str name=query

  

Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-10 Thread Nicolas Pastorino

Thanks a lot, it works fine now, fetching subelements properly.
The only issue left is that the XPath syntax passed in the data- 
config.xml does not seem to work properly. As an example, processing  
the following entity :


root
contenido id=10097 idioma=cat
antetitulo/antetitulo
titulo
This is my title
/titulo
resumen
This is my summary
/resumen
texto
This is the body of my text
/texto
/contenido
/root

and trying to fill a solr field with the 'id' attribute of the  
'contenido' tag with the following config :

field column=m_guid xpath=/root/contenido/@id /

does not seem to work properly.

Thanks a lot for your time already !

Regards,
Nicolas Pastorino



On Jun 10, 2008, at 14:55 , Noble Paul നോബിള്‍  
नोब्ळ् wrote:



The configuration is fine but for one detail
The documents are to be created for the entity 'oldsearchcontent' not
for the root entity . so add an attribute rootEntity=false for the
entity 'oldsearchcontentlist' as follows.

   entity name=oldsearchcontentlist

url=http://localhost/psc/trunk/ezfiles/list_old_content.php? 
limit=10amp;urlsonly=1

   processor=XPathEntityProcessor
   forEach=/root/entries/entry
   rootEntity=false

this means that the entity directly under this
('oldsearchcontent')will be treated as the root and documents will be
created for that.
--Noble

On Tue, Jun 10, 2008 at 6:15 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:

Hello fellow Solr users !


I am in the process of trying to index XML documents in Solr. I  
went for the
DataImportHandler approach, which seemed to perfectly suit this  
need. Due to
the large amount of XML documents to be indexed ( ~60MB ), i  
thought i would
hardly be possible to feed solr with the concatenation of all  
these docs at
once. Hence this small php script i wrote, serving on HTTP the  
list of these
documents, under the following form ( available from a local URL  
replicated

in data-config.xml ) :


?xml version=1.0 encoding=UTF-8?
root
entries
   entry
   realmold_search_content/realm

 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
10098.xml/source

   /entry
   entry
   realmold_search_content/realm

 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
10099.xml/source

   /entry
   entry
   realmold_search_content/realm

 sourcehttp://localhost/psc/trunk/ezfiles/extension/psc/doc/xml/ 
all_in_one.xml/source

   /entry
/entries
/root


The idea would be to have one single data-config.xml configuration  
file for
the DataImportHandler, which would read the listing presented  
above, and
request every single subitem and index it. Every subitem has the  
following

structure :
?xml version=1.0 encoding=ISO-8859-1 ?
root
   contenido id=10099 idioma=cat
   antetitulo![CDATA[This is an introduction
text]]/antetitulo
   titulo![CDATA[This is a title]]/titulo
   resumen![CDATA[ This a a summary ]]/resumen
   texto![CDATA[This is the body of my  
articlebrbr]]

   /texto
   autor![CDATA[John Doe]]/autor
   fecha![CDATA[31/10/2001]]/fecha
   fuente![CDATA[]]/fuente
   webexterna![CDATA[]]/webexterna
   recursos/recursos
   ambitos/ambitos
   /contenido
/root



After struggling for a ( long ) while with different configuration
scenarios, here is a data-config.xml i ended up with :


dataConfig
   dataSource type=HttpDataSource/
   document
   entity name=oldsearchcontentlist
   pk=m_guid

 url=http://localhost/psc/trunk/ezfiles/list_old_content.php? 
limit=10amp;urlsonly=1

   processor=XPathEntityProcessor
   forEach=/root/entries/entry

   field column=elementurl
xpath=/root/entries/entry/source/ /

   entity name=oldsearchcontent
   pk=m_guid
   url=$ 
{oldsearchcontentlist.elementurl}

   processor=XPathEntityProcessor
   forEach=/root/contenido
   transformer=TemplateTransformer
   field column=m_guid
xpath=/root/contenido/titulo /
   /entity
   /entity
   /document
/dataConfig


As a note, i had to check out Solr's trunk, and patched it with the
following : https://issues.apache.org/jira/browse/SOLR-469 (
https://issues.apache.org/jira/secure/attachment/12380679/ 
SOLR-469.patch ),

and recompiled.
Running the following command :
http://localhost:8983/solr/dataimport?command=full- 
importverbose=ondebug=on

tells me that no Document was created at all, 

Re: DataImportHandler : How to mix XPathEntityProcessor and TemplateTransformer

2008-06-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
It is a bug, nice catch
there needs to be a null check there in the method
can us just try replacing the method with the following?

private Node getMatchingChild(XMLStreamReader parser) {
  if(childNodes == null) return null;
  String localName = parser.getLocalName();
  for (Node n : childNodes) {
if (n.name.equals(localName)) {
  if (n.attribAndValues == null)
return n;
  if (checkForAttributes(parser, n.attribAndValues))
return n;
}
  }
  return null;
}

I tried with that code and it is working. We shall add it in the next patch


--Noble
On Tue, Jun 10, 2008 at 9:11 PM, Nicolas Pastorino [EMAIL PROTECTED] wrote:
 I just forgot to mention the error related to the description below. I get
 the following when running a full-import ( sorry for the noise .. ) :

 SEVERE: Full Import failed
 java.lang.RuntimeException: java.lang.NullPointerException
at
 org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
at
 org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:207)
at
 org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:161)
at
 org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:144)
at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:280)
at
 org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:302)
at
 org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:173)
at
 org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:134)
at
 org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:323)
at
 org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:374)
at
 org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:179)
at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:125)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:965)
at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272)
at
 org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at
 org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
 org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
 org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
 org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at
 org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
 org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
 org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at
 org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at
 org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
 org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
 org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
 org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
 Caused by: java.lang.NullPointerException
at
 org.apache.solr.handler.dataimport.XPathRecordReader$Node.getMatchingChild(XPathRecordReader.java:198)
at
 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:171)
at
 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
at
 org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
at
 org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
at
 org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
... 31 more

 Regards,
 Nicolas Pastorino

 On Jun 10, 2008, at 17:38 , Nicolas Pastorino wrote:

 Thanks a lot, it works fine now, fetching subelements properly.
 The only issue left is that the XPath syntax passed in the data-config.xml
 does not seem to work properly. As an example, processing the following
 entity :

 root
contenido id=10097 idioma=cat
antetitulo/antetitulo
titulo
This is my title
/titulo