RE: Problems importing HTML content contained within XML document

2009-08-20 Thread venn hardy
Thanks Paul, I upgraded to solr 1.4 and used the flatten attribute as you suggested. It works well. From: noble.p...@corp.aol.com Date: Wed, 19 Aug 2009 15:05:48 +0530 Subject: Re: Problems importing HTML content contained within XML document To: solr-user@lucene.apache.org try

Re: Problems importing HTML content contained within XML document

2009-08-19 Thread Martijn v Groningen
Hi Venn, I think what is happening when the BODY element is being processed by xpath expressen (/document/category/BODY), is that it does not retrieve the text content from the P elements inside the body element. The expression will only retrieve text content that is directly a child of the BODY

Re: Problems importing HTML content contained within XML document

2009-08-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
try this field column=textContent xpath=/document/category/BODY faltten=true/ this should slurp al the tags under body On Wed, Aug 19, 2009 at 1:44 PM, venn hardyvenn.ha...@hotmail.com wrote: Hello, I have just started trying out SOLR to index some XML documents that I receive. I am using

Re: Problems importing HTML content contained within XML document

2009-08-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
sorry field column=textContent xpath=/document/category/BODY flatten=true/ 2009/8/19 Noble Paul നോബിള്‍ नोब्ळ् noble.p...@corp.aol.com: try this field column=textContent xpath=/document/category/BODY faltten=true/ this should slurp al the tags under body On Wed, Aug 19, 2009 at 1:44 PM,