try this
<field column="textContent" xpath="/document/category/BODY" faltten="true"/>

this should slurp al the tags under body

On Wed, Aug 19, 2009 at 1:44 PM, venn hardy<venn.ha...@hotmail.com> wrote:
>
> Hello,
>
> I have just started trying out SOLR to index some XML documents that I 
> receive. I am
> using the SOLR 1.3 and its HttpDataSource in conjunction with the 
> XPathEntityProcessor.
>
>
>
> I am finding the data import really useful so far, but I am having a few 
> problems when
> I try and import HTML contained within one of the XML tags <BODY>. The data 
> import just seems
> to ignore the textContent silently but it imports everything else.
>
>
>
> When I do a query through the SOLR admin interface, only the id and author 
> fields are displayed.
>
> Any ideas what I am doing wrong?
>
>
>
> Thanks
>
>
>
> This is what my dataConfig looks like:
> <dataConfig>
>  <dataSource type="HttpDataSource" />
>  <document>
>  <entity name="archive" pk="id" 
> url="http://localhost:9080/data/20090817070752.xml"; 
> processor="XPathEntityProcessor" forEach="/document/category" 
> transformer="DateFormatTransformer" stream="true" dataSource="dataSource">
>         <field column="id" xpath="/document/category/reference" />
>  <field column="textContent" xpath="/document/category/BODY" />
>  <field column="author" xpath="/document/category/author" />
>  </entity>
>  </document>
> </dataConfig>
>
>
>
> This is how I have specified my schema
> <fields>
>   <field name="id" type="string" indexed="true" stored="true" required="true" 
> />
>   <field name="author" type="string" indexed="true" stored="true"/>
>   <field name="textContent" type="text" indexed="true" stored="true" />
> </fields>
>
>  <uniqueKey>id</uniqueKey>
>  <defaultSearchField>id</defaultSearchField>
>
>
>
> And this is what my XML document looks like:
>
> <document>
>  <category>
>  <reference>123456</reference>
>  <author>Authori name</author>
>  <BODY>
>  <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit.
>  Morbi lorem elit, lacinia ac blandit ac, tristique et ante. Phasellus varius 
> varius felis ut vestibulum</P>
>  <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi lorem elit,
>  lacinia ac blandit ac, tristique et ante. Phasellus varius varius felis ut 
> vestibulum</P>
>  <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi lorem elit,
>  lacinia ac blandit ac, tristique et ante. Phasellus varius varius felis ut 
> vestibulum</P>
>  </BODY>
>  </category>
> </document>
>
> _________________________________________________________________
> Looking for a place to rent, share or buy this winter? Find your next place 
> with Ninemsn property
> http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Edomain%2Ecom%2Eau%2F%3Fs%5Fcid%3DFDMedia%3ANineMSN%5FHotmail%5FTagline&_t=774152450&_r=Domain_tagline&_m=EXT



-- 
-----------------------------------------------------
Noble Paul | Principal Engineer| AOL | http://aol.com

Reply via email to