On Wed, Feb 4, 2009 at 1:35 AM, Fergus McMenemie <[email protected]> wrote:
> <entity name="x"
> dataSource="myfilereader"
> processor="XPathEntityProcessor"
> url="${jc.fileAbsolutePath}"
> stream="false"
> forEach="/record">
> <field column="para" xpath="/record/sect1/para" />
> <field column="para" xpath="/record/list/listitem/para" />
> <field column="para" xpath="/a/b/c/para" />
> <field column="para" xpath="/d/e/f/g/para" />
>
> Below is the line from my schema.xml
>
> <field name="para" type="text" indexed="true" stored="true"
> multiValued="true"/>
>
> Now a given document will only have one style of layout, and of course
> the /a/b/c /d/e/f/g stuff is made up. For a document that has a single
> <para>Hello world</para> element I see search results as follows, the
> one <para> string seems to have been entered into the index four times.
> I only saw duplicate results before adding the extra made-up stuff.
>
>
I think there is something fishy with the XPathEntityProcessor. For now, I
think you can work around by giving each field a different 'column' and
attribute 'name=para' on each of them.
--
Regards,
Shalin Shekhar Mangar.