Has anyone played around with trying to use Nutch/Lucene as a parametric
search? My idea is to have each crawled document be an xml document
containing some set of data, for instance:
crawleddoc1.xml:
<data>
<instance id=1>
<type>sometype</type>
<date>Jan 1, 2006</date>
<value>5</value>
</instance>
crawleddoc2.xml:
<data>
<instance id=2>
<type>sometype</type>
<date>Jan 2, 2006</date>
<value>70</value>
</instance>
</data>
And be able to perform searches like <date> between two values and
<value> = 5? The return would just be the XML doc location (maybe the
URL to the doc).
Is this just bending nutch/lucene too much towards relational data
processing, or is it something that could be achieved while still taking
advantage of everything lucene and nutch offer?
Any input is greatly appreciated!
Thanks,
Jerry