Has anyone played around with trying to use Nutch/Lucene as a parametric search? My idea is to have each crawled document be an xml document containing some set of data, for instance:

crawleddoc1.xml:
<data>
  <instance id=1>
      <type>sometype</type>
      <date>Jan 1, 2006</date>
      <value>5</value>
   </instance>

crawleddoc2.xml:
<data>
  <instance id=2>
      <type>sometype</type>
      <date>Jan 2, 2006</date>
      <value>70</value>
   </instance>
</data>

And be able to perform searches like <date> between two values and <value> = 5? The return would just be the XML doc location (maybe the URL to the doc).

Is this just bending nutch/lucene too much towards relational data processing, or is it something that could be achieved while still taking advantage of everything lucene and nutch offer?

Any input is greatly appreciated!
Thanks,
Jerry

Reply via email to