Sorry for the long silence and thanks for the help.
I've found the plugins you mentioned and set up nutch to use them. The
result is somewhat confusing, though. For one thing, my date: and
type: queries still returned no results. Weirder still, using luke to
inspect the index contents, I saw the new fields, luke would display
the top ranking terms by both "date" and "type" fields, a search like
"date:20051030" would yield dozens of results, but the "string value"
of the "date" and "type" fields was not available....even thought I
found the documents in question using that exact field as a key.

I'll see what I come up with using 0.8 as I need the .xls and .zip
support, anyway.

t.n.a.

On 7/20/06, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote:
You'd have to enable index-more and query-more plugins, I believe.

> -----Original Message-----
> From: Tomi NA [mailto:[EMAIL PROTECTED]
> Sent: 2006-7-19 10:01
> To: nutch-user@lucene.apache.org
> Subject: missing, but declared functionality
>
> These kinds of queries return no results:
>
> date:19980101-20061231
> type:pdf
> type:application/pdf
>
> From the release changes documents (0.7-0.7.2), I assumed
> these would work.
> Upon index inspection (using the luke tool), I see there are no fields
> marked "date" or "type" (althought I gather this is interpreted as
> url:*.pdf). The fields I have are:
> anchor
> boost
> content
> digest
> docNo
> host
> segment
> site
> title
> url
>
> I ran the index process with very little special configuration....some
> filetype filtering and the like.
> Am I missing something?
> The files are served over a samba share: I plan to serve them through
> a web server because of security implications of using the file://
> protocol. Can the creation and last modification date be retrieved
> over http:// at all?
>
> TIA,
> t.n.a.
>

Reply via email to