I've run into this problem, too, and have ended up breaking the big
directories into 1 or more smaller ones, with 1000 files each, and then
creating a separate index for each one.

A pain, I know, but it's been the only thing that worked for us.

On the bright side, if you organize the directories by subject, you can
allow folks to search only one index at a time...

Ian

-----Original Message-----
From: Jackson Moore [mailto:[EMAIL PROTECTED]]
Sent: Friday, March 29, 2002 8:40 AM
To: CF-Talk
Subject: Verity Index type=path


I have been using Verity on a couple websites and have been pretty pleased
thus far.  However, I recently encountered a problem when trying to index a
large directory.  (This is the Verity that comes with CF4.5 - not the new
CF5 version).

I have a directory with 5740 files in it - all Microsoft Word 97 documents.
When I create a new collection and use CFINDEX to index the documents, it
doesn't add all 5740 documents and it isn't even consistent.  Sometimes it
will add 5728, 5732, 5735, 5736 - never all 5740.

I'm using the following CFINDEX code:

<cfindex        collection="pnpworking2"
                        action="update"
                        type="path"
                        key="e:\inetpub\wwwroot\websitename\documents\pnp\working\"
                        extensions=".doc,.xls,.ppt,.pdf">

I'm confirming the number of files added, by searching for a keyword and
outputting #search.RECORDSSEARCHED#.

I have also tried using the CF Administrator to index the directory and get
the SAME results.

If I start with a new collection and use CFDIRECTORY to get a listing of the
files in the directory and then loop over them one by one adding them to the
collection using CFINDEX and type=file, the server times out before
finishing - I've let it run for 30 minutes.

In other places I use Verity, I don't have more than 1000 files in a single
directory, and I do not have this problem.

Questions:

1.  Is there a limit to the number of files Verity can index in a single
directory?
2.  Maybe all the files are getting indexed, but the .RECORDSSEARCHED is
incorrect?
3.  I considered that it may be an issue where Verity can't access a
document if its in use by another process, but I made a copy of the
directory and moved it somewhere else on the server and am still having the
same problem.  Are there other possible services that could access these
files (preventing Verity from indexing them) I need to be aware of?

(Platform NT4Sp6a/IIS4/CF4.5.1SP2)

Thanks for any help you can provide,

Jackson Moore
[EMAIL PROTECTED]


______________________________________________________________________
Structure your ColdFusion code with Fusebox. Get the official book at 
http://www.fusionauthority.com/bkinfo.cfm
FAQ: http://www.thenetprofits.co.uk/coldfusion/faq
Archives: http://www.mail-archive.com/[email protected]/
Unsubscribe: http://www.houseoffusion.com/index.cfm?sidebar=lists

Reply via email to