You should use -i -u URL for new documents only. This command just inserts URL to the database, but doesn't index it.
If you are able to access MySQL database from your script, you can update field "urlword.next_index_time" to small value (1) for documents, that you want to reindex. In this case you should specify large value of "Period" parameter in "aspseek.conf" for all URLs.
 
Alexander
 
----- Original Message -----
Sent: Saturday, March 03, 2001 6:49 AM
Subject: Re: [aseek-users] configuration quesiton

This seems to be working but still is taking a long time to index my entire site.  Can I do a

index -i -u http://www.yoursite.com/changingdir/subdir/changeddocument.html

for each new or changed document as part of a script (I have a script that already goes out and finds new and changed documents already that I could add this process to)?  If I can do this do I need to do anything to load the delta files  (like an index -D), or will the index -i -u ... do that after each file inserted into the database.

Then once per week I can set it up to rewalk the entire site.

Would this work, or would it cause problems in my database.

Thanks

Dan

Kir Kolyshkin wrote:

To limit reindexing, use -u option, argument is URL mask in SQL form,
in your case it can be http://www.yoursite.com/rapidly_changing_dir1%

So, you'll run index -u nightly, and index without option to reindex everything
every week.

Daniell Freed wrote:
>
> I need some advice about setting up aspseek.  I have a working installation of
> aspseek, but I am looking to optimize how it works for my particular needs.
>
> I have a single site that has 4 main directories that need to be indexed; all
> together there are about 200,000 documents.  2 of these directories contain
> documents that don't ever change, and they take up about 70% of the total number
> of documents.  The other 2 directories change daily; there are generally
> anywhere from 50 to 300 new or changed documents every day.  (These documents
> are Wordperfect and Word documents that have been converted to html nightly as
> part of a cron job using some custom perl scripts and a convertion tool called
> wp2html).  I need to update the changing directories nightly so I can search on
> these new and changed documents.
>
> When I initially ran index, the database was created just fine and I was able to
> search the documents that I needed.  Then I started running nightly index jobs
> that took about 30 to 40 minutes to run, but I wasn't seeing any changes to the
> old documents, and it didn't really look like any new documents were being added
> either (all of the documents contain last modified dates that I was using to
> search on).  After poking around in the aspseek.conf file I discovered the
> period command was set to 7d (7 days) and I figured that was my problem, so I
> lowered this to 6h (6 hours).  Now my index is running but it is taking a really
> long time to run (6 hours so far).  Looking at the logs.txt file, it looks like
> it is indexing everything from scratch (the queued docs count is up to over
> 100,000 documents).
>
> Is there a way that I can configure AspSeek to only look for updates in the 2
> directories that contain changes?  Or can I configure searchd to search 2
> different databases at the same time when a search request is made?
>
> Or (and this is a more complicated question) can I call index to insert or
> update a single document at a time?  If this works then I can just add this to
> my conversion script because it already goes through and finds new and changed
> documents as part of its process.
>
> My goal here is to be able to run these update scripts overnight so that any
> changes made the previous day are searchable.
>
> Thanks for the advice.
>
> --
> Daniell Freed
> Computer Services
> Dewitt, Ross, & Stevens S.C.
>
> He who fights with monsters might take care
> lest he thereby become a monster.
> And if you gaze for long into an abyss,
> the abyss gazes also into you.
>
> Beyond Good and Evil
> Friedrich Wilhelm Nietzche
>
>

--  [EMAIL PROTECTED]  http://kir.sever.net ICQ 7551596  --
Join CCAUWM - Citizens' Campaign for Abolition of the Use
of the Word Microsoft (or of Microsoft Word - you choose)

-- 
Daniell Freed
Computer Services
Dewitt, Ross, & Stevens S.C.

He who fights with monsters might take care 
lest he thereby become a monster. 
And if you gaze for long into an abyss, 
the abyss gazes also into you.

Beyond Good and Evil
Friedrich Wilhelm Nietzche
 

Reply via email to