I have yet another question then.

If I use index -a -u <url>  to index a specific document will it load the delta files (and the other end of index stuff) after each issuing of the command, or can I stop it from doing those steps until I am done and then just do this with an index -D -B -K at the end of the entire process?

Thanks again.
Dan

Achilleas Mantzios wrote:

On Sat, 03 Mar 2001, you wrote:
>
> You should use -i -u URL for new documents only. This command just inserts URL to the database, but doesn't index it.
> If you are able to access MySQL database from your script, you can update field "urlword.next_index_time" to small value (1) for documents, that you want to reindex. In this case you should specify large value of "Period" parameter in "aspseek.conf" for all URLs.
>
> Alexander

I think index -a -u <url> does the job. (reseting the next index time to now,
indexing just the urls matching <url>)

>
> ----- Original Message -----
>   From: Daniell Freed
>   To: [EMAIL PROTECTED]
>   Sent: Saturday, March 03, 2001 6:49 AM
>   Subject: Re: [aseek-users] configuration quesiton
>
>
>   This seems to be working but still is taking a long time to index my entire site.  Can I do a
>   index -i -u http://www.yoursite.com/changingdir/subdir/changeddocument.html
>
>   for each new or changed document as part of a script (I have a script that already goes out and finds new and changed documents already that I could add this process to)?  If I can do this do I need to do anything to load the delta files  (like an index -D), or will the index -i -u ... do that after each file inserted into the database.
>
>   Then once per week I can set it up to rewalk the entire site.
>
>   Would this work, or would it cause problems in my database.
>
>   Thanks
>
>   Dan
>
>   Kir Kolyshkin wrote:
>
>     To limit reindexing, use -u option, argument is URL mask in SQL form,
>     in your case it can be http://www.yoursite.com/rapidly_changing_dir1%
>     So, you'll run index -u nightly, and index without option to reindex everything
>     every week.
>
>     Daniell Freed wrote:
>     >
>     > I need some advice about setting up aspseek.  I have a working installation of
>     > aspseek, but I am looking to optimize how it works for my particular needs.
>     >
>     > I have a single site that has 4 main directories that need to be indexed; all
>     > together there are about 200,000 documents.  2 of these directories contain
>     > documents that don't ever change, and they take up about 70% of the total number
>     > of documents.  The other 2 directories change daily; there are generally
>     > anywhere from 50 to 300 new or changed documents every day.  (These documents
>     > are Wordperfect and Word documents that have been converted to html nightly as
>     > part of a cron job using some custom perl scripts and a convertion tool called
>     > wp2html).  I need to update the changing directories nightly so I can search on
>     > these new and changed documents.
>     >
>     > When I initially ran index, the database was created just fine and I was able to
>     > search the documents that I needed.  Then I started running nightly index jobs
>     > that took about 30 to 40 minutes to run, but I wasn't seeing any changes to the
>     > old documents, and it didn't really look like any new documents were being added
>     > either (all of the documents contain last modified dates that I was using to
>     > search on).  After poking around in the aspseek.conf file I discovered the
>     > period command was set to 7d (7 days) and I figured that was my problem, so I
>     > lowered this to 6h (6 hours).  Now my index is running but it is taking a really
>     > long time to run (6 hours so far).  Looking at the logs.txt file, it looks like
>     > it is indexing everything from scratch (the queued docs count is up to over
>     > 100,000 documents).
>     >
>     > Is there a way that I can configure AspSeek to only look for updates in the 2
>     > directories that contain changes?  Or can I configure searchd to search 2
>     > different databases at the same time when a search request is made?
>     >
>     > Or (and this is a more complicated question) can I call index to insert or
>     > update a single document at a time?  If this works then I can just add this to
>     > my conversion script because it already goes through and finds new and changed
>     > documents as part of its process.
>     >
>     > My goal here is to be able to run these update scripts overnight so that any
>     > changes made the previous day are searchable.
>     >
>     > Thanks for the advice.
>     >
>     > --
>     > Daniell Freed
>     > Computer Services
>     > Dewitt, Ross, & Stevens S.C.
>     >
>     > He who fights with monsters might take care
>     > lest he thereby become a monster.
>     > And if you gaze for long into an abyss,
>     > the abyss gazes also into you.
>     >
>     > Beyond Good and Evil
>     > Friedrich Wilhelm Nietzche
>     >
>     >
>
>     --  [EMAIL PROTECTED]  http://kir.sever.net ICQ 7551596  --
>     Join CCAUWM - Citizens' Campaign for Abolition of the Use
>     of the Word Microsoft (or of Microsoft Word - you choose)
>
> --
> Daniell Freed
> Computer Services
> Dewitt, Ross, & Stevens S.C.
>
> He who fights with monsters might take care
> lest he thereby become a monster.
> And if you gaze for long into an abyss,
> the abyss gazes also into you.
>
> Beyond Good and Evil
> Friedrich Wilhelm Nietzche
>
>

----------------------------------------
Content-Type: text/html; name="unnamed"
Content-Transfer-Encoding: quoted-printable
Content-Description:
----------------------------------------

--
Á÷éëëÝáò ÌÜíôæéïò
Application Developer
Eurisko A.E.
ÐéíäÜñïõ 9
106 71 ÁèÞíá
Ôçë: +301 3633362
Öáî: +301 3633074
e-mail: [EMAIL PROTECTED]

-- 
Daniell Freed
Computer Services
Dewitt, Ross, & Stevens S.C.

He who fights with monsters might take care 
lest he thereby become a monster. 
And if you gaze for long into an abyss, 
the abyss gazes also into you.

Beyond Good and Evil
Friedrich Wilhelm Nietzche
 

Reply via email to