On Sat, 03 Mar 2001, you wrote:
> 
> You should use -i -u URL for new documents only. This command just inserts URL to 
>the database, but doesn't index it.
> If you are able to access MySQL database from your script, you can update field 
>"urlword.next_index_time" to small value (1) for documents, that you want to reindex. 
>In this case you should specify large value of "Period" parameter in "aspseek.conf" 
>for all URLs.
> 
> Alexander

I think index -a -u <url> does the job. (reseting the next index time to now,
indexing just the urls matching <url>)

> 
> ----- Original Message ----- 
>   From: Daniell Freed 
>   To: [EMAIL PROTECTED] 
>   Sent: Saturday, March 03, 2001 6:49 AM
>   Subject: Re: [aseek-users] configuration quesiton
> 
> 
>   This seems to be working but still is taking a long time to index my entire site.  
>Can I do a 
>   index -i -u http://www.yoursite.com/changingdir/subdir/changeddocument.html 
> 
>   for each new or changed document as part of a script (I have a script that already 
>goes out and finds new and changed documents already that I could add this process 
>to)?  If I can do this do I need to do anything to load the delta files  (like an 
>index -D), or will the index -i -u ... do that after each file inserted into the 
>database. 
> 
>   Then once per week I can set it up to rewalk the entire site. 
> 
>   Would this work, or would it cause problems in my database. 
> 
>   Thanks 
> 
>   Dan 
> 
>   Kir Kolyshkin wrote: 
> 
>     To limit reindexing, use -u option, argument is URL mask in SQL form, 
>     in your case it can be http://www.yoursite.com/rapidly_changing_dir1% 
>     So, you'll run index -u nightly, and index without option to reindex everything 
>     every week. 
> 
>     Daniell Freed wrote: 
>     > 
>     > I need some advice about setting up aspseek.  I have a working installation of 
>     > aspseek, but I am looking to optimize how it works for my particular needs. 
>     > 
>     > I have a single site that has 4 main directories that need to be indexed; all 
>     > together there are about 200,000 documents.  2 of these directories contain 
>     > documents that don't ever change, and they take up about 70% of the total 
>number 
>     > of documents.  The other 2 directories change daily; there are generally 
>     > anywhere from 50 to 300 new or changed documents every day.  (These documents 
>     > are Wordperfect and Word documents that have been converted to html nightly as 
>     > part of a cron job using some custom perl scripts and a convertion tool called 
>     > wp2html).  I need to update the changing directories nightly so I can search 
>on 
>     > these new and changed documents. 
>     > 
>     > When I initially ran index, the database was created just fine and I was able 
>to 
>     > search the documents that I needed.  Then I started running nightly index jobs 
>     > that took about 30 to 40 minutes to run, but I wasn't seeing any changes to 
>the 
>     > old documents, and it didn't really look like any new documents were being 
>added 
>     > either (all of the documents contain last modified dates that I was using to 
>     > search on).  After poking around in the aspseek.conf file I discovered the 
>     > period command was set to 7d (7 days) and I figured that was my problem, so I 
>     > lowered this to 6h (6 hours).  Now my index is running but it is taking a 
>really 
>     > long time to run (6 hours so far).  Looking at the logs.txt file, it looks 
>like 
>     > it is indexing everything from scratch (the queued docs count is up to over 
>     > 100,000 documents). 
>     > 
>     > Is there a way that I can configure AspSeek to only look for updates in the 2 
>     > directories that contain changes?  Or can I configure searchd to search 2 
>     > different databases at the same time when a search request is made? 
>     > 
>     > Or (and this is a more complicated question) can I call index to insert or 
>     > update a single document at a time?  If this works then I can just add this to 
>     > my conversion script because it already goes through and finds new and changed 
>     > documents as part of its process. 
>     > 
>     > My goal here is to be able to run these update scripts overnight so that any 
>     > changes made the previous day are searchable. 
>     > 
>     > Thanks for the advice. 
>     > 
>     > -- 
>     > Daniell Freed 
>     > Computer Services 
>     > Dewitt, Ross, & Stevens S.C. 
>     > 
>     > He who fights with monsters might take care 
>     > lest he thereby become a monster. 
>     > And if you gaze for long into an abyss, 
>     > the abyss gazes also into you. 
>     > 
>     > Beyond Good and Evil 
>     > Friedrich Wilhelm Nietzche 
>     > 
>     > 
> 
>     --  [EMAIL PROTECTED]  http://kir.sever.net ICQ 7551596  -- 
>     Join CCAUWM - Citizens' Campaign for Abolition of the Use 
>     of the Word Microsoft (or of Microsoft Word - you choose)
> 
> -- 
> Daniell Freed
> Computer Services
> Dewitt, Ross, & Stevens S.C.
> 
> He who fights with monsters might take care 
> lest he thereby become a monster. 
> And if you gaze for long into an abyss, 
> the abyss gazes also into you.
> 
> Beyond Good and Evil
> Friedrich Wilhelm Nietzche
>     
> 

----------------------------------------
Content-Type: text/html; name="unnamed"
Content-Transfer-Encoding: quoted-printable
Content-Description: 
----------------------------------------

-- 
�������� ��������
Application Developer
Eurisko A.E.
�������� 9
106 71 �����
���: +301 3633362
���: +301 3633074
e-mail: [EMAIL PROTECTED]

Reply via email to