If I use index -a -u <url> to index a specific document will it load the delta files (and the other end of index stuff) after each issuing of the command, or can I stop it from doing those steps until I am done and then just do this with an index -D -B -K at the end of the entire process?
Thanks again.
Dan
Achilleas Mantzios wrote:
On Sat, 03 Mar 2001, you wrote:
>
> You should use -i -u URL for new documents only. This command just inserts URL to the database, but doesn't index it.
> If you are able to access MySQL database from your script, you can update field "urlword.next_index_time" to small value (1) for documents, that you want to reindex. In this case you should specify large value of "Period" parameter in "aspseek.conf" for all URLs.
>
> AlexanderI think index -a -u <url> does the job. (reseting the next index time to now,
indexing just the urls matching <url>)>
> ----- Original Message -----
> From: Daniell Freed
> To: [EMAIL PROTECTED]
> Sent: Saturday, March 03, 2001 6:49 AM
> Subject: Re: [aseek-users] configuration quesiton
>
>
> This seems to be working but still is taking a long time to index my entire site. Can I do a
> index -i -u http://www.yoursite.com/changingdir/subdir/changeddocument.html
>
> for each new or changed document as part of a script (I have a script that already goes out and finds new and changed documents already that I could add this process to)? If I can do this do I need to do anything to load the delta files (like an index -D), or will the index -i -u ... do that after each file inserted into the database.
>
> Then once per week I can set it up to rewalk the entire site.
>
> Would this work, or would it cause problems in my database.
>
> Thanks
>
> Dan
>
> Kir Kolyshkin wrote:
>
> To limit reindexing, use -u option, argument is URL mask in SQL form,
> in your case it can be http://www.yoursite.com/rapidly_changing_dir1%
> So, you'll run index -u nightly, and index without option to reindex everything
> every week.
>
> Daniell Freed wrote:
> >
> > I need some advice about setting up aspseek. I have a working installation of
> > aspseek, but I am looking to optimize how it works for my particular needs.
> >
> > I have a single site that has 4 main directories that need to be indexed; all
> > together there are about 200,000 documents. 2 of these directories contain
> > documents that don't ever change, and they take up about 70% of the total number
> > of documents. The other 2 directories change daily; there are generally
> > anywhere from 50 to 300 new or changed documents every day. (These documents
> > are Wordperfect and Word documents that have been converted to html nightly as
> > part of a cron job using some custom perl scripts and a convertion tool called
> > wp2html). I need to update the changing directories nightly so I can search on
> > these new and changed documents.
> >
> > When I initially ran index, the database was created just fine and I was able to
> > search the documents that I needed. Then I started running nightly index jobs
> > that took about 30 to 40 minutes to run, but I wasn't seeing any changes to the
> > old documents, and it didn't really look like any new documents were being added
> > either (all of the documents contain last modified dates that I was using to
> > search on). After poking around in the aspseek.conf file I discovered the
> > period command was set to 7d (7 days) and I figured that was my problem, so I
> > lowered this to 6h (6 hours). Now my index is running but it is taking a really
> > long time to run (6 hours so far). Looking at the logs.txt file, it looks like
> > it is indexing everything from scratch (the queued docs count is up to over
> > 100,000 documents).
> >
> > Is there a way that I can configure AspSeek to only look for updates in the 2
> > directories that contain changes? Or can I configure searchd to search 2
> > different databases at the same time when a search request is made?
> >
> > Or (and this is a more complicated question) can I call index to insert or
> > update a single document at a time? If this works then I can just add this to
> > my conversion script because it already goes through and finds new and changed
> > documents as part of its process.
> >
> > My goal here is to be able to run these update scripts overnight so that any
> > changes made the previous day are searchable.
> >
> > Thanks for the advice.
> >
> > --
> > Daniell Freed
> > Computer Services
> > Dewitt, Ross, & Stevens S.C.
> >
> > He who fights with monsters might take care
> > lest he thereby become a monster.
> > And if you gaze for long into an abyss,
> > the abyss gazes also into you.
> >
> > Beyond Good and Evil
> > Friedrich Wilhelm Nietzche
> >
> >
>
> -- [EMAIL PROTECTED] http://kir.sever.net ICQ 7551596 --
> Join CCAUWM - Citizens' Campaign for Abolition of the Use
> of the Word Microsoft (or of Microsoft Word - you choose)
>
> --
> Daniell Freed
> Computer Services
> Dewitt, Ross, & Stevens S.C.
>
> He who fights with monsters might take care
> lest he thereby become a monster.
> And if you gaze for long into an abyss,
> the abyss gazes also into you.
>
> Beyond Good and Evil
> Friedrich Wilhelm Nietzche
>
>----------------------------------------
Content-Type: text/html; name="unnamed"
Content-Transfer-Encoding: quoted-printable
Content-Description:
------------------------------------------
Á÷éëëÝáò ÌÜíôæéïò
Application Developer
Eurisko A.E.
ÐéíäÜñïõ 9
106 71 ÁèÞíá
Ôçë: +301 3633362
Öáî: +301 3633074
e-mail: [EMAIL PROTECTED]
-- Daniell Freed Computer Services Dewitt, Ross, & Stevens S.C. He who fights with monsters might take care lest he thereby become a monster. And if you gaze for long into an abyss, the abyss gazes also into you.
Beyond Good and Evil Friedrich Wilhelm Nietzche
