According to Pietro Palladino:
> Scrive Geoff Hutchison <[EMAIL PROTECTED]>:
> > On Wed, 12 Sep 2001, Pietro Palladino wrote:
> > > 1) when I really need to use more than one database?
> > 
> > * If you want more security for restrict/exclude than parameters to the
> > CGI.
> 
> Sorry, Geoff, could you better explain this point, plz?

See http://www.htdig.org/FAQ.html#q4.20 for a more detailed explanation
of the why and how of this.

> > * If you want to have "virtual hosting" for multiple sites. 
> 
> Ok, but in this case I could use the restrict option in a php wrapper.
> e.g. I have just one db and a List/Menu in which I can choose "site1" "site2" 
> or "site3". When I press "Go", my page sends the choice to a new php file in 
> which a switch instruction let me choose on which site I need to use the 
> restrict option...
> If I use a php wrapper, are there different reasons that I don't know (on this
> subject) about why I need to use multiple database? :-?

I think the basic point is if the restrict and exclude parameters to
htsearch do what you want, then you don't need multiple databases.

The main reason you'd pick separate databases over restrict/exclude
would be security (see FAQ 4.20).  The second would be because you
want a complete physical separation of sites.  E.g. if you're hosting
virtual sites for different companies, you'd want them to be able to
control the indexing of their own sites, and handle the maintenance of
their databases themselves, so it would likely be better to have separate
databases for each company.

A third reason might be performance or system limitation issues.  E.g.,
if a combined database of all the sites you index would be too big, or
too slow to search, you may prefer to maintain separate ones.

> > * If you want to "mix and match" multiple categories independent of
> > URLs
> >   (e.g. a site might want databases based on states or regions, which
> > each have collections of multiple URLs)
> 
> Also this point is a little bit obscure. Could you better explain this subject?

I think he means if you want to categorize pages in ways that would be
too complex to express in the form of simple restrict and exclude URL
substrings, then having separate databases may give you more control
over the categorization.  If restrict/exclude do the job for you, this
wouldn't be an issue.

> > > 2) Actually I've 3 conf files. Which one I'll need to use when I run
> > rundig?
> > 
> > That depends on what you're intending to do. It's hard to say more
> > without more details from you.
> 
> :-) Sorry, U R right. Well, I use a php wrapper to let people search in 3 
> different subtrees of our site. For each kind of search I use different 
> restrict/exclude options and 3 different conf files. Though I don't really need 
> 3 conf files (2 of them are alike), I choose this way 'cause I thought that I 
> could need all of them in the future. They differ just in the 
> option "allow_in_form". Now the question is: Do I need a new conf file just for 
> digging? 

See http://www.htdig.org/attrs.html#allow_in_form, and you'll see that
this attribute is only used by htsearch.  If your config files differ only
in the setting of this attribute, then as far as htdig and htmerge are
concerned, they all have the same settings for any attribute that matters.
If they're all essentially the same, then no, you don't really need another
one for digging, as long as the htdig and htmerge-specific attributes
are set as you need them.

> > > 3) I didn't understand a thing. If I use a "hierarchy" of weights like
> > this one:
> > ...
> > > Am I right if I say that words in the text are more important than
> > > ones in the title...and so on with the headings? Is it convenient
> > > specifying them or it's better using the default ones? By a logical
> > > side, is it better give more weight to the words inside the text?
> > 
> > The weights you gave would certainly have the effects you describe. I
> > think most people find that headers, titles, meta keywords are more
> > accurate, succinct descriptions of the document contents than the
> > normal
> > text. So the default values are at least a better starting point than
> > those that you described--but these can value depending on how people
> > code
> > their pages. (For example, if people use <font> tags rather than <h1>
> > to
> > set a "header," then the header_factor isn't that useful.)
> 
> Right! Unfortunately my reality belong to the worse case :-) The pages of our 
> site haven't properly titles and they have no meta keywords, so what do U think 
> about my choice? Any suggests?

If your pages don't have any meta keywords, then the setting of
keywords_factor doesn't really matter, as it won't be used.  Same thing
for title_factor if none of the pages have titles.  On the other hand,
if the pages have titles that are not relevant to their contents, or
might lead to misleading search results, then you may want to set the
title_factor to something low, or 0.  Same thing for other scoring
factors if they correspond to misleading or inappropriate elements on
your pages (e.g. meta keywords, meta descriptions, link descriptions,
headings, etc.)

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to