According to DI Peter Burgstaller:
> I'm trying to index a site that has language dependent content.
> The nameingconvention for files there is <filename>.html.en for the
> english content, <filename>.html.de for german <filename>.html.it for italian
> etc. (you get the idea).
>
> Now I have a problem with index.html.<lang> since url.normalize calls
> for url.removeIndex which makes <site>/index.html.en and <site>/index.html.de
> -> <site>/ so I only get one language.
>
> Any ideas how to solve that problem?
I have a couple ideas. The problem stems from the fact that the
CompareWord method that removeIndex uses is happy with matching a
substring, as long as this substring is a "word", which it thinks it
is in this case because of the period. The simple fix is to change
the setting of the remove_default_doc attribute from the default of
index.html, to either index.html.en or index.html.de, so that only one
of these is stripped off.
If that is unsatisfactory, e.g. if you have some directories which just
have an index.html without the .<lang> suffix, then the more elegant
fix is to change URL::removeIndex() to be more exacting about what it
matches. E.g., you could change the section that reads:
if (defaultdoc->hasPattern() &&
defaultdoc->CompareWord(path.sub(filename)))
path.chop(path.length() - filename);
to something like this:
int which, length;
if (defaultdoc->hasPattern() &&
defaultdoc->CompareWord(path.sub(filename), &which, &length) &&
path.length() == filename+length)
path.chop(length);
If you try this, please let me know if it works for you, and I'll
incorporate it into 3.2.0b3.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.