On Mon, Mar 25, 2002 at 02:05:34PM -0500, Geoff Hutchison wrote:
> On Mon, 25 Mar 2002, Soon-Son Kwon wrote:
>
> > limit_urls_to: HOWTO Translations KoreanDoc
> >
> > But after modifying the limit_urls_to, the db size
> > grew much bigger than before.
>
> I would worry that with the limit_urls_to that you have set, that you
> could end up heading off-site. If I had to take a guess from your domain
> name, I'd think perhaps you headed to the main LDP site and started
> indexing there.
In fact, I am running Korean LDP and my server has some other websites
but I want htdig search only on the URL which contains the above on
my domain only.(kldp.org)
So I set limit_urls_to to "kldp.org/HOWTO kldp.org/Translations
kldp.org/KoreanDoc" to let htdig store information only for the URLs which
contain the above strings but the result was the same.
db.wordlist grew up to 1.4GB until it eat up all the disk space left.
Can anyone please let me know how can I let htdig store only for
some specific directories?
> > Has anyone faced the same situation yet?
> > I am using somewhat old version (3.1.2) because I have a patch which
> > enables htdig to deal with 2-byte character data.
> >
> > I am also want to know if current htdig supports asian (especially Korean)
> > characters or not. AFAIK, not yet but things may have changed. :-)
>
> No, but we'd certainly be very interested in that patch. We'd certainly
> like to support multi-byte character sets, but as none of the currently
> active developers:
> * has much multi-byte data to index
> * is familiar with programming Unicode/UTF-8 encodings
> * can easily test multi-byte indexing
> we haven't made much progress.
>
> Of course if you have a patch that can point the way towards this, we'd
> certainly give it a look and/or work with people to get multi-byte
> indexing working.
In fact, this patch is over 2 years old and at that time, this patch
was rejected because this is only for Korean...not Unicode/UTF-8.
This patch works only for 3.1.2 and the developer stopped upgrading it.
--
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
(o_ **WTFM**
(o_ (o_ //\
(/)_ (/)_ V_/_ http://kldp.org
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html