KH Aleksey,
AC Before looking for word in .index file dictd converts it to lower
AC case and removes non-alphanumeric characters from the word (if no
AC 00-database-allchars is found of cause). This is necessary to
AC ignore non-alphanumeric characters in search and make the search
AC
KH Aleksey,
KH Do you know why 00-database-allchars is necessary when
KH 00-database-allchars is present, in order for dictd to read
KH 00-database-short (and thus report the database name)? Is this a bug
KH or a feature?
Could you please show me 'head -n 30 buggy_db.index'?
P.S.
Note
Aleksey,
Note that if dictd databases was created by dictfmt with --utf8
option but without --allchars, 00-database-xxx entry will be present
in .index file as 00databasexxx, i.e. with no dashes.
Indeed, s/-//g allows it to work without the 00-database-allchars.
Is this a documented feature,
KH Aleksey,
Note that if dictd databases was created by dictfmt with --utf8
option but without --allchars, 00-database-xxx entry will be
present in .index file as 00databasexxx, i.e. with no dashes.
KH Indeed, s/-//g allows it to work without the
KH 00-database-allchars. Is this a
Aleksey,
AC Before looking for word in .index file dictd converts it to lower
AC case and removes non-alphanumeric characters from the word (if no
AC 00-database-allchars is found of cause). This is necessary to
AC ignore non-alphanumeric characters in search and make the search
AC
Thomas Ludovic,
I have been testing a dictd_1.10.2, and as Aleksey suggested, it
serves UTF8 dictionaries without needing a --locale=*.utf-8 line.
The unreadability of the 00-database-short entry when running uft8 is
still present, but I have made some progress with this. If a
KH Thomas:
What about upgrading the Debian package to dictd 1.10.1, as
suggested by Aleksey ?
KH Absolutely! I did not realize that 1.10 was out -- and it appears to
KH have been so since June. How embarrassing.
KH I will build a new package and let you know how it behaves.
I always
Thomas:
What about upgrading the Debian package to dictd 1.10.1, as
suggested by Aleksey ?
Absolutely! I did not realize that 1.10 was out -- and it appears to
have been so since June. How embarrassing.
I will build a new package and let you know how it behaves.
Kirk
--
To UNSUBSCRIBE,
The problem is that, given the way that the locales package works
(with locales built by locale-gen), there is no means that I know of
to create a dependency on there being a UTF-8 locale built.
TP I don't understand why locales are necessary to allow dictd to read
TP UTF-8
Hi Aleksey,
Aleksey Cheusov wrote:
It easy. dictd prior to 1.10.1 uses libc functions
isw{alpha,alnum,...} and tow{upper,lower}
which are locale sensitive.
If you dislike this, upgrade dictd to the latest versions.
Ok, thanks. Kirk, can you upgrade the dictd Debian package to the latest
Hi Thomas, Ludovic,
Ludovic:
Are you sure the sort order is problematic?
Thomas:
Quite strange, since revo works perfectly with dictd, word search
through indexes works.
It is only the 00-* headers missorted. This will have to be fixed,
but it is a minor thing. There is definitely a bug in
Hi,
Kirk Hilliard wrote:
AFAIK, dictd only uses the locale provided to determine sort order for
doing its search of the index. It might be possible to incorporate
UTF-8 character order into dictd itself. This would not be as elegant
from a programing standpoint, and Aleksey might not
Hi Kirk,
Kirk Hilliard a écrit :
First, the index is not sorted correctly:
Quite strange, since revo works perfectly with dictd, word search
through indexes works.
The only reason that dictd needs to know locale info, is for sort
order. It looks as if the rest of the index is sorted
Hi Thomas,
Just a note to keep you up to date.
The work we did with indexing multiple alternative 00-database-short
entries is good, and something like it should be included in the revo
package, but the problem definitely lies elsewhere.
I finally downloaded your stuff, and even the database in
Thanks for all your explanations. It was now clearer, and I could give a
try to it, but still doesn't work.
[snip]
What I do in the debian/rules is the following:
..
That looks good to me.
I fear that I must given you a solution for what I assumed was wrong,
not for what the actual problem
Hi Kirk,
Kirk Hilliard a écrit :
OK, here is some stuff that should get it working. For production, we
will have to decide if you want to regenerate the indices fully, using
the upstream files and process (which upstream might want to adopt
themselves) or to make a more automated version of
Hi BTS, this is just a note to keep you up to date.
dictd should be able to handle multiple indices for a single database,
running as multiple dictionaries. A small change to dictdconfig alows
it to automatically write the additional database sections to db.list.
Regarding the 00-database-short
Hi,
Kirk Hilliard wrote:
Presumably you are adding these before the index is built.
No, I'm not, because the indexes are not built: they are included in the
upstream package. Is this a problem ? Should the
00-database-short-$LANGUAGE string be defined also in indexes ?
It is in dictd, but
Hi Thomas,
Presumably you are adding these before the index is built.
No, I'm not, because the indexes are not built: they are included in
the upstream package. Is this a problem ? Should the
00-database-short-$LANGUAGE string be defined also in indexes ?
Yes, the server retrieves the
Hi,
Kirk Hilliard wrote:
Yes, the server retrieves the 00-database-short* entries using a
regular lookup, and thus uses the index. Also, any changes to the
database file that affects the offsets of entries would require an
index rebuild. It should be possible to append entries to the
Hi,
[ I'm adding Ludovic Courtès as Cc: of this mail ]
Kirk Hilliard wrote:
You will need to add the desired database names to the database as
definitions for 00-database-short-cs and 00-database-short-de.
Does that work for you?
Yes it seems to work for us:
- we can avoid the symlinks
Hi,
I'll give a try to this patch this evening and let you know. Thanks !
Good. I tested dictdconfig, but did not build a database with
multiple 00-database-short-ext entries. From dictd's source, it
appears that name @* directives are implemented and should work, but
I'll feel more confidant
Hi,
Kirk Hilliard a écrit :
You will need to add the desired database names to the database as
definitions for 00-database-short-cs and 00-database-short-de.
I've fixed the dict-revo Debian package, so that:
1) it doesn't generate useless symlinks, since dictdconfig now allows
multiple
Hi Thomas,
I've fixed the dict-revo Debian package, so that:
1) it doesn't generate useless symlinks, since dictdconfig now allows
multiple indexes for the same database
2) add some 00-database-short-$LANGUAGE at the beginning of the
database. Here's the beginning of the database file
Package: dictd
Version: 1.9.15-1
Severity: wishlist
Hi,
Ludovic Courtès ([EMAIL PROTECTED]) and I are currently trying to
package « Reta Vortaro », an Esperanto dictionary, available under the
GNU GPL at http://www.uni-leipzig.de/esperanto/voko/tgz/index.html.
This dictionary is a bit
Ludovic Courtès ([EMAIL PROTECTED]) and I are currently trying to
package « Reta Vortaro », an Esperanto dictionary
[snip]
a single .dict.dz file ... has several .index files
Great! I think that we can handle it all by modifying dictdconfig a bit.
dictdconfig is driven by index files --
26 matches
Mail list logo