Hi Daniel, Code points are not the only way to sort it. However, there does need to be a comparison function defined, which will compare two words and give which is bigger. This needs to be used consistently, from module creation to frontend. There could be a library of defined comparators provided by SWORD - but you would need one for each sort order you wanted (which approaches one per language).
Personally, I don't find that sorted order is particularly important in dictionaries - I would type in a word, and then hope that if it is a different form of the word it would be relatively close. Some frontends may not give the ability to type in words, though. But I haven't used dictionaries in other languages, so it may be different for them - especially once diacritics are involved. The reasons why dictionaries are different from bibles are: 1) Bibles have a known structure, which is hardcoded in the key type (this is going to be able to change soon, for alternate versification, though - probably leading to less efficient modules) 2) Dictionaries can be much, much larger - Websters is a 14Mb download compressed, as compared to the WEB's ~1.5Mb That's not to say the dictionaries can't be done more efficiently than they are currently. Looking at the code, they could be quicker for the (common?) case of incrementing a module. Currently they do a binary search for every increment. Further, they could probably be optimized for key retrieval - which is the really important thing here. (For example by storing the keys separately, uncompressed, 1 key per line) God Bless, Ben ------------------------------------------------------------------------------------------- The Lord is not slow to fulfill his promise as some count slowness, but is patient toward you, not wishing that any should perish, but that all should reach repentance. 2 Peter 3:9 (ESV) On Thu, Sep 18, 2008 at 11:21 AM, Daniel Owens <[EMAIL PROTECTED]> wrote: > Is code point order the ONLY way to sort dictionary entries? Surely there > is a solution which would retain the printed or intended order of dictionary > entries without giving up lots of efficiency. If not, I think users would > find a correctly ordered but slower dictionary to one which is fast but > jumbled up. > > At the very least, even if dictionaries aren't sorted by the printed order, > they should AT LEAST be in alphabetical order. To me that is a > non-negotiable for a dictionary--people depend on dictionaries being in the > right order, and code point order disturbs that for some languages. Here are > a couple of ideas: > - Could a configuration file of some sort be created to define a > sorted order for a given language that would actually be in alphabetical > order? > - Could a dictionary index be created to handle large dictionaries > which allows for the retention of the correct order of entries (whether that > is the printed order or alphabetical order)? > - Bibles are not ordered by code point, and we are able to search them > fairly quickly. Do dictionaries need to be compiled in a fashion similar to > Bibles? > > As it stands, dictionaries are NOT displayed in alphabetical order (at > least not Vietnamese, and apparently Farsi), which at best looks silly to > the user and at worst means you have to manually hunt around to find the > right entry, making a Genbook more efficient for the user in the end. But > then you lose the dictionary lookup feature. > > Daniel > > Ben Morgan wrote: > > The issue with ordering as I understand it is that if it is in (some form > of) sorted order, you can use binary search to find entries. > If you want order retained, it is best to use a genbook - but it won't be as > efficient, and may not have as good UI support. > With huge english dictionaries (like Webster's, for instance) this becomes > very important. > > >From BPBible's perspective, dictionary handling is done as follows: > 1. Read the index of the dictionary and divide by 4 or 6 to get the length > (depending on the driver) > 2. Set the virtual list length to the dictionary length > 3. When any item is displayed in the virtual list, it retrieves it from the > module. > 4. When the user starts typing in the text box above, it does a binary > search to find which item to display. > > 4 is already quite slow enough on big dictionaries - by having it unsorted, > it would make it quite a lot slower, I imagine. > All the keys from the module would have to be read in, which takes a while. > > God Bless, > Ben > ------------------------------------------------------------------------------------------- > The Lord is not slow to fulfill his promise as some count slowness, > but is patient toward you, not wishing that any should perish, > but that all should reach repentance. > 2 Peter 3:9 (ESV) > > > On Thu, Sep 18, 2008 at 12:43 AM, Daniel Owens <[EMAIL PROTECTED]> <[EMAIL > PROTECTED]> wrote: > > > > mention that byte ordering does some strange things to Vietnamese > dictionaries. The Vietnamese script is a Latin script, but because it uses > some odd characters code point ordering results in illogical and > non-alphabetical entry ordering. For example, the "d" with a line through it > (đ) gets relegated to near the end of the dictionary instead of after the > regular "d" or anything with an apostrophe at the beginning of a word or > phrase gets moved to the top of the list regardless of the first letter > (such as 'tis). I am supportive of the IIRC general opinion. Let the module > creator worry about the ordering. Otherwise you get some very strange > dictionary behavior. > > > > ------------------------------ > > _______________________________________________ > sword-devel mailing list: [EMAIL > PROTECTED]://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > > > -- > PMBX license 1502 > > > _______________________________________________ > sword-devel mailing list: [email protected] > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page >
_______________________________________________ sword-devel mailing list: [email protected] http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
