You're not missing anything. These kind of problems are handled best by 
normalization.

In my earlier post I was suggesting that we normalize an hyphenated word, say 
"God-ward",  to its parts and the whole: "God", "ward" and "Godward".

Solving backward compatibliity is fairly simple. Have a version number for the 
built index. If it doesn't match the expected value from the normalizer, the 
index is invalid and can't be used. JSword has the code for such a mechanism, 
but it hasn't been woven in. One could go deeper than a single coarse grain 
version number and have version numbers for each feature that is part of an 
index.

In Him,
        DM
On Mar 3, 2013, at 8:36 AM, Chris Burrell <ch...@burrell.me.uk> wrote:

> I still think normalisation of what is searched for would be good, in that it 
> basically means the user sees the results that he is looking for.
> 
> I understand the concern for backwards compatibility and perhaps that means 
> frontends should be able to turn this normalisation off. But looking ahead, 
> for new front-ends, front-ends that can make rebuilding indexes part of the 
> upgrade to a new version and for all new downloads of frontends, this has to 
> be a benefit.
> 
> Not normalising, seems to me like perpetuating an existing problem into all 
> new downloads from this day forth. Or am I missing something?
> Chris
> 
> 
> 
> On 3 March 2013 12:53, Jonathan Morgan <jonmmor...@gmail.com> wrote:
> Another possibly related normalisation problem which BPBible at least has an 
> open issue about is Caesar vs. Cæsar.  Theoretically I guess you want either 
> search to match both forms.  I don't know how Lucene etc. deals with this (if 
> at all).
> 
> Jon
> 
> 
> On Mon, Feb 25, 2013 at 2:48 AM, David Haslam <dfh...@googlemail.com> wrote:
> In the KJV module, if you want to search for [say] the hyphenated name
> "Maher–shalal–hash–baz", you first have to be aware that this module uses
> the ndash in place of the hyphen.
> 
> btw.  It's not so easy to enter the ndash from a keyboard, and probably even
> harder in an Android tablet or mobile.
> 
> If you use ordinary hyphen/minus for the search key hyphen for this module,
> you don't find anything with "Exact phrase".
> If you use "Multi-word", you do find "Maher" highlighted in the found verse.
> (e.g. using Xiphos).
> 
> For modules in general, however, the user cannot usually know in advance
> whether hyphenated words use the ndash, the hyphen or something else.
> 
> Has anyone else looked into this aspect of the search feature?
> 
> David
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://sword-dev.350566.n4.nabble.com/Searching-for-hyphenated-words-tp4652016.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
> 
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
> 
> _______________________________________________
> sword-devel mailing list: sword-devel@crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

_______________________________________________
sword-devel mailing list: sword-devel@crosswire.org
http://www.crosswire.org/mailman/listinfo/sword-devel
Instructions to unsubscribe/change your settings at above page

Reply via email to