Re: [sword-devel] Languages without a space between words
Visually, the resulting text would display the same as the original, but >>>>>> the module would be amenable to indexing for word searches. >>>>>> >>>>>> A difficulty that might then arise is how the front-end user might enter >>>>>> the search query for an exact phrase search type (containing more than >>>>>> one word). Other search types (all words, any word) might be OK as is. >>>>>> >>>>>> Aside: The KuCut method developed in 2004 was originally trained using >>>>>> the text of the ThaKJV. >>>>>> >>>>>> Regards, >>>>>> >>>>>> David >>>>>> >>>>>> Sent from Proton Mail for iOS >>>>>> >>>>>> On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne >>>>>> <[ref...@gmx.net](mailto:On+Mon,+Apr+17,+2023+at+17:16,+Peter+Von+Kaehne+%3C%3Ca+href=)> >>>>>> wrote: >>>>>> >>>>>>> Does Thai Burmese etc etc use end forms for letters? if so, are these >>>>>>> encoded as such? >>>>>>> >>>>>>> Peter >>>>>>> Gesendet: Montag, 17. April 2023 um 16:47 Uhr >>>>>>> Von: "David Haslam" >>>>>>> An: sword-devel@crosswire.org >>>>>>> Betreff: [sword-devel] Languages without a space between words >>>>>>> How (if at all) does the SWORD API generate a search index for a module >>>>>>> that is for a language without a space between words? >>>>>>> >>>>>>> Please consider how best to generate a useful search index for modules >>>>>>> that are >>>>>>> for Bible translations in languages that have no spaces between words. >>>>>>> >>>>>>> Example: CrossWire module ThaiKJV >>>>>>> >>>>>>> See >>>>>>> https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries >>>>>>> Has this ever been considered before. >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> David >>>>>>> >>>>>>> Sent from Proton Mail for iOS >>>>>>> ___ sword-devel mailing >>>>>>> list: sword-devel@crosswire.org >>>>>>> http://crosswire.org/mailman/listinfo/sword-devel Instructions to >>>>>>> unsubscribe/change your settings at above page >>>>>>> ___ >>>>>>> sword-devel mailing list: sword-devel@crosswire.org >>>>>>> http://crosswire.org/mailman/listinfo/sword-devel >>>>>>> Instructions to unsubscribe/change your settings at above page >>>>>> >>>>>> ___ >>>>>> sword-devel mailing list: sword-devel@crosswire.org >>>>>> http://crosswire.org/mailman/listinfo/sword-devel >>>>>> Instructions to unsubscribe/change your settings at above page >>>> >>>> ___ >>>> sword-devel mailing list: sword-devel@crosswire.org >>>> http://crosswire.org/mailman/listinfo/sword-devel >>>> Instructions to unsubscribe/change your settings at above page >>> >>> ___ >>> sword-devel mailing list: >>> sword-devel@crosswire.org >>> >>> http://crosswire.org/mailman/listinfo/sword-devel >>> Instructions to unsubscribe/change your settings at above page -- Sent from my Android device with K-9 Mail. Please excuse my brevity. ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
Thanks Troy One detail you omitted is that search ignores punctuation and is case-insensitive for bicameral scripts. eg. An exact phrase search of the KJV for “verily verily” will find “Verily, verily, …” Referring to earlier discussion, would SWORD search count the ZWNJ as a space? David Sent from Proton Mail for iOS On Tue, Apr 18, 2023 at 01:08, Troy A. Griffitts wrote: > Great suggestions all. One thing to interject: SWORD raw search simply looks > for a needles in a haystack-- it doesn't break words at all in the haystack. > Multi-word search-type will break the needles up by a space, e.g., if you > search for "God love world" and specify multi-word then you effectively get a > search for a 3 needles. "phrase" search-type takes the search term as one > needle. Whether or not that would be more or less useful here, I'll let the > language-informed determine. > > On 4/17/23 11:24, Greg Hellings wrote: > >> Yes, that looks like the type of thing. Although that is for Lucene (Java). >> I don't know the status of CLucene's implementation of that nor of Xapian's. >> But that would be the proper place for such processing to occur. If those >> libraries do not have one, interested parties could submit one. They could >> probably develop it inside of the SWORD library to be sure it's doing what >> they want it to do (I believe those filters are designed to be pluggable by >> the calling application) before submitting it to those projects for >> inclusion. >> >> --Greg >> >> On Mon, Apr 17, 2023 at 1:12 PM David Haslam wrote: >> >>> Thanks, Greg. >>> >>> I just came across this >>> >>> https://lucene.apache.org/core/3_2_0/api/contrib-analyzers/org/apache/lucene/analysis/th/ThaiWordFilter.html >>> >>> Is that the kind of thing you were thinking of? >>> >>> David >>> >>> Sent from Proton Mail for iOS >>> >>> On Mon, Apr 17, 2023 at 17:51, Greg Hellings >>> <[greg.helli...@gmail.com](mailto:On+Mon,+Apr+17,+2023+at+17:51,+Greg+Hellings+%3C%3Ca+href=)> >>> wrote: >>> >>>> I don't believe you're going to get that sort of feature directly in the >>>> engine's simple search. >>>> >>>> However, if you're using a build of the library that utilizes CLucene or >>>> Xapian, then that should be the function of those libraries. They are >>>> supposed to be able to handle all of that type of functionality if the >>>> language has a corresponding contribution to that library. It might be >>>> better to check in with them. >>>> >>>> --Greg >>>> >>>> On Mon, Apr 17, 2023 at 11:46 AM David Haslam >>>> wrote: >>>> >>>>> Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode >>>>> characters contain the word FINAL. Likewise for Myanmar letters. >>>>> >>>>> A possible way forward might be to run one of the several Word >>>>> Segmentation programs on the text of the ThaiKJV. >>>>> >>>>> Examples: KuCut, DeepCut, AttaCut >>>>> >>>>> This should insert a Unicode zero width non-joiner (ZWNJ) as a word >>>>> separator. >>>>> >>>>> NB. The module would have to be updated using the segmented source text. >>>>> >>>>> Visually, the resulting text would display the same as the original, but >>>>> the module would be amenable to indexing for word searches. >>>>> >>>>> A difficulty that might then arise is how the front-end user might enter >>>>> the search query for an exact phrase search type (containing more than >>>>> one word). Other search types (all words, any word) might be OK as is. >>>>> >>>>> Aside: The KuCut method developed in 2004 was originally trained using >>>>> the text of the ThaKJV. >>>>> >>>>> Regards, >>>>> >>>>> David >>>>> >>>>> Sent from Proton Mail for iOS >>>>> >>>>> On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne >>>>> <[ref...@gmx.net](mailto:On+Mon,+Apr+17,+2023+at+17:16,+Peter+Von+Kaehne+%3C%3Ca+href=)> >>>>> wrote: >>>>> >>>>>> Does Thai Burmese etc etc use end forms for letters? if so, are these >>>>>> encoded as such? >>>>>> >>>>>> Peter >>>>&
Re: [sword-devel] Languages without a space between words
Great suggestions all. One thing to interject: SWORD raw search simply looks for a needles in a haystack-- it doesn't break words at all in the haystack. Multi-word search-type will break the needles up by a space, e.g., if you search for "God love world" and specify multi-word then you effectively get a search for a 3 needles. "phrase" search-type takes the search term as one needle. Whether or not that would be more or less useful here, I'll let the language-informed determine. On 4/17/23 11:24, Greg Hellings wrote: Yes, that looks like the type of thing. Although that is for Lucene (Java). I don't know the status of CLucene's implementation of that nor of Xapian's. But that would be the proper place for such processing to occur. If those libraries do not have one, interested parties could submit one. They could probably develop it inside of the SWORD library to be sure it's doing what they want it to do (I believe those filters are designed to be pluggable by the calling application) before submitting it to those projects for inclusion. --Greg On Mon, Apr 17, 2023 at 1:12 PM David Haslam wrote: Thanks, Greg. I just came across this https://lucene.apache.org/core/3_2_0/api/contrib-analyzers/org/apache/lucene/analysis/th/ThaiWordFilter.html Is that the kind of thing you were thinking of? David Sent from Proton Mail for iOS On Mon, Apr 17, 2023 at 17:51, Greg Hellings mailto:On+Mon,+Apr+17,+2023+at+17:51,+Greg+Hellings+%3C%3Ca+href=>> wrote: I don't believe you're going to get that sort of feature directly in the engine's simple search. However, if you're using a build of the library that utilizes CLucene or Xapian, then that should be the function of those libraries. They are supposed to be able to handle all of that type of functionality if the language has a corresponding contribution to that library. It might be better to check in with them. --Greg On Mon, Apr 17, 2023 at 11:46 AM David Haslam wrote: Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode characters contain the word FINAL. Likewise for Myanmar letters. A possible way forward might be to run one of the several Word Segmentation programs on the text of the ThaiKJV. Examples: KuCut, DeepCut, AttaCut This should insert a Unicode zero width non-joiner (ZWNJ) as a word separator. NB. The module would have to be updated using the segmented source text. Visually, the resulting text would display the same as the original, but the module would be amenable to indexing for word searches. A difficulty that might then arise is how the front-end user might enter the search query for an exact phrase search type (containing more than one word). Other search types (all words, any word) might be OK as is. Aside: The KuCut method developed in 2004 was originally trained using the text of the ThaKJV. Regards, David Sent from Proton Mail for iOS On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne mailto:On+Mon,+Apr+17,+2023+at+17:16,+Peter+Von+Kaehne+%3C%3Ca+href=>> wrote: Does Thai Burmese etc etc use end forms for letters? if so, are these encoded as such? Peter *Gesendet:* Montag, 17. April 2023 um 16:47 Uhr *Von:* "David Haslam" *An:* sword-devel@crosswire.org *Betreff:* [sword-devel] Languages without a space between words How (if at all) does the SWORD API generate a search index for a module that is for a language without a space between words? |Please consider how best to generate a useful search index for modules that are for Bible translations in languages that have no spaces between words. Example: CrossWire module ThaiKJV See https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries Has this ever been considered before.| Best regards, David Sent from Proton Mail for iOS ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsu
Re: [sword-devel] Languages without a space between words
Yes, that looks like the type of thing. Although that is for Lucene (Java). I don't know the status of CLucene's implementation of that nor of Xapian's. But that would be the proper place for such processing to occur. If those libraries do not have one, interested parties could submit one. They could probably develop it inside of the SWORD library to be sure it's doing what they want it to do (I believe those filters are designed to be pluggable by the calling application) before submitting it to those projects for inclusion. --Greg On Mon, Apr 17, 2023 at 1:12 PM David Haslam wrote: > Thanks, Greg. > > I just came across this > > > https://lucene.apache.org/core/3_2_0/api/contrib-analyzers/org/apache/lucene/analysis/th/ThaiWordFilter.html > > Is that the kind of thing you were thinking of? > > David > > Sent from Proton Mail for iOS > > > On Mon, Apr 17, 2023 at 17:51, Greg Hellings > wrote: > > I don't believe you're going to get that sort of feature directly in the > engine's simple search. > > However, if you're using a build of the library that utilizes CLucene or > Xapian, then that should be the function of those libraries. They are > supposed to be able to handle all of that type of functionality if the > language has a corresponding contribution to that library. It might be > better to check in with them. > > --Greg > > On Mon, Apr 17, 2023 at 11:46 AM David Haslam > wrote: > >> Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode >> characters >> contain the word FINAL. Likewise for Myanmar letters. >> >> A possible way forward might be to run one of the several Word >> Segmentation programs on the text of the ThaiKJV. >> >> Examples: KuCut, DeepCut, AttaCut >> >> This should insert a Unicode zero width non-joiner (ZWNJ) as a word >> separator. >> >> NB. The module would have to be updated using the segmented source text. >> >> Visually, the resulting text would display the same as the original, but >> the module would be amenable to indexing for word searches. >> >> A difficulty that might then arise is how the front-end user might enter >> the search query for an exact phrase search type (containing more than one >> word). Other search types (all words, any word) might be OK as is. >> >> Aside: The KuCut method developed in 2004 was originally trained using >> the text of the ThaKJV. >> >> Regards, >> >> David >> >> Sent from Proton Mail for iOS >> >> >> On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne > > wrote: >> >> Does Thai Burmese etc etc use end forms for letters? if so, are these >> encoded as such? >> Peter >> *Gesendet:* Montag, 17. April 2023 um 16:47 Uhr >> *Von:* "David Haslam" >> *An:* sword-devel@crosswire.org >> *Betreff:* [sword-devel] Languages without a space between words >> How (if at all) does the SWORD API generate a search index for a module >> that is for a language without a space between words? >> >> Please consider how best to generate a useful search index for modules that >> are >> for Bible translations in languages that have no spaces between words. >> >> Example: CrossWire module ThaiKJV >> >> Seehttps://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries >> >> Has this ever been considered before. >> >> Best regards, >> David >> Sent from Proton Mail for iOS >> ___ sword-devel mailing list: >> sword-devel@crosswire.org >> http://crosswire.org/mailman/listinfo/sword-devel Instructions to >> unsubscribe/change your settings at above page >> ___ >> sword-devel mailing list: sword-devel@crosswire.org >> http://crosswire.org/mailman/listinfo/sword-devel >> Instructions to unsubscribe/change your settings at above page >> >> ___ >> sword-devel mailing list: sword-devel@crosswire.org >> http://crosswire.org/mailman/listinfo/sword-devel >> Instructions to unsubscribe/change your settings at above page >> > ___ > sword-devel mailing list: sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
Thanks, Greg. I just came across this https://lucene.apache.org/core/3_2_0/api/contrib-analyzers/org/apache/lucene/analysis/th/ThaiWordFilter.html Is that the kind of thing you were thinking of? David Sent from Proton Mail for iOS On Mon, Apr 17, 2023 at 17:51, Greg Hellings <[greg.helli...@gmail.com](mailto:On Mon, Apr 17, 2023 at 17:51, Greg Hellings < wrote: > I don't believe you're going to get that sort of feature directly in the > engine's simple search. > > However, if you're using a build of the library that utilizes CLucene or > Xapian, then that should be the function of those libraries. They are > supposed to be able to handle all of that type of functionality if the > language has a corresponding contribution to that library. It might be better > to check in with them. > > --Greg > > On Mon, Apr 17, 2023 at 11:46 AM David Haslam wrote: > >> Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode >> characters contain the word FINAL. Likewise for Myanmar letters. >> >> A possible way forward might be to run one of the several Word Segmentation >> programs on the text of the ThaiKJV. >> >> Examples: KuCut, DeepCut, AttaCut >> >> This should insert a Unicode zero width non-joiner (ZWNJ) as a word >> separator. >> >> NB. The module would have to be updated using the segmented source text. >> >> Visually, the resulting text would display the same as the original, but the >> module would be amenable to indexing for word searches. >> >> A difficulty that might then arise is how the front-end user might enter the >> search query for an exact phrase search type (containing more than one >> word). Other search types (all words, any word) might be OK as is. >> >> Aside: The KuCut method developed in 2004 was originally trained using the >> text of the ThaKJV. >> >> Regards, >> >> David >> >> Sent from Proton Mail for iOS >> >> On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne >> <[ref...@gmx.net](mailto:On+Mon,+Apr+17,+2023+at+17:16,+Peter+Von+Kaehne+%3C%3Ca+href=)> >> wrote: >> >>> Does Thai Burmese etc etc use end forms for letters? if so, are these >>> encoded as such? >>> >>> Peter >>> Gesendet: Montag, 17. April 2023 um 16:47 Uhr >>> Von: "David Haslam" >>> An: sword-devel@crosswire.org >>> Betreff: [sword-devel] Languages without a space between words >>> How (if at all) does the SWORD API generate a search index for a module >>> that is for a language without a space between words? >>> >>> Please consider how best to generate a useful search index for modules that >>> are >>> for Bible translations in languages that have no spaces between words. >>> >>> Example: CrossWire module ThaiKJV >>> >>> See >>> https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries >>> Has this ever been considered before. >>> >>> Best regards, >>> >>> David >>> >>> Sent from Proton Mail for iOS >>> ___ sword-devel mailing list: >>> sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel >>> Instructions to unsubscribe/change your settings at above page >>> ___ >>> sword-devel mailing list: sword-devel@crosswire.org >>> http://crosswire.org/mailman/listinfo/sword-devel >>> Instructions to unsubscribe/change your settings at above page >> >> ___ >> sword-devel mailing list: sword-devel@crosswire.org >> http://crosswire.org/mailman/listinfo/sword-devel >> Instructions to unsubscribe/change your settings at above page___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
I don't believe you're going to get that sort of feature directly in the engine's simple search. However, if you're using a build of the library that utilizes CLucene or Xapian, then that should be the function of those libraries. They are supposed to be able to handle all of that type of functionality if the language has a corresponding contribution to that library. It might be better to check in with them. --Greg On Mon, Apr 17, 2023 at 11:46 AM David Haslam wrote: > Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode > characters > contain the word FINAL. Likewise for Myanmar letters. > > A possible way forward might be to run one of the several Word > Segmentation programs on the text of the ThaiKJV. > > Examples: KuCut, DeepCut, AttaCut > > This should insert a Unicode zero width non-joiner (ZWNJ) as a word > separator. > > NB. The module would have to be updated using the segmented source text. > > Visually, the resulting text would display the same as the original, but > the module would be amenable to indexing for word searches. > > A difficulty that might then arise is how the front-end user might enter > the search query for an exact phrase search type (containing more than one > word). Other search types (all words, any word) might be OK as is. > > Aside: The KuCut method developed in 2004 was originally trained using the > text of the ThaKJV. > > Regards, > > David > > Sent from Proton Mail for iOS > > > On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne > wrote: > > Does Thai Burmese etc etc use end forms for letters? if so, are these > encoded as such? > > Peter > > > *Gesendet:* Montag, 17. April 2023 um 16:47 Uhr > *Von:* "David Haslam" > *An:* sword-devel@crosswire.org > *Betreff:* [sword-devel] Languages without a space between words > How (if at all) does the SWORD API generate a search index for a module > that is for a language without a space between words? > > Please consider how best to generate a useful search index for modules that > are > for Bible translations in languages that have no spaces between words. > > Example: CrossWire module ThaiKJV > > Seehttps://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries > > Has this ever been considered before. > > Best regards, > > David > > Sent from Proton Mail for iOS > ___ sword-devel mailing list: > sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel Instructions to > unsubscribe/change your settings at above page > ___ > sword-devel mailing list: sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > > ___ > sword-devel mailing list: sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
Unlike Hebrew and Arabic, etc, none of the names of the Thai Unicode characters contain the word FINAL. Likewise for Myanmar letters. A possible way forward might be to run one of the several Word Segmentation programs on the text of the ThaiKJV. Examples: KuCut, DeepCut, AttaCut This should insert a Unicode zero width non-joiner (ZWNJ) as a word separator. NB. The module would have to be updated using the segmented source text. Visually, the resulting text would display the same as the original, but the module would be amenable to indexing for word searches. A difficulty that might then arise is how the front-end user might enter the search query for an exact phrase search type (containing more than one word). Other search types (all words, any word) might be OK as is. Aside: The KuCut method developed in 2004 was originally trained using the text of the ThaKJV. Regards, David Sent from Proton Mail for iOS On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne <[ref...@gmx.net](mailto:On Mon, Apr 17, 2023 at 17:16, Peter Von Kaehne < wrote: > Does Thai Burmese etc etc use end forms for letters? if so, are these encoded > as such? > > Peter > Gesendet: Montag, 17. April 2023 um 16:47 Uhr > Von: "David Haslam" > An: sword-devel@crosswire.org > Betreff: [sword-devel] Languages without a space between words > How (if at all) does the SWORD API generate a search index for a module that > is for a language without a space between words? > > Please consider how best to generate a useful search index for modules that > are > for Bible translations in languages that have no spaces between words. > > Example: CrossWire module ThaiKJV > > See > https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries > > Has this ever been considered before. > > Best regards, > > David > > Sent from Proton Mail for iOS > ___ sword-devel mailing list: > sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page > ___ > sword-devel mailing list: sword-devel@crosswire.org > http://crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
i would think that for search to work there needs to be a way of knowing where words end. Options are 1) If there is another way of knowing where words end (end forms encoded into text) then we could adapt the search 2) If there is not then translators should consider adding ZWNJ spaces between words for this purpose. We could then per language using that as a word separator. Peter Gesendet: Montag, 17. April 2023 um 16:47 Uhr Von: "David Haslam" An: sword-devel@crosswire.org Betreff: [sword-devel] Languages without a space between words How (if at all) does the SWORD API generate a search index for a module that is for a language without a space between words? Please consider how best to generate a useful search index for modules that are for Bible translations in languages that have no spaces between words. Example: CrossWire module ThaiKJV See https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries Has this ever been considered before. Best regards, David Sent from Proton Mail for iOS ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
Re: [sword-devel] Languages without a space between words
Does Thai Burmese etc etc use end forms for letters? if so, are these encoded as such? Peter Gesendet: Montag, 17. April 2023 um 16:47 Uhr Von: "David Haslam" An: sword-devel@crosswire.org Betreff: [sword-devel] Languages without a space between words How (if at all) does the SWORD API generate a search index for a module that is for a language without a space between words? Please consider how best to generate a useful search index for modules that are for Bible translations in languages that have no spaces between words. Example: CrossWire module ThaiKJV See https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries Has this ever been considered before. Best regards, David Sent from Proton Mail for iOS ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page ___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page
[sword-devel] Languages without a space between words
How (if at all) does the SWORD API generate a search index for a module that is for a language without a space between words? Please consider how best to generate a useful search index for modules that are for Bible translations in languages that have no spaces between words. Example: CrossWire module ThaiKJV See https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries Has this ever been considered before. Best regards, David Sent from Proton Mail for iOS___ sword-devel mailing list: sword-devel@crosswire.org http://crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page