Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread David Haslam
There are some languages in which the apostrophe is used a letter of the alphabet rather than an item of punctuation. e.g. Somali, in which the apostrophe represents the /Alef/. See http://en.wikipedia.org/wiki/Somali_alphabet Guessing that our Lucene indexing method generally strips out such

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-12-10 Thread DM Smith
IIRC, the StandardAnalyzer that SWORD uses doesn't allow for that. It has its own handling of the punctuation that is fixed. I've said before, the analyzer is only good for English like languages. In Him, DM On Dec 10, 2012, at 11:17 AM, David Haslam dfh...@googlemail.com wrote:

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-27 Thread Peter von Kaehne
Developers\' Collaboration Forum sword-devel@crosswire.org Betreff: Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version On Mon, Nov 26, 2012 at 11:15 PM, Nic Carter niccar...@mac.com wrote: My understanding is that we are currently locked into a really old version

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Chris Little
You're talking about vowels, not shaping. Shaping in Arabic changes the shape of the letter according to its context in the word (initial, medial, final, or isolated). I imagine unshaped Arabic would be very difficult to read. Arabic without vowel marks, on the other hand, is standard. I

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread pola ashraf
someone in this list report them about all this discussion :) So now we know the problem and the solution . Date: Mon, 26 Nov 2012 01:05:16 -0800 From: chris...@crosswire.org To: sword-devel@crosswire.org Subject: Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version You're

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Peter von Kaehne
@crosswire.org Betreff: Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version Sorry for choosing the wrong word this wikipedia article talking about this topic https://en.wikipedia.org/wiki/Arabic_diacritics Thanks Chris for your reply about the filter, Actually I don't

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread David Haslam
Which (I suppose) would have been a patch to the SWORD API ? So a similar patch would be necessary in principle to JSword ??? David -- View this message in context: http://sword-dev.350566.n4.nabble.com/Re-Search-bug-New-Arabic-Bible-Not-Shaped-SVD-Version-tp4651330p4651336.html Sent from

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Peter von Kaehne
Von: David Haslam dfh...@googlemail.com So a similar patch would be necessary in principle to JSword ??? No. If And Bible does not have a problem, then Jsword does its job correctly. Peter ___ sword-devel mailing list: sword-devel@crosswire.org

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Greg Hellings
On Mon, Nov 26, 2012 at 6:22 AM, Peter von Kaehne ref...@gmx.net wrote: Von: David Haslam dfh...@googlemail.com So a similar patch would be necessary in principle to JSword ??? No. If And Bible does not have a problem, then Jsword does its job correctly. However, BibleTime would require

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread DM Smith
Correct. JSword uses Lucene's filter for the language, which does more normalization than the StandardAnalyzer which SWORD uses exclusively. The StandardAnalyzer should only be used for unaccented latinate text. Same with the SimpleAnalyzer. (In Lucene, an analyzer is a filter chain which

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Greg Hellings
On Mon, Nov 26, 2012 at 8:12 AM, DM Smith dmsm...@crosswire.org wrote: Correct. JSword uses Lucene's filter for the language, which does more normalization than the StandardAnalyzer which SWORD uses exclusively. The StandardAnalyzer should only be used for unaccented latinate text. Same

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread pola ashraf
: Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version On Mon, Nov 26, 2012 at 8:12 AM, DM Smith dmsm...@crosswire.org wrote: Correct. JSword uses Lucene's filter for the language, which does more normalization than the StandardAnalyzer which SWORD uses exclusively

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-26 Thread Nic Carter
My understanding is that we are currently locked into a really old version of the C library it is no longer being maintained. Instead we need to port SWORD to use the current version of the library, which is actively being maintained... I gather some work has been done on this but I'm not sure

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-25 Thread pola ashraf
I think Arabic shapes add extra Unicode characters that's why the 2 same words - i mentioned before - don't give the same results -- Any Arabic search problem is unconnected to shaping. Modules are routinely created and stored in a normalised format, user entries, e.g. for

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-25 Thread pola ashraf
Using a comparison tool from ICU the two strings resulted in different character numbers Words to compare يَسُوعَ يسوع Which is the Name of JESUS Christ in Arabic but one is shaped and the other isn't Words converted to HEX Format \u064a \u064e \u0633 \u064f \u0648 \u0639 \u064e \u064a \u0633

[sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread pola ashraf
Hi, Sorry for posting a lot these days :) I just do many searches, readings and experiments on CrossWire Programs and modules . I found that i can't search in the SVD bible since all words are shaped while i write not shaped search words For example searching for يسوع is not equal يَسُوعَ

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola, For several very valid reasons we never copy e-Sword source texts to make SWORD modules hosted by CrossWire. /Please do not go down that route/. David -- View this message in context:

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola wrote, Currently I think in using Mod2Osis to extract the OSIS source text then use Any program that can Remove all Arabic shapes then Package it again using OSIS2Mod Please understand that a round trip using mod2osis and osis2mod is highly deprecated. Information will always be lost due to

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread David Haslam
Pola wrote, The permanent solution is to make search indexes ignore all Arabic shapes Indeed, this would be true for all similar scripts that used glyph shaping. Not just those in the Arabic/Persian family either. The fundamental problem has been identified and described. We really do need a

Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version

2012-11-24 Thread pola ashraf
for accuracy of some words Date: Sat, 24 Nov 2012 09:12:25 -0800 From: dfh...@googlemail.com To: sword-devel@crosswire.org Subject: Re: [sword-devel] Search bug New Arabic Bible, Not Shaped SVD Version Pola wrote, The permanent solution is to make search indexes ignore all Arabic shapes