Assuming I've understood you correctly.... It should be as transparent as possible. Biblical scholars are not *necessarily* technically savvy. Witness the fact that a number of my colleagues still type Hebrew 'backwards' using old legacy systems.... One should not presume technical prowess, at best Biblical literacy.
K ----- Original Message ----- From: "Jony Rosenne" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, July 26, 2003 2:27 AM Subject: RE: Yerushala(y)im - or Biblical Hebrew > I don't think that it is important that the user not be aware of the > encoding, since it is only intended for Biblical scholars. > > Jony > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Kenneth Whistler > > Sent: Saturday, July 26, 2003 3:50 AM > > To: [EMAIL PROTECTED] > > Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED] > > Subject: Re: Yerushala(y)im - or Biblical Hebrew > > > > > > Peter wrote: > > > > > One thought: Ken has suggested CGJ be used to prevent reordering of > > > combining marks in fixed position classes such as the > > Hebrew vowels, > > > and also suggested that users should not need to be aware > > of the need > > > for CGJ for this purpose but that software can be > > implemented in a way > > > that hides that detail. I'm not sure how that will work, > > > > Details TBD, of course, but the essence of it is that you > > want the user experience of inserting patah + hiriq > > to correspond to the backing store insertion of <patah, CGJ, > > hiriq>, without making them explicitly have to know about or > > type a "CGJ" key. There are various input and editing > > strategies to accomplish this -- effectively the problem is > > similar to other needs to tuck hidden characters away in the > > backing store for bidirectional text. > > > > The situation for searching is a little different. While the > > editing tools may be smart about the Biblical Hebrew points, > > a typical query widget might not, so in that instance, you > > want a query on <patah, hiriq> to match the repository store > > instance of <patah, CGJ, hiriq>. Well, format controls and > > some other characters (including CGJ) are ordinarily supposed > > to be ignored for searching -- unless you have specialized > > tailorings for them. So the ordinary strategy would be to > > keep the repository normalized, and then before local > > comparison against the query string, strip out the CGJ for > > the match. The situation is more complicated if the query > > string doesn't use a CGJ *and* gets normalized. In that > > situation, you lose the distinction in order, of course, but > > the search strategy should be to strip out the CGJ locally > > and renormalize. That could result in false positive matches, > > of course, but at least you will find what you were looking for. > > > > > but it's making me wonder if > > > effectively we'd be looking at some amendment to the normalization > > > algorithms to insert CGJ in certain enumerated contexts. > > > > No. > > > > --Ken > > > > > > > > > > > >