> > If the input is in
> > multiple (Indic) scripts, and let's assume that the audience
> > (which may be a single person just asking for an sorted list
> > of his/her files) can read the Indic scripts used, it may be
> > helpful to interleave. (But I will not push this.)
> 
>       Now let's asume that person can't read all the scripts.  Then they
> get lots of unintelligible garbage in their sort.  This, and the upside is
> "may be helpful".  Which side did you say you're making the case for?

Garbage in, garbage out. If you didn't want unintelligible garbage in the
output, you shouldn't have put it in the input, and no sort procedure is
going to remove it. The user that can't read all the scripts is not an 
interesting person here, because it doesn't really matter to them if the
garbage is interfiled or at the end.

What's the actual usage pattern for multi-lingual sorts? Possibly the most 
common case, IMO, is a collection of Serbian or Tibetan or Sanskrit or Hebrew 
data in mixed scripts; the most convenient thing to do there is to interfile.
Another common case is computer directory listings in English & some other 
language, which should probably be seperate; but that's Latin, which is out
of the scope of this discussion. Again, a Serbian user would probably like
Latin and Cyrillic interfiled, and someone working on paleo-Hebrew or Sanskrit
would probably like their characters interfiled. I've never seen a multi-script
index; is there any real legacy behavior here, besides computer programs which
were forced to do something?
-- 
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm


Reply via email to