Hi, On Sat, 2005-12-24 at 11:10 +0100, Mark Wielaard wrote: > But you are right that we could/should probably add most things from the > weak and neutral category, which we know won't "disrupt" > left-to-rightness: > > DIRECTIONALITY_EUROPEAN_NUMBER (EN) > DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR (ES) > DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR (ET) > DIRECTIONALITY_ARABIC_NUMBER (AN) > DIRECTIONALITY_COMMON_NUMBER_SEPARATOR (CS) > DIRECTIONALITY_SEGMENT_SEPARATOR (S) > DIRECTIONALITY_WHITESPACE (WS) > > I am not sure we should test for the others. I have been conservative > with the above list (just so I don't have to read the whole bidi > algorithm description). The idea behind requiresBidi() is that it is a > quick way to determine whether to do full bidirectional analysis or not > (or actually if the whole paragraph text is written left-to-right). So > false positives aren't really a problem. It just means that you have to > follow the full algorithm to get the full answer.
This patch implements the above. 2005-12-31 Mark Wielaard <[EMAIL PROTECTED]> * java/text/Bidi.java (requiresBidi): Also test against character types L, EN, ES, ET, AN, CS, S and WS. Committed, Mark
Index: java/text/Bidi.java =================================================================== RCS file: /cvsroot/classpath/classpath/java/text/Bidi.java,v retrieving revision 1.1 diff -u -r1.1 Bidi.java --- java/text/Bidi.java 23 Dec 2005 18:27:21 -0000 1.1 +++ java/text/Bidi.java 31 Dec 2005 10:44:10 -0000 @@ -44,24 +44,32 @@ * TODO/FIXME Only one method <code>requiresBidi</code> is implemented * for now by using <code>Character</code>. The full algorithm is <a * href="http://www.unicode.org/unicode/reports/tr9/">Unicode Standard - * Annex #9: The Bidirectional Algorithm</a>. A ful implementation is + * Annex #9: The Bidirectional Algorithm</a>. A full implementation is * <a href="http://fribidi.org/">GNU FriBidi</a>. */ public class Bidi { /** * Returns false if all characters in the text between start and end - * are all left-to-right text. WARNING, this implementation is - * slow, it calls <code>Character.getDirectionality(char)</code> on - * all characters. + * are all left-to-right text. This implementation is just calls + * <code>Character.getDirectionality(char)</code> on all characters + * and makes sure all characters are either explicitly left-to-right + * or neutral in directionality (character types L, EN, ES, ET, AN, + * CS, S and WS). */ public static boolean requiresBidi(char[] text, int start, int end) { - final int LEFT_TO_RIGHT = Character.DIRECTIONALITY_LEFT_TO_RIGHT; for (int i = start; i < end; i++) { - char c = text[i]; - if (Character.getDirectionality(c) != LEFT_TO_RIGHT) + byte dir = Character.getDirectionality(text[i]); + if (dir != Character.DIRECTIONALITY_LEFT_TO_RIGHT + && dir != Character.DIRECTIONALITY_EUROPEAN_NUMBER + && dir != Character.DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR + && dir != Character.DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR + && dir != Character.DIRECTIONALITY_ARABIC_NUMBER + && dir != Character.DIRECTIONALITY_COMMON_NUMBER_SEPARATOR + && dir != Character.DIRECTIONALITY_SEGMENT_SEPARATOR + && dir != Character.DIRECTIONALITY_WHITESPACE) return true; }
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Classpath-patches mailing list Classpath-patches@gnu.org http://lists.gnu.org/mailman/listinfo/classpath-patches