Author: larry
Date: Sat Mar 10 09:42:53 2007
New Revision: 14327

Modified:
   doc/trunk/design/syn/S02.pod

Log:
Clarifications on StrPos and StrLen requested by putter++.


Modified: doc/trunk/design/syn/S02.pod
==============================================================================
--- doc/trunk/design/syn/S02.pod        (original)
+++ doc/trunk/design/syn/S02.pod        Sat Mar 10 09:42:53 2007
@@ -589,13 +589,23 @@
 graphemes, or characters in some language.  For all builtin operations,
 all C<Str> positions are reported as position objects, not integers.
 These C<StrPos> objects point into a particular string at a particular
-location independent of abstraction level.  The subtraction of two
-C<StrPos> objects gives a C<StrLen> object, which is still not an
-integer, because the string between two positions also has multiple
-integer interpretations depending on the units.  A given C<StrLen>
-may know that it represents 18 bytes, 7 codepoints, and 3 graphemes,
-but it knows this lazily because it actually just hangs onto the two
-C<StrPos> objects.  (It's much like a C<Range> object in that respect.)
+location independent of abstraction level, either by tracking the
+string and position directly, or by generating an abstraction-level
+independent representation of the offset from the beginning of the
+string that will give the same results if applied to the same string
+in any context.  This is assuming the string isn't modified in the
+meanwhile; a C<StrPos> is not a "marker" and is not required to follow
+changes to a mutable string.
+
+The subtraction of two C<StrPos> objects gives a C<StrLen> object,
+which is also not an integer, because the string between two positions
+also has multiple integer interpretations depending on the units.
+A given C<StrLen> may know that it represents 18 bytes, 7 codepoints,
+3 graphemes, and 1 letter in Malayalam, but it might only know this
+lazily because it actually just hangs onto the two C<StrPos> endpoints
+within the string that in turn may or may not just lazily point into
+the string.  (The lazy implementation of C<StrLen> is much like a
+C<Range> object in that respect.)
 
 If you use integers as arguments where position objects are expected,
 it will be assumed that you mean the units of the current lexically
@@ -607,6 +617,11 @@
 Of course, such a dimensional number will fail if used on a string
 that doesn't provide the appropriate abstraction level.
 
+If a C<StrPos> or C<StrLen> is forced into a numeric context, it will
+assume the units of the current Unicode abstraction level.  It is
+erroneous to pass such a non-dimensional number to a routine that
+would interpret it with the wrong units.
+
 =item *
 
 A C<Buf> is a stringish view of an array of

Reply via email to