On Apr 30, 2004, at 9:02 AM, Larry Wall wrote:

On Fri, Apr 30, 2004 at 08:38:18AM -0700, Jeff Clites wrote:
: On Apr 28, 2004, at 5:01 AM, Dan Sugalski wrote:
:
: >At 3:17 AM -0700 4/28/04, Jeff Clites wrote:
: >>On Apr 23, 2004, at 2:43 PM, Dan Sugalski wrote:
: >>
: >>>For example, consider the following:
: >>>
: >>> use Unicode;
: >>> open FOO, "foo.txt", :charset(latin-3);
: >>> open BAR, "bar.txt", :charset(big5);
: >>> $filehandle = 0;
: >>> while (<>) {
: >>> if ($filehandle++) {
: >>> print FOO $_;
: >>> } else {
: >>> print BAR $_;
: >>> }
: >>> $filehadle %= 2;
: >>> }
: >>
: >>What's the input record separator here?
: >
: >The filehandle default, which depends on the encoding and character
: >set of the input data, or so Larry's told me.
:
: So the nature of my question here is that I assume the input record
: separator will be set as a string, with something similar to: $/ = "\n"
: or $/ = "----" or whatever.
....
$/ is gone. But if there were a $/, it would do the Right Thing. :-)

(Which, in Perl 6, is to have consistent Unicode semantics regardless
of the supposed encoding of the string.)

Arguably, this discussion should be happening in p6l rather than p6i...

Well, the implementation point that I was getting at, which perhaps I should have stated more clearly up front, was that if one gets to specify a default input-record-separator, then if that's done as a string, then you're going to have to (in Dan's plan as stated) transcode your input-record-separator and your input stream to a common character set/encoding, so you're paying the computational price that Dan indicated the above code could avoid. If the input-record-separator is specified as a byte-sequence rather than as a string, then in the plan I had in mind would also avoid the overhead of "decoding" the string.


So the implementation point was that this example doesn't seem to argue that Dan's plan has a performance advantage.

JEff



Reply via email to