In perl.git, the branch blead has been updated <http://perl5.git.perl.org/perl.git/commitdiff/01f0ef6b2ef087bcb4c6dc352861e0ed728c3ecb?hp=069689156823734d3af603278e2629028f1f9e54>
- Log ----------------------------------------------------------------- commit 01f0ef6b2ef087bcb4c6dc352861e0ed728c3ecb Author: Aristotle Pagaltzis <[email protected]> Date: Wed Jul 13 16:56:20 2016 +0200 perlfunc: fix seek/tell/sysseek byte offset note akwardness M pod/perlfunc.pod commit b7173f14c9ed466859bda312f73514f450866a77 Author: Aristotle Pagaltzis <[email protected]> Date: Wed Jul 13 16:56:15 2016 +0200 perlfunc: unrearrange sysseek doc to prepare next patch M pod/perlfunc.pod ----------------------------------------------------------------------- Summary of changes: pod/perlfunc.pod | 70 +++++++++++++++++++++++++++++--------------------------- 1 file changed, 36 insertions(+), 34 deletions(-) diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 4f2d3a1..5a4c503 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -6648,12 +6648,13 @@ C<SEEK_CUR>, and C<SEEK_END> (start of the file, current position, end of the file) from the L<Fcntl> module. Returns C<1> on success, false otherwise. -Note the I<in bytes>: even if the filehandle has been set to -operate on characters (for example by using the C<:encoding(utf8)> open -layer), L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> will take byte offsets, -not character offsets (because implementing that would render -L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> and -L<C<tell>|/tell FILEHANDLE> rather slow). +Note the emphasis on bytes: even if the filehandle has been set to operate +on characters (for example using the C<:encoding(utf8)> I/O layer), the +L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>, +L<C<tell>|/tell FILEHANDLE>, and +L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> +family of functions use byte offsets, not character offsets, +because seeking to a character offset would be very slow in a UTF-8 file. If you want to position the file for L<C<sysread>|/sysread FILEHANDLE,SCALAR,LENGTH,OFFSET> or @@ -8531,33 +8532,19 @@ X<sysseek> X<lseek> =for Pod::Functions +5.004 position I/O pointer on handle used with sysread and syswrite -Sets FILEHANDLE's system position in bytes using L<lseek(2)>. FILEHANDLE may +Sets FILEHANDLE's system position I<in bytes> using L<lseek(2)>. FILEHANDLE may be an expression whose value gives the name of the filehandle. The values for WHENCE are C<0> to set the new position to POSITION; C<1> to set the it to the current position plus POSITION; and C<2> to set it to EOF plus POSITION, typically negative. -For WHENCE, you may also use the constants C<SEEK_SET>, C<SEEK_CUR>, -and C<SEEK_END> (start of the file, current position, end of the file) -from the L<Fcntl> module. Use of the constants is also more portable -than relying on 0, 1, and 2. - -Returns the new position in bytes, or the undefined value on failure. A -position of zero is returned as the string C<"0 but true">; thus -L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> returns -true on success and false on failure, yet you can still easily determine -the new position. - -For example to define a C<systell> function: - - use Fcntl 'SEEK_CUR'; - sub systell { sysseek($_[0], 0, SEEK_CUR) } - -Note the I<in bytes>: even if the filehandle has been set to operate -on characters (for example by using the C<:encoding(utf8)> I/O layer), -L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> will take and return byte -offsets, not character offsets (because implementing that would render -L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> unacceptably slow). +Note the emphasis on bytes: even if the filehandle has been set to operate +on characters (for example using the C<:encoding(utf8)> I/O layer), the +L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>, +L<C<tell>|/tell FILEHANDLE>, and +L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> +family of functions use byte offsets, not character offsets, +because seeking to a character offset would be very slow in a UTF-8 file. L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> bypasses normal buffered IO, so mixing it with reads other than @@ -8569,6 +8556,20 @@ L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>, L<C<tell>|/tell FILEHANDLE>, or L<C<eof>|/eof FILEHANDLE> may cause confusion. +For WHENCE, you may also use the constants C<SEEK_SET>, C<SEEK_CUR>, +and C<SEEK_END> (start of the file, current position, end of the file) +from the L<Fcntl> module. Use of the constants is also more portable +than relying on 0, 1, and 2. For example to define a "systell" function: + + use Fcntl 'SEEK_CUR'; + sub systell { sysseek($_[0], 0, SEEK_CUR) } + +Returns the new position, or the undefined value on failure. A position +of zero is returned as the string C<"0 but true">; thus +L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> returns +true on success and false on failure, yet you can still easily determine +the new position. + =item system LIST X<system> X<shell> @@ -8704,12 +8705,13 @@ error. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read. -Note the I<in bytes>: even if the filehandle has been set to -operate on characters (for example by using the C<:encoding(utf8)> open -layer), L<C<tell>|/tell FILEHANDLE> will return byte offsets, not -character offsets (because that would render -L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE> and -L<C<tell>|/tell FILEHANDLE> rather slow). +Note the emphasis on bytes: even if the filehandle has been set to operate +on characters (for example using the C<:encoding(utf8)> I/O layer), the +L<C<seek>|/seek FILEHANDLE,POSITION,WHENCE>, +L<C<tell>|/tell FILEHANDLE>, and +L<C<sysseek>|/sysseek FILEHANDLE,POSITION,WHENCE> +family of functions use byte offsets, not character offsets, +because seeking to a character offset would be very slow in a UTF-8 file. The return value of L<C<tell>|/tell FILEHANDLE> for the standard streams like the STDIN depends on the operating system: it may return -1 or -- Perl5 Master Repository
