Re: [PATCH] Bindings for ‘sendfile’

2013-04-18 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Wed, 17 Apr 2013 18:27:04 +0200

   So that was over a PF_INET/SOCK_STREAM socket, right?

Yes.  FWIW, ttn-do sizzweb can also operate over PF_UNIX/SOCK_STREAM
and i recall doing some testing (w/ similar results) there.

   Would it occur systematically, or just once in a while?

Sorry, i can't recall precisely.  My vague memory is that on the client
(apt-get) side, i saw some packages download completely, no problem,
about 20% of the time.  I suppose it would be good to collect some hard
data, draw graphs, tweak kernel knobs, publish papers, etc.  Some day...

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgptPOYhoF4WT.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-17 Thread Ludovic Courtès
Thien-Thi Nguyen t...@gnuvola.org skribis:

 Anyway, i hope Guile grows a ‘sendfile-some’ (or whatever) so that i can
 adapt my programs to use it instead of the ttn-do ‘sendfile’:

I’m open to the idea.  To get a better understanding, do you have
examples of real applications where it makes a difference?  I.e.,
concrete cases of recurring short writes?

Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-04-17 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Wed, 17 Apr 2013 14:52:40 +0200

   concrete cases of recurring short writes?

The original application was ttn-do serve-debiso, which configures
ttn-do sizzweb to serve debian ISOs (or rather, the various component
.deb files and metadata that apt-get requests from the loopback-mounted
ISO) over the home network.

Specifically, proc ‘prep-file-transfer’ in module ‘(ttn-do sizzweb)’
calls ‘sendfile’ in a loop.  Admittedly, that code could be replaced by
‘sendfile/never-fewer’, as the retval is not used in a particularly
intelligent way, but anyway, the point is, at time of writing and early
testing *sans looping*, apt-get on the other computer would stall and
eventually die, complaining.  That's when i learned to read and think
about sendfile(2) more carefully than i had before.

It would be nice to point to a ChangeLog entry that embodies this ah
ha! moment, but as it turned out, the testing was done before commit.
Oh well, life goes on...

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpaljvkq1He6.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-17 Thread Ludovic Courtès
Thien-Thi Nguyen t...@gnuvola.org skribis:

 The original application was ttn-do serve-debiso, which configures
 ttn-do sizzweb to serve debian ISOs (or rather, the various component
 .deb files and metadata that apt-get requests from the loopback-mounted
 ISO) over the home network.

 Specifically, proc ‘prep-file-transfer’ in module ‘(ttn-do sizzweb)’
 calls ‘sendfile’ in a loop.  Admittedly, that code could be replaced by
 ‘sendfile/never-fewer’, as the retval is not used in a particularly
 intelligent way, but anyway, the point is, at time of writing and early
 testing *sans looping*, apt-get on the other computer would stall and
 eventually die, complaining.  That's when i learned to read and think
 about sendfile(2) more carefully than i had before.

So that was over a PF_INET/SOCK_STREAM socket, right?

Would it occur systematically, or just once in a while?

Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-04-16 Thread Ludovic Courtès
Thien-Thi Nguyen t...@gnuvola.org skribis:

 My reading of sendfile(2) is that it does its best to send as much as
 possible, but does not guarantee sending everything.  What it does
 succeed in sending, it reports to the caller.  The caller loops as
 desired, after evaluating (in some caller-meaningful way) the returned
 information.

Just a note, this information is definitely not conveyed by the man page:

   sendfile()  copies  data  between one file descriptor and another.  

   [...]

   count is the number of bytes to copy between the file descriptors.

Of course the return type hints at a write(2)-like interface, but the
truth is that short writes had to be evidenced by experimental
validation, so to speak.

Ludo’.




Re: [PATCH] Bindings for ‘sendfile’

2013-04-16 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Tue, 16 Apr 2013 18:31:00 +0200

   Thien-Thi Nguyen t...@gnuvola.org skribis:

My reading of sendfile(2) is that it does its best to send as much
as possible, but does not guarantee sending everything.  What it
does succeed in sending, it reports to the caller.  The caller
loops as desired, after evaluating (in some caller-meaningful way)
the returned information.

   Just a note, this information is definitely not conveyed by the man page:

  sendfile()  copies  data  between one file descriptor and another.  

  [...]

  count is the number of bytes to copy between the file descriptors.

Yes, that part is what you tell (request of) the OS, not what the OS
tells (reports to) you.  I think the manpage is not remiss here.

   Of course the return type hints at a write(2)-like interface, but the
   truth is that short writes had to be evidenced by experimental
   validation, so to speak.

Well, i don't know what the truth is, but certainly it can be easy to
misinterpret the sendfile(2) manpage if you don't take all of it into
account.  The major hint is actually:

   If offset is not NULL, then it points to a variable holding the
   file offset from which sendfile() will start reading data from
   in_fd.  When sendfile() returns, this variable will be set to the
   offset of the byte following the last byte that was read.

That is basically another way of saying that the syscall does its best
to send as much as possible, but does not guarantee sending everything.

This is also conveyed in the general context of system programming,
where an OS that guarantees a single-syscall full write of a possibly
multi-gigabyte file is an OS that can be easily bludgeoned to death by
mischievous (or simply misbehaving) userland programs, the eventual
consequence being OS writers tend to avoid making such guarantees
(especially for i/o, where the Real World is known to be Real Weird).

Anyway, i hope Guile grows a ‘sendfile-some’ (or whatever) so that i can
adapt my programs to use it instead of the ttn-do ‘sendfile’:

http://www.gnuvola.org/software/ttn-do/ttn-do.html.gz#zz-sys-linux-gnu

Until then, happy hacking!

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpzVMVmsl0vt.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-14 Thread Thien-Thi Nguyen
() Mark H Weaver m...@netris.org
() Wed, 10 Apr 2013 07:26:32 -0400

   Regarding the proposed low-level interface (which I will call
   'sendfile-some' for now), I have a question: can it actually be used
   to write a robust asynchronous server?

I can't imagine why not, although i certainly don't pretend to be an
expert on async server programming.  My experience is mostly on the
client side, w/ Emacs as the front-end.  I would be surprised if
sendfile(2) is NOT used in robust asynchronous servers written in C.

   Is it guaranteed to never block for more than a short time?  I don't
   know the answer.

Me neither, although i think this depends on ‘O_NONBLOCK’, which is
established on open(2).

   If the answer is no, then it seems to me that this would eliminate
   the most compelling reason for a 'sendfile-some' API.

Why?

   Or how about the other potential use case you gave: to keep stats on
   how much is sent per chunk.  What is the meaning of chunk?  If
   so, is sendfile(2) guaranteed to return once for each chunk?  If not,
   then what is the meaning of the statistics you would gather?

I'd rather not get into particulars in this line of discussion, if you
don't mind, for lack of time.  Instead, i'll just ask you to try to
imagine a robust P2P architecture (every node as both client and server)
that does NOT care about flow control, and spew some philosophy here
(feel free to ignore)...

My reading of sendfile(2) is that it does its best to send as much as
possible, but does not guarantee sending everything.  What it does
succeed in sending, it reports to the caller.  The caller loops as
desired, after evaluating (in some caller-meaningful way) the returned
information.

All i desire is the semantics of a Scheme ‘sendfile’ not deviate from
that of the syscall sendfile(2); i judge not implication of the name
for the statistical majority, only fidelity, when it comes to syscalls.

Control, not coddling, please -- why should C programmers have all the fun?

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpQcy7YAESkx.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-14 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Wed, 10 Apr 2013 22:56:38 +0200

   Well, I sympathize with the idea of sticking to the underlying
   syscall semantics; yet, for my own uses of ‘sendfile’, I can only
   think of cases all I want is to send the file.

I understand; it's easy to project one's limitations into the code.

   (Besides, it seems to be unusual to get ‘sendfile’ to send less than
   asked to.  The ‘pipe’ example mentioned in the manual would show up
   at most once every 10 iterations with our test case.)

API design is the art of making possible both the usual and the unusual.

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpYsAoa76Ym_.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-10 Thread Mark H Weaver
Thien-Thi Nguyen t...@gnuvola.org writes:

 Another way to think about it is: A ‘sendfile/all’ can be implemented in
 terms of a ‘sendfile/non-looping’ but not the other way around.

 So overall, i think hiding partial i/o is a mistake.  This is just one
 instance of that, it seems (‘write’ and ‘put-bytevector’ being others).
 Unlike the others, however, there is still opportunity for change.

I acknowledge that it might be useful to add a lower-level interface for
'sendfile', perhaps called 'sendfile-some', analogous to the R6RS
'get-bytevector-some'.  We can still do that.  However, I continue to
believe that 'sendfile' should do what its name implies, and what 99% of
users will want and expect.

The alternative would most likely result in the vast majority of code
written using 'sendfile' to be sporadically unreliable.  Even Ludovic
had trouble getting the loop right, and we all know that he's an
exceptionally talented and careful hacker.

Regarding the proposed low-level interface (which I will call
'sendfile-some' for now), I have a question: can it actually be used to
write a robust asynchronous server?  Is it guaranteed to never block for
more than a short time?  I don't know the answer.  If the answer is
no, then it seems to me that this would eliminate the most compelling
reason for a 'sendfile-some' API.

Or how about the other potential use case you gave: to keep stats on how
much is sent per chunk.  What is the meaning of chunk?  If so, is
sendfile(2) guaranteed to return once for each chunk?  If not, then what
is the meaning of the statistics you would gather?

 Regards,
   Mark



Re: [PATCH] Bindings for ‘sendfile’

2013-04-10 Thread Ludovic Courtès
Thien-Thi Nguyen t...@gnuvola.org skribis:

 So overall, i think hiding partial i/o is a mistake.

Well, I sympathize with the idea of sticking to the underlying syscall
semantics; yet, for my own uses of ‘sendfile’, I can only think of cases
all I want is to send the file.

(Besides, it seems to be unusual to get ‘sendfile’ to send less than
asked to.  The ‘pipe’ example mentioned in the manual would show up at
most once every 10 iterations with our test case.)

Ludo’.




Re: [PATCH] Bindings for ‘sendfile’

2013-04-09 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Sun, 07 Apr 2013 21:53:26 +0200

   +In other cases, the libc function may send fewer bytes than
   +@var{count}---for instance because @var{out} is a slow or limited
   +device, such as a pipe.  When that happens, Guile's @code{sendfile}
   +automatically retries until exactly @var{count} bytes were sent or an
   +error occurs.

A short write is an opportunity for the caller to Do Something Else
(i.e., go asynchronous).  I think that is more useful than internalizing
the looping.  To accomodate both usage patterns, you could leave the
low-level proc as-is (as-was :-D) and provide another proc that loops.

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpDM0vxKLrBC.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-09 Thread Ludovic Courtès
Hi!

Thien-Thi Nguyen t...@gnuvola.org skribis:

 () l...@gnu.org (Ludovic Courtès)
 () Sun, 07 Apr 2013 21:53:26 +0200

+In other cases, the libc function may send fewer bytes than
+@var{count}---for instance because @var{out} is a slow or limited
+device, such as a pipe.  When that happens, Guile's @code{sendfile}
+automatically retries until exactly @var{count} bytes were sent or an
+error occurs.

 A short write is an opportunity for the caller to Do Something Else
 (i.e., go asynchronous).  I think that is more useful than internalizing
 the looping.  To accomodate both usage patterns, you could leave the
 low-level proc as-is (as-was :-D) and provide another proc that loops.

Well, I hesitated a lot.  My initial inclination was to do what you say,
and let users handle the looping.

After a long discussion on IRC was Mark, I was mostly convinced that
leaving that to users is questionable.

First, because one obviously expects the procedure to send the file, not
just part of it.  Second, those who did not forget to implement the loop
can easily get it wrong.  Third, if you really want to interleave I/O
and computation, you probably don’t want to use sendfile(2) in the first
place.  Fourth, ‘write’, ‘put-bytevector’, etc. also hide short writes.

My 4¢,
Ludo’.




Re: [PATCH] Bindings for ‘sendfile’

2013-04-09 Thread Thien-Thi Nguyen
() l...@gnu.org (Ludovic Courtès)
() Tue, 09 Apr 2013 17:02:03 +0200

   After a long discussion on IRC was Mark, I was mostly convinced that
   leaving that to users is questionable.

User behavior in the wild is always questionable.  :-D

   First, because one obviously expects the procedure to send the file,
   not just part of it.  Second, those who did not forget to implement
   the loop can easily get it wrong.  Third, if you really want to
   interleave I/O and computation, you probably don’t want to use
   sendfile(2) in the first place.  Fourth, ‘write’, ‘put-bytevector’,
   etc. also hide short writes.

Yes, the overall intent is to send all the file.  However, it's not wise
to assume caller doesn't want to keep stats (for example) on how much of
the file is sent per chunk.  That is not a heavy computation, but still
quite valid.  (Imagine an adaptive server that has write access to the
kernel network and virtual memory subsystem knobs.)

The same thinking applies to any internalized looping.  The design
principle to pursue is that the lowest level should leave the most
control to the caller, w/o losing the essence of the functionality.
[Insert quote from that wily programmer, A. Einstein, here. :-D]
[Insert analogous bufferbloat problem, here.]

(Of course, higher-level convenience interfaces are always welcome, once
the possibility of control (by caller) is guaranteed.)

Another way to think about it is: A ‘sendfile/all’ can be implemented in
terms of a ‘sendfile/non-looping’ but not the other way around.

So overall, i think hiding partial i/o is a mistake.  This is just one
instance of that, it seems (‘write’ and ‘put-bytevector’ being others).
Unlike the others, however, there is still opportunity for change.

-- 
Thien-Thi Nguyen
GPG key: 4C807502


pgpLchJZ8HpcU.pgp
Description: PGP signature


Re: [PATCH] Bindings for ‘sendfile’

2013-04-07 Thread Ludovic Courtès
There were sporadic failures of the sendfile/pipe tests on Hydra.  I
managed to reproduce them and noticed that sometimes sendfile(2) would
return 64KiB, which is much less than what we asked for.

Although the man page doesn’t mention it, this can happen when writing
to a slow or limited device, like a pipe (pretty much like write(2)).

I was hesitant whether to stick to libc behavior (return the number of
bytes sent and let the user handle it), or to handle it directly in our
‘sendfile’ procedure.  After discussion with Mark on IRC, I became
convinced that the latter is preferable, so here’s the patch.

I’ll commit it shortly if there are no objections.

Ludo’.

From 42459c382ca512c927b6228c1557afe8e10aedae Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= l...@gnu.org
Date: Sun, 7 Apr 2013 21:47:48 +0200
Subject: [PATCH] Change `sendfile' to loop until everything has been sent.

* libguile/filesys.c (scm_sendfile)[HAVE_SYS_SENDFILE_H 
  HAVE_SENDFILE]: Compare RESULT with C_COUNT.  Loop until C_COUNT bytes
  have been sent.  Return SCM_UNSPECIFIED.
* doc/ref/posix.texi (File System): Update the description.  Explain the
  new semantics.
---
 doc/ref/posix.texi |   11 +--
 libguile/filesys.c |   36 +---
 2 files changed, 38 insertions(+), 9 deletions(-)

diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
index 45f320f..13f3a48 100644
--- a/doc/ref/posix.texi
+++ b/doc/ref/posix.texi
@@ -806,9 +806,10 @@ The return value is unspecified.
 @deffn {Scheme Procedure} sendfile out in count [offset]
 @deffnx {C Function} scm_sendfile (out, in, count, offset)
 Send @var{count} bytes from @var{in} to @var{out}, both of which
-are either open file ports or file descriptors.  When
+must be either open file ports or file descriptors.  When
 @var{offset} is omitted, start reading from @var{in}'s current
-position; otherwise, start reading at @var{offset}.
+position; otherwise, start reading at @var{offset}.  The return
+value is unspecified.
 
 When @var{in} is a port, it is often preferable to specify @var{offset},
 because @var{in}'s offset as a port may be different from the offset of
@@ -824,6 +825,12 @@ In some cases, the @code{sendfile} libc function may return
 @code{EINVAL} or @code{ENOSYS}.  In that case, Guile's @code{sendfile}
 procedure automatically falls back to doing a series of @code{read} and
 @code{write} calls.
+
+In other cases, the libc function may send fewer bytes than
+@var{count}---for instance because @var{out} is a slow or limited
+device, such as a pipe.  When that happens, Guile's @code{sendfile}
+automatically retries until exactly @var{count} bytes were sent or an
+error occurs.
 @end deffn
 
 @findex rename
diff --git a/libguile/filesys.c b/libguile/filesys.c
index d318ae7..069c7ad 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -,9 +,10 @@ SCM_DEFINE (scm_copy_file, copy-file, 2, 0, 0,
 SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 	(SCM out, SCM in, SCM count, SCM offset),
 	Send @var{count} bytes from @var{in} to @var{out}, both of which 
-	are either open file ports or file descriptors.  When 
+	must be either open file ports or file descriptors.  When 
 	@var{offset} is omitted, start reading from @var{in}'s current 
-	position; otherwise, start reading at @var{offset}.)
+	position; otherwise, start reading at @var{offset}.  The return 
+	value is unspecified.)
 #define FUNC_NAME s_scm_sendfile
 {
 #define VALIDATE_FD_OR_PORT(cvar, svar, pos)	\
@@ -1139,9 +1140,31 @@ SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 #if defined HAVE_SYS_SENDFILE_H  defined HAVE_SENDFILE
   /* The Linux-style sendfile(2), which is different from the BSD-style.  */
 
-  result = sendfile_or_sendfile64 (out_fd, in_fd,
-   SCM_UNBNDP (offset) ? NULL : c_offset,
-   c_count);
+  {
+size_t total;
+off_t *offset_ptr;
+
+offset_ptr = SCM_UNBNDP (offset) ? NULL : c_offset;
+
+/* On Linux, when OUT_FD is a file, everything is transferred at once and
+   RESULT == C_COUNT.  However, when OUT_FD is a pipe or other slow
+   device, fewer bytes may be transferred, hence the loop.  */
+for (total = 0, result = 0; total  c_count  result = 0; )
+  {
+	result = sendfile_or_sendfile64 (out_fd, in_fd, offset_ptr,
+	 c_count - total);
+	if (result  0)
+	  {
+	total += result;
+	if (offset_ptr == NULL)
+	  offset_ptr = c_offset;
+	*offset_ptr = total;
+	  }
+	else if (result  0  errno == EINTR)
+	  /* Keep going.  */
+	  result = 0;
+  }
+  }
 
   /* Quoting the Linux man page: In Linux kernels before 2.6.33, out_fd
  must refer to a socket.  Since Linux 2.6.33 it can be any file.
@@ -1183,10 +1206,9 @@ SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 	result += obtained;
   }
 
-return scm_from_size_t (result);
   }
 
-  return scm_from_ssize_t (result);
+  return SCM_UNSPECIFIED;
 
 #undef VALIDATE_FD_OR_PORT
 }
-- 
1.7.10.4



Re: [PATCH] Bindings for ‘sendfile’

2013-04-07 Thread Ludovic Courtès
I just pushed a new version, which reinstates the return value, and
appropriately detects EOF condition on the input (by just returning 0).

Comments welcome!

Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-03-25 Thread Ludovic Courtès
Mark H Weaver m...@netris.org skribis:

 The sendfile commit (fbac7c6113056bc6ee85996b10bdc08325c742a5) has
 caused the following build failures on Hydra.

Thanks for the notification.

 On GNU/Linux: (both i686-linux and x86_64-linux without threads)

 http://hydra.nixos.org/build/4463700/log/raw
 http://hydra.nixos.org/build/4463695/log/raw

 checking sys/sendfile.h usability... yes
 checking sys/sendfile.h presence... yes
 checking for sys/sendfile.h... yes
 [...]
 checking for sendfile... yes

 ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f ~A 
 (Function not implemented) (38)))
 ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error 
 #f ~A (Function not implemented) (38)))

Oops, fixed by 45417ab.

 On FreeBSD: (x86_64-freebsd without threads)

[...]

 On Darwin: (x86_64-darwin, both with and without threads)

 http://hydra.nixos.org/build/4463702/log/raw
 http://hydra.nixos.org/build/4463698/log/raw

 checking sys/sendfile.h usability... no
 checking sys/sendfile.h presence... no
 checking for sys/sendfile.h... no
 [...]
 checking for sendfile... yes

It turns out that the BSDs have their own flavor:

  http://www.unix.com/man-page/FreeBSD/2/sendfile/
  
https://developer.apple.com/library/mac/#documentation/Darwin/Reference/Manpages/man2/sendfile.2.html

We should support it, eventually, but for now commit 11ed427 just makes
sure we don’t mistakenly attempt to use it.

Thanks,
Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-03-23 Thread Mark H Weaver
Hi Ludovic,

The sendfile commit (fbac7c6113056bc6ee85996b10bdc08325c742a5) has
caused the following build failures on Hydra.

Thanks,
  Mark


==
On GNU/Linux: (both i686-linux and x86_64-linux without threads)

http://hydra.nixos.org/build/4463700/log/raw
http://hydra.nixos.org/build/4463695/log/raw

checking sys/sendfile.h usability... yes
checking sys/sendfile.h presence... yes
checking for sys/sendfile.h... yes
[...]
checking for sendfile... yes

ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f ~A 
(Function not implemented) (38)))
ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error #f 
~A (Function not implemented) (38)))

==
On FreeBSD: (x86_64-freebsd without threads)

http://hydra.nixos.org/build/4463693/log/raw

checking sys/sendfile.h usability... no
checking sys/sendfile.h presence... no
checking for sys/sendfile.h... no
[...]
checking for sendfile... yes

filesys.c: In function 'scm_sendfile':
filesys.c:1140: warning: implicit declaration of function 'sendfile'

ERROR: filesys.test: sendfile: file - arguments: ((system-error sendfile ~A 
(Bad file descriptor) (9)))
ERROR: filesys.test: sendfile: file with offset - arguments: ((system-error 
sendfile ~A (Bad file descriptor) (9)))
ERROR: filesys.test: sendfile: pipe - arguments: ((system-error #f ~A 
(Function not implemented) (78)))
ERROR: filesys.test: sendfile: pipe with offset - arguments: ((system-error #f 
~A (Function not implemented) (78)))

==
On Darwin: (x86_64-darwin, both with and without threads)

http://hydra.nixos.org/build/4463702/log/raw
http://hydra.nixos.org/build/4463698/log/raw

checking sys/sendfile.h usability... no
checking sys/sendfile.h presence... no
checking for sys/sendfile.h... no
[...]
checking for sendfile... yes

../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c: In function 
'scm_sendfile':
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: warning: passing 
argument 3 of 'sendfile' makes integer from pointer without a cast [enabled by 
default]
/usr/include/sys/socket.h:619:5: note: expected 'off_t' but argument is of type 
'scm_t_off *'
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: warning: passing 
argument 4 of 'sendfile' makes pointer from integer without a cast [enabled by 
default]
/usr/include/sys/socket.h:619:5: note: expected 'off_t *' but argument is of 
type 'size_t'
../../guile-2.0.7.227-86faf-dirty/libguile/filesys.c:1142:8: error: too few 
arguments to function 'sendfile'
/usr/include/sys/socket.h:619:5: note: declared here
make[3]: *** [libguile_2.0_la-filesys.lo] Error 1



Re: [PATCH] Bindings for ‘sendfile’

2013-03-22 Thread Mark H Weaver
l...@gnu.org (Ludovic Courtès) writes:

 Mark H Weaver m...@netris.org skribis:

 Since the code below will behave badly if 'c_count' does not fit in an
 'ssize_t', we should validate here that it _does_ fit.

 Oops, indeed.  (Note that sendfile(2) and write(2) have that problem:
 they take a size_t and return a ssize_t...)

 There was also the other issue of making sure we use the right function,
 depending on _FILE_OFFSET_BITS  co.

 Here are the changes compared to the previous patch:

[...]

 WDYT?

Looks good to me! :)

Mark



Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Ludovic Courtès
Hi Mark,

Mark H Weaver m...@netris.org skribis:

 l...@gnu.org (Ludovic Courtès) writes:
 I plan to commit the patch below, which adds bindings for ‘sendfile’.

 Comments?

 Looks great to me, modulo one comment below.

Thanks for the quick review!

 I especially like the fact that although it can make use of the
 non-standard Linux syscall, it works properly on all systems.  In
 response to suggestions by others that we create a linux module:
 I'd prefer to follow the good precedent set by Ludovic here.

Yeah.  (And experience has shown that POSIX largely follows GNU/Linux
nowadays.)

 +  size_t c_count;
 +  off_t c_offset;
 +  ssize_t result;
 +  int in_fd, out_fd;
 +
 +  VALIDATE_FD_OR_PORT (out_fd, out, 1);
 +  VALIDATE_FD_OR_PORT (in_fd, in, 2);
 +  c_count = scm_to_size_t (count);

 Since the code below will behave badly if 'c_count' does not fit in an
 'ssize_t', we should validate here that it _does_ fit.

Oops, indeed.  (Note that sendfile(2) and write(2) have that problem:
they take a size_t and return a ssize_t...)

There was also the other issue of making sure we use the right function,
depending on _FILE_OFFSET_BITS  co.

Here are the changes compared to the previous patch:

diff --git a/libguile/filesys.c b/libguile/filesys.c
index 097b03a..6804db9 100644
--- a/libguile/filesys.c
+++ b/libguile/filesys.c
@@ -102,6 +102,10 @@
 # include sys/sendfile.h
 #endif
 
+/* Glibc's `sendfile' function.  */
+#define sendfile_or_sendfile64			\
+  CHOOSE_LARGEFILE (sendfile, sendfile64)
+
 #include full-read.h
 #include full-write.h
 
@@ -1123,7 +1127,7 @@ SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 }
 
   size_t c_count;
-  off_t c_offset;
+  scm_t_off c_offset;
   ssize_t result;
   int in_fd, out_fd;
 
@@ -1133,20 +1137,20 @@ SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
   c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
 
 #ifdef HAVE_SENDFILE
-  result = sendfile (out_fd, in_fd,
+  result = sendfile_or_sendfile64 (out_fd, in_fd,
    SCM_UNBNDP (offset) ? NULL : c_offset,
    c_count);
 
   /* Quoting the Linux man page: In Linux kernels before 2.6.33, out_fd
  must refer to a socket.  Since Linux 2.6.33 it can be any file.
- Fall back to read(2) and write(2) such an error happens.  */
+ Fall back to read(2) and write(2) when such an error occurs.  */
   if (result  0  errno != EINVAL  errno != ENOSYS)
 SCM_SYSERROR;
   else if (result  0)
 #endif
   {
 char buf[8192];
-size_t left;
+size_t result, left;
 
 if (!SCM_UNBNDP (offset))
   {
@@ -1173,6 +1177,8 @@ SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 
 	result += obtained;
   }
+
+return scm_from_size_t (result);
   }
 
   return scm_from_ssize_t (result);

WDYT?

Thanks,
Ludo’.


Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Andy Wingo
On Thu 21 Mar 2013 10:40, l...@gnu.org (Ludovic Courtès) writes:

 l...@gnu.org (Ludovic Courtès) writes:
 I plan to commit the patch below, which adds bindings for
‘sendfile’.

Should probably go in gnulib at some point, no?  Looks good tho :)
-- 
http://wingolog.org/



Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Andrew Gaylard

On 03/21/13 11:15, Ludovic Courtès wrote:

Noah Lavine noah.b.lav...@gmail.com skribis:

I've thought for a while that if I had time (which I know I won't) I would
make a module called (linux) with bindings for non-POSIX Linux kernel
features. What do you think of this idea? If so, what do you think of
putting sendfile there and expanding it with other functions as we need
them?

I’ve thought about it, but ended up with making sendfile work whether or
not the syscall is available (just like glibc does, after all).

So for this particular case, I’d rather keep it in the global name
space.  There’s also the untold argument that even if sendfile(2) is
unavailable, the loop written in C is going to be faster than the
equivalent bytecode.
Just another datapoint: Solaris 10 has sendfile.  So it's not just a 
Linux feature.


--
Andrew



Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Ludovic Courtès
Andy Wingo wi...@pobox.com skribis:

 On Thu 21 Mar 2013 10:40, l...@gnu.org (Ludovic Courtès) writes:

 l...@gnu.org (Ludovic Courtès) writes:
 I plan to commit the patch below, which adds bindings for
‘sendfile’.

 Should probably go in gnulib at some point, no?

Yes, you’re right.  I’ll see what I can do.

Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-03-21 Thread Noah Lavine
Hello,

Yes, you're completely right - making it work on all platforms is much
better than what I had proposed. I'm glad you're doing this.

Thanks,
Noah


On Thu, Mar 21, 2013 at 5:15 AM, Ludovic Courtès l...@gnu.org wrote:

 Hi Noah,

 Noah Lavine noah.b.lav...@gmail.com skribis:

  I've thought for a while that if I had time (which I know I won't) I
 would
  make a module called (linux) with bindings for non-POSIX Linux kernel
  features. What do you think of this idea? If so, what do you think of
  putting sendfile there and expanding it with other functions as we need
  them?

 I’ve thought about it, but ended up with making sendfile work whether or
 not the syscall is available (just like glibc does, after all).

 So for this particular case, I’d rather keep it in the global name
 space.  There’s also the untold argument that even if sendfile(2) is
 unavailable, the loop written in C is going to be faster than the
 equivalent bytecode.

 FWIW, I plan to integrate the Linux bindings I wrote for “boot-to-Guile”
 eventually:


 http://git.savannah.gnu.org/cgit/guix.git/tree/gnu/packages/patches/guile-linux-syscalls.patch

 These are not defined in POSIX, but apart from the Linux module
 syscalls, they (that is, mount(2) and the networking ioctls) happen to
 be supported by glibc on all 3 kernels, and also by other libcs.  Thus,
 I’d rather keep them in the global name space as well.

 WDYT?

 Thanks,
 Ludo’.



Re: [PATCH] Bindings for ‘sendfile’

2013-03-20 Thread Noah Lavine
Hi,

sendfile looks very useful!

I've thought for a while that if I had time (which I know I won't) I would
make a module called (linux) with bindings for non-POSIX Linux kernel
features. What do you think of this idea? If so, what do you think of
putting sendfile there and expanding it with other functions as we need
them?

That would make it very clear when writing a program which functions were
portable and which weren't.

Best,
Noah


On Wed, Mar 20, 2013 at 6:21 PM, Ludovic Courtès l...@gnu.org wrote:

 Hi,

 I plan to commit the patch below, which adds bindings for ‘sendfile’.

 Comments?

 Ludo’.




Re: [PATCH] Bindings for ‘sendfile’

2013-03-20 Thread Nala Ginrut
On Wed, 2013-03-20 at 23:21 +0100, Ludovic Courtès wrote:
 Hi,
 
 I plan to commit the patch below, which adds bindings for ‘sendfile’.
 
 Comments?
 

As a server-develop fan, I definitely love it.
Besides, can we add more linux-specific features and add them into a
place like (ice-9 linux)?

 Ludo’.
 
 differences between files attachment
 (0001-Add-bindings-for-Linux-s-sendfile.patch), the patch
 From a10f5d5d69d63495cab5432d858b1af52a2bacbf Mon Sep 17 00:00:00 2001
 From: =?UTF-8?q?Ludovic=20Court=C3=A8s?= l...@gnu.org
 Date: Wed, 20 Mar 2013 23:04:11 +0100
 Subject: [PATCH] Add bindings for Linux's `sendfile'.
 
 * configure.ac: Check for sys/sendfile.h and `sendfile'.
 * libguile/filesys.c (scm_sendfile): New function.
 * libguile/filesys.h (scm_sendfile): New declaration.
 * test-suite/tests/filesys.test (sendfile): New test prefix.
 * doc/ref/posix.texi (File System): Document `sendfile'.
 ---
  configure.ac  |   20 --
  doc/ref/posix.texi|   23 +++
  libguile/filesys.c|   85 
 +
  libguile/filesys.h|4 +-
  test-suite/tests/filesys.test |   70 -
  5 files changed, 195 insertions(+), 7 deletions(-)
 
 diff --git a/configure.ac b/configure.ac
 index 42de733..bcfc1a6 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -647,12 +647,13 @@ AC_SUBST([SCM_I_GSC_HAVE_STRUCT_DIRENT64])
  # this file instead of fenv.h
  #   process.h - mingw specific
  #   sched.h - missing on MinGW
 +#   sys/sendfile.h - non-POSIX, found in glibc
  #
  AC_CHECK_HEADERS([complex.h fenv.h io.h libc.h limits.h memory.h process.h 
 string.h \
  sys/dir.h sys/ioctl.h sys/select.h \
  sys/time.h sys/timeb.h sys/times.h sys/stdtypes.h sys/types.h \
  sys/utime.h time.h unistd.h utime.h pwd.h grp.h sys/utsname.h \
 -direct.h machine/fpu.h sched.h])
 +direct.h machine/fpu.h sched.h sys/sendfile.h])
  
  # complex double is new in C99, and complex is only a keyword if
  # complex.h is included
 @@ -744,10 +745,21 @@ AC_CHECK_HEADERS([assert.h crt_externs.h])
  #   _NSGetEnviron - Darwin specific
  #   strcoll_l, newlocale - GNU extensions (glibc), also available on Darwin
  #   fork - unavailable on Windows
 -#   utimensat: posix.1-2008
 -#   sched_getaffinity, sched_setaffinity: GNU extensions (glibc)
 +#   utimensat - posix.1-2008
 +#   sched_getaffinity, sched_setaffinity - GNU extensions (glibc)
 +#   sendfile - non-POSIX, found in glibc
  #
 -AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid fesetround 
 ftime ftruncate fchown fchmod getcwd geteuid getsid gettimeofday gmtime_r 
 ioctl lstat mkdir mknod nice pipe _pipe readdir_r readdir64_r readlink rename 
 rmdir select setegid seteuid setlocale setpgid setsid sigaction siginterrupt 
 stat64 strftime strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname 
 waitpid strdup system usleep atexit on_exit chown link fcntl ttyname getpwent 
 getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp index 
 bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron strcoll strcoll_l 
 newlocale utimensat sched_getaffinity sched_setaffinity])
 +AC_CHECK_FUNCS([DINFINITY DQNAN cexp chsize clog clog10 ctermid  
 \
 +  fesetround ftime ftruncate fchown fchmod getcwd geteuid getsid \
 +  gettimeofday gmtime_r ioctl lstat mkdir mknod nice pipe _pipe  
 \
 +  readdir_r readdir64_r readlink rename rmdir select setegid seteuid \
 +  setlocale setpgid setsid sigaction siginterrupt stat64 strftime\
 +  strptime symlink sync sysconf tcgetpgrp tcsetpgrp times uname waitpid  
 \
 +  strdup system usleep atexit on_exit chown link fcntl ttyname getpwent  
 \
 +  getgrent kill getppid getpgrp fork setitimer getitimer strchr strcmp   
 \
 +  index bcopy memcpy rindex truncate unsetenv isblank _NSGetEnviron  \
 +  strcoll strcoll_l newlocale utimensat sched_getaffinity\
 +  sched_setaffinity sendfile])
  
  AM_CONDITIONAL([HAVE_FORK], [test x$ac_cv_func_fork = xyes])
  
 diff --git a/doc/ref/posix.texi b/doc/ref/posix.texi
 index d659cf3..ca02093 100644
 --- a/doc/ref/posix.texi
 +++ b/doc/ref/posix.texi
 @@ -803,6 +803,29 @@ Copy the file specified by @var{oldfile} to 
 @var{newfile}.
  The return value is unspecified.
  @end deffn
  
 +@deffn {Scheme Procedure} sendfile out in count [offset]
 +@deffnx {C Function} scm_sendfile (out, in, count, offset)
 +Send @var{count} bytes from @var{in} to @var{out}, both of which
 +are either open file ports or file descriptors.  When
 +@var{offset} is omitted, start reading from @var{in}'s current
 +position; otherwise, start reading at @var{offset}.
 +
 +When @var{in} is a port, it is often preferable to specify @var{offset},
 +because @var{in}'s offset as a port may be different from the offset of
 +its underlying file descriptor.
 +
 +On systems that support it, such as GNU/Linux, this procedure uses the
 +@code{sendfile} libc 

Re: [PATCH] Bindings for ‘sendfile’

2013-03-20 Thread Mark H Weaver
Hi Ludovic,

l...@gnu.org (Ludovic Courtès) writes:
 I plan to commit the patch below, which adds bindings for ‘sendfile’.

 Comments?

Looks great to me, modulo one comment below.

I especially like the fact that although it can make use of the
non-standard Linux syscall, it works properly on all systems.  In
response to suggestions by others that we create a linux module:
I'd prefer to follow the good precedent set by Ludovic here.

 diff --git a/libguile/filesys.c b/libguile/filesys.c
 index 282ff31..097b03a 100644
 --- a/libguile/filesys.c
 +++ b/libguile/filesys.c
 @@ -98,6 +98,14 @@
  
  #define NAMLEN(dirent)  strlen ((dirent)-d_name)
  
 +#ifdef HAVE_SYS_SENDFILE_H
 +# include sys/sendfile.h
 +#endif
 +
 +#include full-read.h
 +#include full-write.h

I didn't know about Gnulib's 'full-read' and 'full-write'.  Nice!
We should probably make more use of these throughout the tree.

 +SCM_DEFINE (scm_sendfile, sendfile, 3, 1, 0,
 + (SCM out, SCM in, SCM count, SCM offset),
 + Send @var{count} bytes from @var{in} to @var{out}, both of which 
 + are either open file ports or file descriptors.  When 
 + @var{offset} is omitted, start reading from @var{in}'s current 
 + position; otherwise, start reading at @var{offset}.)
 +#define FUNC_NAME s_scm_sendfile
 +{
 +#define VALIDATE_FD_OR_PORT(cvar, svar, pos) \
 +  if (scm_is_integer (svar)) \
 +cvar = scm_to_int (svar);\
 +  else   \
 +{\
 +  SCM_VALIDATE_OPFPORT (pos, svar);  \
 +  scm_flush (svar);  \
 +  cvar = SCM_FPORT_FDES (svar);  \
 +}
 +
 +  size_t c_count;
 +  off_t c_offset;
 +  ssize_t result;
 +  int in_fd, out_fd;
 +
 +  VALIDATE_FD_OR_PORT (out_fd, out, 1);
 +  VALIDATE_FD_OR_PORT (in_fd, in, 2);
 +  c_count = scm_to_size_t (count);

Since the code below will behave badly if 'c_count' does not fit in an
'ssize_t', we should validate here that it _does_ fit.

 +  c_offset = SCM_UNBNDP (offset) ? 0 : scm_to_off_t (offset);
 +
 +#ifdef HAVE_SENDFILE
 +  result = sendfile (out_fd, in_fd,
 +  SCM_UNBNDP (offset) ? NULL : c_offset,
 +  c_count);
 +
 +  /* Quoting the Linux man page: In Linux kernels before 2.6.33, out_fd
 + must refer to a socket.  Since Linux 2.6.33 it can be any file.
 + Fall back to read(2) and write(2) such an error happens.  */
 +  if (result  0  errno != EINVAL  errno != ENOSYS)
 +SCM_SYSERROR;
 +  else if (result  0)
 +#endif
 +  {
 +char buf[8192];
 +size_t left;
 +
 +if (!SCM_UNBNDP (offset))
 +  {
 + if (SCM_PORTP (in))
 +   scm_seek (in, offset, scm_from_int (SEEK_SET));
 + else
 +   lseek_or_lseek64 (in_fd, c_offset, SEEK_SET);
 +  }
 +
 +for (result = 0, left = c_count; result  c_count; )

If 'c_count's does not fit in a 'ssize_t', then this loop will go
forever and 'result' will wrap around to negative numbers and undefined
C behavior.

 +  {
 + size_t asked, obtained;
 +
 + asked = SCM_MIN (sizeof buf, left);
 + obtained = full_read (in_fd, buf, asked);
 + if (obtained  asked)
 +   SCM_SYSERROR;
 +
 + left -= obtained;
 +
 + obtained = full_write (out_fd, buf, asked);
 + if (obtained  asked)
 +   SCM_SYSERROR;
 +
 + result += obtained;
 +  }
 +  }
 +
 +  return scm_from_ssize_t (result);
 +
 +#undef VALIDATE_FD_OR_PORT
 +}
 +#undef FUNC_NAME
 +
  #endif /* HAVE_POSIX */

Other than that, looks beautiful to me :)

Thanks!
  Mark



Re: [PATCH] Bindings for ‘sendfile’

2013-03-20 Thread Mark H Weaver
Mark H Weaver m...@netris.org writes:
 +for (result = 0, left = c_count; result  c_count; )

 If 'c_count's does not fit in a 'ssize_t', then this loop will go
 forever and 'result' will wrap around to negative numbers and undefined
 C behavior.

Having just consulted the relevant C standards, I see that I was wrong
to claim that the loop will spin forever.  If a signed integer is
compared with an unsigned integer of equal rank, the signed integer is
converted to unsigned first.

The latter part of my statement (about undefined behavior) was true, and
invalidates any other predictions.  If 'result' overflows (and it will
if 'c_count' does not fit in a 'ssize_t' and nothing else goes wrong)
then the behavior of the program is completely undefined.  Modern C
compilers are increasingly fond of taking advantage of that fact, as
demonstrated by the recent discovery that they were optimizing out our
overflow checks.

http://stackoverflow.com/questions/14495636/strange-multiplication-behavior-in-guile-scheme-interpreter
http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=2355f01709eadfd5350c510cdb095b4e3f71f17c

  Mark