"Dennis Lee Bieber" <wlfr...@ix.netcom.com> wrote in message
news:mailman.12.1442794762.28679.python-l...@python.org...
On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
<james.harri...@gmail.com> declaimed the following:
There are a few things and more crop up as time goes on. For example,
over TCP it would be helpful to have a function to receive a specific
number of bytes or one to read bytes until reaching a certain
delimiter
such as newline or zero or space etc. Even better would be to be able
to
use the iteration protocol so you could just code next() and get the
next such chunk of read in a for loop. When sending it would be good
to
just say to send a bunch of bytes but know that you will get told how
many were sent (or didn't get sent) if it fails. Sock.sendall()
doesn't
do that.
Note that the "buffer size" option on a TCP socket.recv() gives you
your "specific number of bytes" -- if available at that time.
"If" is a big word!
AIUI the buffer size is not guaranteed to relate to the number of bytes
returned except that you won't/shouldn't(!) get more than the buffer
size.
I wouldn't want to user .recv(1) though to implement your "reaching a
certain delimiter"... Much better to read as much as available and
search
it for the delimiter.
Yes, that's what I do at the moment. I keep a block of bytes, add any
new stuff to it and scan it for delimiters.
I'll confess, adding a .readln() FOR TCP ONLY, might
be a nice extension over BSD sockets (might need to allow option for
whether line-ends are Internet standard <cr><lf> or some other marker,
and
whether they should be converted upon reading to the native format for
the
host).
Akira Li pointed out that there is just such an extension: makefile.
Scanning to <lf> is what I do just now as that includes <cr><lf> too and
I leave them on the string. IIRC file.readline works in the same way.
I thought UDP would deliver (or drop) a whole datagram but cannot find
anything in the Python documentaiton to guarantee that. In fact
documentation for the send() call says that apps are responsible for
checking that all data has been sent. They may mean that to apply to
stream protocols only but it doesn't state that. (Of course, UDP
datagrams are limited in size so the call may validly indicate
incomplete transmission even when the first part of a big message is
sent successfully.)
Looking in the wrong documentation <G>
You probably should be looking at the UDP RFC. Or maybe just
http://www.diffen.com/difference/TCP_vs_UDP
"""
Packets are sent individually and are checked for integrity only if
they
arrive. Packets have definite boundaries which are honored upon
receipt,
meaning a read operation at the receiver socket will yield an entire
message as it was originally sent.
"""
I would rather see it in the Python docs because we program to the
language standard and there can be - and often are, for good reason -
areas where Python does not work in the same way as underlying systems.
Even if the IP layer has to fragment a UDP packet to meet limits of
the
transport media, it should put them back together on the other end
before
passing it up to the UDP layer. To my knowledge, UDP does not have a
size
limit on the message (well -- a 16-bit length field in the UDP
header). But
since it /is/ "got it all" or "dropped" with no inherent confirmation,
one
would have to embed their own protocol within it -- sequence numbers
with
ACK/NAK, for example. Problem: if using LARGE UDP packets, this
protocol
would mean having LARGE resends should packets be dropped or arrive
out of
sequence (and since the ACK/NAK could be dropped too, you may have to
handle the case of a duplicated packet -- also large).
Yes, it was the 16-bit limitation that I was talking about.
TCP is a stream protocol -- the protocol will ensure that all data
arrives, and that it arrives in order, but does not enforce any
boundaries
on the data; what started as a relatively large packet at one end may
arrive as lots of small packets due to intermediate transport limits
(one
can visualize a worst case: each TCP packet is broken up to fit
Hollerith
cards; 20bytes for header and 60 bytes of data -- then fed to a reader
and
sent on AS-IS). Boundaries are the end-user responsibility... line
endings
(look at SMTP, where an email message ends on a line containing just a
".")
or embedded length counter (not the TCP packet length).
Yes.
Receiving no bytes is taken as indicating the end of the
communication.
That's OK for TCP but not for UDP so there should be a way to
distinguish between the end of data and receiving an empty datagram.
I don't believe UDP supports a truly empty datagram (length of 0) --
presuming a sending stack actually sends one, the receiving stack will
probably drop it as there is no data to pass on to a client (there is
a PR
at work because we have a UDP driver that doesn't drop 0-length
messages,
but also can't deliver them -- so the circular buffer might fill with
undeliverable headers)
As others have pointed out, UDP implementations do seem to work with
zero-byte datagrams properly. Again, I would rather see that in the
Python documentation which is what, effectively, forms a contract that
we should be able to rely on.
James
--
https://mail.python.org/mailman/listinfo/python-list