"Dennis Lee Bieber" <wlfr...@ix.netcom.com> wrote in message news:mailman.12.1442794762.28679.python-l...@python.org...
On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
<james.harri...@gmail.com> declaimed the following:



There are a few things and more crop up as time goes on. For example,
over TCP it would be helpful to have a function to receive a specific
number of bytes or one to read bytes until reaching a certain delimiter such as newline or zero or space etc. Even better would be to be able to
use the iteration protocol so you could just code next() and get the
next such chunk of read in a for loop. When sending it would be good to
just say to send a bunch of bytes but know that you will get told how
many were sent (or didn't get sent) if it fails. Sock.sendall() doesn't
do that.

Note that the "buffer size" option on a TCP socket.recv() gives you
your "specific number of bytes" -- if available at that time.

"If" is a big word!

AIUI the buffer size is not guaranteed to relate to the number of bytes returned except that you won't/shouldn't(!) get more than the buffer size.

I wouldn't want to user .recv(1) though to implement your "reaching a
certain delimiter"... Much better to read as much as available and search
it for the delimiter.

Yes, that's what I do at the moment. I keep a block of bytes, add any new stuff to it and scan it for delimiters.

I'll confess, adding a .readln() FOR TCP ONLY, might
be a nice extension over BSD sockets (might need to allow option for
whether line-ends are Internet standard <cr><lf> or some other marker, and whether they should be converted upon reading to the native format for the
host).

Akira Li pointed out that there is just such an extension: makefile. Scanning to <lf> is what I do just now as that includes <cr><lf> too and I leave them on the string. IIRC file.readline works in the same way.

I thought UDP would deliver (or drop) a whole datagram but cannot find
anything in the Python documentaiton to guarantee that. In fact
documentation for the send() call says that apps are responsible for
checking that all data has been sent. They may mean that to apply to
stream protocols only but it doesn't state that. (Of course, UDP
datagrams are limited in size so the call may validly indicate
incomplete transmission even when the first part of a big message is
sent successfully.)

Looking in the wrong documentation <G>

You probably should be looking at the UDP RFC. Or maybe just

http://www.diffen.com/difference/TCP_vs_UDP

"""
Packets are sent individually and are checked for integrity only if they arrive. Packets have definite boundaries which are honored upon receipt,
meaning a read operation at the receiver socket will yield an entire
message as it was originally sent.
"""

I would rather see it in the Python docs because we program to the language standard and there can be - and often are, for good reason - areas where Python does not work in the same way as underlying systems.

Even if the IP layer has to fragment a UDP packet to meet limits of the transport media, it should put them back together on the other end before passing it up to the UDP layer. To my knowledge, UDP does not have a size limit on the message (well -- a 16-bit length field in the UDP header). But since it /is/ "got it all" or "dropped" with no inherent confirmation, one would have to embed their own protocol within it -- sequence numbers with ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol would mean having LARGE resends should packets be dropped or arrive out of
sequence (and since the ACK/NAK could be dropped too, you may have to
handle the case of a duplicated packet -- also large).

Yes, it was the 16-bit limitation that I was talking about.

TCP is a stream protocol -- the protocol will ensure that all data
arrives, and that it arrives in order, but does not enforce any boundaries
on the data; what started as a relatively large packet at one end may
arrive as lots of small packets due to intermediate transport limits (one can visualize a worst case: each TCP packet is broken up to fit Hollerith cards; 20bytes for header and 60 bytes of data -- then fed to a reader and sent on AS-IS). Boundaries are the end-user responsibility... line endings (look at SMTP, where an email message ends on a line containing just a ".")
or embedded length counter (not the TCP packet length).

Yes.

Receiving no bytes is taken as indicating the end of the communication.
That's OK for TCP but not for UDP so there should be a way to
distinguish between the end of data and receiving an empty datagram.

I don't believe UDP supports a truly empty datagram (length of 0) --
presuming a sending stack actually sends one, the receiving stack will
probably drop it as there is no data to pass on to a client (there is a PR at work because we have a UDP driver that doesn't drop 0-length messages,
but also can't deliver them -- so the circular buffer might fill with
undeliverable headers)

As others have pointed out, UDP implementations do seem to work with zero-byte datagrams properly. Again, I would rather see that in the Python documentation which is what, effectively, forms a contract that we should be able to rely on.

James

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to