Re: Lightwight socket IO wrapper

James Harris Tue, 22 Sep 2015 12:53:47 -0700

"Dennis Lee Bieber" <wlfr...@ix.netcom.com> wrote in messagenews:mailman.12.1442794762.28679.python-l...@python.org...

On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
<james.harri...@gmail.com> declaimed the following:
There are a few things and more crop up as time goes on. For example,
over TCP it would be helpful to have a function to receive a specific
number of bytes or one to read bytes until reaching a certaindelimitersuch as newline or zero or space etc. Even better would be to be ableto
use the iteration protocol so you could just code next() and get the
next such chunk of read in a for loop. When sending it would be goodto
just say to send a bunch of bytes but know that you will get told how
many were sent (or didn't get sent) if it fails. Sock.sendall()doesn't
do that.
Note that the "buffer size" option on a TCP socket.recv() gives you
your "specific number of bytes" -- if available at that time.


"If" is a big word!

AIUI the buffer size is not guaranteed to relate to the number of bytesreturned except that you won't/shouldn't(!) get more than the buffersize.

I wouldn't want to user .recv(1) though to implement your "reaching a
certain delimiter"... Much better to read as much as available andsearch
it for the delimiter.

Yes, that's what I do at the moment. I keep a block of bytes, add anynew stuff to it and scan it for delimiters.

I'll confess, adding a .readln() FOR TCP ONLY, might
be a nice extension over BSD sockets (might need to allow option for
whether line-ends are Internet standard <cr><lf> or some other marker,andwhether they should be converted upon reading to the native format forthe
host).

Akira Li pointed out that there is just such an extension: makefile.Scanning to <lf> is what I do just now as that includes <cr><lf> too andI leave them on the string. IIRC file.readline works in the same way.

I thought UDP would deliver (or drop) a whole datagram but cannot find
anything in the Python documentaiton to guarantee that. In fact
documentation for the send() call says that apps are responsible for
checking that all data has been sent. They may mean that to apply to
stream protocols only but it doesn't state that. (Of course, UDP
datagrams are limited in size so the call may validly indicate
incomplete transmission even when the first part of a big message is
sent successfully.)

Looking in the wrong documentation <G>

You probably should be looking at the UDP RFC. Or maybe just

http://www.diffen.com/difference/TCP_vs_UDP

"""

Packets are sent individually and are checked for integrity only iftheyarrive. Packets have definite boundaries which are honored uponreceipt,

meaning a read operation at the receiver socket will yield an entire
message as it was originally sent.
"""

I would rather see it in the Python docs because we program to thelanguage standard and there can be - and often are, for good reason -areas where Python does not work in the same way as underlying systems.

Even if the IP layer has to fragment a UDP packet to meet limits ofthetransport media, it should put them back together on the other endbeforepassing it up to the UDP layer. To my knowledge, UDP does not have asizelimit on the message (well -- a 16-bit length field in the UDPheader). Butsince it /is/ "got it all" or "dropped" with no inherent confirmation,onewould have to embed their own protocol within it -- sequence numberswithACK/NAK, for example. Problem: if using LARGE UDP packets, thisprotocolwould mean having LARGE resends should packets be dropped or arriveout of
sequence (and since the ACK/NAK could be dropped too, you may have to
handle the case of a duplicated packet -- also large).


Yes, it was the 16-bit limitation that I was talking about.

TCP is a stream protocol -- the protocol will ensure that all data
arrives, and that it arrives in order, but does not enforce anyboundaries
on the data; what started as a relatively large packet at one end may
arrive as lots of small packets due to intermediate transport limits(onecan visualize a worst case: each TCP packet is broken up to fitHollerithcards; 20bytes for header and 60 bytes of data -- then fed to a readerandsent on AS-IS). Boundaries are the end-user responsibility... lineendings(look at SMTP, where an email message ends on a line containing just a".")
or embedded length counter (not the TCP packet length).


Yes.

Receiving no bytes is taken as indicating the end of thecommunication.
That's OK for TCP but not for UDP so there should be a way to
distinguish between the end of data and receiving an empty datagram.
I don't believe UDP supports a truly empty datagram (length of 0) --
presuming a sending stack actually sends one, the receiving stack will
probably drop it as there is no data to pass on to a client (there isa PRat work because we have a UDP driver that doesn't drop 0-lengthmessages,
but also can't deliver them -- so the circular buffer might fill with
undeliverable headers)

As others have pointed out, UDP implementations do seem to work withzero-byte datagrams properly. Again, I would rather see that in thePython documentation which is what, effectively, forms a contract thatwe should be able to rely on.


James

--
https://mail.python.org/mailman/listinfo/python-list

Re: Lightwight socket IO wrapper

Reply via email to