Re: Lightwight socket IO wrapper

2015-09-22 Thread James Harris
"Dennis Lee Bieber"  wrote in message 
news:mailman.12.1442794762.28679.python-l...@python.org...

On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
 declaimed the following:




There are a few things and more crop up as time goes on. For example,
over TCP it would be helpful to have a function to receive a specific
number of bytes or one to read bytes until reaching a certain 
delimiter
such as newline or zero or space etc. Even better would be to be able 
to

use the iteration protocol so you could just code next() and get the
next such chunk of read in a for loop. When sending it would be good 
to

just say to send a bunch of bytes but know that you will get told how
many were sent (or didn't get sent) if it fails. Sock.sendall() 
doesn't

do that.


Note that the "buffer size" option on a TCP socket.recv() gives you
your "specific number of bytes" -- if available at that time.


"If" is a big word!

AIUI the buffer size is not guaranteed to relate to the number of bytes 
returned except that you won't/shouldn't(!) get more than the buffer 
size.



I wouldn't want to user .recv(1) though to implement your "reaching a
certain delimiter"... Much better to read as much as available and 
search

it for the delimiter.


Yes, that's what I do at the moment. I keep a block of bytes, add any 
new stuff to it and scan it for delimiters.



I'll confess, adding a .readln() FOR TCP ONLY, might
be a nice extension over BSD sockets (might need to allow option for
whether line-ends are Internet standard  or some other marker, 
and
whether they should be converted upon reading to the native format for 
the

host).


Akira Li pointed out that there is just such an extension: makefile. 
Scanning to  is what I do just now as that includes  too and 
I leave them on the string. IIRC file.readline works in the same way.



I thought UDP would deliver (or drop) a whole datagram but cannot find
anything in the Python documentaiton to guarantee that. In fact
documentation for the send() call says that apps are responsible for
checking that all data has been sent. They may mean that to apply to
stream protocols only but it doesn't state that. (Of course, UDP
datagrams are limited in size so the call may validly indicate
incomplete transmission even when the first part of a big message is
sent successfully.)


Looking in the wrong documentation 

You probably should be looking at the UDP RFC. Or maybe just

http://www.diffen.com/difference/TCP_vs_UDP

"""
Packets are sent individually and are checked for integrity only if 
they
arrive. Packets have definite boundaries which are honored upon 
receipt,

meaning a read operation at the receiver socket will yield an entire
message as it was originally sent.
"""


I would rather see it in the Python docs because we program to the 
language standard and there can be - and often are, for good reason - 
areas where Python does not work in the same way as underlying systems.


Even if the IP layer has to fragment a UDP packet to meet limits of 
the
transport media, it should put them back together on the other end 
before
passing it up to the UDP layer. To my knowledge, UDP does not have a 
size
limit on the message (well -- a 16-bit length field in the UDP 
header). But
since it /is/ "got it all" or "dropped" with no inherent confirmation, 
one
would have to embed their own protocol within it -- sequence numbers 
with
ACK/NAK, for example. Problem: if using LARGE UDP packets, this 
protocol
would mean having LARGE resends should packets be dropped or arrive 
out of

sequence (and since the ACK/NAK could be dropped too, you may have to
handle the case of a duplicated packet -- also large).


Yes, it was the 16-bit limitation that I was talking about.


TCP is a stream protocol -- the protocol will ensure that all data
arrives, and that it arrives in order, but does not enforce any 
boundaries

on the data; what started as a relatively large packet at one end may
arrive as lots of small packets due to intermediate transport limits 
(one
can visualize a worst case: each TCP packet is broken up to fit 
Hollerith
cards; 20bytes for header and 60 bytes of data -- then fed to a reader 
and
sent on AS-IS). Boundaries are the end-user responsibility... line 
endings
(look at SMTP, where an email message ends on a line containing just a 
".")

or embedded length counter (not the TCP packet length).


Yes.

Receiving no bytes is taken as indicating the end of the 
communication.

That's OK for TCP but not for UDP so there should be a way to
distinguish between the end of data and receiving an empty datagram.


I don't believe UDP supports a truly empty datagram (length of 0) --
presuming a sending stack actually sends one, the receiving stack will
probably drop it as there is no data to pass on to a client (there is 
a PR
at work because we have a UDP driver that doesn't drop 0-length 
messages,

but also can't deliver them -- so the circular 

Re: Lightwight socket IO wrapper

2015-09-22 Thread James Harris
"Marko Rauhamaa"  wrote in message 
news:8737y6cgp6@elektro.pacujo.net...

"James Harris" :


I agree with what you say. A zero-length UDP datagram should be
possible and not indicate end of input but is that guaranteed and
portable?


The zero-length payload size shouldn't be an issue, but UDP doesn't 
make

any guarantees about delivering the message. Your UDP application must
be prepared for some, most or all of the messages disappearing without
any error indication.

In practice, you'd end up implementing your own TCP on top of UDP
(retries, timeouts, acknowledgements, sequence numbers etc).


The unreliability of UDP was not the case in point here. Rather, it was 
about whether different platforms could be relied upon to deliver 
zero-length datagrams to the app if the datagrams got safely across the 
network.


James

--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Gregory Ewing

Random832 wrote:


Isn't this technically the same problem as pressing ctrl-d at a terminal
- it's not _really_ the end of the input (you can continue reading
after), but it sends the program something it will interpret as such?


Yes. There's no concept of "closing the connection" with UDP,
because there's no connection. So if a read returns 0 bytes,
it must be because someone sent you a 0-length datagram.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread James Harris
"Akira Li" <4kir4...@gmail.com> wrote in message 
news:mailman.18.1442804862.28679.python-l...@python.org...

"James Harris"  writes:
...

There are a few things and more crop up as time goes on. For example,
over TCP it would be helpful to have a function to receive a specific
number of bytes or one to read bytes until reaching a certain
delimiter such as newline or zero or space etc.


The answer is sock.makefile('rb') then `file.read(nbytes)` returns a
specific number of bytes.


Thanks, I hadn't seen that. Now I know of it I see references to it all 
over the place but beforehand it was in hiding


It is exactly the type of convenience wrapper I was expecting Python to 
have but expected it to be in another module. It looks as though it will 
definitely cover some of the issues I had.



`file.readline()` reads until newline (b'\n') There is Python Issue:
"Add support for reading records with arbitrary separators to the
standard IO stack"
 http://bugs.python.org/issue1152248
See also
 http://bugs.python.org/issue17083

Perhaps, it is easier to implement read_until(sep) that is best suited
for a particular case.


OK.

...

When sending it would be good to just say to send a bunch of bytes 
but
know that you will get told how many were sent (or didn't get sent) 
if

it fails. Sock.sendall() doesn't do that.


sock.send() returns the number of bytes sent that may be less than 
given.

You could reimplement sock.sendall() to include the number of bytes
successfully sent in case of an error.


I know. As mentioned, I wondered if there were already such functions to 
save me using my own.


I thought UDP would deliver (or drop) a whole datagram but cannot 
find

anything in the Python documentaiton to guarantee that. In fact
documentation for the send() call says that apps are responsible for
checking that all data has been sent. They may mean that to apply to
stream protocols only but it doesn't state that. (Of course, UDP
datagrams are limited in size so the call may validly indicate
incomplete transmission even when the first part of a big message is
sent successfully.)

Receiving no bytes is taken as indicating the end of the
communication. That's OK for TCP but not for UDP so there should be a
way to distinguish between the end of data and receiving an empty
datagram.


There is no end of communication in UDP and therefore there is no end 
of

data. If you've got a zero bytes in return then it means that you've
received a zero length datagram.

sock.recvfrom() is a thin wrapper around the corresponding C
function. You could read any docs you like about UDP sockets.

http://stackoverflow.com/questions/5307031/how-to-detect-receipt-of-a-0-length-udp-datagram


As mentioned to Dennis just now, I would prefer to write code to conform 
with the documented behaviour of Python and its libraries, as long as 
they were known to be reliable implementations of what was documented, 
of course.


I agree with what you say. A zero-length UDP datagram should be possible 
and not indicate end of input but is that guaranteed and portable? 
(Rhetorical.)  It seems not. Even the Linux man page for recv says: "If 
no  messages  are  available  at  the  socket, the receive calls wait 
for a message to arrive, unless the socket is nonblocking" In that 
case, of course, what it defines as a "message" - and whether it can be 
zero length or not - is not stated.



The recv calls require a buffer size to be supplied which is a
technical detail. A Python wrapper could save the programmer dealing
with that.


It is not just a buffer size. It is the maximum amount of data to be
received at once i.e., sock.recv() may return less but never more.


My point was that we might want to request the entire next line or next 
field of input and not know a maximum length. *C* programmers are used 
to giving buffers fixed sizes often because then they can avoid fiddling 
with memory management but Python normally does that for us. I was 
suggesting that the thin wrapper around the socket recv() call is too 
thin! The makefile() approach that you mentioned seems more Pythonesque, 
though.



You could use makefile() and read() if recv() is too low-level.


Yes.

James

--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Marko Rauhamaa
"James Harris" :

> I agree with what you say. A zero-length UDP datagram should be
> possible and not indicate end of input but is that guaranteed and
> portable?

The zero-length payload size shouldn't be an issue, but UDP doesn't make
any guarantees about delivering the message. Your UDP application must
be prepared for some, most or all of the messages disappearing without
any error indication.

In practice, you'd end up implementing your own TCP on top of UDP
(retries, timeouts, acknowledgements, sequence numbers etc).


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Random832
On Tue, Sep 22, 2015, at 15:45, James Harris wrote:
> "Dennis Lee Bieber"  wrote in message 
> news:mailman.12.1442794762.28679.python-l...@python.org...
> > On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
> >  declaimed the following:
> >>Receiving no bytes is taken as indicating the end of the 
> >>communication.
> >>That's OK for TCP but not for UDP so there should be a way to
> >>distinguish between the end of data and receiving an empty datagram.
> >>
> > I don't believe UDP supports a truly empty datagram (length of 0) --
> > presuming a sending stack actually sends one, the receiving stack will
> > probably drop it as there is no data to pass on to a client (there is 
> > a PR
> > at work because we have a UDP driver that doesn't drop 0-length 
> > messages,
> > but also can't deliver them -- so the circular buffer might fill with
> > undeliverable headers)
> 
> As others have pointed out, UDP implementations do seem to work with 
> zero-byte datagrams properly. Again, I would rather see that in the 
> Python documentation which is what, effectively, forms a contract that 
> we should be able to rely on.

Isn't this technically the same problem as pressing ctrl-d at a terminal
- it's not _really_ the end of the input (you can continue reading
after), but it sends the program something it will interpret as such?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Jorgen Grahn
On Mon, 2015-09-21, Cameron Simpson wrote:
> On 21Sep2015 10:34, Chris Angelico  wrote:
>>If you're going to add sequencing and acknowledgements to UDP,
>>wouldn't it be easier to use TCP and simply prefix every message with
>>a two-byte length?
>
> Frankly, often yes. That's what I do. (different length encoding, but 
> otherwise...)
>
> UDP's neat if you do not care if a packet fails to arrive and if you can 
> guarentee that your data fits in a packet in the face of different MTUs. 

There's also the impact on your application. With TCP you need to
consider that you may block when reading or writing, and you'll be
using threads and/or a state machine driven by select() or something.
UDP is more fire-and-forget.

> I like TCP myself, most of the time. Another nice thing about TCP is that wil 
> a 
> little effort you get to pack multiple data packets (or partial data packets) 
> into a network packet, etc.

That, and also (again) the impact on the application.  With UDP you
can easily end up wasting a lot of time reading tiny datagrams one by
one.  It has often been a performance bottleneck for me, with certain
UDP-based protocols which cannot pack multiple application-level
messages into one datagram.

Although perhaps you tend not to use Python in those situations.

/Jorgen

-- 
  // Jorgen Grahn    O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-22 Thread Jorgen Grahn
On Mon, 2015-09-21, Chris Angelico wrote:
> On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa  wrote:
>> Chris Angelico :
>>
>>> On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:
 You can read a full buffer even if you have a variable-length length
 encoding.
>>>
>>> Not sure what you mean there. Unless you can absolutely guarantee that
>>> you didn't read too much, or can absolutely guarantee that your
>>> buffering function will be the ONLY way anything reads from the
>>> socket, buffering is a problem.
>>
>> Only one reader can read a socket safely at any given time so mutual
>> exclusion is needed.
>>
>> If you read "too much," the excess can be put in the application's read
>> buffer where it is available for whoever wants to process the next
>> message.
>
> Oops, premature send - sorry! Trying again.
>
> Which works only if you have a single concept of "application's read
> buffer". That means that you have only one place that can ever read
> data. Imagine a protocol that mainly consists of lines of text
> terminated by CRLF, but allows binary data to be transmitted by
> sending "DATA N\r\n" followed by N arbitrary bytes. The simplest and
> most obvious way to handle the base protocol is to buffer your reads
> as much as possible, but that means potentially reading the beginning
> of the data stream along with its header. You therefore cannot use the
> basic read() method to read that data - you have to use something from
> your line-based wrapper, even though you are decidedly NOT using a
> line-based protocol at that point.
>
> That's what I mean by guaranteeing that your buffering function is the
> only way data gets read from the socket. Either that, or you need an
> underlying facility for un-reading a bunch of data - de-buffering and
> making it readable again.

The way it seems to me, reading a TCP socket always ends up as:

- keep an application buffer
- do one socket read and append to the buffer
- consume 0--more complete "entries" from the beginning
  of the buffer; keep the incomplete one which may exist
  at the end
- go back and read some more when there's a chance more data
  has arrived

So the buffer is a circular buffer of octets, which you chop up
by parsing it so you can see it as a circular buffer of complete and
incomplete entries or messages.

At that level, yes, the line-oriented data and the binary data would
coexist in the same application buffer.

/Jorgen

-- 
  // Jorgen Grahn    O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Cameron Simpson

On 21Sep2015 12:40, Chris Angelico  wrote:

On Mon, Sep 21, 2015 at 11:55 AM, Cameron Simpson  wrote:

On 21Sep2015 10:34, Chris Angelico  wrote:

If you're going to add sequencing and acknowledgements to UDP,
wouldn't it be easier to use TCP and simply prefix every message with
a two-byte length?


Frankly, often yes. That's what I do. (different length encoding, but
otherwise...)


Out of interest, what encoding?


NB: this is for binary protocols.

I don't like embedding arbitrary size limits in protocols or data formats if I 
can easily avoid it. So (for my home grown binary protocols) I encode unsigned 
integers as big endian octets with the top bit meaning "another octet follows" 
and the bottom 7 bits going to the value. So my packets look like:


 encoded(length)data

For sizes below 128, one byte of length. For sizes 128-16383, two bytes. And so 
on. Compact yet unbounded.


My new protocols ar probably going to derive from the scheme implemented  in 
the code cited below. "New" means as of some weeks ago, when I completely 
rewrote a painful ad hoc protocol of mine and pulled out the general features 
into what follows.


The actual packet format is implemented by the Packet class at the bottom of 
this:


 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/serialise.py

Simple and flexible.

As for using that data format multiplexed with multiple channels, see the 
PacketConnection class here:


 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/stream.py

Broadly, the packets are length[tag,flags[,channel#],payload] and one 
implements whatever semantics one needs on top of that.


You can see this exercised over UNIX pipes and TCP streams in the unit tests 
here:


 https://bitbucket.org/cameron_simpson/css/src/tip/lib/python/cs/stream_tests.py

On the subject of packet stuffing, my preferred loop for that is visible in the 
PacketConnection._send worker thread method, which goes:


   fp = self._send_fp
   Q = self._sendQ
   for P in Q:
 sig = (P.channel, P.tag, P.is_request)
 if sig in self.__sent:
   raise RuntimeError("second send of %s" % (P,))
 self.__sent.add(sig)
 write_Packet(fp, P)
 if Q.empty():
   fp.flush()
   fp.close()

In short: get packets from the queue and write them to the stream buffer. If 
the queue gets empty, _only then_ flush the buffer. This assures synchronicity 
in comms while giving the IO library a chance to fill a buffer with several 
packets.


Cheers,
Cameron Simpson 

ERROR 155 - You can't do that.  - Data General S200 Fortran error code list
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Michael Ströder
Marko Rauhamaa wrote:
> I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they are
> available (Linux).

If these options are not available are both option constants also not
available? Or does the implementation have to look into sys.platform?

Ciao, Michael.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Sep 21, 2015 at 4:27 PM, Cameron Simpson  wrote:
>> For sizes below 128, one byte of length. For sizes 128-16383, two bytes. And
>> so on. Compact yet unbounded.
>
> [...]
>
> It's generally a lot faster to do a read(2) than a loop with any
> number of read(1), and you get some kind of bound on your allocations.
> Whether that's important to you or not is another question, but
> certainly your chosen encoding is a good way of allowing arbitrary
> integer values.

You can read a full buffer even if you have a variable-length length
encoding.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Michael Ströder :

> Marko Rauhamaa wrote:
>> I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they
>> are available (Linux).
>
> If these options are not available are both option constants also not
> available? Or does the implementation have to look into sys.platform?

   >>> import socket
   >>> 'TCP_CORK' in dir(socket)
   True

The TCP_NODELAY option is available everywhere but has special semantics
with TCP_CORK.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Chris Angelico
On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Mon, Sep 21, 2015 at 4:27 PM, Cameron Simpson  wrote:
>>> For sizes below 128, one byte of length. For sizes 128-16383, two bytes. And
>>> so on. Compact yet unbounded.
>>
>> [...]
>>
>> It's generally a lot faster to do a read(2) than a loop with any
>> number of read(1), and you get some kind of bound on your allocations.
>> Whether that's important to you or not is another question, but
>> certainly your chosen encoding is a good way of allowing arbitrary
>> integer values.
>
> You can read a full buffer even if you have a variable-length length
> encoding.

Not sure what you mean there. Unless you can absolutely guarantee that
you didn't read too much, or can absolutely guarantee that your
buffering function will be the ONLY way anything reads from the
socket, buffering is a problem.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Sep 21, 2015 at 2:39 PM, Marko Rauhamaa  wrote:
>> Chris Angelico :
>>
>>> If you write a packet of data, then write another one, and another,
>>> and another, and another, without waiting for responses, Nagling
>>> should combine them automatically. [...]
>>
>> Unfortunately, Nagle and delayed ACK, which are both defaults, don't go
>> well together (you get nasty 200-millisecond hickups).
>
> Only in the write-write-read scenario.

Which is the case you brought up. Ideally, application code should be
oblivious to the inner heuristics of the TCP implementation. IOW,
write-write-read is perfectly valid and shouldn't lead to performance
degradation.

Unfortunately, the socket API doesn't provide a standard way for the
application to tell the kernel that it is done sending for now. Linux's
TCP_CORK+TCP_NODELAY is a nonstandard way but does the job quite nicely.

>> As for the topic, TCP doesn't need wrappers to abstract away the
>> difficult bits. That's a superficially good idea that leads to
>> trouble.
>
> Depends what you're doing - if you're working with a higher level
> protocol like HTTP, then abstracting away the difficult bits of TCP is
> part of abstracting away the difficult bits of HTTP, and something
> like 'requests' is superb.

Naturally, a higher-level protocol hides the lower-level protocol. It in
turn has intricacies of its own. Unfortunately, Python's stdlib HTTP
facilities are too naive (ie, blocking, incompatible with asyncio) to be
usable.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Chris Angelico
On Mon, Sep 21, 2015 at 4:27 PM, Cameron Simpson  wrote:
> I don't like embedding arbitrary size limits in protocols or data formats if
> I can easily avoid it. So (for my home grown binary protocols) I encode
> unsigned integers as big endian octets with the top bit meaning "another
> octet follows" and the bottom 7 bits going to the value. So my packets look
> like:
>
>  encoded(length)data
>
> For sizes below 128, one byte of length. For sizes 128-16383, two bytes. And
> so on. Compact yet unbounded.

Ah, the MIDI Variable-Length Integer. Decent.

It's generally a lot faster to do a read(2) than a loop with any
number of read(1), and you get some kind of bound on your allocations.
Whether that's important to you or not is another question, but
certainly your chosen encoding is a good way of allowing arbitrary
integer values.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread MRAB

On 2015-09-21 09:47, Marko Rauhamaa wrote:

Michael Ströder :


Marko Rauhamaa wrote:

Michael Ströder :


Marko Rauhamaa wrote:

I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they
are available (Linux).


If these options are not available are both option constants also not
available? Or does the implementation have to look into sys.platform?


   >>> import socket
   >>> 'TCP_CORK' in dir(socket)
   True


On which platform was this done?


Python3 on Fedora 21.
Python2 on RHEL4.

Sorry, don't have non-Linux machines to try.


How to automagically detect whether TCP_CORK is really available on a
platform?


I sure hope 'TCP_CORK' in dir(socket) evaluates to False on non-Linux
machines.


On Windows 10:

Python 3.5.0 (v3.5.0:374f501f4567, Sep 13 2015, 02:27:37) [MSC v.1900 64 
bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> 'TCP_CORK' in dir(socket)
False
>>>

--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Jorgen Grahn
On Mon, 2015-09-21, Dennis Lee Bieber wrote:
> On Sun, 20 Sep 2015 23:36:30 +0100, "James Harris"
>  declaimed the following:

...
>>I thought UDP would deliver (or drop) a whole datagram but cannot find 
>>anything in the Python documentaiton to guarantee that. In fact 
>>documentation for the send() call says that apps are responsible for 
>>checking that all data has been sent. They may mean that to apply to 
>>stream protocols only but it doesn't state that. (Of course, UDP 
>>datagrams are limited in size so the call may validly indicate 
>>incomplete transmission even when the first part of a big message is 
>>sent successfully.)
>>
>   Looking in the wrong documentation  
>
>   You probably should be looking at the UDP RFC. Or maybe just
>
> http://www.diffen.com/difference/TCP_vs_UDP
>
> """
> Packets are sent individually and are checked for integrity only if they
> arrive. Packets have definite boundaries which are honored upon receipt,
> meaning a read operation at the receiver socket will yield an entire
> message as it was originally sent.
> """
>
>   Even if the IP layer has to fragment a UDP packet to meet limits of the
> transport media, it should put them back together on the other end before
> passing it up to the UDP layer. To my knowledge, UDP does not have a size
> limit on the message (well -- a 16-bit length field in the UDP header).

So they are "limited in size" like the OP wrote.  (A TCP stream OTOH is
potentially infinite.)

But also, the IPv4 RFC says:

All hosts must be prepared to accept datagrams of up to 576 octets
(whether they arrive whole or in fragments).  It is recommended
that hosts only send datagrams larger than 576 octets if they have
assurance that the destination is prepared to accept the larger
datagrams.

As for "all or nothing" with UDP datagrams, you also have the socket
layer case where the user does read() into a 1000 octet buffer and the
datagram was 1200 octets.  With BSD sockets you can (if you try)
detect this, but the extra 200 octets are lost forever.

> But  since it /is/ "got it all" or "dropped" with no inherent confirmation, 
> one
> would have to embed their own protocol within it -- sequence numbers with
> ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol
> would mean having LARGE resends should packets be dropped or arrive out of
> sequence (and since the ACK/NAK could be dropped too, you may have to
> handle the case of a duplicated packet -- also large).
>
>   TCP is a stream protocol -- the protocol will ensure that all data
> arrives, and that it arrives in order, but does not enforce any boundaries
> on the data; what started as a relatively large packet at one end may
> arrive as lots of small packets due to intermediate transport limits (one
> can visualize a worst case: each TCP packet is broken up to fit Hollerith
> cards; 20bytes for header and 60 bytes of data -- then fed to a reader and
> sent on AS-IS).

The problem is IMO more this: the chunks of data that the application
writes doesn't map to what the other application reads.  In the lower
layers, I don't expect TCP segments to be split, and IP fragmentation
(if it happens at all) operates at an even lower level.

However the end result is still just as you write:

> Boundaries are the end-user responsibility... line endings
> (look at SMTP, where an email message ends on a line containing just a ".")
> or embedded length counter (not the TCP packet length).
>
>>Receiving no bytes is taken as indicating the end of the communication. 
>>That's OK for TCP but not for UDP so there should be a way to 
>>distinguish between the end of data and receiving an empty datagram.
>>
>   I don't believe UDP supports a truly empty datagram (length of 0) --
> presuming a sending stack actually sends one, the receiving stack will
> probably drop it as there is no data to pass on to a client

UDP datagrams of length 0 work (just tried it on Linux).  There's
nothing special about it.

> (there is a PR
> at work because we have a UDP driver that doesn't drop 0-length messages,
> but also can't deliver them -- so the circular buffer might fill with
> undeliverable headers)

Those messages should be delivered to the receiving socket, in the
sense that they are sanity-checked, used to wake up the application
and mark the socket readable, fill up one entry in the read queue and
so on ...

Of course your system at work may have the rights to be more
restrictive, if it's special-purpose.

/Jorgen

-- 
  // Jorgen Grahn    O  o   .
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Cameron Simpson

On 21Sep2015 18:07, Chris Angelico  wrote:

On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:

Chris Angelico :


On Mon, Sep 21, 2015 at 4:27 PM, Cameron Simpson  wrote:

For sizes below 128, one byte of length. For sizes 128-16383, two bytes. And
so on. Compact yet unbounded.


[...]

It's generally a lot faster to do a read(2) than a loop with any
number of read(1), and you get some kind of bound on your allocations.
Whether that's important to you or not is another question, but
certainly your chosen encoding is a good way of allowing arbitrary
integer values.


You can read a full buffer even if you have a variable-length length
encoding.


Not sure what you mean there. Unless you can absolutely guarantee that
you didn't read too much, or can absolutely guarantee that your
buffering function will be the ONLY way anything reads from the
socket, buffering is a problem.


I'm using buffered io streams, so that layer will be reading in chunks. Pulling 
things from that buffer with fp.read(1) is cheap enough for my use.


Cheers,
Cameron Simpson 
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:
>> You can read a full buffer even if you have a variable-length length
>> encoding.
>
> Not sure what you mean there. Unless you can absolutely guarantee that
> you didn't read too much, or can absolutely guarantee that your
> buffering function will be the ONLY way anything reads from the
> socket, buffering is a problem.

Only one reader can read a socket safely at any given time so mutual
exclusion is needed.

If you read "too much," the excess can be put in the application's read
buffer where it is available for whoever wants to process the next
message.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Chris Angelico
On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:
>>> You can read a full buffer even if you have a variable-length length
>>> encoding.
>>
>> Not sure what you mean there. Unless you can absolutely guarantee that
>> you didn't read too much, or can absolutely guarantee that your
>> buffering function will be the ONLY way anything reads from the
>> socket, buffering is a problem.
>
> Only one reader can read a socket safely at any given time so mutual
> exclusion is needed.
>
> If you read "too much," the excess can be put in the application's read
> buffer where it is available for whoever wants to process the next
> message.

Which works only if you have a single concept of "application's read
buffer". That means that you have only one place that can ever read
data. Imagine a
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa  wrote:
>> Only one reader can read a socket safely at any given time so mutual
>> exclusion is needed.
>>
>> If you read "too much," the excess can be put in the application's read
>> buffer where it is available for whoever wants to process the next
>> message.
>
> Which works only if you have a single concept of "application's read
> buffer".

Well, the socket's read buffer.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Michael Ströder
Marko Rauhamaa wrote:
> Michael Ströder :
> 
>> Marko Rauhamaa wrote:
>>> I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they
>>> are available (Linux).
>>
>> If these options are not available are both option constants also not
>> available? Or does the implementation have to look into sys.platform?
> 
>>>> import socket
>>>> 'TCP_CORK' in dir(socket)
>True

On which platform was this done?

To rephrase myquestion:
How to automagically detect whether TCP_CORK is really available on a platform?

'TCP_CORK' in dir(socket)
or catch AttributeError

sys.platform=='linux2'
hoping that Linux 2.1 or prior is not around anymore...

...

Ciao, Michael.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Michael Ströder :

> Marko Rauhamaa wrote:
>> Michael Ströder :
>> 
>>> Marko Rauhamaa wrote:
 I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they
 are available (Linux).
>>>
>>> If these options are not available are both option constants also not
>>> available? Or does the implementation have to look into sys.platform?
>> 
>>>>> import socket
>>>>> 'TCP_CORK' in dir(socket)
>>True
>
> On which platform was this done?

Python3 on Fedora 21.
Python2 on RHEL4.

Sorry, don't have non-Linux machines to try.

> How to automagically detect whether TCP_CORK is really available on a
> platform?

I sure hope 'TCP_CORK' in dir(socket) evaluates to False on non-Linux
machines.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Marko Rauhamaa
Marko Rauhamaa :

> Chris Angelico :
>
>> On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa  wrote:
>>> Only one reader can read a socket safely at any given time so mutual
>>> exclusion is needed.
>>>
>>> If you read "too much," the excess can be put in the application's read
>>> buffer where it is available for whoever wants to process the next
>>> message.
>>
>> Which works only if you have a single concept of "application's read
>> buffer".
>
> Well, the socket's read buffer.

To be exact, the application should associate a read buffer with each
socket.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-21 Thread Chris Angelico
On Mon, Sep 21, 2015 at 6:38 PM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Mon, Sep 21, 2015 at 5:59 PM, Marko Rauhamaa  wrote:
>>> You can read a full buffer even if you have a variable-length length
>>> encoding.
>>
>> Not sure what you mean there. Unless you can absolutely guarantee that
>> you didn't read too much, or can absolutely guarantee that your
>> buffering function will be the ONLY way anything reads from the
>> socket, buffering is a problem.
>
> Only one reader can read a socket safely at any given time so mutual
> exclusion is needed.
>
> If you read "too much," the excess can be put in the application's read
> buffer where it is available for whoever wants to process the next
> message.

Oops, premature send - sorry! Trying again.

Which works only if you have a single concept of "application's read
buffer". That means that you have only one place that can ever read
data. Imagine a protocol that mainly consists of lines of text
terminated by CRLF, but allows binary data to be transmitted by
sending "DATA N\r\n" followed by N arbitrary bytes. The simplest and
most obvious way to handle the base protocol is to buffer your reads
as much as possible, but that means potentially reading the beginning
of the data stream along with its header. You therefore cannot use the
basic read() method to read that data - you have to use something from
your line-based wrapper, even though you are decidedly NOT using a
line-based protocol at that point.

That's what I mean by guaranteeing that your buffering function is the
only way data gets read from the socket. Either that, or you need an
underlying facility for un-reading a bunch of data - de-buffering and
making it readable again.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Lightwight socket IO wrapper

2015-09-20 Thread James Harris
I guess there have been many attempts to make socket IO easier to handle 
and a good number of those have been in Python.


The trouble with trying to improve something which is already well 
designed (and conciously left as is) is that the so-called improvement 
can become much more complex and overly elaborate. That can apply to the 
initial idea, for sure, but when writing helper or convenience functions 
perhaps it applies more to the temptation to keep adding just a little 
bit extra. The end result can be overly elaborate such as a framework 
which is fine where such is needed but is overkill for simpler 
requirements.


Do you guys have any recommendations of some *lightweight* additions to 
Python socket IO before I write any more of my own? Something built in 
to Python would be much preferred over any modules which have to be 
added. I had in the back of my mind that there was a high-level 
socket-IO library - much as threading was added as a wrapper to the 
basic thread module - but I cannot find anything above socket. Is there 
any?


A current specific to illustrate where basic socket IO is limited: it 
normally provides no guarantees over how many bytes are transferred at a 
time (AFAICS that's true for both streams and datagrams) so the 
delimiting of messages/records needs to be handled by the sender and 
receiver. I do already handle some of this myself but I wondered if 
there was a prebuilt solution that I should be using instead - to save 
me adding just a little bit extra. ;-)


James

--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread James Harris
"Akira Li" <4kir4...@gmail.com> wrote in message 
news:mailman.37.1442754893.21674.python-l...@python.org...

"James Harris"  writes:


I guess there have been many attempts to make socket IO easier to
handle and a good number of those have been in Python.

The trouble with trying to improve something which is already well
designed (and conciously left as is) is that the so-called 
improvement

can become much more complex and overly elaborate. That can apply to
the initial idea, for sure, but when writing helper or convenience
functions perhaps it applies more to the temptation to keep adding
just a little bit extra. The end result can be overly elaborate such
as a framework which is fine where such is needed but is overkill for
simpler requirements.

Do you guys have any recommendations of some *lightweight* additions
to Python socket IO before I write any more of my own? Something 
built
in to Python would be much preferred over any modules which have to 
be

added. I had in the back of my mind that there was a high-level
socket-IO library - much as threading was added as a wrapper to the
basic thread module - but I cannot find anything above socket. Is
there any?


Does ØMQ qualify as lightweight?


It's certainly interesting. It's puzzling, too. For example,

 http://zguide.zeromq.org/py:hwserver

The Python code there includes

 message = socket.recv()

but given that this is a TCP socket it doesn't look like there is any 
way for the stack to know how many bytes to return. Either ZeroMQ layers 
another end-to-end protocol on top of TCP (which would be no good) or it 
will be guessing (which would not be good either).


There are probably answers to that query but there is a lot of 
documentation, including on reliable communication, and that in itself 
makes ZeroMQ seem overkill, even if it can be persuaded to do what I 
want.


I am impressed that they show code in many languages. I may come back to 
it but for the moment it doesn't seem to be what I was looking for. And 
it is not built in.



A current specific to illustrate where basic socket IO is limited: it
normally provides no guarantees over how many bytes are transferred 
at

a time (AFAICS that's true for both streams and datagrams) so the
delimiting of messages/records needs to be handled by the sender and
receiver. I do already handle some of this myself but I wondered if
there was a prebuilt solution that I should be using instead - to 
save

me adding just a little bit extra. ;-)


There are already convenience functions in stdlib such as
sock.sendall(), sock.sendfile(), socket.create_connection() in 
addition

to BSD Sockets API.

If you want to extend this list and have specific suggestions; see
 https://docs.python.org/devguide/stdlibchanges.html


That may be a bit overkill just now but it's a good suggestion.


Or just describe your current specific issue in more detail here.


There are a few things and more crop up as time goes on. For example, 
over TCP it would be helpful to have a function to receive a specific 
number of bytes or one to read bytes until reaching a certain delimiter 
such as newline or zero or space etc. Even better would be to be able to 
use the iteration protocol so you could just code next() and get the 
next such chunk of read in a for loop. When sending it would be good to 
just say to send a bunch of bytes but know that you will get told how 
many were sent (or didn't get sent) if it fails. Sock.sendall() doesn't 
do that.


I thought UDP would deliver (or drop) a whole datagram but cannot find 
anything in the Python documentaiton to guarantee that. In fact 
documentation for the send() call says that apps are responsible for 
checking that all data has been sent. They may mean that to apply to 
stream protocols only but it doesn't state that. (Of course, UDP 
datagrams are limited in size so the call may validly indicate 
incomplete transmission even when the first part of a big message is 
sent successfully.)


Receiving no bytes is taken as indicating the end of the communication. 
That's OK for TCP but not for UDP so there should be a way to 
distinguish between the end of data and receiving an empty datagram.


The recv calls require a buffer size to be supplied which is a technical 
detail. A Python wrapper could save the programmer dealing with that.


Reminder to self: encoding issues.

None of the above is difficult to write and I have written the bits I 
need myself but, basically, there are things that would make socket IO 
easier and yet still compatible with more long-winded code. So I 
wondered if there were already some Python modules which were more 
convenient than what I found in the documentation.


James

--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Chris Angelico
On Mon, Sep 21, 2015 at 10:19 AM, Dennis Lee Bieber
 wrote:
> Even if the IP layer has to fragment a UDP packet to meet limits of 
> the
> transport media, it should put them back together on the other end before
> passing it up to the UDP layer. To my knowledge, UDP does not have a size
> limit on the message (well -- a 16-bit length field in the UDP header). But
> since it /is/ "got it all" or "dropped" with no inherent confirmation, one
> would have to embed their own protocol within it -- sequence numbers with
> ACK/NAK, for example. Problem: if using LARGE UDP packets, this protocol
> would mean having LARGE resends should packets be dropped or arrive out of
> sequence (and since the ACK/NAK could be dropped too, you may have to
> handle the case of a duplicated packet -- also large).
>

If you're going to add sequencing and acknowledgements to UDP,
wouldn't it be easier to use TCP and simply prefix every message with
a two-byte length?

UDP is great when order doesn't matter and each packet stands entirely
alone. DNS is a well-known example - the question "What is the IP
address for www.rosuav.com?" doesn't in any way affect the question
"What is the mail server for gmail.com?", so you fire off UDP packets
for each one, and get responses whenever you get them. UDP's also
perfect for a heartbeat system - you send out a packet every
however-often, and if the monitor hasn't heard from you in X seconds,
it starts alerting people. No need for responses of any kind there.
But for working with a stream, I usually find it's a lot easier to
build on top of TCP than UDP.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Akira Li
"James Harris"  writes:

> I guess there have been many attempts to make socket IO easier to
> handle and a good number of those have been in Python.
>
> The trouble with trying to improve something which is already well
> designed (and conciously left as is) is that the so-called improvement
> can become much more complex and overly elaborate. That can apply to
> the initial idea, for sure, but when writing helper or convenience
> functions perhaps it applies more to the temptation to keep adding
> just a little bit extra. The end result can be overly elaborate such
> as a framework which is fine where such is needed but is overkill for
> simpler requirements.
>
> Do you guys have any recommendations of some *lightweight* additions
> to Python socket IO before I write any more of my own? Something built
> in to Python would be much preferred over any modules which have to be
> added. I had in the back of my mind that there was a high-level
> socket-IO library - much as threading was added as a wrapper to the
> basic thread module - but I cannot find anything above socket. Is
> there any?

Does ØMQ qualify as lightweight?

> A current specific to illustrate where basic socket IO is limited: it
> normally provides no guarantees over how many bytes are transferred at
> a time (AFAICS that's true for both streams and datagrams) so the
> delimiting of messages/records needs to be handled by the sender and
> receiver. I do already handle some of this myself but I wondered if
> there was a prebuilt solution that I should be using instead - to save
> me adding just a little bit extra. ;-)

There are already convenience functions in stdlib such as
sock.sendall(), sock.sendfile(), socket.create_connection() in addition
to BSD Sockets API.

If you want to extend this list and have specific suggestions; see
  https://docs.python.org/devguide/stdlibchanges.html

Or just describe your current specific issue in more detail here.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Cameron Simpson

On 21Sep2015 10:34, Chris Angelico  wrote:

If you're going to add sequencing and acknowledgements to UDP,
wouldn't it be easier to use TCP and simply prefix every message with
a two-byte length?


Frankly, often yes. That's what I do. (different length encoding, but 
otherwise...)


UDP's neat if you do not care if a packet fails to arrive and if you can 
guarentee that your data fits in a packet in the face of different MTUs. 

I like TCP myself, most of the time. Another nice thing about TCP is that wil a 
little effort you get to pack multiple data packets (or partial data packets) 
into a network packet, etc.


Cheers,
Cameron Simpson 

If you lie to the compiler, it will get its revenge.- Henry Spencer
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Akira Li
"James Harris"  writes:
...
> There are a few things and more crop up as time goes on. For example,
> over TCP it would be helpful to have a function to receive a specific
> number of bytes or one to read bytes until reaching a certain
> delimiter such as newline or zero or space etc. 

The answer is sock.makefile('rb') then `file.read(nbytes)` returns a
specific number of bytes.

`file.readline()` reads until newline (b'\n') There is Python Issue:
"Add support for reading records with arbitrary separators to the
standard IO stack"
  http://bugs.python.org/issue1152248
See also
  http://bugs.python.org/issue17083

Perhaps, it is easier to implement read_until(sep) that is best suited
for a particular case.

> Even better would be to be able to use the iteration protocol so you
> could just code next() and get the next such chunk of read in a for
> loop.

file is an iterator over lines i.e., next(file) works.

> When sending it would be good to just say to send a bunch of bytes but
> know that you will get told how many were sent (or didn't get sent) if
> it fails. Sock.sendall() doesn't do that.

sock.send() returns the number of bytes sent that may be less than given.
You could reimplement sock.sendall() to include the number of bytes
successfully sent in case of an error.

> I thought UDP would deliver (or drop) a whole datagram but cannot find
> anything in the Python documentaiton to guarantee that. In fact
> documentation for the send() call says that apps are responsible for
> checking that all data has been sent. They may mean that to apply to
> stream protocols only but it doesn't state that. (Of course, UDP
> datagrams are limited in size so the call may validly indicate
> incomplete transmission even when the first part of a big message is
> sent successfully.)
>
> Receiving no bytes is taken as indicating the end of the
> communication. That's OK for TCP but not for UDP so there should be a
> way to distinguish between the end of data and receiving an empty
> datagram.

There is no end of communication in UDP and therefore there is no end of
data. If you've got a zero bytes in return then it means that you've
received a zero length datagram.

sock.recvfrom() is a thin wrapper around the corresponding C
function. You could read any docs you like about UDP sockets.
  
http://stackoverflow.com/questions/5307031/how-to-detect-receipt-of-a-0-length-udp-datagram

> The recv calls require a buffer size to be supplied which is a
> technical detail. A Python wrapper could save the programmer dealing
> with that.

It is not just a buffer size. It is the maximum amount of data to be
received at once i.e., sock.recv() may return less but never more.
You could use makefile() and read() if recv() is too low-level.

> Reminder to self: encoding issues.
>
> None of the above is difficult to write and I have written the bits I
> need myself but, basically, there are things that would make socket IO
> easier and yet still compatible with more long-winded code. So I
> wondered if there were already some Python modules which were more
> convenient than what I found in the documentation.
>
> James

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Gregory Ewing

Dennis Lee Bieber wrote:

worst case: each TCP packet is broken up to fit Hollerith
cards;


Or printed on strips of paper and tied to pigeons:

https://en.wikipedia.org/wiki/IP_over_Avian_Carriers

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Chris Angelico
On Mon, Sep 21, 2015 at 11:55 AM, Cameron Simpson  wrote:
> On 21Sep2015 10:34, Chris Angelico  wrote:
>>
>> If you're going to add sequencing and acknowledgements to UDP,
>> wouldn't it be easier to use TCP and simply prefix every message with
>> a two-byte length?
>
>
> Frankly, often yes. That's what I do. (different length encoding, but
> otherwise...)

Out of interest, what encoding? With most protocols, I would prefer to
encode in ASCII digits terminated by end-of-line, but for arbitrary
content you're packaging up, it's usually easier to read 2 bytes (or 4
or whatever you want to specify), then read that many bytes, and
that's your content. No buffering required - you'll never read past
the end of a packet.

> UDP's neat if you do not care if a packet fails to arrive and if you can
> guarentee that your data fits in a packet in the face of different MTUs.
> I like TCP myself, most of the time. Another nice thing about TCP is that
> wil a little effort you get to pack multiple data packets (or partial data
> packets) into a network packet, etc.

Emphatically - a little effort sometimes, and other times no effort at
all! If you write a packet of data, then write another one, and
another, and another, and another, without waiting for responses,
Nagling should combine them automatically. And even if they're not
deliberately queued by Nagle's Algorithm, packets can get combined for
other reasons. So, yeah! Definitely can help a lot with packet counts
on small writes.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Marko Rauhamaa
Chris Angelico :

> On Mon, Sep 21, 2015 at 11:55 AM, Cameron Simpson  wrote:
>> Another nice thing about TCP is that wil a little effort you get to
>> pack multiple data packets (or partial data packets) into a network
>> packet, etc.
>
> Emphatically - a little effort sometimes, and other times no effort at
> all! If you write a packet of data, then write another one, and
> another, and another, and another, without waiting for responses,
> Nagling should combine them automatically. And even if they're not
> deliberately queued by Nagle's Algorithm, packets can get combined for
> other reasons. So, yeah! Definitely can help a lot with packet counts
> on small writes.

Unfortunately, Nagle and delayed ACK, which are both defaults, don't go
well together (you get nasty 200-millisecond hickups).

I recommend using socket.TCP_CORK with socket.TCP_NODELAY where they are
available (Linux). They give you Nagle without delayed ACK. See

   http://linux.die.net/man/7/tcp>


As for the topic, TCP doesn't need wrappers to abstract away the
difficult bits. That's a superficially good idea that leads to trouble.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Lightwight socket IO wrapper

2015-09-20 Thread Chris Angelico
On Mon, Sep 21, 2015 at 2:39 PM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Mon, Sep 21, 2015 at 11:55 AM, Cameron Simpson  wrote:
>>> Another nice thing about TCP is that wil a little effort you get to
>>> pack multiple data packets (or partial data packets) into a network
>>> packet, etc.
>>
>> Emphatically - a little effort sometimes, and other times no effort at
>> all! If you write a packet of data, then write another one, and
>> another, and another, and another, without waiting for responses,
>> Nagling should combine them automatically. And even if they're not
>> deliberately queued by Nagle's Algorithm, packets can get combined for
>> other reasons. So, yeah! Definitely can help a lot with packet counts
>> on small writes.
>
> Unfortunately, Nagle and delayed ACK, which are both defaults, don't go
> well together (you get nasty 200-millisecond hickups).

Only in the write-write-read scenario. If you write-read-write-read,
or if your reads don't depend on your writes, then Nagle + delayed ACK
works just fine. But if you write a bunch of stuff, then block waiting
for the other end to respond, and then write multiple times, and wait
for a response, _then_ the pair work badly together, yes.

> As for the topic, TCP doesn't need wrappers to abstract away the
> difficult bits. That's a superficially good idea that leads to trouble.

Depends what you're doing - if you're working with a higher level
protocol like HTTP, then abstracting away the difficult bits of TCP is
part of abstracting away the difficult bits of HTTP, and something
like 'requests' is superb. But if you're inventing your own protocol,
directly on top of a BSD socket, then I would agree - just call socket
functions directly. Otherwise you risk nasty surprises when your
file-like object has ridiculous performance problems.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list