On 17 Sep 2015, at 10:24, Motti Shneor <su...@bezeqint.net> wrote:

> Is there a specific penalty to NSStream’s method read:maxLength: ?

Yes.  In the case of a socket stream, -read:maxLength: is equivalent to the 
BSD-level <x-man-page://2/read> system call, so you do a round trip into the 
kernel each time you call it.  Whether that's a problem depends on a lot of 
factors.  For example, if message sizes are small, these kernel calls are 
likely to add up to a significant performance impact.  OTOH, if message sizes 
are large, the kernel calls may end up swamped by the cost of copying 

> Last, several guys mentioned GCDAsyncSocket. What is it? an open-source thing?


It's an open source thing.

You can also use Dispatch I/O <x-man-page://3/dispatch> directly or in concert 
with NSStream (use NSStream to do the stream setup, there extract the socket 
from the stream, close the streams, and then use Dispatch I/O for the I/O path).

When dealing with network performance I have a bunch of suggestions:

* Fundamentally any users-space networking is limited by BSD Sockets, which 
requires at least one copy from the in-kernel socket buffer to your buffer in 
user-space.  This is typically done by <x-man-page://2/read> or one of its 
equivalents.

The only real optimisation you can do at the system call boundary is to use 
<x-man-page://2/readv> to implement a 'scatter' read.  That may or may not be a 
win depending on how you structure things.

* If you control the on-the-wire protocol, you may be able to get significant 
benefits by changing that.   For example:

- Protocols, like HTTP/1.x, which require parsing line-delimited headers, are a 
pain to optimise.

- A lot of the time networking performance isn't limited by CPU time or 
bandwidth but by poorly designed protocols that result in the performance being 
dependent on the network latency.  Again, HTTP/1.x is a major offender here.

In my experience, the *really* big wins in network performance generally come 
from fixing problems like this.

* You want to design your I/O structure to meet the needs of your client (that 
is, the code that's consuming the data you're reading).  For example, if your 
media engine expects you to give it data in the large malloc'd buffers, that's 
should be a major consideration in the design of your I/O structure.

Alternatively, if you can change how you supply data to your media engine, you 
could potentially avoid a bunch of overhead.  dispatch_data_t (which can be 
bridged to NSData) can really help here because it supports:

- non-contiguous data, allowing you to join two buffers

- subdata creation without copies, to allow you to efficiently split two buffers

- data flowing through multiple layers without copying

As a concrete example of this, consider an I/O subsystem that does this:

1. reads the data into a malloc'd block

2. creates a dispatch_data from that (dispatch_data_create)

3. parser the data

4. when it finds a message, creates a subdata to represent the message 
(dispatch_data_create_subrange) and passes that to the media engine

5. if there's data left at the end, saves that so that subsequent reads can 
join it to data from the next read (dispatch_data_create_concat)

You've done no copies (except the one required by BSD Sockets) and yet you're 
passing the data up to the client in a nice sequence of dispatch_data_t's.  
Those may be discontiguous, but it's up to the client as to whether they want 
to efficiently deal with discontiguous data or gather the data into one 
contiguous buffer.

Share and Enjoy
--
Quinn "The Eskimo!"                    <http://www.apple.com/developer/>
Apple Developer Relations, Developer Technical Support, Core OS/Hardware



 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Macnetworkprog mailing list      (Macnetworkprog@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/macnetworkprog/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to