Re: [RFC] how to get the size of a malloc(9) block ?
On Nov 29, 2013, at 3:44 PM, jb wrote: > Luigi Rizzo iet.unipi.it> writes: > >> ... >> There is a difference between applications peeking into >> implementation details that should be hidden, and providing >> instead limited and specific information through a well defined API. >> ... > > Right. > > If you want to improve memory management, that is, have the system (kernel > or user space) handle memory reallocation intelligently and transparently > to the user, then aim at a well defined API: Don’t forget: * Request a block of “at least N bytes” and have the allocator tell you what it *really* allocated. This allows applications to use memory more efficiently by taking advantage of over-allocation when it happens. Tim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
What is the supposed definition of malloc_usable_size(p) in a hypothetical, upcoming C standard? With the rest of the C standard remaining the same, one could try: Definition: The value of malloc_usable_size(p) is the amount of space allocated for object p, plus the amount of space after object p that can currently be written without messing up other objects or memory-management areas. This definition is practically useless, because data written to the slack space after the object could be claimed by calls to alloc()ish function -- perhaps by other threads. Another attempt at the defining something useful: Definition: The value of malloc_usable_size(p) is the amount of space allocated for object p, plus the amount of space after object p that can be written without messing up other objects or memory-management areas, while alloc()ish functions are not called on p. With this, one asks the question: How much is usually overallocated? In some implementations, usually just a few bytes (say, when the minimal allocation unit is 8 bytes); where not, it can be said that the memory manager is quite space-leaky. It appears that it's not possible to make a proper API with malloc_usable_size() included, at least when multi-threading is involved (ie., in the modern world). However, it is still useful to create an API that supports the following cases: - A program knows how to adapt to memory fragmentation without moving an ever-growing, but chainable array of data. - A program would become faster, if it knew when moving is required; then, the program could update various pointer-based (as opposed to arrayindex-based) references to the object being moved. (Just like when memory is defragmentated in a garbage-collected programming language.) - A program requires more memory in real-time, which means to either receive more memory immediately and do something, or to signal a real-time failure. So new flags could be [1]: - realloc_flags(p, s, REALLOCF_NO_MOVE): Resize object p, without moving it, to size s. With this restriction, when requesting more memory, and the specified amount isn't available, don't do anything (when requesting less memory, always succeed). - realloc_flags(p, s, REALLOCF_NO_MOVE | REALLOCF_ELASTIC): Resize object p, without moving it, to size s. With this restriction, when requesting more memory, and the specified amount isn't available, reserve as much as possible (when requesting less memory, always succeed). On the other hand, be advised of a hypothetical scenario, in which realloc() would like to jump at the opportunity to move the object to a different space, say, for the purpose of condensing slack space, when statistics show that allocated areas have plenty of holes. This means that the design of the new API can have more goals: - The allocator implementation should be able to shape the workings of a program at quick-realloc points, for example, by coaxing it to call realloc() when memory is very scattered. - The program should always be able to take advantage of a quick-realloc functionality, for example, to support certain real-time requirements of applications, to the extent reasonably possible within the implementation. For this, there could be a REALLOCF_FORCE flag, to be used in real-time scenarios. Without the flag, the call can be expected to be rejected on the basis of some implementation-specific preference, such as anti-fragmentation. Is there any insufficiency in this API, in anyone's mind? [1] When such a distinction makes sense and is supported (not stubbed) in the current architecture, environment, implementation, etc. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
10.0-RELEASE status update
Quick 10.0-RELEASE status update: - iconv(3) changes have been made in head/, and merged to stable/10 today. - Two MFCs are undergoing review, one of which I will commit right before updating the stable/10 branch name to reflect '-BETA4'.[1] - Builds for 10.0-BETA4 will begin tomorrow. The schedule page on the website will be updated to reflect the start of the -BETA4 builds, as well as updating the remainder of the schedule to reflect the adjustments after the delay. [1] - Important note to those tracking stable/10: An update will be committed tomorrow that will disable automatic creation of pkg.conf(5). Those installing new systems from 10.0-BETA4 should experience no trouble, as the update pkg(8) version (pkg-1.2.1) should be available around the same time 10.0-BETA4 is announced. This affects the pkg(8) bootstrap functionality *only*. Those with bootstrapped pkg(8) will not be affected. For those doing source-based upgrades from stable/10 *and* do not have pkg(8) already bootstrapped, there will be a brief (about 1 day, "brief" being relative) window of time where pkg(8) (versions less than 1.2) will not have a pre-configured pkg.conf(5). This will also be noted in the 10.0-BETA4 announcement mail.[2] [2] - I believe this is a non-issue, since usr.sbin/pkg/config.c has pkg.FreeBSD.org set as the packagesite, but I am sure I am overlooking something obvious that will prove me wrong. So, that is why the "important note" is longer than the actual status update. :-) Thank you for your patience. Glen On behalf of: re@ pgpo5Kp0XjnHC.pgp Description: PGP signature
Re: [RFC] how to get the size of a malloc(9) block ?
On Fri, Nov 29, 2013 at 5:02 PM, jb wrote: > Luigi Rizzo iet.unipi.it> writes: > > > ... > > > If you want to improve memory management, that is, have the system > (kernel > > > or user space) handle memory reallocation intelligently and > transparently > > > to the user, then aim at a well defined API: > > > - reallocate "with no copy", which means new space appended (taking > into > > > account *usable size*, a hidden-to-user implementation detail), if > > > possible > > > - otherwise fail, and let the user decide about reallocation "with > copy" > > > or allocation of a new space > > > > > > > i respectfully disagree :) but am not pushing to add ksize. > > Just note that both mine and your "well defined API" leak details: > > > > yours is (A) "I may be overallocating but won't tell you how much"; > > mine is (B) "I may be overallocating and here is exactly how much". > > > > Now if I may make a comparison with going shopping, > > I'd rather hear the final price from the seller (case B), > > than having to guess by repeated trial and error, > > which is what case A leads to if i really want to figure out. > > ... > > This is not necessarily true - I omitted the details of reallocation > implementation on purpose. > From the caller's point of view, if it requested allocation of memory > size, then that's what it wanted in the first place. If it got it, then > there is no other info needed. > This is not what we are discussing. We are discussing the case where the caller, _before_ requesting extra memory, would like to know how much space is available to make different decisions such as 1. realloc unconditionally 2. give up 3. allocate a separate block and chain to it 4. reduce its requirements and live with what extra space is available (if any). Your suggested flags support #1 and #2 directly, #3 can be simulated with realloc(NO_ALLOC) + malloc(), but prevent #4. cheers luigi Next, if the caller came to the conclusion that more would be needed, then > it should ask for memory reallocation, trusting that the system will do it > in the most efficient way. > If the caller wants to influence that process, then proper option(s) are > needed in reallocation API, e.g.: > - with no copy > - with copy > That means one call with options, with a specific (wanted by user) result. > Of course, thinking thru the options (default, mutual exclusion, etc) is > an important process and subject to RFC. > A user-empowering API. No magic, no hacks. > > So, how about Request-for-Enhancement to GNU C lib, and the ugly hacks > will disappear quickly. > > jb > > > > ___ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" > -- -+--- Prof. Luigi RIZZO, ri...@iet.unipi.it . Dip. di Ing. dell'Informazione http://www.iet.unipi.it/~luigi/. Universita` di Pisa TEL +39-050-2211611 . via Diotisalvi 2 Mobile +39-338-6809875 . 56122 PISA (Italy) -+--- ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On Fri, Nov 29, 2013 at 4:49 PM, Adrian Chadd wrote: > The reason I wouldn't implement this is to avoid having code that > _relies_ on this behaviour in order to function or perform well. > nobody ever said (or could reasonably expect to do) that. Applications don't know if the allocator overallocates, so they have no hope unless they work even without the feature. This is only about giving them an option to improve performance in those (rare ?) cases where, as i showed, knowing the underlying allocation size may lead to better usage of memory. > > Heck, it may not even be portable to other operating systems. Except, > Linux, I guess. > > in userspace, as jb commented, all major OSes have it (malloc_usable_size() on FreeBSD and Linux, _msize() on Windows, malloc_size() on OSX). In the kernel, I have no idea, but porting kernel code across systems is a nightmare anyways... cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
Luigi Rizzo iet.unipi.it> writes: > ... > > If you want to improve memory management, that is, have the system (kernel > > or user space) handle memory reallocation intelligently and transparently > > to the user, then aim at a well defined API: > > - reallocate "with no copy", which means new space appended (taking into > > account *usable size*, a hidden-to-user implementation detail), if > > possible > > - otherwise fail, and let the user decide about reallocation "with copy" > > or allocation of a new space > > > > i respectfully disagree :) but am not pushing to add ksize. > Just note that both mine and your "well defined API" leak details: > > yours is (A) "I may be overallocating but won't tell you how much"; > mine is (B) "I may be overallocating and here is exactly how much". > > Now if I may make a comparison with going shopping, > I'd rather hear the final price from the seller (case B), > than having to guess by repeated trial and error, > which is what case A leads to if i really want to figure out. > ... This is not necessarily true - I omitted the details of reallocation implementation on purpose. >From the caller's point of view, if it requested allocation of memory size, then that's what it wanted in the first place. If it got it, then there is no other info needed. Next, if the caller came to the conclusion that more would be needed, then it should ask for memory reallocation, trusting that the system will do it in the most efficient way. If the caller wants to influence that process, then proper option(s) are needed in reallocation API, e.g.: - with no copy - with copy That means one call with options, with a specific (wanted by user) result. Of course, thinking thru the options (default, mutual exclusion, etc) is an important process and subject to RFC. A user-empowering API. No magic, no hacks. So, how about Request-for-Enhancement to GNU C lib, and the ugly hacks will disappear quickly. jb ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
The reason I wouldn't implement this is to avoid having code that _relies_ on this behaviour in order to function or perform well. Heck, it may not even be portable to other operating systems. Except, Linux, I guess. -adrian ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On Fri, Nov 29, 2013 at 3:44 PM, jb wrote: > Luigi Rizzo iet.unipi.it> writes: > > > ... > > There is a difference between applications peeking into > > implementation details that should be hidden, and providing > > instead limited and specific information through a well defined API. > > ... > > Right. > > If you want to improve memory management, that is, have the system (kernel > or user space) handle memory reallocation intelligently and transparently > to the user, then aim at a well defined API: > - reallocate "with no copy", which means new space appended (taking into > account *usable size*, a hidden-to-user implementation detail), if > possible > - otherwise fail, and let the user decide about reallocation "with copy" > or allocation of a new space > i respectfully disagree :) but am not pushing to add ksize. Just note that both mine and your "well defined API" leak details: yours is (A) "I may be overallocating but won't tell you how much"; mine is (B) "I may be overallocating and here is exactly how much". Now if I may make a comparison with going shopping, I'd rather hear the final price from the seller (case B), than having to guess by repeated trial and error, which is what case A leads to if i really want to figure out. > The malloc_usable_size() is a hack. > The extra space allocated or not due to fragmentation, alignment, etc, is > an internal by-product, irrelevant to original memory alloc request, and it > should not be leaked, also because its details may change in future API > implementations. > So, these memory allocation functions leaking implementation details, and > the two derived functions, ksize() and malloc_usable_size() (and other > derivatives like malloc_size() in Mac OS X), are a violations of a clean, > safe, and maintainable API. > > Note that malloc_usable_size() is a GNU C Library extension, not part of > Single UNIX Specification. > Honestly i did not even know they existed until a few days ago; but the fact that many different systems have come out with similar extensions at least make me wonder whether the SUS missed it. cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
Luigi Rizzo iet.unipi.it> writes: > ... > There is a difference between applications peeking into > implementation details that should be hidden, and providing > instead limited and specific information through a well defined API. > ... Right. If you want to improve memory management, that is, have the system (kernel or user space) handle memory reallocation intelligently and transparently to the user, then aim at a well defined API: - reallocate "with no copy", which means new space appended (taking into account *usable size*, a hidden-to-user implementation detail), if possible - otherwise fail, and let the user decide about reallocation "with copy" or allocation of a new space The malloc_usable_size() is a hack. The extra space allocated or not due to fragmentation, alignment, etc, is an internal by-product, irrelevant to original memory alloc request, and it should not be leaked, also because its details may change in future API implementations. So, these memory allocation functions leaking implementation details, and the two derived functions, ksize() and malloc_usable_size() (and other derivatives like malloc_size() in Mac OS X), are a violations of a clean, safe, and maintainable API. Note that malloc_usable_size() is a GNU C Library extension, not part of Single UNIX Specification. jb ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
Sure, is there a TCP version of this patch floating around? How's it doing load balancing to multiple listeners? -a On 29 November 2013 11:28, Oleg Moskalenko wrote: > It would be nice to have this feature compiled and supported in FreeBSD > kernel by default. > > Thanks > Oleg > > > On Fri, Nov 29, 2013 at 11:01 AM, Ermal Luçi wrote: > >> And some better marketing from Dragonfly about it >> http://forum.nginx.org/read.php?29,241283,241283 :) >> >> >> On Fri, Nov 29, 2013 at 7:55 PM, Ermal Luçi wrote: >> >>> Also some discussions and improvements to it. >>> >>> http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html >>> >>> >>> On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi wrote: >>> Well seems Dragonfly has some version of it already from commit [1]. In FreeBSD there is the framework for this with by defining PCBGROUP. Also the explanation of it at [2] and [3]. It can achieve approximately the same features of SO_RESUSEPORT of linux. The only thing missing is the marketing behind it and i think and better RSS support. By looking at dates the support is there before linux so all you guys looking for it can experiment with it. What i was trying to accomplish was something else from performance improvement and maybe put a sysctl behind it to make it more acceptable.. [1] http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9 [2] http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51 [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote: > Tim, you are wrong. Read what is "multicast" definition, and read how > UDP and TCP sockets work in Linux 3.9+ kernels. > > Oleg . > > > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote: > >> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: >> >> > Hello, >> > >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two >> daemons to >> > share the same port and possibly listening ip … >> >> These flags are used with TCP-based servers. >> >> I’ve used them to make software upgrades go more smoothly. >> Without them, the following often happens: >> >> * Old server stops. In the process, all of its TCP connections are >> closed. >> >> * Connections to old server remain in the TCP connection table until >> the remote end can acknowledge. >> >> * New server starts. >> >> * New server tries to open port but fails because that port is “still >> in use” by connections in the TCP connection table. >> >> With these flags, the new server can open the port even though >> it is “still in use” by existing connections. >> >> >> > This is not the case today. >> > Only multicast sockets seem to have the behaviour of broadcasting >> the data >> > to all sockets sharing the same properties through these options! >> >> That is what multicast is for. >> >> If you want the same data sent to all listeners, then >> that is multicast behavior and you should be using >> a multicast socket. >> >> > The patch at [1] implements/corrects the behaviour for UDP sockets. >> >> You’re trying to turn all UDP sockets with those options >> into multicast sockets. >> >> If you want a multicast socket, you should ask for one. >> >> Tim >> >> ___ >> freebsd-...@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > > -- Ermal >>> >>> >>> >>> -- >>> Ermal >>> >> >> >> >> -- >> Ermal >> > ___ > freebsd-...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On Thu, Nov 28, 2013 at 7:13 AM, jb wrote: > Luigi Rizzo iet.unipi.it> writes: > > > ... > > But I don't understand why you find ksize()/malloc_usable_size() > dangerous. > > ... > > The original crime is commited when *usable size* (an implementation > detail) > is exported (leaked) to the caller. > To be blunt, when a caller requests memory of certain size, and its > request is > satisfied, then it is not its business to learn details beyond that (and > they > should not be offered as well). > The API should be sanitized, in kernel and user space. > Otherwise, all kind of charlatans will try to play hair-raising games with > it. > If the caller wants to track the *requested size* programmatically, it is > its > business to do it and it can be done very easily. > There is a difference between applications peeking into implementation details that should be hidden, and providing instead limited and specific information through a well defined API. In general (not in the specific code I am handling and not something I personally need), what the caller might want to do is optimize its requests according to how system behaves, and it cannot do that without some help from the below. I have seen the following types of comments in this thread: - "you should get it right the first time and never realloc" Maybe, but then the offending api is realloc() not ksize() - "build your own allocator" Yes i do it when it makes sense, but sometimes it is either overkill or a bad idea (as it loses opportunities for global optimizations, duplicates code, takes memory in subsystem-specific freelists...) - "what if ksize()/malloc_usable_size() lies ?" Well, that would be a bug in the allocator: if it says the memory is usable, it must be usable, period. - "rather than ksize() i'll give you a fix for one use case" (the NO_REALLOC flag to realloc()). This i think would be a mistake -- it acknowledges the need for exposing some information but then only provides a specific fix for one use case. I'll just restate that there are multiple situations where an application might use some information on actual allocation sizes: - when it needs to extend memory and has a choice between a cheap realloc() (if extra space is available), chaining blocks (when the memcpy would be too expensive), give up and live with whatever space is available. - when it has freedom in picking the block size and so it wants to optimize its requests basing on what the underlying allocator does. As an example, long ago FreeBSD was really suboptimal when you allocated blocks whose size was a power of 2, because the metadata was inline. These days, there is a different issue: powers of 2 are ok but blocks 2049 bytes and above seem to be padded to a multiple of 2048, leading to a huge overhead in some cases. cheers luigi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
And some better marketing from Dragonfly about it http://forum.nginx.org/read.php?29,241283,241283 :) On Fri, Nov 29, 2013 at 7:55 PM, Ermal Luçi wrote: > Also some discussions and improvements to it. > > http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html > > > On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi wrote: > >> Well seems Dragonfly has some version of it already from commit [1]. >> >> In FreeBSD there is the framework for this with by defining PCBGROUP. >> Also the explanation of it at [2] and [3]. >> It can achieve approximately the same features of SO_RESUSEPORT of linux. >> The only thing missing is the marketing behind it and i think and better >> RSS support. >> By looking at dates the support is there before linux so all you guys >> looking for it can experiment with it. >> >> What i was trying to accomplish was something else from performance >> improvement and >> maybe put a sysctl behind it to make it more acceptable.. >> >> [1] >> http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9 >> [2] >> http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51 >> [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html >> >> >> On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote: >> >>> Tim, you are wrong. Read what is "multicast" definition, and read how >>> UDP and TCP sockets work in Linux 3.9+ kernels. >>> >>> Oleg . >>> >>> >>> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote: >>> On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: > Hello, > > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to > share the same port and possibly listening ip … These flags are used with TCP-based servers. I’ve used them to make software upgrades go more smoothly. Without them, the following often happens: * Old server stops. In the process, all of its TCP connections are closed. * Connections to old server remain in the TCP connection table until the remote end can acknowledge. * New server starts. * New server tries to open port but fails because that port is “still in use” by connections in the TCP connection table. With these flags, the new server can open the port even though it is “still in use” by existing connections. > This is not the case today. > Only multicast sockets seem to have the behaviour of broadcasting the data > to all sockets sharing the same properties through these options! That is what multicast is for. If you want the same data sent to all listeners, then that is multicast behavior and you should be using a multicast socket. > The patch at [1] implements/corrects the behaviour for UDP sockets. You’re trying to turn all UDP sockets with those options into multicast sockets. If you want a multicast socket, you should ask for one. Tim ___ freebsd-...@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >>> >>> >> >> >> -- >> Ermal >> > > > > -- > Ermal > -- Ermal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
Also some discussions and improvements to it. http://unix.derkeiler.com/Mailing-Lists/FreeBSD/net/2013-09/msg00165.html On Fri, Nov 29, 2013 at 7:42 PM, Ermal Luçi wrote: > Well seems Dragonfly has some version of it already from commit [1]. > > In FreeBSD there is the framework for this with by defining PCBGROUP. > Also the explanation of it at [2] and [3]. > It can achieve approximately the same features of SO_RESUSEPORT of linux. > The only thing missing is the marketing behind it and i think and better > RSS support. > By looking at dates the support is there before linux so all you guys > looking for it can experiment with it. > > What i was trying to accomplish was something else from performance > improvement and > maybe put a sysctl behind it to make it more acceptable.. > > [1] > http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9 > [2] > http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51 > [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html > > > On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote: > >> Tim, you are wrong. Read what is "multicast" definition, and read how UDP >> and TCP sockets work in Linux 3.9+ kernels. >> >> Oleg . >> >> >> On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote: >> >>> >>> On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: >>> >>> > Hello, >>> > >>> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons >>> to >>> > share the same port and possibly listening ip … >>> >>> These flags are used with TCP-based servers. >>> >>> I’ve used them to make software upgrades go more smoothly. >>> Without them, the following often happens: >>> >>> * Old server stops. In the process, all of its TCP connections are >>> closed. >>> >>> * Connections to old server remain in the TCP connection table until the >>> remote end can acknowledge. >>> >>> * New server starts. >>> >>> * New server tries to open port but fails because that port is “still in >>> use” by connections in the TCP connection table. >>> >>> With these flags, the new server can open the port even though >>> it is “still in use” by existing connections. >>> >>> >>> > This is not the case today. >>> > Only multicast sockets seem to have the behaviour of broadcasting the >>> data >>> > to all sockets sharing the same properties through these options! >>> >>> That is what multicast is for. >>> >>> If you want the same data sent to all listeners, then >>> that is multicast behavior and you should be using >>> a multicast socket. >>> >>> > The patch at [1] implements/corrects the behaviour for UDP sockets. >>> >>> You’re trying to turn all UDP sockets with those options >>> into multicast sockets. >>> >>> If you want a multicast socket, you should ask for one. >>> >>> Tim >>> >>> ___ >>> freebsd-...@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-net >>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >>> >> >> > > > -- > Ermal > -- Ermal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
On Fri, Nov 29, 2013 at 6:59 PM, Tim Kientzle wrote: > > On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: > > > Hello, > > > > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to > > share the same port and possibly listening ip … > > These flags are used with TCP-based servers. > Every one has its own use-case! > > I’ve used them to make software upgrades go more smoothly. > Without them, the following often happens: > > * Old server stops. In the process, all of its TCP connections are closed. > > * Connections to old server remain in the TCP connection table until the > remote end can acknowledge. > > * New server starts. > > * New server tries to open port but fails because that port is “still in > use” by connections in the TCP connection table. > > With these flags, the new server can open the port even though > it is “still in use” by existing connections. > > > > This is not the case today. > > Only multicast sockets seem to have the behaviour of broadcasting the > data > > to all sockets sharing the same properties through these options! > > That is what multicast is for. > > Multicast has its defined scope and its applications though i think its interpreting the same socket options and respecting the options for what they should do and how they should behave. > If you want the same data sent to all listeners, then > that is multicast behavior and you should be using > a multicast socket. > > > The patch at [1] implements/corrects the behaviour for UDP sockets. > > You’re trying to turn all UDP sockets with those options > into multicast sockets. > Not really the idea is how you do support the use case of having two daemons using the same port numbers but speaking different protocols. The best would be to merge these daemons but in the case you cannot there should be some support on it. At the very end there are only 65k ports :). Probably a sysctl for the feature might be a further compromise on it? > > If you want a multicast socket, you should ask for one. > > Tim > > -- Ermal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
Well seems Dragonfly has some version of it already from commit [1]. In FreeBSD there is the framework for this with by defining PCBGROUP. Also the explanation of it at [2] and [3]. It can achieve approximately the same features of SO_RESUSEPORT of linux. The only thing missing is the marketing behind it and i think and better RSS support. By looking at dates the support is there before linux so all you guys looking for it can experiment with it. What i was trying to accomplish was something else from performance improvement and maybe put a sysctl behind it to make it more acceptable.. [1] http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/740d1d9f7b7bf9c9c021abb8197718d7a2d441c9 [2] http://fxr.watson.org/fxr/source/netinet/in_pcbgroup.c?im=bigexcerpts#L51 [3] http://lists.freebsd.org/pipermail/svn-src-head/2011-June/028190.html On Fri, Nov 29, 2013 at 7:03 PM, Oleg Moskalenko wrote: > Tim, you are wrong. Read what is "multicast" definition, and read how UDP > and TCP sockets work in Linux 3.9+ kernels. > > Oleg . > > > On Fri, Nov 29, 2013 at 9:59 AM, Tim Kientzle wrote: > >> >> On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: >> >> > Hello, >> > >> > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to >> > share the same port and possibly listening ip … >> >> These flags are used with TCP-based servers. >> >> I’ve used them to make software upgrades go more smoothly. >> Without them, the following often happens: >> >> * Old server stops. In the process, all of its TCP connections are >> closed. >> >> * Connections to old server remain in the TCP connection table until the >> remote end can acknowledge. >> >> * New server starts. >> >> * New server tries to open port but fails because that port is “still in >> use” by connections in the TCP connection table. >> >> With these flags, the new server can open the port even though >> it is “still in use” by existing connections. >> >> >> > This is not the case today. >> > Only multicast sockets seem to have the behaviour of broadcasting the >> data >> > to all sockets sharing the same properties through these options! >> >> That is what multicast is for. >> >> If you want the same data sent to all listeners, then >> that is multicast behavior and you should be using >> a multicast socket. >> >> > The patch at [1] implements/corrects the behaviour for UDP sockets. >> >> You’re trying to turn all UDP sockets with those options >> into multicast sockets. >> >> If you want a multicast socket, you should ask for one. >> >> Tim >> >> ___ >> freebsd-...@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-net >> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org" >> > > -- Ermal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
On Nov 29, 2013, at 4:04 AM, Ermal Luçi wrote: > Hello, > > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to > share the same port and possibly listening ip … These flags are used with TCP-based servers. I’ve used them to make software upgrades go more smoothly. Without them, the following often happens: * Old server stops. In the process, all of its TCP connections are closed. * Connections to old server remain in the TCP connection table until the remote end can acknowledge. * New server starts. * New server tries to open port but fails because that port is “still in use” by connections in the TCP connection table. With these flags, the new server can open the port even though it is “still in use” by existing connections. > This is not the case today. > Only multicast sockets seem to have the behaviour of broadcasting the data > to all sockets sharing the same properties through these options! That is what multicast is for. If you want the same data sent to all listeners, then that is multicast behavior and you should be using a multicast socket. > The patch at [1] implements/corrects the behaviour for UDP sockets. You’re trying to turn all UDP sockets with those options into multicast sockets. If you want a multicast socket, you should ask for one. Tim ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
On 11/29/13, 8:04 PM, Ermal Luçi wrote: Hello, since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to share the same port and possibly listening ip, you would expect if you bind two daemon with such options to same port to see the same traffic on both! this is not how I interpret it.. I presume it is is to allow two OUTGOING sessions from the same source. This is not the case today. Only multicast sockets seem to have the behaviour of broadcasting the data to all sockets sharing the same properties through these options! The patch at [1] implements/corrects the behaviour for UDP sockets. Is there anything to be corrected in that patch? Why it has not been provided there before? Can it be committed to the tree? Any extra security checks for jails needed there? [1] https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
On Fri, Nov 29, 2013 at 1:04 PM, Ermal Luçi wrote: > Hello, > > since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to > share the same port and possibly listening ip, you would expect if you bind > two daemon with such options to same port to see the same traffic on both! > > This is not the case today. > Only multicast sockets seem to have the behaviour of broadcasting the data > to all sockets sharing the same properties through these options! > > The patch at [1] implements/corrects the behaviour for UDP sockets. > Is there anything to be corrected in that patch? > Why it has not been provided there before? > Can it be committed to the tree? > Any extra security checks for jails needed there? > > > [1] > > https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff > > -- > Ermal I understood it as working sort of like for TCP, where packages from a given remote host+port all end up at exactly one of the local sockets? If the idea is to split the workload over multiple threads holding their own sockets listening to the same interface+port, wouldn't sending all packets to all sockets all the time be kind of counterproductive? Of course, I haven't actually used it much; I might be wrong. -- Daniel Nebdal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
[PATCH] SO_REUSEADDR and SO_REUSEPORT behaviour
Hello, since SO_REUSEADDR and SO_REUSEPORT are supposed to allow two daemons to share the same port and possibly listening ip, you would expect if you bind two daemon with such options to same port to see the same traffic on both! This is not the case today. Only multicast sockets seem to have the behaviour of broadcasting the data to all sockets sharing the same properties through these options! The patch at [1] implements/corrects the behaviour for UDP sockets. Is there anything to be corrected in that patch? Why it has not been provided there before? Can it be committed to the tree? Any extra security checks for jails needed there? [1] https://github.com/pfsense/pfsense-tools/blob/master/patches/RELENG_10_0/udp_SO_REUSEADDR%2BPORT.diff -- Ermal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On 11/29/13, 7:26 PM, Daniel Nebdal wrote: On Fri, Nov 29, 2013 at 11:59 AM, Gleb Smirnoff wrote: On Thu, Nov 28, 2013 at 03:13:53PM +, jb wrote: j> > But I don't understand why you find ksize()/malloc_usable_size() dangerous. j> > ... j> j> The original crime is commited when *usable size* (an implementation detail) j> is exported (leaked) to the caller. j> To be blunt, when a caller requests memory of certain size, and its request is j> satisfied, then it is not its business to learn details beyond that (and they j> should not be offered as well). j> The API should be sanitized, in kernel and user space. j> Otherwise, all kind of charlatans will try to play hair-raising games with it. j> If the caller wants to track the *requested size* programmatically, it is its j> business to do it and it can be done very easily. +1 This is kind of APIs that just shouldn't exist. -- Totus tuus, Glebius. Then again: Using the "overflow" memory is only going to bite them if the API lies : If the return value is exactly "the size of the block you got allocated and can safely use until you free it", using it will per definition be safe. If the allocator later changes to, say, always allocate exact byte ranges, or to allocating blocks but having the option to fragment them later - then the return value would have to shrink to match, and any program using it would still DTRT. I'm completely ambivalent about adding it, though - it's not something I need, it's more stuff that needs to be handled if you change/rewrite the allocator, and it's not my decision. I think that if you want to play games with expanding buffers etc, then you should write your own allocator. You asked for X bytes. you should expect that you get X bytes and nothing more... either that or you should have asked for more in the first place. -- Daniel Nebdal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On Fri, Nov 29, 2013 at 11:59 AM, Gleb Smirnoff wrote: > On Thu, Nov 28, 2013 at 03:13:53PM +, jb wrote: > j> > But I don't understand why you find ksize()/malloc_usable_size() > dangerous. > j> > ... > j> > j> The original crime is commited when *usable size* (an implementation > detail) > j> is exported (leaked) to the caller. > j> To be blunt, when a caller requests memory of certain size, and its > request is > j> satisfied, then it is not its business to learn details beyond that > (and they > j> should not be offered as well). > j> The API should be sanitized, in kernel and user space. > j> Otherwise, all kind of charlatans will try to play hair-raising games > with it. > j> If the caller wants to track the *requested size* programmatically, it > is its > j> business to do it and it can be done very easily. > > +1 > > This is kind of APIs that just shouldn't exist. > > -- > Totus tuus, Glebius. > Then again: Using the "overflow" memory is only going to bite them if the API lies : If the return value is exactly "the size of the block you got allocated and can safely use until you free it", using it will per definition be safe. If the allocator later changes to, say, always allocate exact byte ranges, or to allocating blocks but having the option to fragment them later - then the return value would have to shrink to match, and any program using it would still DTRT. I'm completely ambivalent about adding it, though - it's not something I need, it's more stuff that needs to be handled if you change/rewrite the allocator, and it's not my decision. -- Daniel Nebdal ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On 28 Nov 2013, at 15:13, jb wrote: > Luigi Rizzo iet.unipi.it> writes: > >> ... >> But I don't understand why you find ksize()/malloc_usable_size() dangerous. >> ... > > The original crime is commited when *usable size* (an implementation detail) > is exported (leaked) to the caller. > To be blunt, when a caller requests memory of certain size, and its request is > satisfied, then it is not its business to learn details beyond that (and they > should not be offered as well). > The API should be sanitized, in kernel and user space. > Otherwise, all kind of charlatans will try to play hair-raising games with it. > If the caller wants to track the *requested size* programmatically, it is its > business to do it and it can be done very easily. > > Some of these guys got it perfectly right: > http://stackoverflow.com/questions/5813078/is-it-possible-to-find-the-memory-allocated-to-the-pointer-without-searching-fo I disagree. I've encountered several occasions where either locality doesn't matter so much or I know the pointer is aliased, and I'd like increase the size of a relatively large allocation. I have two choices: - Call realloc(), potentially copying a lot of data - Call malloc(), and chain two (or more) allocations together. What I'd like to do is call realloc() if it's effectively free, or call malloc() in other cases. The malloc_useable_size() API is wrong though. In the kernel, realloc() already takes a flag and a M_DONTALLOCATE would make more sense, enlarging the allocation if it can be done without doing the allocate-copy-free dance, but returning NULL and leaving the allocation unmodified if not. David ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: [RFC] how to get the size of a malloc(9) block ?
On Thu, Nov 28, 2013 at 03:13:53PM +, jb wrote: j> > But I don't understand why you find ksize()/malloc_usable_size() dangerous. j> > ... j> j> The original crime is commited when *usable size* (an implementation detail) j> is exported (leaked) to the caller. j> To be blunt, when a caller requests memory of certain size, and its request is j> satisfied, then it is not its business to learn details beyond that (and they j> should not be offered as well). j> The API should be sanitized, in kernel and user space. j> Otherwise, all kind of charlatans will try to play hair-raising games with it. j> If the caller wants to track the *requested size* programmatically, it is its j> business to do it and it can be done very easily. +1 This is kind of APIs that just shouldn't exist. -- Totus tuus, Glebius. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"