Re: [networking-discuss] VirtualBox Network Driver problem with Marvell Yukon
Garrett D'Amore wrote: This looks like an OpenSolaris packaigng problem, and honestly I don't know enough about it to be terribly helpful in understanding the problem. -- Garrett He's running build 111. yge was introduced in build 124; he'll need to upgrade. - Bart -- Bart Smaalders Solaris Kernel Performance ba...@cyber.eng.sun.com http://blogs.sun.com/barts You will contribute more with mercurial than with thunderbird. ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] [tools-compilers] Untunable tunables or: compiler optimizing global static int into constant value
James Carlson wrote: Chris Quenelle writes: The ELF symbols are different for global vs static symbols. Yep. If you have a list of known tunables, you should be able to audit this automatically by having the build scan the symbols after you compile. Sadly, no. Most things designed for use in /etc/system are ad-hoc. There's certainly no general list. (Outside of the 'ip' module, where we do some build time auditing for a different reason.) It's just a bit of arcana that designers and code reviewers sometimes need to know about, like the difference between .data and .bss. :-/ Of course, a declaration of #define EXTERN_TUNABLE /* Do not make static */ in a kernel header might help prevent this: EXTERN_TUNABLE int foo; - Bart -- Bart Smaalders Solaris Kernel Performance ba...@cyber.eng.sun.com http://blogs.sun.com/barts You will contribute more with mercurial than with thunderbird. ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
James Carlson wrote: Yes, but I _thought_ it was clear that you wanted to get rid of select/poll in the path of the I/O, on the grounds that it results in too many system calls. Right? ... I'm saying that an interface that accepts multiple file descriptors in an attempt to avoid poll-and-figure-out-which-one behavior, but that handles only one kind of transaction (I/O) and only one kind of descriptor (a socket) might be too limited to use easily. What's the intended usage model? Kacheong's design involves either a call to port_getn or a call to select/poll., followed by a call to his new function, recvfrom_list(): while (1) { select(...) /* get list of active sockets */ recvfrom_list(); process_all_data(); } This technique will retrieve 1 available message on each socket at the cost of 2 system calls. If the traffic is not evenly distributed, more system calls are required, to the point where 2 system calls are required per message if all the traffic arrives on a single socket. This suffers from the following limitations: * As noted above, it doesn't handle asymmetric loadings well * Does not easily admit the use of threads; only one thread can meaningfully poll on a set of fds at a time. * Lack of buffer pre-posting mandates additional copy of data when recvfrom_list() is called. The asynchronous model is this: queue_list_of_async_recvfroms(port, ...); in one or more threads: while (1) { port_getn(port, ); process_all_data(); queue_list_of_async_recvfroms(port, ...); } In the case of evenly distributed inputs, this also results in 2 system calls to retrieve a message on each port. However, if we're clever enough to post more than one pending recvfrom on each socket (using more buffers), we can also handle asymmetric loadings w/o increasing work per message. We also gain the the following: * Buffers are pre-posted, so with a proper implementation hardware, copies may be avoided. Even if copies are required, they can be done by other cpus/threads/io dma engines. This means that even single threaded apps should see a significant reduction in UDP latency on MP hardware. * Seamless threading is possible w/o any user locking required to manage pending IO lists; the event port mechanism provides a user-specified pointer to accompany each IO operation. * Other pending IO operations (disk, poll(2), listen, accept, etc) can be handled in the event loop, since event_ports are designed to unify dispatching paradigm. Supporting asynchronous IO on sockets also admits significant performance wins on the transmit side, since zero copy is easily done. The downsides w/ asynchronous I/O is that some thought needs to be given to transmit/receive buffer management, and that it may represent a new programming pattern for networking developers. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
David Edmondson wrote: If the event mechanism in the ICSC extended socket API were replaced with event ports, would the remainder fit your requirements? dme. I need to go back and read that again. I was concerned w/ the pre-registration requirements in the past. Do you have a pointer to the spec handy? Thanks - - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
Kacheong Poon wrote: Bart Smaalders wrote: The savings here are basically the reduction of system call overhead we still have per-filedescriptor costs on each select. Yes, the major issue is too many system calls. Also, such event loops are notoriously hard to thread; the select loop requires a dispatcher thread. Why not do the following: One reason for the simple function is that most existing code can be easily changed to use it. While this is probably not optimal, the impact can be bigger as more apps may be changed to use it. Define a new asynchronous call similar to lio_listio that permits the collection of struct sockaddr socklen_t info. This would then generate async io completion events, which could be retrieved via an event port once the IO had completed. In lio_listio(), it returns when all the ops are complete (LIO_WAIT) or it returns immediately and an event is generated when all the ops are complete. This does not work well in the current context where the ops are for different sockets. So I suppose the above suggestion means that an event is generated for each ops in the list. Correct? To use it, I suppose it can be something like /* * Call the new function to associate the list * of sockets with the port and pre-post the * received buffers. */ _new_socket_listio(port, socket_list, ...); while (1) { if (port_getn(...) == 0) { /* Data is in buffer for n sd */ for (each sd in the event list) process_data(sd); ... /* * Re-associate the just processed * list of sockets with the port. */ _new_socket_listio(...); } } Comparing this with the use of recvfrom_list(). /* Associate all sockets with the port. */ for (...) port_associate(...); while (1) { if (port_getn(...) == 0) { /* * Construct a receive list from * the event list and call * recvfrom_list(). */ ... recvfrom_list(...); for (each sd in the list) { process_data(sd); /* Re-associate the sd */ port_associate(...); } ... } } Besides the pre-posting of buffers, I think one major difference is in the number of system calls made as we need to call port_associate() repeatedly for recvfrom_list(). Maybe we need a port_associate_n()? The advantage of such a design is that by having multiple async read requests pending in the kernel on each fd, per-descriptor latencies are reduced significantly since we don't have a delay due to returning from poll/select, scanning the descriptor list, and calling back into the kernel w/ recfrom_list, processing the results and then calling select/poll again. If we have something like port_associate_n(), I think most of the above problems are not true. But if an app does not use event port, it can still benefit from recvfrom_list(). This is a trade off. Also more threads can be brought to bear, a useful technique on CMT processors. Even threading-averse programmers can still benefit from multiple kernel threads copying out the data into the pre-posted receive buffers, and process the requests as they complete in a single thread. Lastly, read side I/O buffer pre-posting also enables utilization of more sophisticated NICs and such technologies as Intel's IOAT. I think the major (only?) benefit of this call is in the pre-posting of buffers. Moving forward, I guess it is needed. But with existing apps, I guess a recvfrom_list() is better. Maybe we should do both? There is an inherent delay using recvfrom list; the packet needs to arrive, the waiting process needs to return from poll() and determine which descriptors have data, and then call recvfrom list, which will then validate each fd, and attempt to copy out the data from each socket to the user- supplied buffer. For multi-threaded applications, only one thread can poll() effectively at a time; the work discovered must then be dispatched to other threads, the busy fds removed from the poll() list, and the remaining fds polled again. We cannot begin processing the next msg on a port until the previous one has been processed, since poll() is idempotent. If you use async io, on the other hand, the programmer can choose the number of buffers that he wants extant per fd, and IOs are queued for all of them. Multiple threads can
Re: [networking-discuss] New send and receive socket calls
Kacheong Poon wrote: The following loop is quite common in a lot of socket apps (the example is for reading data) . select(on a bunch of sockets); foreach socket with read events { recv(data); process(data); } In Solaris 10 with event port, the select() may be replaced by say, port_getn(). But the result is the same, the app knows that there are a bunch of sockets ready to be handled. Currently, each socket needs to be handled independently. Stephen Uhler from Sun Lab suggested that we can have a new socket call which can handle a bunch of sockets at the same time. Doing this can save a lot of socket calls. and context switches. Using recvfrom() as an example, struct rd_list { int sd; void*buf; size_t buf_len; struct sockaddr *from; socklen_t *from_len; int flags; int *num_recv; }; int recvfrom_list(struct rd_list[], size_t list_size); Basically, struct rd_list contains all the parameters of the recvfrom() call. And the recvfrom_list() takes an array of struct rd_list. In return, it fills in each element with the appropriate data as if a non-blocking recvfrom() is called on each socket. We can also have recv_list() and recvmsg_list(). The idea can also be applied to the sending side. For example, we can have sendto_list(). What do people think of this proposal? Please comment. The savings here are basically the reduction of system call overhead we still have per-filedescriptor costs on each select. Also, such event loops are notoriously hard to thread; the select loop requires a dispatcher thread. Why not do the following: Define a new asynchronous call similar to lio_listio that permits the collection of struct sockaddr socklen_t info. This would then generate async io completion events, which could be retrieved via an event port once the IO had completed. The advantage of such a design is that by having multiple async read requests pending in the kernel on each fd, per-descriptor latencies are reduced significantly since we don't have a delay due to returning from poll/select, scanning the descriptor list, and calling back into the kernel w/ recfrom_list, processing the results and then calling select/poll again. Also more threads can be brought to bear, a useful technique on CMT processors. Even threading-averse programmers can still benefit from multiple kernel threads copying out the data into the pre-posted receive buffers, and process the requests as they complete in a single thread. Lastly, read side I/O buffer pre-posting also enables utilization of more sophisticated NICs and such technologies as Intel's IOAT. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
James Carlson wrote: Bart Smaalders writes: Kacheong Poon wrote: Define a new asynchronous call similar to lio_listio that permits the collection of struct sockaddr socklen_t info. This would then generate async io completion events, which could be retrieved via an event port once the IO had completed. This seems reasonable to me, but I think the error semantics have to be worked out _very_ carefully. The sendfile/sendfilev fiasco gives a pretty good cautionary tale. As for making it sockets-only, I'd be worried that application programmers would be sufficiently constrained that they would end up being unable to use the resulting interface. Every time we've tried to come up with one of these special new interfaces, the first question asked is, great, but my polling loop code is currently pretty generic, so how do I get STREAMS, character devices, other I/O into this interface? We already can do async IO to regular files, pipes, raw disks, etc using the existing async io routines. The *msg, etc, calls need additional passed in/returned information (sender address length) so the async IO calls to initiate such IO requests need to be different, and if we want kernel level parallelism sockfs needs modifications. However, there's nothing stopping one from writing a single loop that issues disk udp network IO requests, and handles the resulting events, just as a single event port loop can handle async IO requests, timer completion events, etc, today. For connected sockets (tcp or udp) there doesn't appear to be any reason that the existing routines cannot be used. Right now we have no async IO support in sockfs, so libaio (now in libc actually, thanks raf) emulates true async IO w/ user-level threads behind your back. I agree that, as usual, the error semantics are critical. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
Andrew Gallatin wrote: Bart Smaalders writes: Lastly, read side I/O buffer pre-posting also enables utilization of more sophisticated NICs and such technologies as Intel's IOAT. Does OpenSolaris currently support IOAT? Thanks, Drew A project to use this is underway, from what I understand. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] New send and receive socket calls
Jeremy Harris wrote: Bart Smaalders wrote: Lastly, read side I/O buffer pre-posting also enables utilization of more sophisticated NICs and such technologies as Intel's IOAT. What happened to the Async-Sockets standardization effort? - Jeremy Harris I'm not sure where it stands now; I definitely had issues with yet another event model for Solaris. The necessity of pre-registering buffers was also really clunky; I guess that was felt to be needed for user-level dma attempts. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Nmap v4.20 and 802.1q
What does snoop do on the same interface? I assume of course you're running with appropriate privileges for the operation. - Bart This message posted from opensolaris.org ___ networking-discuss mailing list networking-discuss@opensolaris.org
[networking-discuss] Re: [perf-discuss] Followup on microoptimizing ip_addr_match()
Dan McDonald wrote: (This time, using e-mail instead of the web form...) Hello again! After what suggestions I saw (all on networking-discuss...), I put together a multiple-choice question. Consider thie _LITTLE_ENDIAN section in this code fragment, which is known to be an improvement on SPARC: = (Cut up to and including here.) = #ifdef _LITTLE_ENDIAN /* For little-endian, we really need to think about this. */ #if 1 /* Clever math - thanks Nico! */ #define PREFIX_LOW32(pfxlen) \ uint8_t)((0xFF00 ((pfxlen) 0x7 ((pfxlen) ~0x7)) | \ (0xFF ((31 - (pfxlen)) ~0x7))) #endif #if 0 /* ntohl() the big-endian solution */ #define PREFIX_LOW32(pfxlen) ntohl(0x (32 - (pfxlen))) #endif #if 0 /* or use a table lookup */ static uint32_t masks[] = { 0x, 0x0080, 0x00C0, 0x00E0, 0x00F0, 0x00F8, 0x00FC, 0x00FE, 0x00FF, 0x80FF, 0xC0FF, 0xE0FF, 0xF0FF, 0xF8FF, 0xFCFF, 0xFEFF, 0x, 0x0080, 0x00C0, 0x00E0, 0x00F0, 0x00F8, 0x00FC, 0x00FE, 0x00FF, 0x80FF, 0xC0FF, 0xE0FF, 0xF0FF, 0xF8FF, 0xFCFF, 0xFEFF }; #define PREFIX_LOW32(pfxlen) (masks[pfxlen]) #endif /* * sleazy prefix-length-based compare. * another inlining candidate.. */ boolean_t ip_addr_match(uint32_t *addr1, int pfxlen, uint32_t *addr2) { while (pfxlen = 32) { if (*addr1 != *addr2) return (B_FALSE); addr1++; addr2++; pfxlen -= 32; } return (pfxlen == 0 || ((*addr1 ^ *addr2) PREFIX_LOW32(pfxlen))); p} = (Cut up to and including here.) = And here's the original code: = (Cut up to and including here.) = /* * sleazy prefix-length-based compare. * another inlining candidate.. */ boolean_t ip_addr_match(uint8_t *addr1, int pfxlen, in6_addr_t *addr2p) { int offset = pfxlen3; int bitsleft = pfxlen 7; uint8_t *addr2 = (uint8_t *)addr2p; /* * and there was much evil.. * XXX should inline-expand the bcmp here and do this 32 bits * or 64 bits at a time.. */ return ((bcmp(addr1, addr2, offset) == 0) ((bitsleft == 0) || (((addr1[offset] ^ addr2[offset]) (0xff(8-bitsleft))) == 0))); } = (Cut up to and including here.) = Experiments run using an IPsec and IKE test suite and using DTrace's FBT indicate that the original function outperforms both the table lookup and the ntohl() the big-endian solution. I have my suspicions about our in-lining performance of ntohl(), but that's another topic. I haven't run Nico's clever math solution yet, but would like to know what the peanut gallery thinks. BTW, here are the two big buckets using bcmp() on an opteron box with bcmp(): Number of calls == 410221. 128 |@@ 308602 256 |@@ 99140 and with table-lookup: Number of calls == 412346. 128 | 292893 256 |@@@ 117017 The htonl() ones were worse than table-lookup. Those two buckets account for the vast majority ( 99% of the sampled calls) of the calls to ip_addr_match(). The sparc gains are bigger than the opteron lossage. Here's bcmp(): Number of calls == 398975. 128 |@@ 177120 256 | 159558 512 |@@ 58487 and using the simple math-only solution: Number of calls == 397934. 128 | 319443 256 |@@ 60991 512 |@5972 So I'm not sure what to do. Any clues are, as always, welcome! Thanks, Dan ___ perf-discuss mailing list [EMAIL PROTECTED] On x86, why not inline a bswap instruction? - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Problems with B64+ and Netopia DSL router (PPPoe)
John Weekley wrote: I recently moved my home firewall from an old B56 box (Gateway 400Mhz, 512 MB, iprb external NIC, rge0 internal NIC) with no changes to tcp settings via ndd to a B64 box (AMD 64 3200+, 2GB RAM iprb external NIC, rge internal) also no changes to tcp settings. To complicate matters, I was forced to upgrade an old Cayman 3220H router to a Netopia 3346N. Once I switched to the Netopia, I noticed while snooping the external interface, a few ICMP Destination unreachable messages. I was able to ping and eventually connect to the site that was showing unreachable. It wasn't noticeable to users with the B56 firewall. but with B64 and B65, tcp connections frequently stall, and it takes a few attempts with firefox to establish the connection. OpenSolaris.org is one of the sites that exhibits this behavior along with msnbc.com. I've set the MTU to 1462 on the external interface and disabled path MTU discovery, with no success. I'm stumped. What's changed between B56 and B64+ to cause this behavior? This sounds like an intermittent routing problem. How are you establishing your static routes? - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Adding a permenant host routing in S10/b63
Dan McDonald wrote: On Wed, May 23, 2007 at 12:30:24PM -0700, Sean Clarke wrote: I add a static route to a host using the following command: route add -host x.x.x.x y.y.y.y -static I would like that to be added automatically on boot up. In Nevada (and possibly one of the S10 updates), there's the -p flag to route(1m): route -p add -host x.x.x.x y.y.y.y Note that this route can and will disappear if the link goes down on build 55; I don't know if that behavior has been changed. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Adding a permenant host routing in S10/b63
James Carlson wrote: Bart Smaalders writes: Dan McDonald wrote: On Wed, May 23, 2007 at 12:30:24PM -0700, Sean Clarke wrote: I add a static route to a host using the following command: route add -host x.x.x.x y.y.y.y -static I would like that to be added automatically on boot up. In Nevada (and possibly one of the S10 updates), there's the -p flag to route(1m): route -p add -host x.x.x.x y.y.y.y Note that this route can and will disappear if the link goes down on build 55; I don't know if that behavior has been changed. I'm not sure what you're talking about. The route should disappear only if either (a) the user specifies an output interface with -ifp or (b) the route is non-static. Anything else ought to be filed as a bug. Note also that Solaris 10 build 63 (!) shouldn't be in use *anywhere*. It's left over from the Solaris 10 beta test cycle, and should have been *removed* per the license agreement long ago. If you have it somewhere, you shouldn't. Please upgrade (or reinstall) before doing anything else. (And if you're looking here for support, you should be using an OpenSolaris based distribution in order for your message to be on-topic. Otherwise, for S10 and updates, please contact Sun's support group.) I was talking about Nevada; sorry, misread the title. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] what is Deadlock: cycle in blocking chain
Tao Chen wrote: Hello, There is a dead lock condition in my driver code. I do not understand the message Deadlock: cycle in blocking chain, how does it happen? How do I avoid it? Tom panic[cpu0]/thread=ff0003eddc80: Deadlock: cycle in blocking chain ff0003eddaa0 genunix:turnstile_block+9f3 () ff0003eddb20 unix:mutex_vector_enter+38d () ff0003eddb50 qla:qla_link_state_machine+22 () ff0003eddb70 qla:qla_timer+78 () ff0003eddbd0 genunix:callout_execute+b1 () ff0003eddc60 genunix:taskq_thread+1dc () ff0003eddc70 unix:thread_start+8 () This message posted from opensolaris.org ___ networking-discuss mailing list networking-discuss@opensolaris.org classic deadlock: thread A holds lock 1 thread B holds lock 2 thread A now attempts to acquire lock 2 thread B now attempts to acquire lock 1 Solaris will detect an arbitrary number of threads in the blocking chain... You need to grab nested locks in a consistent order... - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] solaris (nv 57) great on Shuttle SN85g but one problem
matthew wrote: kudos to the team that hammered out the new network config tool for the NV builds!!! it has been much needed. although manually editing the files is easy, this is just quicker and expected in today's world. especially when you have to work around os x server admins all day that complain about solaris. i have nv 57 installed on a shuttle SN85 and it works wonderfully, but i cannot get it to recognize the internal Nvidia network card... i have tried every thing i can think of. does anyone know what cards will for sure work on solaris nv builds? i am looking to just buy a few inexpensive pci cards and try...the sun hwl does not seem to list any practical, recent or common cards that users are likely to have. suse 10 works great on this box, but i am pulling for solaris. Netgear GA311 works well OOTB, as does AirLink GB. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] non-blocking kernel sockets
For sockets for user programs, is there any interest in having async I/O just work TM for connected tcp sockets? - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
[networking-discuss] Re: Re: Re: problems with nge driver on build 53 on K8N mobo mobo
Has there been any progress on this bug? Thanks - - Bart This message posted from opensolaris.org ___ networking-discuss mailing list networking-discuss@opensolaris.org
[networking-discuss] Re: [nwam-discuss] Network proxy
Kacheong Poon wrote: Calum Benson wrote: That's a bit of an over-generalisation. Most apps on the OpenSolaris desktop *do* use it. (Even Mozilla used to, although we seem to have decided not to carry that patch across into Firefox.) So perhaps you ought to clarify exactly what class of applications you're interested in solving the problem for...? I guess the class of apps is not really defined... We just want to make most apps, graphical or not, work with NWAM seamlessly if possible. And I guess most people will consider Firefox to be an important app to work with NWAM. Do you have a suggestion on this issue if patching Firefox to use the GConf key may not be an option? Has anyone considered creating a local proxy that would handle this? Apps would always use the local proxy; it's configuration would be updated dynamically by NWAM/user. We could ameliorate possible performance issues by adding the ability to splice sockets together in the kernel once the connections were established. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Network proxy
Paul Jakma wrote: On Wed, 17 Jan 2007, Darren Reed wrote: It would be nice if we could change every instance of 'getenv(http_proxy)' with something just as simple that took into account: - $http_proxy - Bonjour - WPAD - whatever other external things need to be used here and just link programs using that function to a special library if required. Wouldn't it make more sense to be able to dynamically change proxies transparent to the application? That way application restart wouldn't be necessary, and one OS proxy dialog would suffice. Could one use some ipfilter magic to redirect http traffic through a proxy? Or does running a local proxy make more sense; it would change configuration but local apps would always contact the local proxy. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [nwam-discuss] Re: [networking-discuss] Network proxy
James Carlson wrote: Bart Smaalders writes: suffice. Could one use some ipfilter magic to redirect http traffic through a proxy? There's no way to know which outbound traffic corresponds to http traffic. There's no requirement that remote sites actually put their web servers on port 80, and many do not. Of course; duh. I need to drink more coffee. Or does running a local proxy make more sense; it would change configuration but local apps would always contact the local proxy. That just moves the problem from one place to another. Perhaps the theory is that (unlike the browser), we can make this proxy work right ... ? Well, it does handle the restart problem, and does give us a single point of reconfiguration much nicer than restarting all processes using the wrong environment variable... - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: problems with nge driver on build 53 on K8N mobo mobo
Jason J. W. Williams wrote: Hi David, Do you have a duplex-mismatch? -J On 1/11/07, David Helm [EMAIL PROTECTED] wrote: I too have a problem with the latest nge driver (version 1.14 according to modinfo) on the above mobo. (I'm running Solaris 10_U2 and Solaris 10_U3) The interface will not even plumb with dhcp (I use inetmenu to configure network interfaces) and if given a static IP, the interface plumbs and ifconfig says that everything is ok, but the interface cannot see anything on the network. Version 1.14 of the driver comes with the 122530-05 nge patch for Solaris x86. Interestingly enough, if the driver is rolled back to 1.13 (patch 122530-02) everything works just fine. Anyone have any clues as to why? Thanks -Dave I'm seeing the same problem on a K8. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: problems with nge driver on build 53 on K8N mobo mobo
Jason J. W. Williams wrote: Hi Guys, Didn't have time to write more at length last time. If the K8 is an nForce 4 chipset you'll have connectivity problems if you're not auto-negotiate on both sides of the Ethernet cable. That might explain the problems. We had Sun swap all our X2100s for X2100 M2s because of the problem. I can post a full internal write-up from my company on the problem if that ends up being the issue. Best Regards, Jason Well, hmm... I've had no trouble w/ a tyan 2865, which is what's in a u20. The K8 is problematic, but that's plugged into a different switch (both are consumer gigabit devices). I'll try booting the k8 plugged into the same switch as my 2865 to rule out that problem. Thanks - - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Driver for ASUS A8N-VM CSM NIC
Rainer Heilke wrote: Greetings, everyone. I hop-e this is an OK place to post this. I did a search on the forum, but didn't find anything applicable. I'm tring to find a NIC driver for the on-board Gigabit NIC of the ASUS A8N-VM CSM motherboard. This uses the NVIDIA nForce 430 chipset. I had thought this would be fairly standard, but I can't seem to get Solaris to understand it. I'm not too worried about getting the full speed out of it, as my network is only 100hdx. The unit is currently networked using a NIC card that Solaris sees just fine (installed after I found the onboard NIC wouldn't configure). Any help would be greatly appreciated. Thanks in advance, Rainer Which version of Solaris are you using? - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
[networking-discuss] Re: [osol-discuss] Project Proposal: Packet Event Framework (PEF)
Yu Xiangning wrote: Hello OpenSolaris folks, I would like to open an OpenSolaris project - Packet Event Framework (PEF), on behalf of the PEF project team. The Packet Event Framework project is a follow-on to FireEngine, which will provide a framework for fine-grain modularity of the network stack based on the execution of a series of event functions as specified by the IP classifier. PEF will provide the following benefits: - Better observability as the framework will support dynamic changes to an event vector. - Improved performance due to: - Smaller code footprint by executing the code needed based on packet classification only. - Iterative execution of events which leads to smaller stack depth. - Fewer parses of the packet. Packet parsing is done once at classification time. - Support for multiple vertical perimeters(squeue_t), so a packet can traverse from one squeue_t to the next. Currently FireEngine supports a single IP squeue_t requiring packet processing to use both STREAMS based queuing and FireEngine IP squeue_t queuing. As a result, a packet can be processed totally within PEF and does not require any STREAMS processing. - CMT(Chip Multi-Threading) based processors will additionally benefit from the multiple squeue_t support through pipe-lined processing of a connection. Multiple cores and/or threads of a core can process different layers of the stack. Also, packet fanout of packets such that multiple packets can be simultaneously processed. Thanks in advance for your support! - yxn ___ opensolaris-discuss mailing list opensolaris-discuss@opensolaris.org +1 Sounds like this will improve performance significantly, esp. as networks increase in performance faster than single cpu cores - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Network interface utilization calculation
Louwtjie Burger wrote: Hi there Can the utilization (0 - 100%) of a network interface be calculated using kstat? This message posted from opensolaris.org ___ networking-discuss mailing list networking-discuss@opensolaris.org On some interfaces, yes; the more modern ones report sufficient stats to do this. On bge0, for example: module: bge instance: 1 name: bge1class:net brdcstrcv 595886 brdcstxmt 2147 collisions 0 crtime 47.871811198 ierrors 25683 ifspeed 1 ipackets10360177 ipackets64 10360177 multircv4918 multixmt3 norcvbuf0 noxmtbuf0 obytes 326843204 obytes64891696 oerrors 0 opackets9542902 opackets64 9542902 rbytes 228242390 rbytes648818176982 snaptime260566.166158187 unknowns102008 You know the speed and the total bytes output and input, so you can calculate the %util. easily. You can even figure out whether or not the link is in half-duplex mode or not. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: Network interface utilization calculation
Louwtjie Burger wrote: So, Take obytes64/rbytes64 at 2 different time periods, let's say 5 seconds apart. Add the difference in bytes to get the total bytes transfered during the time period. Divide by 5 secs and you have the bytes/sec throughput. From here on you can just divide the Gbit speed. A 1Gbit interface does 128MB/s ... but isn't this theoretical? Yes, I've seen 12MB/s on 10Mbit ethernet... but I seldom see 100MB/s on Gbit ethernet. What would be the correct value to consider as 100% utilization? I'd use 128MB/sec. Utilization is a fuzzy number enough as it is. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Wake On Lan support?
Peter Memishian wrote: Does/Will Solaris have Wake On Lan support for any NIC? I would like this for remote management of some of my systems. There are no documented interfaces at this point to enable Wake-on-LAN or control the magic packets. There is working going on as part of the power management efforts to support wake on lan... - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Handling thousands of TCP connections
Rao Shoaib wrote: Try using port events. man port_get(3C) and port_associate(3C). Rao Attached please find example program on using event ports to handle tcp connections. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts /* * event ports example - tcp echo service * echos all inputs, handles congestion, refusal to read * socket by clients, etc. If stdin receives a EOF, * port_alert is used to cause all threads to exit. * * use mconnect or telnet to port 35000 to test * Truss the binary to really watch whats happening... * * cc -D_REENTRANT eports.c -o eports -lsocket * * Bart Smaalders 7/20/04 */ #include port.h #include sys/types.h #include sys/socket.h #include netinet/in.h #include arpa/inet.h #include poll.h #include pthread.h #include fcntl.h #include unistd.h #include string.h #include strings.h #include stdlib.h #include errno.h #include signal.h #include stdio.h #include thread.h #define PORT_NUM 35000 /* arbitrary */ #define BLEN 1024 typedef struct conn { int (*cn_callback)(struct conn *); int cn_fd; int cn_total; /* totals sent */ int cn_bcnt; /* bytes to be sent */ char cn_buffer[BLEN]; } conn_t; static int port; /*ARGSUSED*/ static void * thread_loop(void *arg) { port_event_t ev; conn_t *ptr; (void) printf(Thread %d starting\n, pthread_self()); /*CONSTCOND*/ while (1) { if (port_get(port, ev, NULL) 0) { perror(port_get); exit(1); } if (ev.portev_source == PORT_SOURCE_FD) { ptr = (conn_t *)ev.portev_user; (void) ptr-cn_callback(ptr); } else break; } (void) printf(Thread %d exiting\n, pthread_self()); return (arg); } static conn_t * get_conn() { conn_t *ptr = malloc(sizeof (conn_t)); if (!ptr) { perror(malloc); exit(1); } bzero(ptr, sizeof (*ptr)); return (ptr); } static int echo_func(conn_t *ptr) { int wrote; int red; /* * if there's no pending data waiting to be echo'd back, * we must be ready to read some */ if (ptr-cn_bcnt == 0) /* need to read */ { red = read(ptr-cn_fd, ptr-cn_buffer, BLEN); if (red = 0) { (void) printf(Closing connection %d - echoed %d bytes\n, ptr-cn_fd, ptr-cn_total); (void) close(ptr-cn_fd); free(ptr); return (0); } ptr-cn_bcnt = red; } /* * if we have data, we need to write */ if (ptr-cn_bcnt 0) { wrote = write(ptr-cn_fd, ptr-cn_buffer, ptr-cn_bcnt); if (wrote 0) ptr-cn_total += wrote; if (wrote 0) { if (errno != EAGAIN) { (void) printf(Closing connection %d - echoed %d bytes\n, ptr-cn_fd, ptr-cn_total); (void) close(ptr-cn_fd); free(ptr); return (0); } wrote = 0; } if (wrote ptr-cn_bcnt) { if (wrote != 0) { (void) memmove(ptr-cn_buffer, ptr-cn_buffer + wrote, ptr-cn_bcnt - wrote); ptr-cn_bcnt -= wrote; } /* * we managed to write some, but still have * some left. Wait for further drainage */ if (port_associate(port, PORT_SOURCE_FD, ptr-cn_fd, POLLOUT, ptr) 0) { perror(port_associate); exit(1); } } else { /* * we wrote it all * go back to reading */ ptr-cn_bcnt = 0; if (port_associate(port, PORT_SOURCE_FD, ptr-cn_fd, POLLIN, ptr) 0) { perror(port_associate); exit(1); } } } return (0); } static int listen_func(conn_t *ptr) { struct sockaddr_in addr; int alen; conn_t *new = get_conn(); if ((new-cn_fd = accept(ptr-cn_fd, (struct sockaddr *)addr, alen)) 0) { perror(accept); exit(1); } new-cn_callback = echo_func; /* * use non-blocking sockets so we don't hang threads if * clients are not reading their return values */ if (fcntl(new-cn_fd, F_SETFL, O_NDELAY) 0) { perror(fcntl); exit(1); } /* * associate new tcp connection w/ port so we can get events from it */ if (port_associate(port, PORT_SOURCE_FD, new-cn_fd, POLLIN, new) 0) { perror(port_associate); exit(1); } /* * re-associate listen_fd so we can accept further connections */ if (port_associate(port, PORT_SOURCE_FD, ptr-cn_fd, POLLIN, ptr) 0) { perror(port_associate); exit(1); } (void) printf(New connection %d\n, new-cn_fd); return (0); } /*ARGSUSED*/ int main(int argc, char *argv[]) { int lsock; int optval; int i; pthread_t tid; struct sockaddr_in server; conn_t *ptr; (void) sigignore(SIGPIPE); if ((port = port_create()) 0) { perror(port_create); exit(1); } if ((lsock = socket(AF_INET, SOCK_STREAM, 0)) 0) { perror(socket:); exit(1); } server.sin_family = AF_INET; server.sin_addr.s_addr = htonl(INADDR_ANY); server.sin_port = htons(PORT_NUM); optval = 1; if (setsockopt(lsock, SOL_SOCKET, SO_REUSEADDR, (char *)optval, 4) 0) { perror(setsocketopt:); exit(1); } if (bind(lsock, (struct sockaddr *)server, sizeof (server)) 0) { perror(bind:); exit(2); } (void) listen(lsock, 10
Re: [networking-discuss] socket lib in solaris?
James Carlson wrote: The point is that having two is confusing, and of the two, 3XNET is better for modern applications (i.e., those attempting to use socket options), so I'd recommend it first. The other one seems to have little going for it but the well-known name and prestige location in the man page search path. So why are there two different libraries? Isn't the xnet functionality a proper superset of libsocket? - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: Packet Filtering Hooks Design Document Review
James Carlson wrote: Bart Smaalders writes: The rules in any single ipf.conf file should describe a consistent, safe set of ipfilter rules for a single operating state. They should be either all applied or none. I don't think it's as simple as that in general. Suppose my configuration says this: block in quick on foobar0 from ! 192.168.254.0/24 to any Should the rule set fail to load if foobar0 doesn't exist in the system? What should it do if that interface shows up later? What should it do if I have (or later gain) *OTHER* interfaces on the system that are not listed in the current rules? As far as I know, there's currently no way to express the idea that any new interface should not be brought up unless there are matching filter rules ready to go for that interface, so it seems to me that there's a gap between the idea of an all or none policy and what would work. Or perhaps we should just say that block in all block out all should be the first lines in all rule sets, thus blocking IO on interfaces not explicitly configured in the rule set. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: Packet Filtering Hooks Design Document Review
Darren Reed wrote: If you start putting settings in this file, what effect does a remove have on them? Or maybe only part of the file or certain lines are recognised, depending on command line switches orit becomes a very messy situation. The sendmail.cf file is a great historical example of what happens when you merge setting options and policy. The rules in any single ipf.conf file should describe a consistent, safe set of ipfilter rules for a single operating state. They should be either all applied or none. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
Re: [networking-discuss] Re: [laptop-discuss] Re: [approach-discuss] first draft of Network Auto-Magic architecture
If we don't do automatic reconfigs of network state, is there a way to insure that network operations depending on a currently unavailable network fail fast? If my wireless link has dropped, waiting for YP/DNS/LDAP to timeout is going to be rather tiresome; I'd much rather it failed right away with a no route to host kind of message. - Bart -- Bart Smaalders Solaris Kernel Performance [EMAIL PROTECTED] http://blogs.sun.com/barts ___ networking-discuss mailing list networking-discuss@opensolaris.org
[networking-discuss] Re: Clearview IP-Level Observability Devices design review (due 11/8)
What are the tradeoffs in supporting snooping on logical devices, eg: snoop -I /dev/bge0:1 You wrote: Opening these devices will provide access to all IP packets with addresses associated with the interface. This includes both IPv4 and IPv6 traffic, and addresses hosted on logical interfaces. For this reason, there is no /dev/ipnet/eri0:1; instead opening /dev/ipnet/eri0 will provide all traffic that is destined for, or originating from, any address hosted on eri0. Clearly, the ability to filter is already there as zones provide access to only their packets; would the difficulties in extending this to all logical addresses (in global and local zones) outweigh the utility? For example, is it useful to be able to snoop on a single zone's traffic on a shared physical interface in the global zone w/o invoking application-specific filtering mechanisms? It would seem to make debugging IPFilter configurations considerably easier, as well. - Bart This message posted from opensolaris.org ___ networking-discuss mailing list networking-discuss@opensolaris.org