Hey Stephen, Unfortunately the suggestions you listed haven't solved my problem. However, given I'm still seeing very high avgTimedEventLatencyRatios, I'm considering moving to a distributed emulation in order to increase the resources available to the emulator. Is this something you would typically expect to make a big difference?
Before I do that, I'd just like to confirm my understanding of some of the packet loss statistics recorded by emanesh. The columns output by "emanesh nodeid get table '*' mac UnicastPacketDropTable0" are: NEM | SINR | Reg Id | Dst MAC | Queue Overflow | Bad Control | Bad Spectrum Query | Flow Control | Duplicate | Rx During Tx | Hidden Busy | Firstly, do the packet drops listed here refer to packets sent by the local nem or packets received (or both)? Secondly, in my emulations, packets are only dropped due to SINR, Duplicate, Rx During Tx, and Hidden Busy. As I understand it, SINR indicates a packet was dropped because of interference, i.e. two or more packets were received concurrently at this node, and the resulting SINR was too low for the packet to be received. Duplicates obviously refer to mac layer retries. Rx during Tx presumably refers to a packet received while this node is trying to send. What exactly does the 'Hidden Busy' field indicate? Is it something to do with hidden terminals? Thanks for your help, hopefully I'll get this sorted soon! Dan > -----Original Message----- > From: Steven Galgano [mailto:[email protected]] > Sent: 24 November 2015 01:15 > To: O'Keeffe, Daniel > Cc: [email protected] > Subject: Re: [emane-users] Fidelity of 802.11 MAC implementation > > Dan, > > We achieve the best emulation performance using a CPU (core) per container, > where each container is treated as an individual wireless node. Each node > runs a > single emane instance, the protocol/service > instance(s) under test, data collection processes and traffic generator > processes. > Nodes only communicate using the emane OTA, event channel and test control > system backchannel (start/shutdown/monitor). No other per node processes run > on the host or other containers not specifically assigned to the node. > > Depending on the scenario (e.g., high traffic loads), we may assign additional > CPUs to one or more of the nodes. Once assigned those CPUs are only available > to the assigned node (container). For some at-scale tests we assign multiple > nodes to a single CPU, trading some amount of fidelity for node count. > > I would suggest, if you have not done so already, you deploy your experiment > on > a server configured without a display manger (graphical user interface). This > is > one of the first recommendations we make to our customers. > > Also, verify that you are enabling realtime scheduling by using '-r' > when you start emane. > > -steve > > > On 11/19/2015 12:11 PM, Dan O'Keeffe wrote: > > Hey Steven, > > > >> > >> There are many factors that may be contributing to the latency spikes > >> you are seeing. Most don't have anything to do with emane and are > >> likely related to your host system. > > > > Have you any other advice as to typical factors/os configuration > > options you've seen that cause high latencies for emane and/or how I > > might diagnose them? > > > >> We have spent a fair amount of time tuning our emulation servers (LXC > >> host systems) to achieve that goal. We don't run X on our servers, we > >> disable all unneeded services and we usually assign one or more > >> dedicated cores to each container running emane. The number of cores > >> per container can increase based on whatever else is executing in the > >> container. Some of our large scale network emulations trade fidelity > >> for node count and assign more than one container per dedicated CPU. > >> > >> This type of tuning in not needed by all. It depends are how > >> satisfied you are with the resulting emulation fidelity and whether > >> or not the radio models in use have tight timing constraints. > >> > >> As a first step, if you are generating high traffic loads, you may > >> benefit from trying the emane develop branch. See Pull 34 [1] which > >> added functionality to short circuit timer logic for enforcing > >> transmit data rates. After that I would try assigning a dedicated > >> core to each container. > >> > >> [1] https://github.com/adjacentlink/emane/pull/34 > >> > > > > OK, I tried your patch but it didn't have much effect on either the > > latency spikes or the high avgTimedEventLatencyRatio statistics. > > > > If I allow the n emane daemons to share n cores, with every other > > container service running on the remaining cores, I get a negligible > > reduction in avgTimedEventLatencyRatio (from 1-2 to about 0.75-1.75) > > and no effect on latency spikes. > > > > If I run each emane daemon on its own dedicated core, with every other > > container service shared across the others cores, I see a massive > > increase in the avgTimedEventLatencyRatio (from 1-2) to about 6-7) but > > again no impact on the latency spikes. > > > > Do you think there is any point in me assigning containers as a whole > > to specific cores, or is it sufficient to assign the emane daemons > > separately? The CPU utilization for the cores running my emane daemons > > seems low enough to me (see attached pdf), although the remaining > > cores are heavily loaded. > > > > Is it possible that the avgTimedEventLatencyRatios can be high if the > > host machine/emulator cores aren't overloaded? For example, are there > > any memory usage/io stats you typically look at? > > > > Any help much appreciated. > > Thanks, > > Dan > > > > > > > > > > _______________________________________________ emane-users mailing list [email protected] http://pf.itd.nrl.navy.mil/mailman/listinfo/emane-users
