Hi Darrell, Yes, I'm still interested in this series. I had a lot of other work last few weeks. I hope that I'll have enough time to reply to the questions and concerns in a near future.
Best regards, Ilya Maximets. On 30.08.2017 01:45, Darrell Ball wrote: > Hi Jan/Ilya/Bhanu > > Just wondering if we are still pursuing this patch series ? > > Thanks Darrell > > > > > On 8/14/17, 8:33 AM, "[email protected] on behalf of Jan > Scheurich" <[email protected] on behalf of > [email protected]> wrote: > > > > We have tested the effect of turbo mode on TSC and there is none. The > > TSC frequency remains at the nominal clock speed, no matter if the core > is > > clocked down or up. So, I believe for PMD threads (where performance > > matters) TSC would be an adequate and efficient clock. > > > > It's highly platform dependent and testing on a few systems doesn't > > guarantee anything. > > From the other hand POSIX guarantee the monotonic characteristics for > > CLOCK_MONOTONIC. > > TSC is also monotonic on a given core. Does CLOCK_MONOTONIC guarantee any > better accuracy than TSC for PMD threads? > > > > On PMDs I am a bit concerned about the overhead/latency introduced > > with the clock_gettime() system call, but I haven't done any > measurements > > to check the actual impact. Have you? > > > > Have you seen my incremental patches? > > There is no overhead, because we're just replacing 'time_msec' with > > 'time_usec'. > > No difference except converting timespec to usec instead of msec. > > I did look at you incremental patches and we will test their performance. > I was concerned about the system call cost on master already before. Perhaps > I'm paranoid, but I would like to double check by testing. > > > > If we go for CLOCK_MONOTONIC in microsecond resolution, we should > > make sure that the clock is read not more than once once every iteration > > (and cache the us value as now in the pmd ctx struct as suggested in > your > > other patch). But then for consistency also the XPS feature should use > the > > PMD time in us resolution. > > > > Again, please, look at my incremental patches. > > As far as I could see you did, for example, not consistently adapt > tx_port->last_used to microsecond resolution. > > > > For non-PMD thread we could actually skip time-based output batching > > completely. The packet rates and the frequency of calls to > > dpif_netdev_run() in the main ovs-vswitchd thread are so low that time- > > based flushing doesn't seem to make much sense. > > Have you considered this option? > > > > > > > Below you can find an alternative incremental patch on top of your RFC > > 4/4 that uses TSC on PMD. We will be comparing the two alternatives for > > performance both with non-PMD guests (iperf3) as well as PMD guests > > (DPDK testpmd). > > > > In your version you need to move all the output_batching related code > > under #ifdef DPDK_NETDEV because it will brake userspace networking if > > compiled without dpdk and output-max-latency != 0. > > Not sure. Batching should implicitly be disabled because cycles_counter() > and cycles_per_microsecond() would both return zero. But I agree that would > be fairly cryptic design. If we used TSC in PMDs we should explicitly not do > time-based tx batching on the non-PMD thread. > > Anyway, if the cost of the clock_gettime() system call proves > insignificant and our performance tests comparing our TSC-based with your > CLOCK_MONOTONIC-based implementation show equivalent results, we can go for > your approach. > > BR, Jan > _______________________________________________ > dev mailing list > [email protected] > > https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=YSJX1FJ-09MF851q3vAIW-9-2W4nZruCOdyvxUMB9vE&s=QUvdTK7m_i90FSk4aRxMN7d_1TSc46NZK9zfV3dI3Cc&e= > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
