On Mon, Nov 18, 2013 at 04:32:34PM -0700, Alex Williamson wrote: > On Mon, 2013-11-18 at 17:55 -0500, Vlad Yasevich wrote: > > On 11/18/2013 05:40 PM, Alex Williamson wrote: > > > On Mon, 2013-11-18 at 17:07 -0500, Vlad Yasevich wrote: > > >> On 11/18/2013 04:33 PM, Alex Williamson wrote: > > >>> On Mon, 2013-11-18 at 15:57 -0500, Vlad Yasevich wrote: > > >>>> On 11/18/2013 03:33 PM, Alex Williamson wrote: > > >>>>> On Mon, 2013-11-18 at 15:09 -0500, Vlad Yasevich wrote: > > >>>>>> On 11/18/2013 02:58 PM, Alex Williamson wrote: > > >>>>>>> On Mon, 2013-11-18 at 21:47 +0200, Michael S. Tsirkin wrote: > > >>>>>>>> This reverts commit cd5be5829c1ce87aa6b3a7806524fac07ac9a757. > > >>>>>>>> Digging into hardware specs shows this does not > > >>>>>>>> actually make QEMU behave more like hardware. > > >>>>>>>> Let's stick to the tried heuristic for 1.7 and > > >>>>>>>> possibly revisit for 1.8. > > >>>>>>> > > >>>>>>> If this is broken, then so are these: > > >>>>>>> > > >>>>>>> 23c37c37f0280761072c23bf67d3a4f3c0ff25aa > > >>>>>>> 7c36507c2b8776266f50c5e2739bd18279953b93 > > >>>>>> > > >>>>>> These aren't really broken. They just assume that the high order > > >>>>>> writes will happen after the low order writes. > > >>>>>> > > >>>>>> In the case of e1000, this is a little more then an assumption > > >>>>>> (particularly in the case of nic initilization). > > >>>>> > > >>>>> But AIUI there's also a valid bit in that high order byte on e1000, so > > >>>>> reverting cd5be582 means we stuff a new mac into qemu less often, but > > >>>>> it's still only accurate some of the time. > > >>>> > > >>>> Yes, there is a slight issue with validity of mac at the time of > > >>>> processing packets. I have an outstanding question on the Intel > > >>>> list about this behavior with real HW. But, with e1000, the validity > > >>>> bit provides a much higher guarantee that a guest that will be > > >>>> setting the mac address will write the high register second to > > >>>> guarantee that when the valid bit is written, the mac is fully > > >>>> valid. As a result we don't really need the e1000 part of the > > >>>> cd5be5829. > > >>> > > >>> But doesn't that valid bit mean that a mac update will start and end > > >>> with a write to the high order register? So we're assuming: > > >>> > > >>> a) write RA + 1 (invalidate) > > >>> b) write RA (write low) > > >>> c) write RA + 1 (write high + valid) > > >>> > > >> > > >> No. On update, only steps b and c typically happen. Thus my question > > >> to the on the intel list. > > > > > > So perhaps the bit is some kind of data latch bit and the mac address > > > fields within those registers are effectively scratch until that bit is > > > written? > > > > > >>> Without cd5be5829 the only change is that we don't store a new mac into > > >>> the monitor at b). The mac stored in the monitor is still wrong from a) > > >>> until c). So it's ever so slightly less broken without cd5be5829. > > >> > > >> Since there is really no a), the mac in the monitor is only different > > >> after step b). since it's is incomplete and we expect step c), there > > >> is really no point in updating it. > > > > > > Great, so I have no argument against reverting, or just fixing, that > > > chunk of cd5be5829. Let's implement the latch bit too. > > > > > >>>>> > > >>>>>> In the case of RTL nic, it is just an assumption, but it hasn't > > >>>>>> been shown faulty yet. We do plan to address this a bit more > > >>>>>> thoroughly. > > >>>>> > > >>>>> So how is RTL less broken without cd5be582? AIUI the valid bit is off > > >>>>> in a separate register on RTL, so we have no guarantee about order of > > >>>>> updating the mac. Without cd5be582 the info in the monitor may be > > >>>>> permanently broken if the guest uses a write order other than what we > > >>>>> assume. > > >>>>> > > >>>> > > >>>> This one is actually not as bad either. RTL spec requires that > > >>>> receive register writes happen as 32 bit word writes. This is > > >>>> what linux and bsd drivers do, so from driver perspective, the > > >>>> issue is the same. What our emulation layer does is turn these > > >>>> 32 bit writes into 4 8-bit writes. This is likely due to some > > >>>> very broken and very old drivers, but I am not sure. > > >>>> > > >>>> So, the information in the monitor will be broken if the guest > > >>>> does: write_hi(); write_lo(); A part of me would really like > > >>>> to see a guest that does this :) > > >>> > > >>> So the argument for reverting here is that it seems unlikely that the > > >>> dwords get written in the hi->lo order and we'd rather the monitor get > > >>> stuck with the wrong mac forever > > >> > > >> For how many/which guests? I know it's not linux or BSD. I need to > > >> boot windows to see what it does, but I think it does the right thing. > > > > > > How many guests do you plan to test? > > > > I think the proposal was to see if anyone reports an issue :) > > > > > > > >>> than it show the wrong mac address for > > >>> a tiny window for everybody else? > > >> > > >> Yes, this would happen for everybody. If you are querying the output, > > >> you might see this and it will show up as 2 changes. > > >> > > >> We are talking about 2 "tiny" amounts here: > > >> On the one hand, we __might__ have guests that write mac and reverse > > >> order thus showing wrong address. > > >> On the other hand we have all the guests who will show the wrong address > > >> for a _short_ time. > > >> > > >> I have a hard time deciding, but have a slight preference for a small > > >> uncertainty (the # of backward writers is very small), rather then a > > >> small certainty (everyone will be effected for a small period of time). > > > > > > Why compromise? Why not implement the register that handles whether the > > > mac is valid on RTL? > > > > > > > I am working on this but it's too late for 1.7. > > > > For 1.7, we have a question: impact a very small number (# -> 0) of > > guests for a long time, or impact a everyone for a very short time. > > > > Original code did the former and fixed the original issue reported. > > The commit in question that we are talking about reverting did the later > > simply because we didn't know any better. > > This thread is getting too long. What I've learned from it is that > there are valid arguments backed by the spec to indicate why the version > of e1000 prior to cd5be582 was more correct and why we think it does not > update the monitor with incorrect mac information. rtl8139 is less > convincing, but all the drivers we know about behave in one way that > allows us to make an assumption about write order and avoid spurious, > incorrect mac address updates to the monitor. Had that information been > provided in the commit log we could have all spent the day doing > relevant work. Please send a v2. Thanks, > > Alex
I tweaked commit log before sending pull request. > > >>> I think you say something about > > >>> sub-optimal here... > > >>> > > >>>> The current code isn't perfect either. It still has a potential > > >>>> to show the wrong mac address in the monitor. I doesn't make > > >>>> a lot of sense to me to replace one sub-optimal solution > > >>>> with another sub-optimal solution, especially since no-one > > >>>> complained. > > >>> > > >>> Exactly, the code isn't perfect either way and this revert is just > > >>> replacing one sub-optimal solution for another. So why do it? > > >> > > >> Another part of this, for me, is that a change was made that we had a > > >> disagreement on and a different approach was being discussed. The > > >> revert is to bring us back to before this change. > > > > > > I was surprised to see it had been committed too, but that's a different > > > argument for revert than what's being presented here. Thanks, > > > > It is another argument for the revert non the less. > > > > Thanks > > -vlad > > > > > > > > Alex > > > > > >>>>>> The patch that was applied was controversial and more then 1 person > > >>>>>> expressed reservations. > > >>>>> > > >>>>> Understood, but it also raised and addressed a shortcoming in the > > >>>>> previous patches. If cd5be582 was controversial because the monitor > > >>>>> was > > >>>>> getting updated with incorrect mac addresses then this simple revert > > >>>>> doesn't solve that problem. Thanks, > > >>>>> > > >>>>> Alex > > >>>>> > > >>>>>>> > > >>>>>>> None of these change the behavior of hardware, they only change > > >>>>>>> when the > > >>>>>>> monitor gets told about mac address changes. I'd suggest either > > >>>>>>> add the > > >>>>>>> emulation described in each spec or revert all of them. A partial > > >>>>>>> revert is just noise. Thanks, > > >>>>>>> > > >>>>>>> Alex > > >>>>>>> > > >>>>>>>> > > >>>>>>>> Reported-by: Vlad Yasevich <vyase...@redhat.com> > > >>>>>>>> Cc: Amos Kong <ak...@redhat.com> > > >>>>>>>> Cc: Alex Williamson <alex.william...@redhat.com> > > >>>>>>>> --- > > >>>>>>>> hw/net/e1000.c | 2 +- > > >>>>>>>> hw/net/rtl8139.c | 5 ++++- > > >>>>>>>> 2 files changed, 5 insertions(+), 2 deletions(-) > > >>>>>>>> > > >>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c > > >>>>>>>> index ae63591..8387443 100644 > > >>>>>>>> --- a/hw/net/e1000.c > > >>>>>>>> +++ b/hw/net/e1000.c > > >>>>>>>> @@ -1106,7 +1106,7 @@ mac_writereg(E1000State *s, int index, > > >>>>>>>> uint32_t val) > > >>>>>>>> > > >>>>>>>> s->mac_reg[index] = val; > > >>>>>>>> > > >>>>>>>> - if (index == RA || index == RA + 1) { > > >>>>>>>> + if (index == RA + 1) { > > >>>>>>>> macaddr[0] = cpu_to_le32(s->mac_reg[RA]); > > >>>>>>>> macaddr[1] = cpu_to_le32(s->mac_reg[RA + 1]); > > >>>>>>>> qemu_format_nic_info_str(qemu_get_queue(s->nic), (uint8_t > > >>>>>>>> *)macaddr); > > >>>>>>>> diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c > > >>>>>>>> index 7f2b4db..5329f44 100644 > > >>>>>>>> --- a/hw/net/rtl8139.c > > >>>>>>>> +++ b/hw/net/rtl8139.c > > >>>>>>>> @@ -2741,7 +2741,10 @@ static void rtl8139_io_writeb(void *opaque, > > >>>>>>>> uint8_t addr, uint32_t val) > > >>>>>>>> > > >>>>>>>> switch (addr) > > >>>>>>>> { > > >>>>>>>> - case MAC0 ... MAC0+5: > > >>>>>>>> + case MAC0 ... MAC0+4: > > >>>>>>>> + s->phys[addr - MAC0] = val; > > >>>>>>>> + break; > > >>>>>>>> + case MAC0+5: > > >>>>>>>> s->phys[addr - MAC0] = val; > > >>>>>>>> qemu_format_nic_info_str(qemu_get_queue(s->nic), > > >>>>>>>> s->phys); > > >>>>>>>> break; > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>> > > >>> > > >>> > > >>> > > >> > > > > > > > > > > > > >