On 02/02/16 11:40, Subramanian, Sriram (EG Servers Platform SW) wrote:
> Hi Laszlo,
> 
> Thanks for your feedback.
> 
> With [Patch 3/3], that fixes an issue where we could return
> EFI_DEVICE_ERROR (on UNDI returning PXE_STATCODE_BUFFER_FULL), [Patch
> 1/3] is no longer a pressing need.
> 
> Otherwise, with the bug described in [Patch 3/3], even if the UNDI
> returned PXE_STATCODE_BUFFER_FULL, we do *not* execute the second
> call tree you mentioned, because the status returned from SNP is
> *not* EFI_NOT_READY, but is EFI_DEVICE_ERROR.
> 
>> However, there is *another* call path leading to Snp->GetStatus(). It is 
>> right after the above Snp->Transmit():
>>
>>   MnpTransmit()                           [MnpMain.c]
>>     MnpBuildTxPacket()                    [MnpIo.c]
>>       MnpAllocTxBuf()                     [MnpConfig.c]
>>         IF no room in FreeTxBufList THEN:
>>           MnpRecycleTxBuf()               [MnpConfig.c]
>>             LOOP:
>>               Snp->GetStatus()
>>               MnpFreeTxBuf()              [MnpConfig.c]
>>     MnpSyncSendPacket()                   [MnpIo.c]
>>       Snp->Transmit()
>>       IF (Status == EFI_NOT_READY) THEN:
>>         MnpRecycleTxBuf()                 [MnpConfig.c]
>>           LOOP:
>>             Snp->GetStatus() <--------------------------------- here
>>             MnpFreeTxBuf()                [MnpConfig.c]
>>         Snp->Transmit()
> 
> That said, my other concern at that time for pushing to a smaller
> value (so GetStatus could be called sooner) was that we could expose
> potentially untested code in various UNDIs (if we take longer to call
> GetStatus), and so the we may end up seeing hangs, corruption, or
> other weird behavior (UNDIs may not return the correct statuses for
> the SNP and MNP to correctly retry/call GetStatus to clear the
> buffers). We actually have NICs that have started failing after SVN
> r19623 and r19624, and continue to fail even after applying the
> patches in this series, something we're working with the vendors to
> get them addressed.

In this case, I agree that patch #1 should be included as well, but the
commit message should be *very* clear that the change is being done only
as a workaround for incorrect UNDI drivers. The commit message should
also include your above summary.

Thanks!
Laszlo

> 
> Thanks,
> Sriram.
> 
> -----Original Message-----
> From: Fu, Siyuan [mailto:siyuan...@intel.com] 
> Sent: Tuesday, February 02, 2016 1:56 PM
> To: Laszlo Ersek; Subramanian, Sriram (EG Servers Platform SW)
> Cc: edk2-de...@ml01.01.org; Ye, Ting; Wu, Jiaxin; El-Haj-Mahmoud, Samer; Li, 
> Ruth; Hsiung, Harry L
> Subject: RE: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP 
> TX buffer pool.
> 
> Hi, Laszlo
> 
> Thanks for your detailed analysis, I will update the commit log for the patch 
> 2/3, and let's wait Sriram to see if he has any concern about 1/3.
> 
> Best Regards
> Siyuan
> 
> -----Original Message-----
> From: Laszlo Ersek [mailto:ler...@redhat.com] 
> Sent: Tuesday, February 2, 2016 4:12 PM
> To: Fu, Siyuan <siyuan...@intel.com>; Subramanian Sriram <srira...@hpe.com>
> Cc: edk2-de...@ml01.01.org; Ye, Ting <ting...@intel.com>; Wu, Jiaxin 
> <jiaxin...@intel.com>; El-Haj-Mahmoud, Samer <samer.el-haj-mahm...@hpe.com>; 
> Li, Ruth <ruth...@intel.com>; Hsiung, Harry L <harry.l.hsi...@intel.com>
> Subject: Re: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP 
> TX buffer pool.
> 
> Hello Siyuan,
> 
> On 02/02/16 02:03, Fu, Siyuan wrote:
>> Hi, Laszlo
>>
>> Yes it's to make it same as the MAX_XMIT_BUFFERS in PxeGetStatus(), 
>> and I agree that the MNP and SNP driver are independent with each 
>> other. I make this change just because some people in edk2 community 
>> have concern about the different length in MNP buffer pool and PXE 
>> recycle array, you could see attached email thread for details. IMHO, 
>> either 32 or 64 or any other different value doesn't make much 
>> different, MNP_TX_BUFFER_INCREASEMENT is just the initial length of 
>> the buffer pool, MNP will increase the pool size if run out of it, and 
>> recycle buffer from SNP if the transmitted buffer queue if full.
> 
> Thanks for the info. I think this discussion should have happened on-list to 
> begin with. If that's okay, I'd like to continue it on-list; I've now CC'd 
> the participants of the attached thread.
> 
> So as far as I can see, there are three concerns (I'll list them first, then 
> react to them separately):
> 
> (1) A potential issue where MNP will reject all caller requests to transmit, 
> because the UNDI underlying the SNP underlying the MNP has all outgoing 
> buffers full, and MNP will not try to recycle UNDI buffers (via recycling SNP 
> buffers).
> 
> In the words of Sriram:
> 
>> This is exactly my concern. We only issue the Snp->GetStatus() from 
>> MnpDxe after we exhaust the FreeTxBufList, so it may be a while before 
>> we call GetStatus again.
> 
> (2) In the implementation of Snp->Transmit() -- PxeTransmit() in 
> "MdeModulePkg/Universal/Network/SnpDxe/Transmit.c" -- we handle BUFFER_FULL 
> as a permanent error, under the "default:" label. According to UEFI v2.6, 
> E.4.18.7, we should cause our SNP implementation to "Call Get Status command 
> to empty buffer".
> 
> (3) In the implementation of Snp->GetStatus() -- PxeGetStatus() in 
> "MdeModulePkg/Universal/Network/SnpDxe/Get_status.c" --, we only provide UNDI 
> with 1 slot to report transmitted buffers in, even though we have room for 32 
> (MAX_XMIT_BUFFERS) in the PXE_DB_GET_STATUS.TxBuffer array.
> 
> First, this pessimizes the code -- it makes the loop at the end of
> PxeGetStatus() useless --, second, some non-conformant UNDI implementation 
> that Siyuan tested does not care about Snp->Cdb.DBsize, and *always* fills in 
> 32 slots.
> 
> --*--
> 
> Assuming I managed to state the problems correctly, here's what I think, in 
> reverse order:
> 
> ** I agree that (3) should be modified, if for nothing else then to avoid the 
> buffer overflow. However, the commit message of:
> 
>   MdeModulePkg: Update DBsize in SNP GetStatus command
> 
> should be changed to state both reasons explicitly:
> - make the loop at the end of PxeGetStatus() useful
> - prevent buffer overflow with the non-conformant UNDI driver Siyuan
>   tested
> 
> 
> ** I agree that (2) should be fixed. I think the currently proposed commit 
> message for patch
> 
>   MdeModulePkg: Correct one return status code in SNP Transmit function
> 
> is correct.
> 
> 
> ** I think that (3) is a non-issue. The situation described by Sriram in
> (1) -- which would amount to a "livelock", i.e., repeated
> Mnp->Transmit() calls without making progress -- is *not* possible.
> 
> I think Siyuan explained it already in the attached thread why it is not 
> possible, but let me try to explain it as well.
> 
> Namely, Sriram correctly identified the following call chain (all referenced 
> files are under "MdeModulePkg/Universal/Network/MnpDxe/"):
> 
>   MnpTransmit()                           [MnpMain.c]
>     MnpBuildTxPacket()                    [MnpIo.c]
>       MnpAllocTxBuf()                     [MnpConfig.c]
>         IF no room in FreeTxBufList THEN:
>           MnpRecycleTxBuf()               [MnpConfig.c]
>             LOOP:
>               Snp->GetStatus()
>               MnpFreeTxBuf()              [MnpConfig.c]
>     MnpSyncSendPacket()                   [MnpIo.c]
>       Snp->Transmit()
> 
> Sriram is correct that, if there *is* room in FreeTxBufList, then we will 
> *not* call MnpRecycleTxBuf() -- hence, we will not call
> Snp->GetStatus(). And then Snp->Transmit() could fail, if the underlying
> UNDI has no room.
> 
> However, there is *another* call path leading to Snp->GetStatus(). It is 
> right after the above Snp->Transmit():
> 
>   MnpTransmit()                           [MnpMain.c]
>     MnpBuildTxPacket()                    [MnpIo.c]
>       MnpAllocTxBuf()                     [MnpConfig.c]
>         IF no room in FreeTxBufList THEN:
>           MnpRecycleTxBuf()               [MnpConfig.c]
>             LOOP:
>               Snp->GetStatus()
>               MnpFreeTxBuf()              [MnpConfig.c]
>     MnpSyncSendPacket()                   [MnpIo.c]
>       Snp->Transmit()
>       IF (Status == EFI_NOT_READY) THEN:
>         MnpRecycleTxBuf()                 [MnpConfig.c]
>           LOOP:
>             Snp->GetStatus() <--------------------------------- here
>             MnpFreeTxBuf()                [MnpConfig.c]
>         Snp->Transmit()
> 
> If the underlying UNDI fails due to lack of buffer space, then the first
> Snp->Transmit() call will return EFI_NOT_READY. Then MnpSyncSendPacket()
> will immediately call MnpRecycleTxBuf() -- *regardless* of whether on entry 
> to MnpTransmit(), "FreeTxBufList" was empty or not --, and UNDI gets a chance 
> to return transmitted buffers.
> 
> And then Snp->Transmit() is retried.
> 
> Again, I think Siyuan explained it well in his email already:
> 
>> We do have method to know whether the UNDI queue is full. The UNDI 
>> Transmit command should return an error CDB.StatCode (BUFFER_FULL 
>> according to UEFI spec in E.4.18.7, while some UNDI return QUEUE_FULL 
>> instead of BUFFER_FULL per my test result), then the SNP->Transmit() 
>> should return EFI_NOT_READY to indicate the caller that the network 
>> interface is too busy to accept this transmit request (see
>> Sn->Transmit API description and return status table in UEFI spec).
>> That's why the MNP driver also call Snp->GetStatus() to recycle the 
>> buffer if a Snp->Transmit() return EFI_NOT_READY.
> 
> But I hope that the above call trees make it easier to understand.
> Therefore I propose to drop patch
> 
>   MdeModulePkg: Update the default size of MNP TX buffer pool
> 
> from the series.
> 
> Sriram, do you agree with this analysis?
> 
> 
> ** In summary:
> 
> - "[Patch 1/3] MdeModulePkg: Update the default size of MNP TX buffer pool."
> --> I propose to drop this patch. Sriram, do you agree?
> 
> - [edk2] [Patch 2/3] MdeModulePkg: Update DBsize in SNP GetStatus command.
> --> please explain in the commit message *both* reasons why this patch
> is needed (i.e., to make the loop in PxeGetStatus() useful, and to prevent 
> buffer overflow from non-conformant UNDI drivers).
> 
> With that spelled out, for patch 2:
> 
> Reviewed-by: Laszlo Ersek <ler...@redhat.com>
> 
> - [edk2] [Patch 3/3] MdeModulePkg: Correct one return status code in SNP 
> Transmit function.
> --> patch is good as-is.
> 
> Reviewed-by: Laszlo Ersek <ler...@redhat.com>
> 
> Thanks!
> Laszlo
> 
>>
>> Best Regards
>> Siyuan
>>
>> -----Original Message-----
>> From: Laszlo Ersek [mailto:ler...@redhat.com]
>> Sent: Tuesday, February 2, 2016 12:13 AM
>> To: Fu, Siyuan <siyuan...@intel.com>; edk2-de...@ml01.01.org
>> Cc: Ye, Ting <ting...@intel.com>; Wu, Jiaxin <jiaxin...@intel.com>
>> Subject: Re: [edk2] [Patch 1/3] MdeModulePkg: Update the default size of MNP 
>> TX buffer pool.
>>
>> On 02/01/16 03:51, Fu Siyuan wrote:
>>> This patch update the default MNP TX buffer increasement to 32, so 
>>> the default TX pool length is same as the maximum recycled buffer 
>>> numbers in one UNDI GetStatus command.
>>
>> Is that MAX_XMIT_BUFFERS in PxeGetStatus()?
>>
>>> Contributed-under: TianoCore Contribution Agreement 1.0
>>> Signed-off-by: Fu Siyuan <siyuan...@intel.com>
>>> CC: Ye Ting <ting...@intel.com>
>>> CC: Wu Jiaxin <jiaxin...@intel.com>
>>> ---
>>>  MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h
>>> b/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h
>>> index c66be64..51391af 100644
>>> --- a/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h
>>> +++ b/MdeModulePkg/Universal/Network/MnpDxe/MnpImpl.h
>>> @@ -27,7 +27,7 @@ WITHOUT WARRANTIES OR REPRESENTATIONS OF ANY KIND, EITHER 
>>> EXPRESS OR IMPLIED.
>>>  #define MNP_INIT_NET_BUFFER_NUM       512
>>>  #define MNP_NET_BUFFER_INCREASEMENT   64
>>>  #define MNP_MAX_NET_BUFFER_NUM        65536
>>> -#define MNP_TX_BUFFER_INCREASEMENT    64
>>> +#define MNP_TX_BUFFER_INCREASEMENT    32    // Same as the recycling Q 
>>> length for xmit_done in UNDI command.
>>>  #define MNP_MAX_TX_BUFFER_NUM         65536
>>>  
>>>  #define MNP_MAX_RCVD_PACKET_QUE_SIZE  256
>>>
>>
>> Out of curiosity: is this a bugfix or an optimization?
>>
>> If a bugfix, then can you please explain why those two things (i.e., 
>> MNP_TX_BUFFER_INCREASEMENT in MnpAllocTxBuf(), and MAX_XMIT_BUFFERS in
>> PxeGetStatus) should match?
>>
>> As a perf optimization I guess this makes sense, but otherwise, SNP 
>> internals and MNP should be independent, shouldn't they?
>>
>> Thanks
>> Laszlo
>>
> 

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel

Reply via email to