On 29/09/16(Thu) 16:20, Joseph Zatarski wrote:
> Yesterday, I was attempting to install 5.8 on a VAX architecture (a DEC
> VXT2000 with 18MB RAM and SPX graphics to be specific) by net booting using
> the VAX boot.mop file. I was having issues during the process, and I
> eventually used a network sniffer attempting to troubleshoot issues. I
> noticed that loading over NFS seemed to stop shortly after my VXT received
> an ARP request from my laptop, and also that the ARP reply seemed to be of
> the wrong length. This issue was also a highly intermittent issue, as well
> as variable in the ways it seemed to fail. Sometimes a full reboot would be
> required before an NFS download worked again, other times I could retry
> without rebooting, and rarely I would be able to boot without issues.
> 
> I know VAX is no longer supported, but I discovered issues in
> arch-independent sections of the source, so please keep reading.
> 
> I did find, I think, the source of the issue with ARP reply packet length.
> In libsa, the function arp_reply is what handles arp replies. Some comments
> in this section read:
> 
> /*
>  * Convert an ARP request into a reply and send it.
>  * Notes:  Re-uses buffer.  Pad to length = 46.
>  */
> 
> The length I see in the network sniffer is actually 64 bytes, which is 18
> bytes longer than the 46 mentioned here. A normal ARP reply or request
> (including ethernet frame) is actually 46 bytes, so this appears to be
> correct in that the length should ultimately be 46. This function accepts a
> pointer to an ethernet packet of type void, and casts it to a structure
> devised to hold an ARP request or reply, called ether_arp:
> 
>     struct ether_arp *arp = pkt;
> 
> This structure should be the size of the ethernet frame's 'payload' for an
> ARP packet, which comes out to be 28 bytes. This is 18 less than 46. The
> additional 18 is for the layer 2 ethernet overhead.
> 
> Near the end of the function, we find a call to sendether which 'pads' with
> 18 bytes:
> 
>     (void) sendether(d, pkt, sizeof(*arp) + 18,
>         arp->arp_tha, ETHERTYPE_ARP);
> 
> However, this is where we go wrong. In ether.c, we find the function
> sendether. Sendether has the following notes:
> 
> /* Caller must leave room for ethernet header in front!! */
> 
> And later:
> 
>     eh = (struct ether_header *)pkt - 1;
>     len += sizeof(*eh);
> 
> which is actually what does the padding here, meaning the '+ 18' in the
> call to sendether is redundant and wrong.
> 
> Looking at this reveals another potential issue. arp_reply passes *pkt to
> sendether, and sendether will overwrite some number of bytes before the
> buffer, which could be out the bounds of that buffer. Of course, it is also
> possible that the buffer that *pkt points inside actually has enough space
> within it before *pkt that this is a nonissue, but I need to investigate
> this further (starting with finding the caller of arp_reply) to determine
> whether it's an issue or not. Although weak evidence, data corruption
> caused by ARP reply would seem to explain the issues I had with netbooting
> my VXT2000.
> 
> To conclude, I believe that line 291 in /sys/lib/libsa/arp.c should be
> changed from:
> 
>     (void) sendether(d, pkt, sizeof(*arp) + 18,
> 
> to
> 
>     (void) sendether(d, pkt, sizeof(*arp),
> 
> in order to solve the packet length issue. Aside from this, I will continue
> looking into potential issues with exceeding buffer bounds when sendether
> is called from arp_reply.

Thanks for the analyse, did you try what you suggested?  On which archs?

Do you have a diff to share?

Reply via email to