Re: [Pvfs2-developers] here_string endecode bug?

Michael Moore Fri, 04 Sep 2009 07:20:05 -0700

Pete Wyckoff wrote:
> [email protected] wrote on Fri, 04 Sep 2009 09:29 -0400:
>> Pete Wyckoff wrote:
>>> [email protected] wrote on Thu, 03 Sep 2009 16:31 -0400:
>>>> But Pete, the problem is encode_string does not write 8 bytes of 0, it  
>>>> writes 4 bytes of 0.  If the buffer is 0's to start with what is here  
>>>> works fine, but if there is something else in the field you get garbage.
>>>>
>>>> Seems like a prudent move to simply zero out all 8 bytes instead of just  
>>>> 4, wouldn't you say?
>>> Thanks for explaining.  I'm looking at the wrong end of the problem,
>>> maybe.
>>>
>>> On the encode side, we have
>>>
>>>     #define encode_string(pptr,pbuf) do { \
>>>     u_int32_t len = 0; \
>>>     if (*pbuf) \
>>>         len = strlen(*pbuf); \
>>>     *(u_int32_t *) *(pptr) = htobmi32(len); \
>>>     if (len) { \
>>>         memcpy(*(pptr)+4, *pbuf, len+1); \
>>>         *(pptr) += roundup8(4 + len + 1); \
>>>     } else { \
>>>         *(u_int32_t *) (*(pptr)+4) = 0; \
>>>         *(pptr) += 8; \
>>>     } \
>>>     } while (0)
>> There's the mixup, we're looking at different versions of encode_string!
>> From a fresh 2.8.1 and the branch I'm working from I have the following
>> encode_string:
>>
>> #define encode_string(pptr,pbuf) do { \
>>     u_int32_t len = 0; \
>>     if (*pbuf) \
>>         len = strlen(*pbuf); \
>>     *(u_int32_t *) *(pptr) = htobmi32(len); \
>>     if (len) { \
>>         memcpy(*(pptr)+4, *pbuf, len+1); \
>>         int pad = roundup8(4 + len + 1) - (4 + len + 1); \
>>         *(pptr) += roundup8(4 + len + 1); \
>>         memset(*(pptr)-pad, 0, pad); \
>>     } else { \
>>         *(u_int32_t *) *(pptr) = 0; \
>>         *(pptr) += 8; \
>>     } \
>>
>> In the else branch there is no +4 in the *pptr assignment. Which branch
>> are you working from? It looks like this was already addressed somewhere
>> before?
> 
> I go to pvfs.org/fisheye.  Looking at cvs vers 1.27 now.  I was
> quoting the non-valgrind one.  Which seems to be different from the
> valgrind one you quote above in this important way.  My vote goes
> toward fixing the valgrind version to fix the off-by-4 bug you just
> found.  It's old, apparently, but would only hurt if you configured
> --with-valgrind.
> 
> Good catch.  Looking at that mess of #defines is always a challenge.
> 
>               -- Pete
>


The above snippet is the non-Valgrind version but both versions have the
issue in pre-1.27. It looks like the non-valgrind version just got fixed
by dbonnie in the 1.26->1.27 patch on Wednesday.

Thanks for your help!

Michael

>>> For a NULL pbuf, or a non-NULL pbuf that starts with'\0',
>>> we get:
>>>
>>>     len = 0
>>>     pptr[0..3] = 0
>>>     (take the else clause)
>>>     pptr[4..7] = 0
>>>     (pptr offset up by 8)
>>>
>>> That generates 8 bytes okay.
>>>
>>> For any other pbuf with len > 0, say 1 for "a\0", we get:
>>>
>>>     len = 1
>>>     pptr[0..3] = htonl(1)
>>>     (take the if clause)
>>>     pptr[4..7] = 'a' '\0' 'x' 'x'   /* 'x' == garbage */
>>>     (pptr offset up by 8)
>>>
>>> Is it those two bytes of 'x' that you don't like?  That's why
>>> we have the valgrind ifdef.  What am I missing?
>>>
>>> Michael's initial patch was:
>>>
>>>     131c131,134
>>>     <     memcpy(pbuf, *(pptr) + 4, len + 1); \
>>>     ---
>>>     > >     if( len ) \
>>>     > >         memcpy(pbuf, *(pptr) + 4, len + 1); \
>>>     > >     else \
>>>     > >         memcpy(pbuf, *(pptr), len + 1); \
>>>
>>> Unfortunately without the useful "-u -p" args to diff, but eyeing
>>> the source shows he's patching on the decode side.
>>>
>>>             -- Pete
>>>
>> Apologies for the lack of context on the diff but you're right it was on
>> the decode side. However, when you mentioned the encode side always
>> shipped 8 bytes I looked closer at the encode side and saw in this
>> version only 4 of the 8 bytes were getting set to '\0'.
>>
>> Michael
>>
>>>> Pete Wyckoff wrote:
>>>>> [email protected] wrote on Thu, 03 Sep 2009 13:44 -0400:
>>>>>> Pete Wyckoff wrote:
>>>>>>> [email protected] wrote on Thu, 03 Sep 2009 11:51 -0400:
>>>>>>>> In looking at some issues I was having with the encoding of PVFS_dirent
>>>>>>>> structs in requests I saw an inconsistency in how here_strings are
>>>>>>>> encoded and decoded. encode_string memcpys strings starting at 
>>>>>>>> *(pptr)+4
>>>>>>>> unless it's length 0 in which case it sets *(pptr) to 0. However,
>>>>>>>> decode_here_string always copys from *(pptr) + 4. So, if d_name is an
>>>>>>>> empty string when encoded d_name gets 1 byte of *(pptr)+4 instead of 0
>>>>>>>> on decoding.
>>>>>>>>
>>>>>>>> The fix is just to handle decoding like encoding. Is there a reason for
>>>>>>>> always copying to *(pptr)+4 in decode_here_string? Is this something
>>>>>>>> that should be changed?
>>>>>>> encode_string always ships at least 8 bytes.  For a null string, that's 
>>>>>>> 8
>>>>>>> bytes of zeroes.  Decoding a null "here" string will use one of
>>>>>>> those zero bytes to set pbuf[0] = '\0'.  I figured it would be nice
>>>>>>> to make sure the string was set to NULL.
>>>>>>>
>>>>>> I see where 8 bytes are always shipped in encode_string but I'm not
>>>>>> seeing where 8 bytes of '\0' get encoded for a NULL string or if the
>>>>>> length of the string is 0.
>>>>>>
>>>>>> However, adding an 8 byte memset worth of '\0' to *(pptr) in the else
>>>>>> (length 0) branch of encode_string also resolves the problem I was
>>>>>> seeing. So, your point may make for a better solution.
>>>>>     uint8_t *wiredata;
>>>>>     char mystr[20];
>>>>>     decode_here_string(&wiredata, mystr);
>>>>>
>>>>>     wiredata -> 0 0 0 0 0 0 0 0
>>>>>     mystr unititialized
>>>>>
>>>>> 131            memcpy(pbuf, *(pptr) + 4, len + 1); \
>>>>>
>>>>>     memcpy(pbuf, <bunch of zeroes>, 1)
>>>>>     equiv to
>>>>>
>>>>>     pbuf[0] = '\0';
>>>>> or
>>>>>     mystr[0] = '\0';
>>>>>
>>>>> right?  It's a _here_ string, you don't want to destroy the pointer,
>>>>> you want to put a zero into it to nullify the string.
>>>>>
>>>>> The comment does say "odd variation".  I don't remember where this
>>>>> even gets used.
>>>>>
>>>>>           -- Pete
>>>>> _______________________________________________
>>>>> Pvfs2-developers mailing list
>>>>> [email protected]
>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] here_string endecode bug?

Reply via email to