> However, I don't know how your suggestion can be implemented

Basically, using a sort of C++ reflection. See 
https://github.com/garbageslam/visit_struct, which is C++11 compatible, and 
additional libraries it points to.

> so feel free to suggest a prototype and/or submit a PR :-)

Surely not right away, but we'll see :)


Yaron.
________________________________
From: Antoine Pitrou <anto...@python.org>
Sent: Monday, September 19, 2022 4:09 AM
To: dev@arrow.apache.org <dev@arrow.apache.org>
Subject: Re: apparently misleading test assertion printout


Le 19/09/2022 à 10:05, Yaron Gvili a écrit :
> Hi Antoine,
>
> Thanks. I understand it is a GoogleTest printout, but do you agree this 
> _default_ printing can be misleading? I believe it'd be better to change this 
> default printing, if possible. In addition, I think it is likely possible to 
> automate the non-default printout of structs (with memory holes) that are 
> defined in Arrow. WDYT?

I agree it's misleading.  However, I don't know how your suggestion can
be implemented, so feel free to suggest a prototype and/or submit a PR :-)

Regards

Antoine.


>
>
> Yaron.
> ________________________________
> From: Antoine Pitrou <anto...@python.org>
> Sent: Monday, September 19, 2022 3:57 AM
> To: dev@arrow.apache.org <dev@arrow.apache.org>
> Subject: Re: apparently misleading test assertion printout
>
>
> Hi Yaron,
>
> This is what GoogleTest does when it doesn't know how to print out a
> value. Guidance to fix this at:
> https://github.com/google/googletest/blob/main/docs/advanced.md#teaching-googletest-how-to-print-your-values
>
> Regards
>
> Antoine.
>
>
> Le 19/09/2022 à 09:54, Yaron Gvili a écrit :
>> Hi,
>>
>> In my local code, I observed a test assertion printout that seems misleading.
>>
>> The printout looks like this:
>>
>> Expected equality of these values:
>>     expected_empty_segment
>>       Which is: 24-byte object <00-00 00-00 00-00 00-00 02-00 00-00 00-00 
>> 00-00 00-00 00-00 00-00 00-00>
>>     empty_segment
>>       Which is: 24-byte object <00-00 00-00 00-00 00-00 02-00 00-00 00-00 
>> 00-00 01-9F 2F-9B 90-51 F6-94>
>>
>> It was caused by this line:
>>
>> ASSERT_EQ(expected_empty_segment, empty_segment)
>>
>> The comparison of the two instances involves this code:
>>
>> struct ARROW_EXPORT GroupingSegment {
>>     int64_t offset;
>>     int64_t length;
>>     bool is_open;
>> };
>>
>> inline bool operator==(const GroupingSegment& segment1, const 
>> GroupingSegment& segment2) {
>>     return segment1.offset == segment2.offset && segment1.length == 
>> segment2.length &&
>>         segment1.is_open == segment2.is_open;
>> }
>>
>> The corresponding values of these instances are:
>>
>> GroupingSegment{0, 2, false}
>> GroupingSegment{0, 2, true}
>>
>> What seems misleading is that the printout says each instance is a 24-bytes 
>> object, which is technically true due to alignment of the struct to its 
>> largest member's size, but effectively only the first 17-bytes are valid. 
>> Taking 24-bytes, the printout includes 7 extra bytes to the right. In this 
>> case, the extra bytes are garbage, which changes from one run to the next, 
>> and it is clear they are uninitialized. However, in general, it is not easy 
>> to distinguish between uninitialized and overran memory.
>>
>> In the above simple case, it is easy enough to figure out what's going on, 
>> but I can imagine that in complex cases, especially ones where memory 
>> overruns are suspected, a misleading printout could be much harder to figure 
>> out.
>>
>> To improve, a possible solution for this is to define a template function 
>> returning the list (or bitmap) of byte-offsets that are valid for a given 
>> instance. For example, one could define that a struct {bool, int32_t} has 
>> valid byte-offsets {0, 4, 5, 6, 7} and this could likely even be automated, 
>> if desirable. Then, these byte-offsets could be used in a printout that 
>> marks the invalid bytes, e.g., the instance {true, 0x01010101} could be 
>> printed as "<01-xx xx-xx xx-xx 01-01 01-01>". An instance for which the 
>> template function is not defined could be printed like currently but with a 
>> warning that some printed bytes may be uninitialized.
>>
>>
>> Cheers,
>> Yaron.
>>
>

Reply via email to