> However, I don't know how your suggestion can be implemented Basically, using a sort of C++ reflection. See https://github.com/garbageslam/visit_struct, which is C++11 compatible, and additional libraries it points to.
> so feel free to suggest a prototype and/or submit a PR :-) Surely not right away, but we'll see :) Yaron. ________________________________ From: Antoine Pitrou <anto...@python.org> Sent: Monday, September 19, 2022 4:09 AM To: dev@arrow.apache.org <dev@arrow.apache.org> Subject: Re: apparently misleading test assertion printout Le 19/09/2022 à 10:05, Yaron Gvili a écrit : > Hi Antoine, > > Thanks. I understand it is a GoogleTest printout, but do you agree this > _default_ printing can be misleading? I believe it'd be better to change this > default printing, if possible. In addition, I think it is likely possible to > automate the non-default printout of structs (with memory holes) that are > defined in Arrow. WDYT? I agree it's misleading. However, I don't know how your suggestion can be implemented, so feel free to suggest a prototype and/or submit a PR :-) Regards Antoine. > > > Yaron. > ________________________________ > From: Antoine Pitrou <anto...@python.org> > Sent: Monday, September 19, 2022 3:57 AM > To: dev@arrow.apache.org <dev@arrow.apache.org> > Subject: Re: apparently misleading test assertion printout > > > Hi Yaron, > > This is what GoogleTest does when it doesn't know how to print out a > value. Guidance to fix this at: > https://github.com/google/googletest/blob/main/docs/advanced.md#teaching-googletest-how-to-print-your-values > > Regards > > Antoine. > > > Le 19/09/2022 à 09:54, Yaron Gvili a écrit : >> Hi, >> >> In my local code, I observed a test assertion printout that seems misleading. >> >> The printout looks like this: >> >> Expected equality of these values: >> expected_empty_segment >> Which is: 24-byte object <00-00 00-00 00-00 00-00 02-00 00-00 00-00 >> 00-00 00-00 00-00 00-00 00-00> >> empty_segment >> Which is: 24-byte object <00-00 00-00 00-00 00-00 02-00 00-00 00-00 >> 00-00 01-9F 2F-9B 90-51 F6-94> >> >> It was caused by this line: >> >> ASSERT_EQ(expected_empty_segment, empty_segment) >> >> The comparison of the two instances involves this code: >> >> struct ARROW_EXPORT GroupingSegment { >> int64_t offset; >> int64_t length; >> bool is_open; >> }; >> >> inline bool operator==(const GroupingSegment& segment1, const >> GroupingSegment& segment2) { >> return segment1.offset == segment2.offset && segment1.length == >> segment2.length && >> segment1.is_open == segment2.is_open; >> } >> >> The corresponding values of these instances are: >> >> GroupingSegment{0, 2, false} >> GroupingSegment{0, 2, true} >> >> What seems misleading is that the printout says each instance is a 24-bytes >> object, which is technically true due to alignment of the struct to its >> largest member's size, but effectively only the first 17-bytes are valid. >> Taking 24-bytes, the printout includes 7 extra bytes to the right. In this >> case, the extra bytes are garbage, which changes from one run to the next, >> and it is clear they are uninitialized. However, in general, it is not easy >> to distinguish between uninitialized and overran memory. >> >> In the above simple case, it is easy enough to figure out what's going on, >> but I can imagine that in complex cases, especially ones where memory >> overruns are suspected, a misleading printout could be much harder to figure >> out. >> >> To improve, a possible solution for this is to define a template function >> returning the list (or bitmap) of byte-offsets that are valid for a given >> instance. For example, one could define that a struct {bool, int32_t} has >> valid byte-offsets {0, 4, 5, 6, 7} and this could likely even be automated, >> if desirable. Then, these byte-offsets could be used in a printout that >> marks the invalid bytes, e.g., the instance {true, 0x01010101} could be >> printed as "<01-xx xx-xx xx-xx 01-01 01-01>". An instance for which the >> template function is not defined could be printed like currently but with a >> warning that some printed bytes may be uninitialized. >> >> >> Cheers, >> Yaron. >> >