On 11/3/2021 07:00, Tvrtko Ursulin wrote:
On 22/10/2021 00:40, john.c.harri...@intel.com wrote:
From: John Harrison <john.c.harri...@intel.com>

The sysfs file read helper does not actually report any errors if a
realloc fails. It just silently returns a 'valid' but truncated
buffer. This then leads to the decode of the buffer failing in random
ways. So, add a check for ENOMEM being generated during the read.

Signed-off-by: John Harrison <john.c.harri...@intel.com>
---
  tests/i915/gem_exec_capture.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tests/i915/gem_exec_capture.c b/tests/i915/gem_exec_capture.c
index e373d24ed..8997125ee 100644
--- a/tests/i915/gem_exec_capture.c
+++ b/tests/i915/gem_exec_capture.c
@@ -131,9 +131,11 @@ static int check_error_state(int dir, struct offset *obj_offsets, int obj_count,
      char *error, *str;
      int blobs = 0;
  +    errno = 0;
      error = igt_sysfs_get(dir, "error");
      igt_sysfs_set(dir, "error", "Begone!");
      igt_assert(error);
+    igt_assert(errno != ENOMEM);

igt_sysfs_get:

    len = 64;
...
                newbuf = realloc(buf, 2*len);

Maybe the problem is doubling goes out of hand. How big are your buffers? Perhaps you could improve the library function instead to grow less aggressively.
The buffers are generally ending at 2GB in size with the capture being about 1.8GB (on the particular system I happen to be testing on).

I considered various options such as doubling until a given size and then just incrementing by fixed amounts. But where do you draw the line? 1MB, 128MB, 1GB, 128GB? If the final result needs to be 128GB (which you cannot know until you have finished reading and resizing) and you are allocating in 1MB chunks then it is going to take a very long time to get there. I ended up leaving it as a straight double on the grounds that it is the best compromise between overallocation and taking ridiculous numbers of steps.




And at the same time perhaps the bug is this:

                if (igt_debug_on(!newbuf))
                        break;
...
        return buf;

So failures to grow the buffer are ignored, while failure to allocate the initial one are not. Perhaps both should return NULL and then callers would not be surprised.

Or you think someone relies on this current odd behaviour?

As per the commit description, this is exactly the problem. However, I do not know for certain this is not intentional behaviour and something somewhere is relying on it. And I really do not have the time to audit this. The vast majority of uses are reading teeny tiny files and don't care but who knows what might not be in some particular test/config/platform/etc. The fact that it is explicitly saying 'igt_debug_on' means that someone must have made a conscious decision to not assert. It's not like they just forgot to check for null being returned. Which implies it is intentional and required.

John.


Regards,

Tvrtko

      igt_debug("%s\n", error);
        /* render ring --- user = 0x00000000 ffffd000 */


Reply via email to