Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Keith Packard
Jason Ekstrand  writes:

> You can have a full reviewed-by

You're too kind :-)

-- 
-keith


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Jason Ekstrand

On June 19, 2018 22:16:37 "Keith Packard"  wrote:


Jason Ekstrand  writes:


I don't think I have any more comments on this patch.  It's gross but I
think it should work.


I'll mark you down as 'Acked-by' then. Neither of us loves the
implementation; I'll see about creating the kernel infrastructure
necessary to supplant it.


You can have a full reviewed-by



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Keith Packard
Jason Ekstrand  writes:

> I don't think I have any more comments on this patch.  It's gross but I
> think it should work.

I'll mark you down as 'Acked-by' then. Neither of us loves the
implementation; I'll see about creating the kernel infrastructure
necessary to supplant it.

-- 
-keith


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Jason Ekstrand
On Tue, Jun 19, 2018 at 6:22 PM, Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> > I suppose we probably shouldn't worry about current_time being greater
> than
> > INT64_MAX?  I guess if that happens we have pretty big problems...
>
> Nope, we've already given up on that working -- it's a couple hundred
> years of system uptime. Neither of us have any concerns in this area.
>
> >>timeout = MIN2(max_timeout, timeout);
> >>
> >>return (current_time + timeout);
> >> }
> >>
> >
> > Yeah, I think that's better.
>
> Cool. I'll wait for further review :-)
>

I don't think I have any more comments on this patch.  It's gross but I
think it should work.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Keith Packard
Jason Ekstrand  writes:

> I suppose we probably shouldn't worry about current_time being greater than
> INT64_MAX?  I guess if that happens we have pretty big problems...

Nope, we've already given up on that working -- it's a couple hundred
years of system uptime. Neither of us have any concerns in this area.

>>timeout = MIN2(max_timeout, timeout);
>>
>>return (current_time + timeout);
>> }
>>
>
> Yeah, I think that's better.

Cool. I'll wait for further review :-)

-- 
-keith


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Jason Ekstrand
On Tue, Jun 19, 2018 at 5:36 PM, Keith Packard  wrote:

> Jason Ekstrand  writes:
>
> > I still don't really like this but I agree that this code really should
> not
> > be getting hit very often so it's probably not too bad.  Maybe one day
> > we'll be able to drop the non-syncobj paths entirely.  Wouldn't that be
> > nice.
>
> I agree. What I want to have is kernel-side syncobj support for all of
> the WSI fences that we need here.
>
> I thought about using syncobjs and signaling them from user-space, but
> realized that we couldn't eliminate the non-syncobj path quite yet as
> that requires a new enough kernel. And, if we want to get to
> kernel-native WSI syncobjs, that would mean we'd eventually have three
> code paths. I think it's better to have one 'legacy' path, which works
> on all kernels, and then one 'modern' path which does what we want.
>
> > In the mean time, this is probably fine.  I did see one issue below
> > with time conversions that should be fixed though.
>
> Always a hard thing to get right.
>
> >> +static uint64_t anv_get_absolute_timeout(uint64_t timeout)
> >> +{
> >> +   if (timeout == 0)
> >> +  return 0;
> >> +   uint64_t current_time = gettime_ns();
> >> +
> >> +   timeout = MIN2(INT64_MAX - current_time, timeout);
> >> +
> >> +   return (current_time + timeout);
> >> +}
> >>
> >
> > This does not have the same behavior as the code it replaces.  In
> > particular, the old code saturates at INT64_MAX whereas this code does
> not
> > do so properly anymore.  If UINT64_MAX gets passed into timeout, I
> believe
> > that will be implicitly casted to signed and the MIN will yield -1 which
> > gets casted back to unsigned and you get UINT64_MAX again.
>
> It won't not get implicitly casted to signed; the implicit cast works
> the other way where  OP  implicitly casts the 
> operand to unsigned for types of the same 'rank' (where 'rank' is not
> quite equivalent to size). Here's a fine article on this:
>
> https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+
> Understand+integer+conversion+rules
>
> However, this is a rather obscure part of the ISO standard, and I don't
> think we should expect that people reading the code know it well.


This is why I insert casts like mad in these scenarios. :-)


> Adding
> a cast to make it clear what we want is a good idea. How about this?
>
> static uint64_t anv_get_absolute_timeout(uint64_t timeout)
> {
>if (timeout == 0)
>   return 0;
>uint64_t current_time = gettime_ns();
>uint64_t max_timeout = (uint64_t) INT64_MAX - current_time;
>

I suppose we probably shouldn't worry about current_time being greater than
INT64_MAX?  I guess if that happens we have pretty big problems...


>timeout = MIN2(max_timeout, timeout);
>
>return (current_time + timeout);
> }
>

Yeah, I think that's better.

--Jason
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Keith Packard
Jason Ekstrand  writes:

> I still don't really like this but I agree that this code really should not
> be getting hit very often so it's probably not too bad.  Maybe one day
> we'll be able to drop the non-syncobj paths entirely.  Wouldn't that be
> nice.

I agree. What I want to have is kernel-side syncobj support for all of
the WSI fences that we need here.

I thought about using syncobjs and signaling them from user-space, but
realized that we couldn't eliminate the non-syncobj path quite yet as
that requires a new enough kernel. And, if we want to get to
kernel-native WSI syncobjs, that would mean we'd eventually have three
code paths. I think it's better to have one 'legacy' path, which works
on all kernels, and then one 'modern' path which does what we want.

> In the mean time, this is probably fine.  I did see one issue below
> with time conversions that should be fixed though.

Always a hard thing to get right.

>> +static uint64_t anv_get_absolute_timeout(uint64_t timeout)
>> +{
>> +   if (timeout == 0)
>> +  return 0;
>> +   uint64_t current_time = gettime_ns();
>> +
>> +   timeout = MIN2(INT64_MAX - current_time, timeout);
>> +
>> +   return (current_time + timeout);
>> +}
>>
>
> This does not have the same behavior as the code it replaces.  In
> particular, the old code saturates at INT64_MAX whereas this code does not
> do so properly anymore.  If UINT64_MAX gets passed into timeout, I believe
> that will be implicitly casted to signed and the MIN will yield -1 which
> gets casted back to unsigned and you get UINT64_MAX again.

It won't not get implicitly casted to signed; the implicit cast works
the other way where  OP  implicitly casts the 
operand to unsigned for types of the same 'rank' (where 'rank' is not
quite equivalent to size). Here's a fine article on this:

https://wiki.sei.cmu.edu/confluence/display/c/INT02-C.+Understand+integer+conversion+rules

However, this is a rather obscure part of the ISO standard, and I don't
think we should expect that people reading the code know it well. Adding
a cast to make it clear what we want is a good idea. How about this?

static uint64_t anv_get_absolute_timeout(uint64_t timeout)
{
   if (timeout == 0)
  return 0;
   uint64_t current_time = gettime_ns();
   uint64_t max_timeout = (uint64_t) INT64_MAX - current_time;

   timeout = MIN2(max_timeout, timeout);

   return (current_time + timeout);
}

> This is a problem because the value we pass into the kernel ioctls is
> signed.  The behavior of the kernel *should* be infinite when timeout
> < 0 but, on some older kernels, -1 is treated as 0 and you get no
> timeout.

Yeah, we definitely want to cap the values to INT64_MAX.

>> -  else if (current_ns + _timeout > INT64_MAX)
>>
>
> I suspect we need to cast INT64_MAX to a uint64_t here to ensure we don't
> accidentally get an implicit conversion to signed.

Again, it's not necessary given the C conversion rules, but it is a good
way to clarify the intent.

-- 
-keith


signature.asc
Description: PGP signature
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 4/7] anv: Support wait for heterogeneous list of fences [v2]

2018-06-19 Thread Jason Ekstrand
I still don't really like this but I agree that this code really should not
be getting hit very often so it's probably not too bad.  Maybe one day
we'll be able to drop the non-syncobj paths entirely.  Wouldn't that be
nice.  In the mean time, this is probably fine.  I did see one issue below
with time conversions that should be fixed though.

On Thu, Jun 14, 2018 at 7:52 PM, Keith Packard  wrote:

> Handle the case where the set of fences to wait for is not all of the
> same type by either waiting for them sequentially (waitAll), or
> polling them until the timer has expired (!waitAll). We hope the
> latter case is not common.
>
> While the current code makes sure that it always has fences of only
> one type, that will not be true when we add WSI fences. Split out this
> refactoring to make merging that clearer.
>
> v2: Adopt Jason Ekstrand's coding conventions
>
> Declare variables at first use, eliminate extra whitespace between
> types and names. Wrap lines to 80 columns.
>
> Suggested-by: Jason Ekstrand 
>
> Signed-off-by: Keith Packard 
> ---
>  src/intel/vulkan/anv_queue.c | 105 +--
>  1 file changed, 88 insertions(+), 17 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_queue.c b/src/intel/vulkan/anv_queue.c
> index 6fe04a0a19d..8df99c84549 100644
> --- a/src/intel/vulkan/anv_queue.c
> +++ b/src/intel/vulkan/anv_queue.c
> @@ -459,12 +459,32 @@ gettime_ns(void)
> return (uint64_t)current.tv_sec * NSEC_PER_SEC + current.tv_nsec;
>  }
>
> +static uint64_t anv_get_absolute_timeout(uint64_t timeout)
> +{
> +   if (timeout == 0)
> +  return 0;
> +   uint64_t current_time = gettime_ns();
> +
> +   timeout = MIN2(INT64_MAX - current_time, timeout);
> +
> +   return (current_time + timeout);
> +}
>

This does not have the same behavior as the code it replaces.  In
particular, the old code saturates at INT64_MAX whereas this code does not
do so properly anymore.  If UINT64_MAX gets passed into timeout, I believe
that will be implicitly casted to signed and the MIN will yield -1 which
gets casted back to unsigned and you get UINT64_MAX again.  This is a
problem because the value we pass into the kernel ioctls is signed.  The
behavior of the kernel *should* be infinite when timeout < 0 but, on some
older kernels, -1 is treated as 0 and you get no timeout.

That said, I think I just saw a bug in the old code which I'll point out
below.


> +
> +static int64_t anv_get_relative_timeout(uint64_t abs_timeout)
> +{
> +   uint64_t now = gettime_ns();
> +
> +   if (abs_timeout < now)
> +  return 0;
> +   return abs_timeout - now;
> +}
> +
>  static VkResult
>  anv_wait_for_syncobj_fences(struct anv_device *device,
>  uint32_t fenceCount,
>  const VkFence *pFences,
>  bool waitAll,
> -uint64_t _timeout)
> +uint64_t abs_timeout_ns)
>  {
> uint32_t *syncobjs = vk_zalloc(>alloc,
>sizeof(*syncobjs) * fenceCount, 8,
> @@ -484,19 +504,6 @@ anv_wait_for_syncobj_fences(struct anv_device
> *device,
>syncobjs[i] = impl->syncobj;
> }
>
> -   int64_t abs_timeout_ns = 0;
> -   if (_timeout > 0) {
> -  uint64_t current_ns = gettime_ns();
> -
> -  /* Add but saturate to INT32_MAX */
> -  if (current_ns + _timeout < current_ns)
> - abs_timeout_ns = INT64_MAX;
> -  else if (current_ns + _timeout > INT64_MAX)
>

I suspect we need to cast INT64_MAX to a uint64_t here to ensure we don't
accidentally get an implicit conversion to signed.


> - abs_timeout_ns = INT64_MAX;
> -  else
> - abs_timeout_ns = current_ns + _timeout;
> -   }
> -
> /* The gem_syncobj_wait ioctl may return early due to an inherent
>  * limitation in the way it computes timeouts.  Loop until we've
> actually
>  * passed the timeout.
> @@ -665,6 +672,67 @@ done:
> return result;
>  }
>
> +static VkResult
> +anv_wait_for_fences(struct anv_device *device,
> +uint32_t fenceCount,
> +const VkFence *pFences,
> +bool waitAll,
> +uint64_t abs_timeout)
> +{
> +   VkResult result = VK_SUCCESS;
> +
> +   if (fenceCount <= 1 || waitAll) {
> +  for (uint32_t i = 0; i < fenceCount; i++) {
> + ANV_FROM_HANDLE(anv_fence, fence, pFences[i]);
> + switch (fence->permanent.type) {
> + case ANV_FENCE_TYPE_BO:
> +result = anv_wait_for_bo_fences(
> +   device, 1, [i], true,
> +   anv_get_relative_timeout(abs_timeout));
> +break;
> + case ANV_FENCE_TYPE_SYNCOBJ:
> +result = anv_wait_for_syncobj_fences(device, 1, [i],
> + true, abs_timeout);
> +break;
> + case ANV_FENCE_TYPE_NONE:
> +result = VK_SUCCESS;
> +break;
> +