Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Roland Scheidegger
Am 05.02.2016 um 16:08 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 3:56 PM, Roland Scheidegger  wrote:
>> Am 05.02.2016 um 15:44 schrieb Marek Olšák:
>>> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
 On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> This is a subset of the generated tests which are known to fail
>> on everything except CPU emulation (AFAIK).
>> ---
>
> This is really awful. Committing a generated test, but with unknown
> bits chopped out is gross.
>
> If it were me, I'd want to understand why my hardware behaved
> differently -- not just hack up *different* tests and claim victory.
>
> FWIW, the generated tests pass on all Intel hardware exposing
> ARB_shading_language_packing. Gen7+ has native half-float support, and
> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
> built-ins into a pile of instructions.
>
> If you can identify how AMD hardware behaves differently and can prove
> that the generator needs to be relaxed or something, that's cool. But
> as is, I hate this patch.
>
> I can't find anything in the AMD docs (I looked at GCN3) about
> half-precision support, so I can't check my theory that AMD hardware
> rounds towards zero instead of to-nearest/even like Intel.

 Since the tests only fail with very small numbers, I think the problem
 is that denorms are disabled by radeonsi. I can try to confirm that.

 The hardware rounds to nearest even. The hw precision is:
 - unpack functions - 0 ULP
 - pack functions = 0.5 ULP
 - input and output denorms are flushed to 0
>>>
>>> Hey Matt,
>>>
>>> I have just confirmed that I was right. After I enable denormals in
>>> hw, the original tests pass. This means that this patch tests the
>>> packing functions but skips denormals.
>>>
>>> Not so awful now, is it? :)
>>>
>>> Sadly, I can't enable denormals on all chips, because they are slow.
>>>
>>> So if I add "-no-denormals" suffix into the test names, I can push this, 
>>> right?
>>>
>>
>> Can't you hack up the generator instead? By the looks of it
>> (gen_builtin_packing_tests.py) it has a list of values which result in
>> denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
>> add a test there which uses a different list, not including them.
>>
>> (That said, I'm a bit surprised for conversion to/from fp16 your hw
>> doesn't do fp16 denorms - they'd be required by d3d(11) as well.)
> 
> The hardware can do denormals, they are just slow on all chips except VI.
> 
> GLSL 4.50 doesn't require denormals, thus piglit shouldn't even contain
> tests for them.
> 

I'm not asking about denormals for ordinary operations, just conversion
to fp16 (any fp16 denorm is a fp32 normal). That would be along similar
lines what d3d10 already required - forbids fp32 denorms, but for
instance sampling fp16 surfaces requires you to handle the fp16 denorms
without flushing to zero (that's at least what the docs say - maybe you
can get away without it...). FWIW x86 half-float conversion instructions
(vcvtph2ps, vcvtps2ph) work that way too - even if you have denorms
disabled, that instruction will still produce fp16 denorms (and convert
fp16 denorms to fp32 normals), you can configure rounding mode but
there's no way to disable fp16 denorms.
Albeit gcn3 supports fp16 natively (that is it has actual operations
using them not just conversion), so I suppose it makes sense it would
flush them on conversion too...
But I think you're right glsl shouldn't require them (albeit it is
pretty silent on that topic as far as I can tell - certainly doesn't
require them for fp32).

Roland

___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Marek Olšák
On Fri, Feb 5, 2016 at 3:56 PM, Roland Scheidegger  wrote:
> Am 05.02.2016 um 15:44 schrieb Marek Olšák:
>> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
>>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
 On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This is a subset of the generated tests which are known to fail
> on everything except CPU emulation (AFAIK).
> ---

 This is really awful. Committing a generated test, but with unknown
 bits chopped out is gross.

 If it were me, I'd want to understand why my hardware behaved
 differently -- not just hack up *different* tests and claim victory.

 FWIW, the generated tests pass on all Intel hardware exposing
 ARB_shading_language_packing. Gen7+ has native half-float support, and
 Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
 built-ins into a pile of instructions.

 If you can identify how AMD hardware behaves differently and can prove
 that the generator needs to be relaxed or something, that's cool. But
 as is, I hate this patch.

 I can't find anything in the AMD docs (I looked at GCN3) about
 half-precision support, so I can't check my theory that AMD hardware
 rounds towards zero instead of to-nearest/even like Intel.
>>>
>>> Since the tests only fail with very small numbers, I think the problem
>>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>>
>>> The hardware rounds to nearest even. The hw precision is:
>>> - unpack functions - 0 ULP
>>> - pack functions = 0.5 ULP
>>> - input and output denorms are flushed to 0
>>
>> Hey Matt,
>>
>> I have just confirmed that I was right. After I enable denormals in
>> hw, the original tests pass. This means that this patch tests the
>> packing functions but skips denormals.
>>
>> Not so awful now, is it? :)
>>
>> Sadly, I can't enable denormals on all chips, because they are slow.
>>
>> So if I add "-no-denormals" suffix into the test names, I can push this, 
>> right?
>>
>
> Can't you hack up the generator instead? By the looks of it
> (gen_builtin_packing_tests.py) it has a list of values which result in
> denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
> add a test there which uses a different list, not including them.
>
> (That said, I'm a bit surprised for conversion to/from fp16 your hw
> doesn't do fp16 denorms - they'd be required by d3d(11) as well.)

The hardware can do denormals, they are just slow on all chips except VI.

GLSL 4.50 doesn't require denormals, thus piglit shouldn't even contain
tests for them.

Marek
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Ilia Mirkin
On Fri, Feb 5, 2016 at 10:48 AM, Roland Scheidegger  wrote:
> But I think you're right glsl shouldn't require them (albeit it is
> pretty silent on that topic as far as I can tell - certainly doesn't
> require them for fp32).

Perhaps it doesn't require them, but at least Unigine Heaven 4.0
misrenders if you don't flush denorms to 0.

  -ilia
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Roland Scheidegger
Am 05.02.2016 um 15:44 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
>>> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
 From: Marek Olšák 

 This is a subset of the generated tests which are known to fail
 on everything except CPU emulation (AFAIK).
 ---
>>>
>>> This is really awful. Committing a generated test, but with unknown
>>> bits chopped out is gross.
>>>
>>> If it were me, I'd want to understand why my hardware behaved
>>> differently -- not just hack up *different* tests and claim victory.
>>>
>>> FWIW, the generated tests pass on all Intel hardware exposing
>>> ARB_shading_language_packing. Gen7+ has native half-float support, and
>>> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
>>> built-ins into a pile of instructions.
>>>
>>> If you can identify how AMD hardware behaves differently and can prove
>>> that the generator needs to be relaxed or something, that's cool. But
>>> as is, I hate this patch.
>>>
>>> I can't find anything in the AMD docs (I looked at GCN3) about
>>> half-precision support, so I can't check my theory that AMD hardware
>>> rounds towards zero instead of to-nearest/even like Intel.
>>
>> Since the tests only fail with very small numbers, I think the problem
>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>
>> The hardware rounds to nearest even. The hw precision is:
>> - unpack functions - 0 ULP
>> - pack functions = 0.5 ULP
>> - input and output denorms are flushed to 0
> 
> Hey Matt,
> 
> I have just confirmed that I was right. After I enable denormals in
> hw, the original tests pass. This means that this patch tests the
> packing functions but skips denormals.
> 
> Not so awful now, is it? :)
> 
> Sadly, I can't enable denormals on all chips, because they are slow.
> 
> So if I add "-no-denormals" suffix into the test names, I can push this, 
> right?
> 

Can't you hack up the generator instead? By the looks of it
(gen_builtin_packing_tests.py) it has a list of values which result in
denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
add a test there which uses a different list, not including them.

(That said, I'm a bit surprised for conversion to/from fp16 your hw
doesn't do fp16 denorms - they'd be required by d3d(11) as well.)

Roland

___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Roland Scheidegger
Am 05.02.2016 um 17:04 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 4:48 PM, Roland Scheidegger  wrote:
>> Am 05.02.2016 um 16:08 schrieb Marek Olšák:
>>> On Fri, Feb 5, 2016 at 3:56 PM, Roland Scheidegger  
>>> wrote:
 Am 05.02.2016 um 15:44 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
>>> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
 From: Marek Olšák 

 This is a subset of the generated tests which are known to fail
 on everything except CPU emulation (AFAIK).
 ---
>>>
>>> This is really awful. Committing a generated test, but with unknown
>>> bits chopped out is gross.
>>>
>>> If it were me, I'd want to understand why my hardware behaved
>>> differently -- not just hack up *different* tests and claim victory.
>>>
>>> FWIW, the generated tests pass on all Intel hardware exposing
>>> ARB_shading_language_packing. Gen7+ has native half-float support, and
>>> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
>>> built-ins into a pile of instructions.
>>>
>>> If you can identify how AMD hardware behaves differently and can prove
>>> that the generator needs to be relaxed or something, that's cool. But
>>> as is, I hate this patch.
>>>
>>> I can't find anything in the AMD docs (I looked at GCN3) about
>>> half-precision support, so I can't check my theory that AMD hardware
>>> rounds towards zero instead of to-nearest/even like Intel.
>>
>> Since the tests only fail with very small numbers, I think the problem
>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>
>> The hardware rounds to nearest even. The hw precision is:
>> - unpack functions - 0 ULP
>> - pack functions = 0.5 ULP
>> - input and output denorms are flushed to 0
>
> Hey Matt,
>
> I have just confirmed that I was right. After I enable denormals in
> hw, the original tests pass. This means that this patch tests the
> packing functions but skips denormals.
>
> Not so awful now, is it? :)
>
> Sadly, I can't enable denormals on all chips, because they are slow.
>
> So if I add "-no-denormals" suffix into the test names, I can push this, 
> right?
>

 Can't you hack up the generator instead? By the looks of it
 (gen_builtin_packing_tests.py) it has a list of values which result in
 denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
 add a test there which uses a different list, not including them.

 (That said, I'm a bit surprised for conversion to/from fp16 your hw
 doesn't do fp16 denorms - they'd be required by d3d(11) as well.)
>>>
>>> The hardware can do denormals, they are just slow on all chips except VI.
>>>
>>> GLSL 4.50 doesn't require denormals, thus piglit shouldn't even contain
>>> tests for them.
>>>
>>
>> I'm not asking about denormals for ordinary operations, just conversion
>> to fp16 (any fp16 denorm is a fp32 normal). That would be along similar
>> lines what d3d10 already required - forbids fp32 denorms, but for
>> instance sampling fp16 surfaces requires you to handle the fp16 denorms
>> without flushing to zero (that's at least what the docs say - maybe you
>> can get away without it...). FWIW x86 half-float conversion instructions
>> (vcvtph2ps, vcvtps2ph) work that way too - even if you have denorms
>> disabled, that instruction will still produce fp16 denorms (and convert
>> fp16 denorms to fp32 normals), you can configure rounding mode but
>> there's no way to disable fp16 denorms.
>> Albeit gcn3 supports fp16 natively (that is it has actual operations
>> using them not just conversion), so I suppose it makes sense it would
>> flush them on conversion too...
>> But I think you're right glsl shouldn't require them (albeit it is
>> pretty silent on that topic as far as I can tell - certainly doesn't
>> require them for fp32).
> 
> There is no point in enabling denormals for the conversion
> instructions if all previous FP instructions flush denormals to 0 and
> so do the following FP instructions.
I think you didn't get the point: any fp16 denormal is a fp32 normal
(and similarly, any fp32 denormal is fp16 zero). And glsl (well the
non-es variants anyway) don't deal with anything but fp32 really, you
cannot manipulate these packed fp16 values any other way than unpacking
to fp32 (well you can of course, but not really in a "float way").
Thus, even if you have no denormals for fp32, fp16 denorms are still
useful for packing, unpacking.

> If the conversion instructions
> were the only FP instructions manipulating with the source data in the
> shader, it would make sense, but this is a very rare use case of 

Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Marek Olšák
On Fri, Feb 5, 2016 at 7:11 PM, Roland Scheidegger  wrote:
> Am 05.02.2016 um 17:04 schrieb Marek Olšák:
>> On Fri, Feb 5, 2016 at 4:48 PM, Roland Scheidegger  
>> wrote:
>>> Am 05.02.2016 um 16:08 schrieb Marek Olšák:
 On Fri, Feb 5, 2016 at 3:56 PM, Roland Scheidegger  
 wrote:
> Am 05.02.2016 um 15:44 schrieb Marek Olšák:
>> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
>>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
 On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This is a subset of the generated tests which are known to fail
> on everything except CPU emulation (AFAIK).
> ---

 This is really awful. Committing a generated test, but with unknown
 bits chopped out is gross.

 If it were me, I'd want to understand why my hardware behaved
 differently -- not just hack up *different* tests and claim victory.

 FWIW, the generated tests pass on all Intel hardware exposing
 ARB_shading_language_packing. Gen7+ has native half-float support, and
 Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
 built-ins into a pile of instructions.

 If you can identify how AMD hardware behaves differently and can prove
 that the generator needs to be relaxed or something, that's cool. But
 as is, I hate this patch.

 I can't find anything in the AMD docs (I looked at GCN3) about
 half-precision support, so I can't check my theory that AMD hardware
 rounds towards zero instead of to-nearest/even like Intel.
>>>
>>> Since the tests only fail with very small numbers, I think the problem
>>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>>
>>> The hardware rounds to nearest even. The hw precision is:
>>> - unpack functions - 0 ULP
>>> - pack functions = 0.5 ULP
>>> - input and output denorms are flushed to 0
>>
>> Hey Matt,
>>
>> I have just confirmed that I was right. After I enable denormals in
>> hw, the original tests pass. This means that this patch tests the
>> packing functions but skips denormals.
>>
>> Not so awful now, is it? :)
>>
>> Sadly, I can't enable denormals on all chips, because they are slow.
>>
>> So if I add "-no-denormals" suffix into the test names, I can push this, 
>> right?
>>
>
> Can't you hack up the generator instead? By the looks of it
> (gen_builtin_packing_tests.py) it has a list of values which result in
> denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
> add a test there which uses a different list, not including them.
>
> (That said, I'm a bit surprised for conversion to/from fp16 your hw
> doesn't do fp16 denorms - they'd be required by d3d(11) as well.)

 The hardware can do denormals, they are just slow on all chips except VI.

 GLSL 4.50 doesn't require denormals, thus piglit shouldn't even contain
 tests for them.

>>>
>>> I'm not asking about denormals for ordinary operations, just conversion
>>> to fp16 (any fp16 denorm is a fp32 normal). That would be along similar
>>> lines what d3d10 already required - forbids fp32 denorms, but for
>>> instance sampling fp16 surfaces requires you to handle the fp16 denorms
>>> without flushing to zero (that's at least what the docs say - maybe you
>>> can get away without it...). FWIW x86 half-float conversion instructions
>>> (vcvtph2ps, vcvtps2ph) work that way too - even if you have denorms
>>> disabled, that instruction will still produce fp16 denorms (and convert
>>> fp16 denorms to fp32 normals), you can configure rounding mode but
>>> there's no way to disable fp16 denorms.
>>> Albeit gcn3 supports fp16 natively (that is it has actual operations
>>> using them not just conversion), so I suppose it makes sense it would
>>> flush them on conversion too...
>>> But I think you're right glsl shouldn't require them (albeit it is
>>> pretty silent on that topic as far as I can tell - certainly doesn't
>>> require them for fp32).
>>
>> There is no point in enabling denormals for the conversion
>> instructions if all previous FP instructions flush denormals to 0 and
>> so do the following FP instructions.
> I think you didn't get the point: any fp16 denormal is a fp32 normal
> (and similarly, any fp32 denormal is fp16 zero). And glsl (well the
> non-es variants anyway) don't deal with anything but fp32 really, you
> cannot manipulate these packed fp16 values any other way than unpacking
> to fp32 (well you can of course, but not really in a "float way").
> Thus, even if you have no denormals for fp32, fp16 denorms are still
> useful for packing, 

Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Ian Romanick
On 02/05/2016 08:04 AM, Marek Olšák wrote:
> On Fri, Feb 5, 2016 at 4:48 PM, Roland Scheidegger  wrote:
>> Am 05.02.2016 um 16:08 schrieb Marek Olšák:
>>> On Fri, Feb 5, 2016 at 3:56 PM, Roland Scheidegger  
>>> wrote:
 Am 05.02.2016 um 15:44 schrieb Marek Olšák:
> On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
>> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
>>> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
 From: Marek Olšák 

 This is a subset of the generated tests which are known to fail
 on everything except CPU emulation (AFAIK).
 ---
>>>
>>> This is really awful. Committing a generated test, but with unknown
>>> bits chopped out is gross.
>>>
>>> If it were me, I'd want to understand why my hardware behaved
>>> differently -- not just hack up *different* tests and claim victory.
>>>
>>> FWIW, the generated tests pass on all Intel hardware exposing
>>> ARB_shading_language_packing. Gen7+ has native half-float support, and
>>> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
>>> built-ins into a pile of instructions.
>>>
>>> If you can identify how AMD hardware behaves differently and can prove
>>> that the generator needs to be relaxed or something, that's cool. But
>>> as is, I hate this patch.
>>>
>>> I can't find anything in the AMD docs (I looked at GCN3) about
>>> half-precision support, so I can't check my theory that AMD hardware
>>> rounds towards zero instead of to-nearest/even like Intel.
>>
>> Since the tests only fail with very small numbers, I think the problem
>> is that denorms are disabled by radeonsi. I can try to confirm that.
>>
>> The hardware rounds to nearest even. The hw precision is:
>> - unpack functions - 0 ULP
>> - pack functions = 0.5 ULP
>> - input and output denorms are flushed to 0
>
> Hey Matt,
>
> I have just confirmed that I was right. After I enable denormals in
> hw, the original tests pass. This means that this patch tests the
> packing functions but skips denormals.
>
> Not so awful now, is it? :)
>
> Sadly, I can't enable denormals on all chips, because they are slow.
>
> So if I add "-no-denormals" suffix into the test names, I can push this, 
> right?
>

 Can't you hack up the generator instead? By the looks of it
 (gen_builtin_packing_tests.py) it has a list of values which result in
 denorm f16 values (make_inputs_for_pack_half_2x16). Presumably you could
 add a test there which uses a different list, not including them.

 (That said, I'm a bit surprised for conversion to/from fp16 your hw
 doesn't do fp16 denorms - they'd be required by d3d(11) as well.)
>>>
>>> The hardware can do denormals, they are just slow on all chips except VI.
>>>
>>> GLSL 4.50 doesn't require denormals, thus piglit shouldn't even contain
>>> tests for them.
>>
>> I'm not asking about denormals for ordinary operations, just conversion
>> to fp16 (any fp16 denorm is a fp32 normal). That would be along similar
>> lines what d3d10 already required - forbids fp32 denorms, but for
>> instance sampling fp16 surfaces requires you to handle the fp16 denorms
>> without flushing to zero (that's at least what the docs say - maybe you
>> can get away without it...). FWIW x86 half-float conversion instructions
>> (vcvtph2ps, vcvtps2ph) work that way too - even if you have denorms
>> disabled, that instruction will still produce fp16 denorms (and convert
>> fp16 denorms to fp32 normals), you can configure rounding mode but
>> there's no way to disable fp16 denorms.
>> Albeit gcn3 supports fp16 natively (that is it has actual operations
>> using them not just conversion), so I suppose it makes sense it would
>> flush them on conversion too...
>> But I think you're right glsl shouldn't require them (albeit it is
>> pretty silent on that topic as far as I can tell - certainly doesn't
>> require them for fp32).
> 
> There is no point in enabling denormals for the conversion
> instructions if all previous FP instructions flush denormals to 0 and
> so do the following FP instructions. If the conversion instructions

Sure there is.  It is very common for applications to load fp16 values
from textures, buffers, etc. to reduce bandwidth, then convert the fp16
to fp32.  That is, after all, the entire reason these intrinsics were
added to GLSL.  That's also the reason for the DX requirements.

Or to put this another way... do you really want to debug an application
rendering bug related to this? :)

> were the only FP instructions manipulating with the source data in the
> shader, it would make sense, but this is a very rare use case of GLSL
> intrinsics that, to my knowledge, aren't used by any app. The 

Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-05 Thread Marek Olšák
On Fri, Feb 5, 2016 at 10:57 AM, Marek Olšák  wrote:
> On Fri, Feb 5, 2016 at 1:55 AM, Matt Turner  wrote:
>> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
>>> From: Marek Olšák 
>>>
>>> This is a subset of the generated tests which are known to fail
>>> on everything except CPU emulation (AFAIK).
>>> ---
>>
>> This is really awful. Committing a generated test, but with unknown
>> bits chopped out is gross.
>>
>> If it were me, I'd want to understand why my hardware behaved
>> differently -- not just hack up *different* tests and claim victory.
>>
>> FWIW, the generated tests pass on all Intel hardware exposing
>> ARB_shading_language_packing. Gen7+ has native half-float support, and
>> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
>> built-ins into a pile of instructions.
>>
>> If you can identify how AMD hardware behaves differently and can prove
>> that the generator needs to be relaxed or something, that's cool. But
>> as is, I hate this patch.
>>
>> I can't find anything in the AMD docs (I looked at GCN3) about
>> half-precision support, so I can't check my theory that AMD hardware
>> rounds towards zero instead of to-nearest/even like Intel.
>
> Since the tests only fail with very small numbers, I think the problem
> is that denorms are disabled by radeonsi. I can try to confirm that.
>
> The hardware rounds to nearest even. The hw precision is:
> - unpack functions - 0 ULP
> - pack functions = 0.5 ULP
> - input and output denorms are flushed to 0

Hey Matt,

I have just confirmed that I was right. After I enable denormals in
hw, the original tests pass. This means that this patch tests the
packing functions but skips denormals.

Not so awful now, is it? :)

Sadly, I can't enable denormals on all chips, because they are slow.

So if I add "-no-denormals" suffix into the test names, I can push this, right?

Marek
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-04 Thread Roland Scheidegger
Am 05.02.2016 um 01:55 schrieb Matt Turner:
> On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
>> From: Marek Olšák 
>>
>> This is a subset of the generated tests which are known to fail
>> on everything except CPU emulation (AFAIK).
>> ---
> 
> This is really awful. Committing a generated test, but with unknown
> bits chopped out is gross.
> 
> If it were me, I'd want to understand why my hardware behaved
> differently -- not just hack up *different* tests and claim victory.
> 
> FWIW, the generated tests pass on all Intel hardware exposing
> ARB_shading_language_packing. Gen7+ has native half-float support, and
> Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
> built-ins into a pile of instructions.
I don't think that's surprising. The glsl lowering code (and the core
mesa half float conversion code) were written to match intel hw as far
as I know. And the tests were later written to match those rounding
semantics :-). But, the test passes on nvidia blob as well, so it's not
quite that unreasonable. It would be nice though if the test could tell
if it's failing because it doesn't work at all or because some rounding
is a bit different (which may or may not be allowed by glsl, maybe
someone can figure that out...).

Roland



> 
> If you can identify how AMD hardware behaves differently and can prove
> that the generator needs to be relaxed or something, that's cool. But
> as is, I hate this patch.
> 
> I can't find anything in the AMD docs (I looked at GCN3) about
> half-precision support, so I can't check my theory that AMD hardware
> rounds towards zero instead of to-nearest/even like Intel.
> ___
> Piglit mailing list
> Piglit@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/piglit
> 

___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit


Re: [Piglit] [Mesa-dev] [PATCH] Add (un)packHalf tests which don't fail on GCN

2016-02-04 Thread Matt Turner
On Thu, Feb 4, 2016 at 10:50 AM, Marek Olšák  wrote:
> From: Marek Olšák 
>
> This is a subset of the generated tests which are known to fail
> on everything except CPU emulation (AFAIK).
> ---

This is really awful. Committing a generated test, but with unknown
bits chopped out is gross.

If it were me, I'd want to understand why my hardware behaved
differently -- not just hack up *different* tests and claim victory.

FWIW, the generated tests pass on all Intel hardware exposing
ARB_shading_language_packing. Gen7+ has native half-float support, and
Gen6 uses the lowering code in lower_packing_builtins.cpp to turn the
built-ins into a pile of instructions.

If you can identify how AMD hardware behaves differently and can prove
that the generator needs to be relaxed or something, that's cool. But
as is, I hate this patch.

I can't find anything in the AMD docs (I looked at GCN3) about
half-precision support, so I can't check my theory that AMD hardware
rounds towards zero instead of to-nearest/even like Intel.
___
Piglit mailing list
Piglit@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/piglit