Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Andre McCurdy
On Tue, Jun 12, 2018 at 4:32 PM, Herve Jourdain  wrote:
> Hi Andre,
>
> I believe I did say always present on armv8 and armv7, I did not mean before 
> that.

Right. The point of considering older architecture levels was that IF
we were to drop support for armv4 (I'm not necessarily suggesting that
we do) then we could simplify the tuning files quite a lot, since then
every supported ARM core would support Thumb.

> Having separate tunes for thumb support was necessary on previous 
> architectures where it was optional, but it persisted for architectures which 
> made thumb mandatory.
> I’m not even advocating removing the tune option for previous architectures 
> that would normally not require it, but I believe we should get rid of it for 
> new ARM architectures.

We all seem to be strongly agreeing with each other :-)

BTW, I don't know how much history you've aware of from when this
topic has been discussed previously on the list (it's come up a few
times...). There are people who've created distros etc which do
actually rely on being able to define (for example) an armv7a machine
without Thumb support, even though such a thing doesn't actually
exist. So, as much as we might all agree that removing the option to
disable Thumb support for armv7a etc might be "the right thing to do",
in practice there are going to be people who object to it.

> Cheers,
> Herve
>
>> On 13 Jun 2018, at 04:39, Andre McCurdy  wrote:
>>
>>> On Tue, Jun 12, 2018 at 1:00 PM, Mark Hatle  
>>> wrote:
 On 6/12/18 10:49 AM, Herve Jourdain wrote:
 Hi,

 So I agree with you about restricting to what gcc can support, that's 
 actually my proposal (actually, probably a subset of what gcc can support).
 So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, 
 armv8.2-a, armv8.3-a, armv8.4-a.
 Then, you can add the supported options with a "+" after the architecture.
 Options supported for armv8-a are: '+crc', '+simd', '+crypto', 
 '+nocrypto', '+nofp'
 Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', 
 '+nofp'
 Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
 '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
 Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', 
 '+dotprod', '+nocrypto', '+nofp'

 As you can see, proposals for armv8-a, whether my previous one, the new 
 one here, or even the one I have updated and used in production, just 
 capture the existing complexity, and not add to it.
 and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add 
 more options down the line.
>>>
>>> Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we 
>>> don't
>>> necessarily define a tune that uses them -- if it's standard another layer
>>> certainly could.)
>>>
 Regarding fpu, gcc supports the following for armv8: fp-armv8, 
 neon-fp-armv8, and crypto-neon-fp-armv8.

 Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
 ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
 ‘cortex-a73’, ‘cortex-a75’.

 I personally would like to keep tuning for a specific CPU as much as 
 possible (again I'm working closely with various ARM-based SoCs, so my 
 opinion might be tainted).
>>>
>>> Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
>>> more reasonable to support all of this.. (maybe that is what needs to be 
>>> done in
>>> the future as well for other architectures.. focus on the 'gcc' behavior and
>>> generate TUNE_FEATURES matching the compiler.)
>>>
>>> I'd like Khem's opinion on how crazy of an idea that is.
>>>
 One thing that could be done to simplify things would be to just use the 
 cpu, and add the options to it. Gcc supports adding options to the cpu.
 '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
 '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
 ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’

 That could simplify the tune settings, but would give less control than 
 what we currently have.
 As you might have guessed, I do put a specific emphasis on the crypto 
 option, and on the neon option, which are the most interesting for armv8 
 in my opinion.
>>>
>>> In the past 'crypto' options have only been assembly.. if that's changed it 
>>> has
>>> definitely opened up a new facet in all of this work.
>>>
 Regarding thumb, always adding it to the tune without creating specific 
 variants with or without thumb makes sense, since the tune is normally 
 about the SoC capabilities, and arv7 and armv8 both support it.
 You can always select whether you want thumb or not by setting 
 ARM_INSTRUCTION_SET appropriately at the distro level.
>>>
>>> Yes, that might be needed now that thumb is theoretically always supposed 
>>> to be
>>> 

Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Herve Jourdain
Hi Andre,

I believe I did say always present on armv8 and armv7, I did not mean before 
that.
Having separate tunes for thumb support was necessary on previous architectures 
where it was optional, but it persisted for architectures which made thumb 
mandatory.
I’m not even advocating removing the tune option for previous architectures 
that would normally not require it, but I believe we should get rid of it for 
new ARM architectures.

Cheers,
Herve

> On 13 Jun 2018, at 04:39, Andre McCurdy  wrote:
> 
>> On Tue, Jun 12, 2018 at 1:00 PM, Mark Hatle  wrote:
>>> On 6/12/18 10:49 AM, Herve Jourdain wrote:
>>> Hi,
>>> 
>>> So I agree with you about restricting to what gcc can support, that's 
>>> actually my proposal (actually, probably a subset of what gcc can support).
>>> So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, 
>>> armv8.2-a, armv8.3-a, armv8.4-a.
>>> Then, you can add the supported options with a "+" after the architecture.
>>> Options supported for armv8-a are: '+crc', '+simd', '+crypto', '+nocrypto', 
>>> '+nofp'
>>> Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', 
>>> '+nofp'
>>> Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
>>> '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
>>> Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', 
>>> '+dotprod', '+nocrypto', '+nofp'
>>> 
>>> As you can see, proposals for armv8-a, whether my previous one, the new one 
>>> here, or even the one I have updated and used in production, just capture 
>>> the existing complexity, and not add to it.
>>> and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add 
>>> more options down the line.
>> 
>> Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we don't
>> necessarily define a tune that uses them -- if it's standard another layer
>> certainly could.)
>> 
>>> Regarding fpu, gcc supports the following for armv8: fp-armv8, 
>>> neon-fp-armv8, and crypto-neon-fp-armv8.
>>> 
>>> Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
>>> ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
>>> ‘cortex-a73’, ‘cortex-a75’.
>>> 
>>> I personally would like to keep tuning for a specific CPU as much as 
>>> possible (again I'm working closely with various ARM-based SoCs, so my 
>>> opinion might be tainted).
>> 
>> Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
>> more reasonable to support all of this.. (maybe that is what needs to be 
>> done in
>> the future as well for other architectures.. focus on the 'gcc' behavior and
>> generate TUNE_FEATURES matching the compiler.)
>> 
>> I'd like Khem's opinion on how crazy of an idea that is.
>> 
>>> One thing that could be done to simplify things would be to just use the 
>>> cpu, and add the options to it. Gcc supports adding options to the cpu.
>>> '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
>>> '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
>>> ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’
>>> 
>>> That could simplify the tune settings, but would give less control than 
>>> what we currently have.
>>> As you might have guessed, I do put a specific emphasis on the crypto 
>>> option, and on the neon option, which are the most interesting for armv8 in 
>>> my opinion.
>> 
>> In the past 'crypto' options have only been assembly.. if that's changed it 
>> has
>> definitely opened up a new facet in all of this work.
>> 
>>> Regarding thumb, always adding it to the tune without creating specific 
>>> variants with or without thumb makes sense, since the tune is normally 
>>> about the SoC capabilities, and arv7 and armv8 both support it.
>>> You can always select whether you want thumb or not by setting 
>>> ARM_INSTRUCTION_SET appropriately at the distro level.
>> 
>> Yes, that might be needed now that thumb is theoretically always supposed to 
>> be
>> present.
> 
> It's not _always_ present - it's missing for armv4 CPUs such as StrongARM.
> 
> However the option has been unnecessarily propagated into tuning files
> for higher architecture levels where support for Thumb _is_ always
> present.

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Andre McCurdy
On Tue, Jun 12, 2018 at 2:43 PM, Mark Hatle  wrote:
> On 6/12/18 3:39 PM, Andre McCurdy wrote:
>> On Tue, Jun 12, 2018 at 1:00 PM, Mark Hatle  wrote:
>>> On 6/12/18 10:49 AM, Herve Jourdain wrote:
 Hi,

 So I agree with you about restricting to what gcc can support, that's 
 actually my proposal (actually, probably a subset of what gcc can support).
 So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, 
 armv8.2-a, armv8.3-a, armv8.4-a.
 Then, you can add the supported options with a "+" after the architecture.
 Options supported for armv8-a are: '+crc', '+simd', '+crypto', 
 '+nocrypto', '+nofp'
 Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', 
 '+nofp'
 Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
 '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
 Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', 
 '+dotprod', '+nocrypto', '+nofp'

 As you can see, proposals for armv8-a, whether my previous one, the new 
 one here, or even the one I have updated and used in production, just 
 capture the existing complexity, and not add to it.
 and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add 
 more options down the line.
>>>
>>> Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we 
>>> don't
>>> necessarily define a tune that uses them -- if it's standard another layer
>>> certainly could.)
>>>
 Regarding fpu, gcc supports the following for armv8: fp-armv8, 
 neon-fp-armv8, and crypto-neon-fp-armv8.

 Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
 ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
 ‘cortex-a73’, ‘cortex-a75’.

 I personally would like to keep tuning for a specific CPU as much as 
 possible (again I'm working closely with various ARM-based SoCs, so my 
 opinion might be tainted).
>>>
>>> Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
>>> more reasonable to support all of this.. (maybe that is what needs to be 
>>> done in
>>> the future as well for other architectures.. focus on the 'gcc' behavior and
>>> generate TUNE_FEATURES matching the compiler.)
>>>
>>> I'd like Khem's opinion on how crazy of an idea that is.
>>>
 One thing that could be done to simplify things would be to just use the 
 cpu, and add the options to it. Gcc supports adding options to the cpu.
 '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
 '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
 ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’

 That could simplify the tune settings, but would give less control than 
 what we currently have.
 As you might have guessed, I do put a specific emphasis on the crypto 
 option, and on the neon option, which are the most interesting for armv8 
 in my opinion.
>>>
>>> In the past 'crypto' options have only been assembly.. if that's changed it 
>>> has
>>> definitely opened up a new facet in all of this work.
>>>
 Regarding thumb, always adding it to the tune without creating specific 
 variants with or without thumb makes sense, since the tune is normally 
 about the SoC capabilities, and arv7 and armv8 both support it.
 You can always select whether you want thumb or not by setting 
 ARM_INSTRUCTION_SET appropriately at the distro level.
>>>
>>> Yes, that might be needed now that thumb is theoretically always supposed 
>>> to be
>>> present.
>>
>> It's not _always_ present - it's missing for armv4 CPUs such as StrongARM.
>
> Always present on -modern- ARM processors.. ARMv7 (Cortex) and newer AFAIK.  
> I'm
> not referring to older cores.

OK. Thanks for clarifying.

>> However the option has been unnecessarily propagated into tuning files
>> for higher architecture levels where support for Thumb _is_ always
>> present.
>>
>
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Mark Hatle
On 6/12/18 3:39 PM, Andre McCurdy wrote:
> On Tue, Jun 12, 2018 at 1:00 PM, Mark Hatle  wrote:
>> On 6/12/18 10:49 AM, Herve Jourdain wrote:
>>> Hi,
>>>
>>> So I agree with you about restricting to what gcc can support, that's 
>>> actually my proposal (actually, probably a subset of what gcc can support).
>>> So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, 
>>> armv8.2-a, armv8.3-a, armv8.4-a.
>>> Then, you can add the supported options with a "+" after the architecture.
>>> Options supported for armv8-a are: '+crc', '+simd', '+crypto', '+nocrypto', 
>>> '+nofp'
>>> Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', 
>>> '+nofp'
>>> Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
>>> '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
>>> Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', 
>>> '+dotprod', '+nocrypto', '+nofp'
>>>
>>> As you can see, proposals for armv8-a, whether my previous one, the new one 
>>> here, or even the one I have updated and used in production, just capture 
>>> the existing complexity, and not add to it.
>>> and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add 
>>> more options down the line.
>>
>> Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we don't
>> necessarily define a tune that uses them -- if it's standard another layer
>> certainly could.)
>>
>>> Regarding fpu, gcc supports the following for armv8: fp-armv8, 
>>> neon-fp-armv8, and crypto-neon-fp-armv8.
>>>
>>> Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
>>> ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
>>> ‘cortex-a73’, ‘cortex-a75’.
>>>
>>> I personally would like to keep tuning for a specific CPU as much as 
>>> possible (again I'm working closely with various ARM-based SoCs, so my 
>>> opinion might be tainted).
>>
>> Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
>> more reasonable to support all of this.. (maybe that is what needs to be 
>> done in
>> the future as well for other architectures.. focus on the 'gcc' behavior and
>> generate TUNE_FEATURES matching the compiler.)
>>
>> I'd like Khem's opinion on how crazy of an idea that is.
>>
>>> One thing that could be done to simplify things would be to just use the 
>>> cpu, and add the options to it. Gcc supports adding options to the cpu.
>>> '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
>>> '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
>>> ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’
>>>
>>> That could simplify the tune settings, but would give less control than 
>>> what we currently have.
>>> As you might have guessed, I do put a specific emphasis on the crypto 
>>> option, and on the neon option, which are the most interesting for armv8 in 
>>> my opinion.
>>
>> In the past 'crypto' options have only been assembly.. if that's changed it 
>> has
>> definitely opened up a new facet in all of this work.
>>
>>> Regarding thumb, always adding it to the tune without creating specific 
>>> variants with or without thumb makes sense, since the tune is normally 
>>> about the SoC capabilities, and arv7 and armv8 both support it.
>>> You can always select whether you want thumb or not by setting 
>>> ARM_INSTRUCTION_SET appropriately at the distro level.
>>
>> Yes, that might be needed now that thumb is theoretically always supposed to 
>> be
>> present.
> 
> It's not _always_ present - it's missing for armv4 CPUs such as StrongARM.

Always present on -modern- ARM processors.. ARMv7 (Cortex) and newer AFAIK.  I'm
not referring to older cores.

> However the option has been unnecessarily propagated into tuning files
> for higher architecture levels where support for Thumb _is_ always
> present.
> 

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Andre McCurdy
On Tue, Jun 12, 2018 at 1:00 PM, Mark Hatle  wrote:
> On 6/12/18 10:49 AM, Herve Jourdain wrote:
>> Hi,
>>
>> So I agree with you about restricting to what gcc can support, that's 
>> actually my proposal (actually, probably a subset of what gcc can support).
>> So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, armv8.2-a, 
>> armv8.3-a, armv8.4-a.
>> Then, you can add the supported options with a "+" after the architecture.
>> Options supported for armv8-a are: '+crc', '+simd', '+crypto', '+nocrypto', 
>> '+nofp'
>> Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', '+nofp'
>> Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
>> '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
>> Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', 
>> '+dotprod', '+nocrypto', '+nofp'
>>
>> As you can see, proposals for armv8-a, whether my previous one, the new one 
>> here, or even the one I have updated and used in production, just capture 
>> the existing complexity, and not add to it.
>> and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add more 
>> options down the line.
>
> Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we don't
> necessarily define a tune that uses them -- if it's standard another layer
> certainly could.)
>
>> Regarding fpu, gcc supports the following for armv8: fp-armv8, 
>> neon-fp-armv8, and crypto-neon-fp-armv8.
>>
>> Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
>> ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
>> ‘cortex-a73’, ‘cortex-a75’.
>>
>> I personally would like to keep tuning for a specific CPU as much as 
>> possible (again I'm working closely with various ARM-based SoCs, so my 
>> opinion might be tainted).
>
> Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
> more reasonable to support all of this.. (maybe that is what needs to be done 
> in
> the future as well for other architectures.. focus on the 'gcc' behavior and
> generate TUNE_FEATURES matching the compiler.)
>
> I'd like Khem's opinion on how crazy of an idea that is.
>
>> One thing that could be done to simplify things would be to just use the 
>> cpu, and add the options to it. Gcc supports adding options to the cpu.
>> '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
>> '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
>> ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’
>>
>> That could simplify the tune settings, but would give less control than what 
>> we currently have.
>> As you might have guessed, I do put a specific emphasis on the crypto 
>> option, and on the neon option, which are the most interesting for armv8 in 
>> my opinion.
>
> In the past 'crypto' options have only been assembly.. if that's changed it 
> has
> definitely opened up a new facet in all of this work.
>
>> Regarding thumb, always adding it to the tune without creating specific 
>> variants with or without thumb makes sense, since the tune is normally about 
>> the SoC capabilities, and arv7 and armv8 both support it.
>> You can always select whether you want thumb or not by setting 
>> ARM_INSTRUCTION_SET appropriately at the distro level.
>
> Yes, that might be needed now that thumb is theoretically always supposed to 
> be
> present.

It's not _always_ present - it's missing for armv4 CPUs such as StrongARM.

However the option has been unnecessarily propagated into tuning files
for higher architecture levels where support for Thumb _is_ always
present.
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Mark Hatle
On 6/12/18 10:49 AM, Herve Jourdain wrote:
> Hi,
> 
> So I agree with you about restricting to what gcc can support, that's 
> actually my proposal (actually, probably a subset of what gcc can support).
> So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, armv8.2-a, 
> armv8.3-a, armv8.4-a.
> Then, you can add the supported options with a "+" after the architecture.
> Options supported for armv8-a are: '+crc', '+simd', '+crypto', '+nocrypto', 
> '+nofp'
> Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', '+nofp'
> Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
> '+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
> Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', '+dotprod', 
> '+nocrypto', '+nofp'
> 
> As you can see, proposals for armv8-a, whether my previous one, the new one 
> here, or even the one I have updated and used in production, just capture the 
> existing complexity, and not add to it.
> and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add more 
> options down the line.

Sounds a lot like the above would be TUNE_FEATURES to me..  (even if we don't
necessarily define a tune that uses them -- if it's standard another layer
certainly could.)

> Regarding fpu, gcc supports the following for armv8: fp-armv8, neon-fp-armv8, 
> and crypto-neon-fp-armv8.
> 
> Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
> ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
> ‘cortex-a73’, ‘cortex-a75’.
> 
> I personally would like to keep tuning for a specific CPU as much as possible 
> (again I'm working closely with various ARM-based SoCs, so my opinion might 
> be tainted).

Thats a lot of options, but if we focus on TUNE_FEATURES, I think it's a bit
more reasonable to support all of this.. (maybe that is what needs to be done in
the future as well for other architectures.. focus on the 'gcc' behavior and
generate TUNE_FEATURES matching the compiler.)

I'd like Khem's opinion on how crazy of an idea that is.

> One thing that could be done to simplify things would be to just use the cpu, 
> and add the options to it. Gcc supports adding options to the cpu.
> '+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
> '+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
> ‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’
> 
> That could simplify the tune settings, but would give less control than what 
> we currently have.
> As you might have guessed, I do put a specific emphasis on the crypto option, 
> and on the neon option, which are the most interesting for armv8 in my 
> opinion.

In the past 'crypto' options have only been assembly.. if that's changed it has
definitely opened up a new facet in all of this work.

> Regarding thumb, always adding it to the tune without creating specific 
> variants with or without thumb makes sense, since the tune is normally about 
> the SoC capabilities, and arv7 and armv8 both support it.
> You can always select whether you want thumb or not by setting 
> ARM_INSTRUCTION_SET appropriately at the distro level.

Yes, that might be needed now that thumb is theoretically always supposed to be
present.

--Mark

> Cheers,
> Herve
> 
> -Original Message-
> From: Mark Hatle [mailto:mark.ha...@windriver.com] 
> Sent: mardi 12 juin 2018 16:32
> To: Herve Jourdain ; 'Koen Kooi' 
> ; 'Randy Li' 
> Cc: 'OE-core' 
> Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex 
> processors
> 
> On 6/12/18 4:30 AM, Herve Jourdain wrote:
>> Hi,
>>
>> I believe I'm the "original author" of some patch attempt at tackling this 
>> problem, more than a year ago, as referenced in this series.
>> And I understand why everyone, Khem being the first and not the only one, 
>> would like some "simpler" things for ARM.
>> But the problem is that ARM-based SoCs are very diverse, and ARM does have a 
>> number of optional IP blocks (such as crypto, but neon is another one, and 
>> there are others), defined for each architecture. Then ARM defines some 
>> "standard" SoCs (like cortex-A53, cortex-A57, ...) which may set some of 
>> those optional IPs as required for that SoC, and the rest still as optional.
>> And SoC vendors decide what optional IPs they will implement or not...
> 
> Simplification is a goal in this, but as you said, not always reasonable with 
> a processor designed to be customized.
> 
> Typically true customization (vendor specific) doesn't belong in the oe-core 
> tune files, but stuff that is architecturally defined may.
> 
>> So when we're talking "cortex-A53

Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Herve Jourdain
Hi,

So I agree with you about restricting to what gcc can support, that's actually 
my proposal (actually, probably a subset of what gcc can support).
So for armv8, gcc supports, as architectures: armv8-a, armv8.1-a, armv8.2-a, 
armv8.3-a, armv8.4-a.
Then, you can add the supported options with a "+" after the architecture.
Options supported for armv8-a are: '+crc', '+simd', '+crypto', '+nocrypto', 
'+nofp'
Options supported for armv8.1-a are: '+simd', '+crypto', '+nocrypto', '+nofp'
Options supported for armv8.2-a and armv8.3-a are: '+fp16', '+fp16fml', 
'+simd', '+crypto', '+dotprod', '+nocrypto', '+nofp'
Options supported for armv8.4-a are: '+fp16', '+simd', '+crypto', '+dotprod', 
'+nocrypto', '+nofp'

As you can see, proposals for armv8-a, whether my previous one, the new one 
here, or even the one I have updated and used in production, just capture the 
existing complexity, and not add to it.
and support for armv8.1-a, armv8.2-a, armv8.3-a, armv8.4a will only add more 
options down the line.

Regarding fpu, gcc supports the following for armv8: fp-armv8, neon-fp-armv8, 
and crypto-neon-fp-armv8.

Regarding cpu, I believe that the armv8 supported ones are: ‘cortex-a32’, 
‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, ‘cortex-a57’, ‘cortex-a72’, 
‘cortex-a73’, ‘cortex-a75’.

I personally would like to keep tuning for a specific CPU as much as possible 
(again I'm working closely with various ARM-based SoCs, so my opinion might be 
tainted).

One thing that could be done to simplify things would be to just use the cpu, 
and add the options to it. Gcc supports adding options to the cpu.
'+nofp' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’ and ‘cortex-a55’
'+crypto' for ‘cortex-a32’, ‘cortex-a35’, ‘cortex-a53’, ‘cortex-a55’, 
‘cortex-a57’, ‘cortex-a72’, ‘cortex-a73’, ‘cortex-a75’

That could simplify the tune settings, but would give less control than what we 
currently have.
As you might have guessed, I do put a specific emphasis on the crypto option, 
and on the neon option, which are the most interesting for armv8 in my opinion.

Regarding thumb, always adding it to the tune without creating specific 
variants with or without thumb makes sense, since the tune is normally about 
the SoC capabilities, and arv7 and armv8 both support it.
You can always select whether you want thumb or not by setting 
ARM_INSTRUCTION_SET appropriately at the distro level.

Cheers,
Herve

-Original Message-
From: Mark Hatle [mailto:mark.ha...@windriver.com] 
Sent: mardi 12 juin 2018 16:32
To: Herve Jourdain ; 'Koen Kooi' 
; 'Randy Li' 
Cc: 'OE-core' 
Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex 
processors

On 6/12/18 4:30 AM, Herve Jourdain wrote:
> Hi,
> 
> I believe I'm the "original author" of some patch attempt at tackling this 
> problem, more than a year ago, as referenced in this series.
> And I understand why everyone, Khem being the first and not the only one, 
> would like some "simpler" things for ARM.
> But the problem is that ARM-based SoCs are very diverse, and ARM does have a 
> number of optional IP blocks (such as crypto, but neon is another one, and 
> there are others), defined for each architecture. Then ARM defines some 
> "standard" SoCs (like cortex-A53, cortex-A57, ...) which may set some of 
> those optional IPs as required for that SoC, and the rest still as optional.
> And SoC vendors decide what optional IPs they will implement or not...

Simplification is a goal in this, but as you said, not always reasonable with a 
processor designed to be customized.

Typically true customization (vendor specific) doesn't belong in the oe-core 
tune files, but stuff that is architecturally defined may.

> So when we're talking "cortex-A53", it's not necessarily the same cortex-A53 
> for all SoC vendors.
> 
> GCC does support all that complexity. So the main question is, do we want to 
> be able to generate code that could take advantage of the optional IPs 
> present on a SoC? Or do we prefer to settle for the least common denominator?

I think this is the key.  What combinations does GCC support (actually generate
code for?)   If GCC can't generate code for that combination, then I don't
believe it belongs as a tune in OE-Core, unless there is a compelling argument 
that assembly level functions will be common enough to justify it.

> As someone who is close to the SoC, I definitely would prefer to be able to 
> take advantage of the optional IPs present on an ARM SoC, and I'd rather have 
> a system that can at least support that even if it's slightly more complex. 
> This said, once it's done, most people won't look under the hood but just use 
> it, so the complexity would end up being hidden - much like now with armv7.

And this is why my GCC statement is being made.  Most developers will define a 
tune, but will never go into the assembly realm.  Th

Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Mark Hatle
> within a single file.

IF the instruction scheduling, generated instructions, optimizations, etc are
truely different.. then we should call them armv81a, etc..  (I don't believe we
can use a '.' for various reasons..)   But if there is no difference in the
compiler behavior, or the generated code.. and it's just assembly level
instruction additions -- then I'm reluctant to add these tunes as they can give
a false impression.

> Thoughts? Can we talk this over, so we can have a chance to have a good 
> support for armv8-32 in oe, instead of everyone doing its own?
> 
> Cheers,
> Herve
> 
> -Original Message-
> From: openembedded-core-boun...@lists.openembedded.org 
> [mailto:openembedded-core-boun...@lists.openembedded.org] On Behalf Of Koen 
> Kooi
> Sent: mardi 12 juin 2018 11:01
> To: Randy Li 
> Cc: OE-core 
> Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex 
> processors
> 
> 
> 
>> Op 9 jun. 2018, om 08:26 heeft Randy Li  het volgende 
>> geschreven:
>>
>> I read the ARMv8 manual again, it looks the hardware float is 
>> mandatory in Linux Distributions and toolchain libraries. Even some 
>> cortex processors can be configured without FPU/NEON hardware, but I 
>> don't think they would be used in openembeded core.
>>
>> So I can assume the NEON(SIMD) would exist all the time. Leaving only 
>> the crc and crypto instructions are optional here.
>>
>>
>> Randy Li (4):
>>  arch-armv8a.inc: add tune include for armv8
>>  tune-cortexa35: add tunes for ARM Cortex-A35
>>  tune-cortexa32: add tunes for ARM Cortex-A32
>>  tune-cortexa72: add tunes for ARM Cortex-A72
> 
> Having been forced to deal with the mess that’s 32-bit arm tunes: Let’s only 
> add an implementation specific tunes *after* having seem conclusive, 
> repeatable benchmark results. 90% of the 32 bit tune files are placebo effect 
> and just explode number of package archs in your distro feed. The goal of 
> aarch64 was to stop being different for the sake of being different, let’s 
> not make a mess because we are used to messes.
> 
> regards,
> 
> Koen
> --
> ___
> Openembedded-core mailing list
> Openembedded-core@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/openembedded-core
> 

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Herve Jourdain
Hi,

I believe I'm the "original author" of some patch attempt at tackling this 
problem, more than a year ago, as referenced in this series.
And I understand why everyone, Khem being the first and not the only one, would 
like some "simpler" things for ARM.
But the problem is that ARM-based SoCs are very diverse, and ARM does have a 
number of optional IP blocks (such as crypto, but neon is another one, and 
there are others), defined for each architecture. Then ARM defines some 
"standard" SoCs (like cortex-A53, cortex-A57, ...) which may set some of those 
optional IPs as required for that SoC, and the rest still as optional.
And SoC vendors decide what optional IPs they will implement or not...

So when we're talking "cortex-A53", it's not necessarily the same cortex-A53 
for all SoC vendors.

GCC does support all that complexity. So the main question is, do we want to be 
able to generate code that could take advantage of the optional IPs present on 
a SoC? Or do we prefer to settle for the least common denominator?
As someone who is close to the SoC, I definitely would prefer to be able to 
take advantage of the optional IPs present on an ARM SoC, and I'd rather have a 
system that can at least support that even if it's slightly more complex. This 
said, once it's done, most people won't look under the hood but just use it, so 
the complexity would end up being hidden - much like now with armv7.

I've personally followed up on my patches from last year, and I now have a 
slightly modified/simplified version of them, which I've used to build some 
production-ready environments using cortex-a53/armv8 tunes, that trigger the 
optimization for cortex-a53 + neon. And if the SoC I'm working with had the 
crypto extension, I would be very happy to build for it, by just switching the 
tune I use for my cortex-a53 to the armv8 tune supporting crypto.

So I believe now may be a good time to talk this over again, because we're 
basically building for cortex-a53 with cortexa7/armv7ve, and that is not the 
most optimal thing to do in my opinion (like, some instructions that were 
native in armv7ve are simulated in armv8).

One thing that I did come up as a simplification was the handling of thumb, I 
don't think it needs to be an option anymore, since its support is mandatory in 
armv8 (but I think it was also the case in armv7). That simplifies things a 
bit, but nothing fundamental, you still need to carry the support for the 
optional IPs around...
And in addition to what I proposed to support last year, we indeed now have to 
add armv8.1a, armv8.2a, armv8.3a, armv8.4a (so far...), which each have their 
own specificities/differences that make it unlikely to be supported within a 
single file.

Thoughts? Can we talk this over, so we can have a chance to have a good support 
for armv8-32 in oe, instead of everyone doing its own?

Cheers,
Herve

-Original Message-
From: openembedded-core-boun...@lists.openembedded.org 
[mailto:openembedded-core-boun...@lists.openembedded.org] On Behalf Of Koen Kooi
Sent: mardi 12 juin 2018 11:01
To: Randy Li 
Cc: OE-core 
Subject: Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex 
processors



> Op 9 jun. 2018, om 08:26 heeft Randy Li  het volgende 
> geschreven:
> 
> I read the ARMv8 manual again, it looks the hardware float is 
> mandatory in Linux Distributions and toolchain libraries. Even some 
> cortex processors can be configured without FPU/NEON hardware, but I 
> don't think they would be used in openembeded core.
> 
> So I can assume the NEON(SIMD) would exist all the time. Leaving only 
> the crc and crypto instructions are optional here.
> 
> 
> Randy Li (4):
>  arch-armv8a.inc: add tune include for armv8
>  tune-cortexa35: add tunes for ARM Cortex-A35
>  tune-cortexa32: add tunes for ARM Cortex-A32
>  tune-cortexa72: add tunes for ARM Cortex-A72

Having been forced to deal with the mess that’s 32-bit arm tunes: Let’s only 
add an implementation specific tunes *after* having seem conclusive, repeatable 
benchmark results. 90% of the 32 bit tune files are placebo effect and just 
explode number of package archs in your distro feed. The goal of aarch64 was to 
stop being different for the sake of being different, let’s not make a mess 
because we are used to messes.

regards,

Koen
--
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


Re: [OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-12 Thread Koen Kooi


> Op 9 jun. 2018, om 08:26 heeft Randy Li  het volgende 
> geschreven:
> 
> I read the ARMv8 manual again, it looks the hardware float is mandatory
> in Linux Distributions and toolchain libraries. Even some cortex
> processors can be configured without FPU/NEON hardware, but I don't
> think they would be used in openembeded core.
> 
> So I can assume the NEON(SIMD) would exist all the time. Leaving only the
> crc and crypto instructions are optional here.
> 
> 
> Randy Li (4):
>  arch-armv8a.inc: add tune include for armv8
>  tune-cortexa35: add tunes for ARM Cortex-A35
>  tune-cortexa32: add tunes for ARM Cortex-A32
>  tune-cortexa72: add tunes for ARM Cortex-A72

Having been forced to deal with the mess that’s 32-bit arm tunes: Let’s only 
add an implementation specific tunes *after* having seem conclusive, repeatable 
benchmark results. 90% of the 32 bit tune files are placebo effect and just 
explode number of package archs in your distro feed. The goal of aarch64 was to 
stop being different for the sake of being different, let’s not make a mess 
because we are used to messes.

regards,

Koen
-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core


[OE-core] [PATCH v2 0/4] Add tune for ARMv8 and some cortex processors

2018-06-09 Thread Randy Li
I read the ARMv8 manual again, it looks the hardware float is mandatory
in Linux Distributions and toolchain libraries. Even some cortex
processors can be configured without FPU/NEON hardware, but I don't
think they would be used in openembeded core.

So I can assume the NEON(SIMD) would exist all the time. Leaving only the
crc and crypto instructions are optional here.


Randy Li (4):
  arch-armv8a.inc: add tune include for armv8
  tune-cortexa35: add tunes for ARM Cortex-A35
  tune-cortexa32: add tunes for ARM Cortex-A32
  tune-cortexa72: add tunes for ARM Cortex-A72

 meta/conf/machine/include/arm/arch-armv8.inc  |  1 -
 meta/conf/machine/include/arm/arch-armv8a.inc | 22 ++
 meta/conf/machine/include/tune-cortexa32.inc  | 15 +++
 meta/conf/machine/include/tune-cortexa35.inc  | 15 +++
 meta/conf/machine/include/tune-cortexa72.inc  | 12 
 5 files changed, 64 insertions(+), 1 deletion(-)
 delete mode 100644 meta/conf/machine/include/arm/arch-armv8.inc
 create mode 100644 meta/conf/machine/include/arm/arch-armv8a.inc
 create mode 100644 meta/conf/machine/include/tune-cortexa32.inc
 create mode 100644 meta/conf/machine/include/tune-cortexa35.inc
 create mode 100644 meta/conf/machine/include/tune-cortexa72.inc

-- 
2.14.3

-- 
___
Openembedded-core mailing list
Openembedded-core@lists.openembedded.org
http://lists.openembedded.org/mailman/listinfo/openembedded-core