Hi Sam,

Yes, you can test Turboshaft Reducers directly. You can find examples in
test/unittests/compiler/turboshaft/
<https://source.chromium.org/chromium/chromium/src/+/main:v8/test/unittests/compiler/turboshaft/>.
Some examples:
  - Basic tests:
https://source.chromium.org/chromium/chromium/src/+/main:v8/test/unittests/compiler/turboshaft/control-flow-unittest.cc
  - Bit more complex tests:
https://source.chromium.org/chromium/chromium/src/+/main:v8/test/unittests/compiler/turboshaft/late-load-elimination-reducer-unittest.cc

You'll see that we don't have that many unittests for Reducers yet. As a
result, the framework may be lacking some useful features. If this happens,
feel free to add whatever you need :)

Cheers,
Darius


On Fri, 12 Jul 2024 at 11:18, Sam Parker-Haynes <[email protected]> wrote:

> Hi,
>
> I think I've got a reasonable implementation of this, where I'm performing
> the reduction in machine-optimization-reducer.h. Is there a way of testing
> turboshaft reducers directly, or will I need to write a mjsunit test?
>
> cheers
>
> On Wednesday, June 5, 2024 at 4:34:50 PM UTC+1 Sam Parker-Haynes wrote:
>
>> Okay, good!!
>>
>> So, although I'm wanting to generate horizontal reduction operations, I'm
>> currently thinking about lowering these to pairwise instructions, such as
>> SSE/AVX haddp and Neon faddp. The semantics of the TS op will be of a
>> recursively pairwise operation so targets should be able to lower them to a
>> variety of optimised sequences, which does mean we'd be able to use addv
>> for ints on aarch64.
>>
>> Thanks again,
>> Sam
>>
>> On Wednesday, June 5, 2024 at 4:04:36 PM UTC+1 [email protected] wrote:
>>
>>> And one more thing that will be nicer in a Reducer than in the
>>> instruction selector: you don't have to worry about CanCover :o :o :o
>>>
>>> Btw, as far as I can tell, there is no corresponding Intel operations
>>> for vaddvq (which I guess is what you want to generate), but I think that
>>> it's still better in a reduce than in the ISEL directly. Maybe add a #ifdef
>>> V8_TARGET_ARCH_ARM64 around the arm64-specific opcodes that you define.
>>>
>>> Cheers,
>>> Darius
>>> On Wednesday, June 5, 2024 at 4:56:56 PM UTC+2 Matthias Liedtke wrote:
>>>
>>>> Hi,
>>>>
>>>> I quickly synced with Darius:
>>>> 1) In general it makes sense to do the matching on the graph itself
>>>> (i.e. in a reducer) assuming this is a generic pattern for which there
>>>> might also be specialized / optimized instructions on other architectures.
>>>> 2) Intel is working on a re-vectorization pass to replace 128 bit SIMD
>>>> operations with 256 bit SIMD operations. So, if these optimized "add +
>>>> shuffle" operations exist on intel as well, there would be a clear benefit
>>>> in doing it in a reducer that could then potentially run prior to the
>>>> revectorization (which would require additional modifications to the
>>>> revectorizer).
>>>>
>>>> In general it's advisable to have as little architecture-specific code
>>>> paths in the reducers as possible, so the operations shouldn't be
>>>> overfitting to some arm64-only instructions.
>>>> Still, having some SIMD operations with clear semantics in the graph
>>>> that only exist on some architectures, is fine.
>>>>
>>>> I don't think the overhead of pattern matching on the graph is likely
>>>> to be more effort or slower than pattern matching during instruction
>>>> selection.
>>>> Given the complexity of arm64 and x64 ISel code, I'm happy about
>>>> anything that isn't added on top of that. :)
>>>>
>>>> Cheers,
>>>> Matthias
>>>>
>>>> On Wed, Jun 5, 2024 at 3:59 PM Sam Parker-Haynes <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'd like to add some pattern matching, for Turboshaft, to recognise
>>>>> add + shuffle patterns which correspond to a horizontal pairwise 
>>>>> reduction.
>>>>> I've started doing this with wasm::SimdShuffle helpers and then during
>>>>> arm64 instruction selection, but it feels like the pattern matching should
>>>>> be done in a generic place too... So, I was thinking about adding more 
>>>>> four
>>>>> more kinds (I32x4, I64x4, F32x4 and F64x2 PairwiseReduction)
>>>>> to Simd128UnaryOp and then perform the combining in
>>>>> machine-optimization-reducer.
>>>>>
>>>>> Does this sound reasonable enough..? Or is the overhead of plumbing
>>>>> this into the TS IR likely going to be significantly more complicated than
>>>>> backend pattern matching?
>>>>>
>>>>> Thanks,
>>>>> Sam
>>>>>
>>>>> --
>>>>> --
>>>>> v8-dev mailing list
>>>>> [email protected]
>>>>> http://groups.google.com/group/v8-dev
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "v8-dev" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/v8-dev/2a9c3fcd-ee78-4877-9587-2ccb3b0a59e6n%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/v8-dev/2a9c3fcd-ee78-4877-9587-2ccb3b0a59e6n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
> --
> v8-dev mailing list
> [email protected]
> http://groups.google.com/group/v8-dev
> ---
> You received this message because you are subscribed to the Google Groups
> "v8-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/v8-dev/c95c86f3-25b8-41da-8ae2-7ecb03c3b54dn%40googlegroups.com
> <https://groups.google.com/d/msgid/v8-dev/c95c86f3-25b8-41da-8ae2-7ecb03c3b54dn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
-- 
v8-dev mailing list
[email protected]
http://groups.google.com/group/v8-dev
--- 
You received this message because you are subscribed to the Google Groups 
"v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/v8-dev/CAKRYUpvVwTWwG8_6Ns-JN29QamMk9GE7As9bNc6_T4Hw_8zYNw%40mail.gmail.com.

Reply via email to