Hi there,
Nico and Matthias said it all already pretty much, but, for completeness
and to give an example, here are the CLs that ported the x64 instruction
selector unit tests:
- https://chromium-review.googlesource.com/c/v8/v8/+/5368860
- https://chromium-review.googlesource.com/c/v8/v8/+/5
And one more thing that will be nicer in a Reducer than in the instruction
selector: you don't have to worry about CanCover :o :o :o
Btw, as far as I can tell, there is no corresponding Intel operations
for vaddvq (which I guess is what you want to generate), but I think that
it's still better
Hi,
In general, LLVM and ahead-of-time compilers have all the time in the world
to optimize a function, while Turbofan tries to save every millisecond it
can (it's not quite true: LLVM also tries to compile somewhat quickly, but
it's orders of magnitude slower that Turbofan). As a result, Turbo
)" instead of "X / (C2 / C1)".
>
> Nah they probably meant just that. (X * 2) / 10 is X / (10 / 2) aka X / 5,
> not X * (10 / 2) aka X * 5.
>
> It would be the same as X * (C1 / C2) though, that's another possible
> simplification (if things don't ov
Hi,
Interestingly, even
function Foo(s) {
return s + "42";
}
Is enough to reach this case.
Basically, as Leszek said, this happens when we have generic JSAdd and we
can infer that one of the inputs is a String.
In your example ("a + b"), we don't go into this path because we cann
Hi,
> With a turbofan-only flow, the overflow operation and the projection will
be duplicated for the second branch, and everything is fine.
I'm not sure which phase could duplicate a Int32AddWithOverflow. I don't
think that the Instruction Selector does this. It could be an artifact of
schedu
Arf sorry, I read a bit too quickly your initial message and missed that
you were trying to remove a CanCover in VisitWordCompareZero.
> Surely we should be able to combine an operation even if it has multiple
users?
Yes, we should (but we don't). We've been thinking about relaxing some
CanCov
Hi,
First: a Turboshaft phase that uses OptimizationPhase creates a copy of the
graph. The old graph is referred to as input_graph and the newly created
graph is the output_graph. This works by reducing each operation one by one
(where each reduction creates a new operation in the output_graph)
Hi,
>From a quick look, it seems Simd128LaneMemoryOps are only created in
wasm/turboshaft-graph-interface.cc, and there, the offset is always 0.
Then, there is no phases at all that reduces (or even looks at for that
matter) Simd128LaneMemoryOp, so when we reach the ISEL, the offset should
sti
Hi Sam,
> This seems to suggest that block->loop_end() is the first block inside
the loop, is this true?! If so, I hope it's clear why this is confusing :)
The comment on GetLoopEndRpo explains what loop_end is:
// In Turbofan, the `block->loop_end()` refers to the first after
(outside)
//
Hi,
`raw_assembler()->IntPtrAddWithOverflow(a, b)` goes to `INTPTR_BINOP(Int,
AddWithOverflow)` in raw-machine-assembler.h line 639, which expands to:
```
Node* IntPtrAddWithOverflow(Node* a, Node* b) {
return kSystemPointerSize == 8 ? Int64AddWithOverflow(a, b)
Hi,
> have a dynamic branch looking at the CPU feature in the generated
assembly.
I would suggest to do this.
You can look for instance at `IncsspqIfSupported` in builtins-x64.cc to see
how you can retrieve CpuFeatures at runtime from builtins. And another
example is the `supports_wasm_simd_1
12 matches
Mail list logo