Hi Sam, At least for now such a simplified setup for assembling Turboshaft instructions manually in a gtest and running the instruction selection on it doesn't exist. It should probably be feasible to define a graph builder for such test cases that basically just provides access to the turboshaft::Assembler. The test environment would need to provide some meaningful turboshaft::PipelineData.
On our side this hasn't been a priority yet. Best regards, Matthias On Thu, Nov 9, 2023 at 2:22 PM Sam Parker-Haynes <[email protected]> wrote: > Aaaaaaah, thank you! Well, now it seems obvious :) > > I think, for now, I will leave the continuation alone and special-case > projections to still require CanCover. > > My question remains around testing. I can currently test specific isel > patterns for Turbofan, is there anything like that for Turboshaft? > > Thanks again, > Sam > > On Thursday, November 9, 2023 at 12:00:07 PM UTC [email protected] > wrote: > >> Arf sorry, I read a bit too quickly your initial message and missed that >> you were trying to remove a CanCover in VisitWordCompareZero. >> >> > Surely we should be able to combine an operation even if it has >> multiple users? >> >> Yes, we should (but we don't). We've been thinking about relaxing some >> CanCovers for this purpose for a while, but haven't had time to get to it >> just yet. It's not surprising that it currently doesn't work, but it would >> be nice to make it work. >> >> > So could Overflow operations and/or Projections[1] be special-cased..? >> And why is there a difference here with a normal comparison? >> >> First, I think that very few operations besides Overflow binops have >> projections. Only Calls come to mind, and they are fairly different with >> regard to VisitWordCompareZero and CanCover (you probably never want to >> cover a Call). So, Overflow binops are different in the sense that they >> have 2 outputs and in particular an overflow output. >> >> Then, for a more useful explanation of what it going on: >> In VisitWordCompareZero, the continuation is a ForBranch when coming from >> a VisitBranch (which is your case). With such continuation, the overflow >> bit is never Defined even if it's used (see VisitBinop: it clearly has a >> single output regardless of the continuation, and EmitWithContinuation only >> defines an additional output for Set continuations), so it's not surprising >> that you end up with a virtual register (the one for the overflow bit) >> being used but not defined: you use it in the 2nd branch, but the 1st >> branch doesn't define it. >> If you look at VisitInt32AddWithOverflow, you'll see that it uses a >> VisitBinop (like VisitWordCompareZero does for branches on >> Int32AddWithOverflow's) with a ForSet continuation, and this ForSet >> continuation causes EmitWithContinuation to emit the overflow bit. >> >> In order to allow VisitWordCompareZero to cover Int32AddWithOverflow when >> the overflow bit has other uses (which should all be already Defined), >> you'll need to change something about the continuation used in >> VisitWordCompareZero. A ForBranchAndSet continuation sounds reasonable, but >> I haven't looked at the details. >> >> Happy to help if you have other questions :) >> >> Cheers, >> Darius >> >> On Thursday, November 9, 2023 at 11:31:24 AM UTC+1 [email protected] >> wrote: >> >>> I can confirm that by disabling value numbering, a second >>> Int32AddWithOverflow does exist: >>> >>> --- BLOCK B1158 id1159 <- B1153, B1157 --- >>> 5768: Phi[kRepFloat64](5753, 5767) >>> 5769: Float64RoundDown(5768) >>> 5770: Int64Constant[19] >>> 5771: Load[kRepTaggedPointer|kTypeAny](27, 5770) >>> 5772: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) >>> 5773: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, >>> sparse:^^](5690, 5689) >>> 5774: TypedStateValues[, sparse:..] >>> 5775: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, >>> sparse:^^......](5773, 5774) >>> 5776: TypedStateValues[, sparse:.] >>> 5777: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 >>> <SharedFunctionInfo test_div>](5772, 5775, 5776, 2, 8, 0) >>> 5778: Int32AddWithOverflow(5689, 5689) >>> 5779: Projection[0](5778) >>> 5780: Projection[1](5778) >>> 5781: Branch[Unspecified, False](5780) -> B1160, B1159 >>> >>> So could Overflow operations and/or Projections[1] be special-cased..? >>> And why is there a difference here with a normal comparison? >>> >>> Thanks, >>> Sam >>> >>> On Thursday, November 9, 2023 at 9:50:53 AM UTC Sam Parker-Haynes wrote: >>> >>>> Hi, >>>> >>>> Thanks both! >>>> >>>> Looking again at the IR, I wonder if it is control related. This is the >>>> well-behaved IR, where both instances of the Int32AddWithOverflow have a >>>> control input. >>>> >>>> --- BLOCK B1145 id5053 <- B1143, B1144 --- >>>> >>>> 43: Merge(42, 13024) >>>> >>>> 53: Phi[kRepWord32](18, 13025, 43) >>>> >>>> 52: Phi[kRepWord32](17, 13021, 43) >>>> >>>> 51: EffectPhi(15, 13024, 43) >>>> >>>> 13034: ChangeInt32ToFloat64(53) >>>> 6670: Load[kRepTaggedPointer|kTypeAny](6114, 23562, 51, 43) >>>> 13026: Int32AddWithOverflow(53, 53, 43) >>>> >>>> 13027: Projection[1](13026, 43) >>>> >>>> 13028: Branch[Machine, False](13027, 43) -> B1147, B1146 >>>> >>>> --- BLOCK B1158 id5027 <- B1157, B1153 --- >>>> 13061: Merge(13054, 13044) >>>> 13063: Phi[kRepFloat64](13056, 13060, 13061) >>>> 13062: EffectPhi(13056, 78, 13061) >>>> 6672: Load[kRepTaggedPointer|kTypeAny](6111, 23562, 13062, 13061) >>>> 13068: Int32AddWithOverflow(53, 53, 13061) >>>> 13069: Projection[1](13068, 13061) >>>> 13070: Branch[Machine, False](13069, 13061) -> B1160, B1159 >>>> >>>> However, with Turboshaft, the blocks look like this: >>>> >>>> --- BLOCK B1145 id1146 <- B1143, B1144 --- >>>> 5473: Phi[kRepWord32](5449, 5470) >>>> 5474: Phi[kRepWord32](5450, 5466) >>>> 5475: Int64Constant[19] >>>> 5476: Load[kRepTaggedPointer|kTypeAny](28, 5475) >>>> 5477: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) >>>> 5478: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, >>>> sparse:^^](5474, 5473) >>>> 5479: TypedStateValues[, sparse:..] >>>> 5480: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, >>>> kRepTagged|kTypeAny, kRepTagged|kTypeAny, kRepTagged|kTypeAny, >>>> sparse:^^^^^...](5478, 5479, 27, 26, 25) >>>> 5481: TypedStateValues[, sparse:.] >>>> 5482: FrameState[UNOPTIMIZED_FRAME, 59, PokeAt(0), 0x191e0015eab5 >>>> <SharedFunctionInfo test_div>](5477, 5480, 5481, 2, 8, 0) >>>> 5483: Int32AddWithOverflow(5473, 5473) >>>> 5484: Projection[0](5483) >>>> 5485: Projection[1](5483) >>>> 5486: Branch[Unspecified, False](5485) -> B1147, B1146 >>>> >>>> --- BLOCK B1158 id1159 <- B1153, B1157 --- >>>> 5549: Phi[kRepFloat64](5534, 5548) >>>> 5550: Float64RoundDown(5549) >>>> 5551: Int64Constant[19] >>>> 5552: Load[kRepTaggedPointer|kTypeAny](27, 5551) >>>> 5553: TypedStateValues[kRepTagged|kTypeAny, sparse:^](3) >>>> 5554: TypedStateValues[kRepWord32|kTypeInt32, kRepWord32|kTypeInt32, >>>> sparse:^^](5474, 5473) >>>> 5555: TypedStateValues[, sparse:..] >>>> 5556: TypedStateValues[kRepTagged|kTypeAny, kRepTagged|kTypeAny, >>>> sparse:^^......](5554, 5555) >>>> 5557: TypedStateValues[, sparse:.] >>>> 5558: FrameState[UNOPTIMIZED_FRAME, 92, PokeAt(0), 0x191e0015eab5 >>>> <SharedFunctionInfo test_div>](5553, 5556, 5557, 2, 8, 0) >>>> 5559: Branch[Unspecified, False](5485) -> B1160, B1159 >>>> >>>> >>>> I have seen that Turboshaft has an EmitProjectionReducer ( >>>> https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/assembler.h;drc=f5a7962861b208e9cf82e61c1fa9f8dc0d216a87;l=606) >>>> and seems to be closely related to ValueNumbering. Without control inputs, >>>> I can't see why ValueNumbering, or something similar, wouldn't remove the >>>> second instance. >>>> >>>> > It could be an artifact of scheduling >>>> >>>> I will have a look at the scheduler. >>>> >>>> > However, this should be prevented by this check >>>> >>>> Yes, it's that check that I'm trying to remove :) and this works for >>>> everything but projections. Surely we should be able to combine an >>>> operation even if it has multiple users? >>>> >>>> Regards, >>>> Sam >>>> >>>> On Wednesday, November 8, 2023 at 5:20:37 PM UTC [email protected] >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> > With a turbofan-only flow, the overflow operation and the projection >>>>> will be duplicated for the second branch, and everything is fine. >>>>> >>>>> I'm not sure which phase could duplicate a Int32AddWithOverflow. I >>>>> don't think that the Instruction Selector does this. It could be an >>>>> artifact of scheduling (which happens before effect control linearization >>>>> or before instruction selection). I'm not sure it's a conscious >>>>> optimization though, or just a side-effect of the scheduler. >>>>> >>>>> > Is it legal for all projected values to be live across basic blocks? >>>>> If so, who should be responsible for ensuring values are live where they >>>>> need to be? >>>>> >>>>> Yes it is legal (and probably happens fairly often). As to who should >>>>> ensure that the values are live... It sounds like here the bug is in the >>>>> Instruction Selector. What probably happens is that when emitting the 1st >>>>> branch, we decide that it can cover the Projection[1] and don't need to >>>>> emit it, which is wrong. However, this should be prevented by this check >>>>> https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/backend/x64/instruction-selector-x64.cc;l=4278: >>>>> when user=the_first_branch and value=projection[1], CanCover should return >>>>> false, because the projection has another use. >>>>> >>>>> If you can provide an example that we can reproduce, I'd be happy to >>>>> have a look, but without being able to reproduce, we're just speculating >>>>> :/ >>>>> >>>>> > I don't understand why it has a control input >>>>> >>>>> Projections don't really need control inputs, but I think that having >>>>> control inputs in projections is somewhat useful for scheduling in >>>>> Turbofan. We trying removing the control input ( >>>>> https://crrev.com/c/4570503), and performance regressed in some cases. >>>>> >>>>> Cheers, >>>>> Darius >>>>> >>>>> On Wednesday, November 8, 2023 at 6:02:40 PM UTC+1 Matthias Liedtke >>>>> wrote: >>>>> >>>>>> Hi Sam, >>>>>> >>>>>> Honestly I don't feel knowledgeable enough to answer this question, >>>>>> so I hope some Turbofan experts can chime in. >>>>>> This line >>>>>> <https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/common-operator.cc;l=1622> >>>>>> defines >>>>>> that the Projection in Turbofan has one value input, a control input and >>>>>> a >>>>>> value output but no control output. >>>>>> I don't understand why it has a control input or what >>>>>> it exactly means that an operation has a control input vs. let's say a >>>>>> Select operation that doesn't have one. >>>>>> Still, I'd assume that the Branch in B4 should have a control input >>>>>> that at some point leads to the same control input as "2: Projection [1]" >>>>>> and therefore I'd assume that it's valid as the control input of the >>>>>> projection dominates the branch that uses it. >>>>>> Looking at the schedule, as the Projection is in B0, it is guaranteed >>>>>> to be "executed" before B4, so the virtual register should be defined and >>>>>> should be "available" / usable by all instructions following itself in B0 >>>>>> and in all instructions in blocks that are reachable from B0 directly or >>>>>> indirectly. >>>>>> I don't think there is anything limiting the liveness of a virtual >>>>>> register, so as long as a block A dominates another Block B, all virtual >>>>>> registers defined in A are usable in B. >>>>>> So your snippet looks totally fine based on my (limited) >>>>>> understanding of Turbofan. >>>>>> >>>>>> However, I'd assume it should be >>>>>> 1: Projection[0] (0) >>>>>> 2: Projection[1] (0) >>>>>> and not >>>>>> 1: Projection[0] (0) >>>>>> 2: Projection[1] (1) >>>>>> (i.e. both projections should have Int32AddWithOverflow as an input) >>>>>> >>>>>> Best regards, >>>>>> Matthias >>>>>> >>>>>> >>>>>> On Wed, Nov 8, 2023 at 5:24 PM Sam Parker-Haynes <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Matthias, >>>>>>> >>>>>>> My RecreateSchedule example looks like this: >>>>>>> >>>>>>> B0: >>>>>>> 0: Int32AddWithOverflow >>>>>>> 1: Projection[0] (0) >>>>>>> 2: Projection[1] (1) >>>>>>> Branch (2) >>>>>>> ... >>>>>>> >>>>>>> B4: >>>>>>> Branch (2) <-- how should this branch receive (2) >>>>>>> >>>>>>> I know I said it looks okay at the IR level, but since the IR isn't >>>>>>> documented that may not be true ;) So, what are the semantics of a >>>>>>> Projection..? >>>>>>> - Is the Projection[1] of an Overflow operation a part of the >>>>>>> control chain? >>>>>>> - Is it legal for all projected values to be live across basic >>>>>>> blocks? If so, who should be responsible for ensuring values are live >>>>>>> where >>>>>>> they need to be? >>>>>>> >>>>>>> Thanks, >>>>>>> Sam >>>>>>> >>>>>>> On Wednesday, November 8, 2023 at 2:15:58 PM UTC Sam Parker-Haynes >>>>>>> wrote: >>>>>>> >>>>>>> Also, are any instruction selection tests being added for >>>>>>> Turboshaft? Something like the equivalent of >>>>>>> test/unittests/compiler/arm64/instruction-selector-arm64-unittest.cc? >>>>>>> >>>>>>> On Wednesday, November 8, 2023 at 1:26:12 PM UTC Sam Parker-Haynes >>>>>>> wrote: >>>>>>> >>>>>>> EDIT: only passes with --no-turboshaft >>>>>>> >>>>>>> On Wednesday, November 8, 2023 at 1:20:03 PM UTC Sam Parker-Haynes >>>>>>> wrote: >>>>>>> >>>>>>> Hi Matthias, >>>>>>> >>>>>>> Ah, I hadn't seen that, so yes I've been tweaking the OG selector >>>>>>> function for arm64. On the x64 side of things, everything is fine with >>>>>>> 'pure' turboshaft pipeline but, as with my arm64 checkout, the issue >>>>>>> arises >>>>>>> when modifying the turbofan selector function and using >>>>>>> ---no-turboshaft. >>>>>>> >>>>>>> So, I guess the problem is with the RecreateSchedule? >>>>>>> >>>>>>> Thanks, >>>>>>> Sam >>>>>>> >>>>>>> On Wednesday, November 8, 2023 at 1:12:56 PM UTC [email protected] >>>>>>> wrote: >>>>>>> >>>>>>> Hi Sam, >>>>>>> >>>>>>> I don't have any insights on x64 for VisitWordCompareZero but as a >>>>>>> quick note, the TurboshaftAdapter instruction selection implementation >>>>>>> for >>>>>>> VisitWordCompareZero for arm64 was just added yesterday by me ( >>>>>>> https://chromium-review.googlesource.com/c/v8/v8/+/5002003). >>>>>>> >>>>>>> So, just to clarify, I assume you are talking about the Turbofan >>>>>>> {Node* / TurbofanAdapter} based instruction selection with Turboshaft >>>>>>> optimizations followed by a RecreateSchedule? >>>>>>> If you do changes for the TurbofanAdapter instruction selection on >>>>>>> arm64 for code that already has an implementation for the >>>>>>> TurboshaftAdapter, could you add a TODO(mliedtke) on the Turboshaft >>>>>>> implementation with some nice description and CC me on the gerrit >>>>>>> changes, >>>>>>> so these don't get lost when we switch over to the TurboshaftAdapter? >>>>>>> >>>>>>> Thanks and best regards, >>>>>>> Matthias >>>>>>> >>>>>>> On Wed, Nov 8, 2023 at 1:54 PM Sam Parker-Haynes <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've been investigating rematerialising flag setting instructions in >>>>>>> VisitWordCompareZero, just by removing the CanCover check, and >>>>>>> something is >>>>>>> going wrong with turboshaft. >>>>>>> >>>>>>> I have an arithmetic overflow operation, Int32AddWithOverflow, and >>>>>>> the overflow check is used by two branch operations. With a >>>>>>> turbofan-only >>>>>>> flow, the overflow operation and the projection will be duplicated for >>>>>>> the >>>>>>> second branch, and everything is fine. But with turboshaft, neither the >>>>>>> overflow operation or projection are duplicated and the second branch >>>>>>> uses >>>>>>> the original projection[1] value. This looks fine at the IR level, but >>>>>>> ends >>>>>>> up broken after isel when the register allocator comes across a virtual >>>>>>> register without a definition. This happens for both x64 and arm64, so >>>>>>> I'm >>>>>>> assuming turboshaft is making some assumptions, based on the current >>>>>>> behaviour, that are non-obvious to me. >>>>>>> >>>>>>> Any ideas? >>>>>>> >>>>>>> Thanks! >>>>>>> Sam >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> v8-dev mailing list >>>>>>> [email protected] >>>>>>> http://groups.google.com/group/v8-dev >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "v8-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/v8-dev/175a4ba2-9d33-45bb-84e3-91d41a722846n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>>> -- >>>>>>> -- >>>>>>> v8-dev mailing list >>>>>>> [email protected] >>>>>>> http://groups.google.com/group/v8-dev >>>>>>> --- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "v8-dev" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> >>>>>> To view this discussion on the web visit >>>>>>> https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com >>>>>>> <https://groups.google.com/d/msgid/v8-dev/46693e6c-1216-493b-a104-2c9d41075987n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>> . >>>>>>> >>>>>> -- > -- > v8-dev mailing list > [email protected] > http://groups.google.com/group/v8-dev > --- > You received this message because you are subscribed to the Google Groups > "v8-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/v8-dev/3aabdaf2-7f9c-4b54-8b09-25441fb4b277n%40googlegroups.com > <https://groups.google.com/d/msgid/v8-dev/3aabdaf2-7f9c-4b54-8b09-25441fb4b277n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- -- v8-dev mailing list [email protected] http://groups.google.com/group/v8-dev --- You received this message because you are subscribed to the Google Groups "v8-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CACx9iAOu%3Dv5W1G%3D1%3DO0X2r_joCCHmUb%3D3rZvMRLdsDeKtiYL4A%40mail.gmail.com.
