On Fri, 26 Sep 2025 22:34:03 GMT, Chen Liang <[email protected]> wrote:

>> Hi @liach , 
>> 
>> Currently, Unsafe.put* APIs expect to operate on a mutable value, without 
>> Unsafe.makePrivateBuffer, there is no way to transition a value object to 
>> larval state.
>> 
>> <img width="800" height="400" alt="image" 
>> src="https://github.com/user-attachments/assets/af826cda-55e1-4b0c-a2ea-62592f7623d6";
>>  />
>> 
>> 
>> Here is a typical update kernel for the nary operation fallback 
>> implementation. 
>> 
>> <img width="500" height="200" alt="image" 
>> src="https://github.com/user-attachments/assets/4a31baa7-52b8-4e0b-8c42-924407bb5665";
>>  />
>> 
>> 
>> 
>> **Here are some relevant FAQs on the need for multifield annotation.**
>> 
>> Q. Why do we need @multifield annotated field in VectorPayloads and not just 
>> retain the array-backed backing storage?
>> A.  Even currently, Vector instances are immutable, with each modification 
>> or application of an operation, a new vector is generated. 
>>       Each new vector has a distinct backing storage in the form of an 
>> array; thus, no two vector ever share their backing storage, which makes 
>> vectors an immutable quantity. 
>>       
>>      Vector<Float>  newVector  =  Vec1.lanewise(VectorOperators.ADD, Vec2);
>>      
>> Since arrays are always allocated over the heap, they carry an identity, 
>> which is the distinctive heap address for each new backing storage array.
>> 
>> This contradicts the philosophy of value type instances, which are 
>> identity-free; the compiler treats two values with the same contents as 
>> equivalent entities and is free to substitute one with another. 
>> 
>> By replacing existing array-backed storage with a @multifield annotated 
>> storage, we ensure that payload adheres to true value semantics, a 
>> @multifiled is seen as a bundle of fields, encapsulating payload is a value 
>> class, unlike an array, a multifield is never allocated an explicit heap 
>> storage. 
>> 
>> Here is an example code
>> 
>>  
>> <img width="400" height="500" alt="image" 
>> src="https://github.com/user-attachments/assets/14cb3fbc-15cf-461e-846e-b200f744f793";
>>  />
>> 
>> 
>> Even though Payload is a value class, its two instances with the same 
>> backing storage are not equal, because arrays have identity.
>> By treating vectors as value objects, we expect that two vectors with the 
>> same contents should be equal.
>> 
>> Q.  Is there any alternative to @multifield?
>> A.  All we need to ensure is that the backing storage has no identity.  
>> Thus, we could have multiple primitive type fields in the payload, one for 
>> each lane of the vector. 
>> 
>>  
>> <img width="450" height="310" al...
>
> Hi @jatin-bhateja, as I know, the larval bit is completely unused in current 
> Valhalla - I believe what we should do with that assert is to simply remove 
> it. I am thinking of some internal API that accepts a Consumer that handles 
> the "early larval" API via unsafe before returning it.
> 
> I thought the merge of lworld into vector would be trivial, but it turned out 
> troublesome - can you push the merge as soon as it is ready? I am more than 
> happy to help you migrate off makePrivateBuffer and this to-be-removed larval 
> bit.

Hi @liach ,

I am working on refereshing lworld+vector branch and also thinking through some 
post-merge improvements.
For record sake adding my notes in the comments.

post merge improvements:-
    - Replace VectorBox and VectorBoxAllocate nodes with InlineTypeNode.
    - New Ideal Transforms to sweep intermediate logic after inline expansion.
  
  Code snippet:-


Source:-
       vec1 = fromArray(SRC1)              - [1]
       vec2 = fromArray(SRC2)              - [2]
       res = vec1.lanewise(MUL,vec2)       - [3]
       res.intoArray(RES)                  - [4]


  Intermeridate Rerpresenation for [1][2]
       
       LoadVector 
          |
          v
       InlineTypeNode (VectorPayload)
          |
          v
       InlineTypeNode (Double256Vector)
          |
          v
       CastPP (NotNull) 
          |
          v
       CheckCastPP (Double256Vector)
        |
        v     
       [x] -> AddP -> LoadVector
        |                  |
        |                  v
        |             InlineTypeNode (VectorPayload)      [[[[ make_from_oop 
]]]           
        |                  |
        | oop              v
        |-----------> InlineTypeNode (Double256Vector)
                           |
                           v
                          [y]


  Intermediate Represenation for [3]

           [y]         [y]
            |           |
            v           v
      VectorUnbox   VectorUnbox
            |         /
            |        /
            |       /
            |      /
            |     /                                                             
                               
            |    /
            |   /
            [Mul]
              |
              v
        InlineTypeNode (VectorPayloadMF)
              |
              v
        InlineTypeNode (Double256Vector)
              |
              v
            CastPP
              |
              v
          CheckCastPP
              |
              v    
             [x] -> AddP -> LoadVector
              |                  |
              |                  v
              |             InlineTypeNode (VectorPayload)      [[[[ 
make_from_oop ]]]           
              |                  |
              | oop              v
              |-----------> InlineTypeNode (Double256Vector)
                                 |
                                 v
                                [y]
 
 Iintermediate Represenation for [4]

         [y]
          |
          v
    VectorUnboxNode
          |
          v
     StoreVector

New Transforms for boxing + unboxing optimizations :-
-------------------------------------------------------------

- Unboxing simply will simply be reduced to action of fetching correct VectorIR 
from its preceding input graph snippet.
- In context where boxing is needed we will bank process_inline_types node for 
bufferning 

   Q. Do we have sufficient infrastructure for lazy buffering during 
process_inline_types in down stream flow ?
   A. No, buffering is done upfront as it needs precise memory and control 
edges, currently, compiler performs eager buffering 
        in scenarios which mandates non-scalarized forms of value objects. i.e. 
non-flat field assignment or passing 
        value type arguments when InlineTypePassFieldsAsArgs is not set.
       
     Q. What if we do upfront buffering of InlineTypeNode corresponding to 
vector boxes ?
     A. A buffer and its associated InlineTypeNode must be in sync all the 
time. Any field update of a value object creates a new value. This can only be 
achieved through a new value instance allocation i.e. it must go through an 
intermediate larval state. All the fields of a value instance are final, thus 
we cannot use putfield to update their contents, getfield does return the 
initializing value of value instance field.
         
        MyValue (value type)
            - float f1;
            - float f2;
            - float f3;
            - float f4;

        val1 = new MyValue(0.0f, 1.0f, 2.0f, 3.0f);
            -->  new MyValue (oop)
                  |
                  --> make_from_oop(oop) post construction

        val2 = new MyValue(val1.f1, 2.0f, val1.f3, val1.f4)  // getfield 
val1.f1 will pass 0.0f
                  |                                          // getfield 
val1.f3 will pass 2.0f
                  |                                          // getfield 
val1.f4 will pass 3.0f
                  |                                          
                  --> make_from_oop(oop) post construction
                        
        return val2.f1 + val2.f2 ; will return 0.0f + 2.0f, thus there will be 
no consumer of InlineTypeNodes and they will be sweept 
        out along with their allocations. 

      Q. In new code model, an InlineTypeNode is always backed by an allocation 
?
      A. An inline type node is created from initialized this pointer passes as 
Parma0 of constructor, unless its created through make_from_multi i.e. from 
scalarize arguments parameters. 


(gdb) l
889         if (cg->method()->is_object_constructor() && receiver != nullptr && 
gvn().type(receiver)->is_inlinetypeptr()) {
890           InlineTypeNode* non_larval = InlineTypeNode::make_from_oop(this, 
receiver, gvn().type(receiver)->inline_klass());
891           // Relinquish the oop input, we will delay the allocation to the 
point it is needed, see the
892           // comments in InlineTypeNode::Ideal for more details
893           non_larval = non_larval->clone_if_required(&gvn(), nullptr);
894           non_larval->set_oop(gvn(), null());
895           non_larval->set_is_buffered(gvn(), false);
896           non_larval = gvn().transform(non_larval)->as_InlineType();
897           map()->replace_edge(receiver, non_larval);
898         }


     
  [Aaction Item] 
        -Try to always link the backing oop to newly scalarized InlineTypeNode, 
since new code model always creates
          a new larval value on every field modification hence, oop and 
scalarized nodes are always in sync.

  [Results] 
         Working on stand alone tests, need to regress through performance and 
validation suite.
            - Functional validation is almost fine.
            - Also, most of the InlineTypeNode are sweeped out unless, there is 
any specific context which needs materialized
              value.

-------------

PR Comment: https://git.openjdk.org/valhalla/pull/1593#issuecomment-3392084635

Reply via email to