ABataev added inline comments.
================ Comment at: openmp/libomptarget/src/omptarget.cpp:233 MapperComponents - .Components[target_data_function == targetDataEnd ? I : E - I - 1]; + .Components[target_data_function == targetDataEnd ? E - I - 1 : I]; MapperArgsBase[I] = C.Base; ---------------- ABataev wrote: > ye-luo wrote: > > ABataev wrote: > > > ye-luo wrote: > > > > ABataev wrote: > > > > > ye-luo wrote: > > > > > > ABataev wrote: > > > > > > > ye-luo wrote: > > > > > > > > ye-luo wrote: > > > > > > > > > ye-luo wrote: > > > > > > > > > > ABataev wrote: > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > > > > > > > > > > > ye-luo wrote: > > > > > > > > > > > > > > > > > > > > > > > ABataev wrote: > > > > > > > > > > > > > > > > > > > > > > > > grokos wrote: > > > > > > > > > > > > > > > > > > > > > > > > > What is the current status of > > > > > > > > > > > > > > > > > > > > > > > > > the order of the arguments > > > > > > > > > > > > > > > > > > > > > > > > > clang emits? Is it still > > > > > > > > > > > > > > > > > > > > > > > > > necessary to traverse > > > > > > > > > > > > > > > > > > > > > > > > > arguments in reverse order > > > > > > > > > > > > > > > > > > > > > > > > > here? > > > > > > > > > > > > > > > > > > > > > > > > Yes, still required > > > > > > > > > > > > > > > > > > > > > > > Based on the conversation in > > > > > > > > > > > > > > > > > > > > > > > https://reviews.llvm.org/D85216 > > > > > > > > > > > > > > > > > > > > > > > This line of code neither before > > > > > > > > > > > > > > > > > > > > > > > nor after the change plays well. > > > > > > > > > > > > > > > > > > > > > > > Shall we fix the order in > > > > > > > > > > > > > > > > > > > > > > > targetDataEnd first? > > > > > > > > > > > > > > > > > > > > > > This change is part of this patch > > > > > > > > > > > > > > > > > > > > > > and cannot be committed separately. > > > > > > > > > > > > > > > > > > > > > I mean could you fix that issue as a > > > > > > > > > > > > > > > > > > > > > parent of this patch? > > > > > > > > > > > > > > > > > > > > > This change is part of this patch and > > > > > > > > > > > > > > > > > > > > > cannot be committed separately. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > If fixing the reordering is part of > > > > > > > > > > > > > > > > > > > > this patch, I should have seen > > > > > > > > > > > > > > > > > > > > "target_data_function == targetDataEnd > > > > > > > > > > > > > > > > > > > > ?" branches disappear. > > > > > > > > > > > > > > > > > > > Nope, just with this patch. It reorders > > > > > > > > > > > > > > > > > > > the maps and need to change the cleanup > > > > > > > > > > > > > > > > > > > order too. > > > > > > > > > > > > > > > > > > It works just like > > > > > > > > > > > > > > > > > > constructors/destructors: allocate in > > > > > > > > > > > > > > > > > > direct order, deallocate in reversed to > > > > > > > > > > > > > > > > > > correctly handle map order. > > > > > > > > > > > > > > > > > The description says that "present and alloc > > > > > > > > > > > > > > > > > mappings are processed first and then all > > > > > > > > > > > > > > > > > others." > > > > > > > > > > > > > > > > > Why the order of arguments in > > > > > > > > > > > > > > > > > targetDataBegin, targetDataEnd and > > > > > > > > > > > > > > > > > targetDataUpdate all get reversed. > > > > > > > > > > > > > > > > Because this is for mappers. Mapper maps are > > > > > > > > > > > > > > > > ordered by the compiler in the direct order > > > > > > > > > > > > > > > > (alloc, maps, delete) but when we need to do > > > > > > > > > > > > > > > > exit, we need to release the data in reversed > > > > > > > > > > > > > > > > order (deletes, maps, allocs). > > > > > > > > > > > > > > > I was not making the question clear. My question > > > > > > > > > > > > > > > about "reverse" is not about having a reverse > > > > > > > > > > > > > > > order for targetDataBegin. My question was about > > > > > > > > > > > > > > > "reversing" from the the old code. Your change > > > > > > > > > > > > > > > put the opposite order for targetDataBegin, > > > > > > > > > > > > > > > targetDataEnd and targetDataUpdate cases. > > > > > > > > > > > > > > > I was not making the question clear. My question > > > > > > > > > > > > > > > about "reverse" is not about having a reverse > > > > > > > > > > > > > > > order for targetDataBegin. My question was about > > > > > > > > > > > > > > > "reversing" from the the old code. Your change > > > > > > > > > > > > > > > put the opposite order for targetDataBegin, > > > > > > > > > > > > > > > targetDataEnd and targetDataUpdate cases. > > > > > > > > > > > > > > > > > > > > > > > > > > > > typo correction > > > > > > > > > > > > > > I was not making the question clear. My question > > > > > > > > > > > > > > about "reverse" is not about having a reverse order > > > > > > > > > > > > > > for **targetDataEnd**. My question was about > > > > > > > > > > > > > > "reversing" from the the old code. Your change put > > > > > > > > > > > > > > the opposite order for targetDataBegin, > > > > > > > > > > > > > > targetDataEnd and targetDataUpdate cases. > > > > > > > > > > > > > My separate question specifically for targetDataEnd > > > > > > > > > > > > > is the following. > > > > > > > > > > > > > > > > > > > > > > > > > > In target(), we call > > > > > > > > > > > > > ``` > > > > > > > > > > > > > targetDataBegin(args) > > > > > > > > > > > > > { // forward order > > > > > > > > > > > > > for (int32_t i = 0; i < arg_num; ++i) { ... } > > > > > > > > > > > > > } > > > > > > > > > > > > > launch_kernels > > > > > > > > > > > > > targetDataEnd(args) > > > > > > > > > > > > > { // reverse order > > > > > > > > > > > > > for (int32_t I = ArgNum - 1; I >= 0; --I) { } > > > > > > > > > > > > > } > > > > > > > > > > > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > At a mapper, > > > > > > > > > > > > > ``` > > > > > > > > > > > > > targetDataMapper > > > > > > > > > > > > > { > > > > > > > > > > > > > // generate args_reverse in reverse order for > > > > > > > > > > > > > targetDataEnd > > > > > > > > > > > > > targetDataEnd(args_reverse) > > > > > > > > > > > > > } > > > > > > > > > > > > > ``` > > > > > > > > > > > > > Are we actually getting the original forward order > > > > > > > > > > > > > due to one reverse in targetDataMapper and second > > > > > > > > > > > > > reverse in targetDataEnd? Is this the desired > > > > > > > > > > > > > behavior? This part confused me. Do I miss something? > > > > > > > > > > > > > Could you explain a bit? > > > > > > > > > > > > Yes, something like this. targetDataEnd reverses the > > > > > > > > > > > > order of mapping arrays. But mapper generator always > > > > > > > > > > > > generates mapping arrays in the direct order (it fills > > > > > > > > > > > > mapping arrays that later processed by the > > > > > > > > > > > > targetDataEnd function). We could fix this by passing > > > > > > > > > > > > extra Boolean flag to the generator function but it > > > > > > > > > > > > means the redesign of the mappers. That's why we have > > > > > > > > > > > > to reverse it in the libomptarget. > > > > > > > > > > > You can check it yourself. Apply the patch, restore the > > > > > > > > > > > original behavior in libomptarget and run libomptarget > > > > > > > > > > > tests. Mapper related tests will crash. > > > > > > > > > > Stick with mapper generator always generating mapping > > > > > > > > > > arrays in the direct order. The targetDataMapper reverse > > > > > > > > > > the mapping array and then passes args_reverse into > > > > > > > > > > targetDataEnd. Inside targetDataEnd, mapping > > > > > > > > > > Yes, something like this. targetDataEnd reverses the order > > > > > > > > > > of mapping arrays. But mapper generator always generates > > > > > > > > > > mapping arrays in the direct order (it fills mapping arrays > > > > > > > > > > that later processed by the targetDataEnd function). We > > > > > > > > > > could fix this by passing extra Boolean flag to the > > > > > > > > > > generator function but it means the redesign of the > > > > > > > > > > mappers. That's why we have to reverse it in the > > > > > > > > > > libomptarget. > > > > > > > > > > > > > > > > > > Stick with mapper generator always generating mapping arrays > > > > > > > > > in the direct order. > > > > > > > > > > > > > > > > > > In the targetDataBegin case, targetDataMapper keep direct > > > > > > > > > order args and calls targetDataBegin(args) and > > > > > > > > > targetDataBegin process args in direct order. > > > > > > > > > > > > > > > > > > In the targetDataEnd case, targetDataMapper reverses the > > > > > > > > > mapping array and then passes args_reverse into > > > > > > > > > targetDataEnd. Inside targetDataEnd, args_reverse are > > > > > > > > > processed in reverse order. So targetDataEnd is actually > > > > > > > > > processing the args in original direct order. This seems > > > > > > > > > contradictory to the constructor/deconstructor like behavior > > > > > > > > > that all the mappings must be processed in the actual reverse > > > > > > > > > order in targetDataEnd. > > > > > > > > > > > > > > > > > > This is my understanding. The current code should be wrong > > > > > > > > > but obviously the current code is working. So why the current > > > > > > > > > code is working? what is inconsistent in my analysis. Could > > > > > > > > > you point out the missing piece. > > > > > > > > > You can check it yourself. Apply the patch, restore the > > > > > > > > > original behavior in libomptarget and run libomptarget tests. > > > > > > > > > Mapper related tests will crash. > > > > > > > > > > > > > > > > For sure without this line, tests would crash and that is why > > > > > > > > you included this line of change in the patch. Since you made > > > > > > > > the change, you could explain why, right? > > > > > > > I changed and simplified codegen for the mapper generator without > > > > > > > changing its interface. I could do this because of the new > > > > > > > ordering, before we could not rely on it. But it also requires a > > > > > > > change in the runtime. > > > > > > > targetDataEnd calls targetDataMapper and targetDataMapper fills > > > > > > > the array in the direct order, but targetDataEnd processes them > > > > > > > in the reverse order, but mapper generator does not know about > > > > > > > it. It also has to generate the data in the reverse order, just > > > > > > > like targetDataEnd does. > > > > > > > > > > > > > > Before this patch mapper generator tried to do some ordering but > > > > > > > it was not always correct. It was not expecting something like > > > > > > > map(alloc:s) map(s.a) because it was not allowed by the compiler. > > > > > > > That's why it worked before and won't work with this patch. > > > > > > > PS. The change in the mapper generator is also required and > > > > > > > cannot be separated. Without this mappers tests won't work. > > > > > > I played a bit with your patch. > > > > > > ``` > > > > > > #pragma omp target exit data map(from: c.a[0:NUM], c.b[0:NU2M]) > > > > > > map(delete: c) > > > > > > ``` > > > > > > I put NUM=1024 and NU2M = 2048. > > > > > > LIBOMPTARGET_DEBUG reports > > > > > > ``` > > > > > > Libomptarget --> Entry 0: Base=0x00007fff064080a8, > > > > > > Begin=0x00007fff064080a8, Size=16, Type=0x0, Name=(null) > > > > > > Libomptarget --> Entry 1: Base=0x00007fff064080a8, > > > > > > Begin=0x0000000000f9cbd0, Size=4096, Type=0x1000000000012, > > > > > > Name=(null) > > > > > > Libomptarget --> Entry 2: Base=0x00007fff064080b0, > > > > > > Begin=0x0000000000f86e10, Size=8192, Type=0x1000000000012, > > > > > > Name=(null) > > > > > > Libomptarget --> Entry 3: Base=0x00007fff064080a8, > > > > > > Begin=0x00007fff064080a8, Size=16, Type=0x1000000000008, Name=(null) > > > > > > ``` > > > > > > Since targetDataEnd internally reverse the processing order, could > > > > > > you confirm that the frontend was emitting entries 3,2,1,0? > > > > > > I'm wondering if the frontend could emit 3, 0, 1, 2 so the > > > > > > processing order is 2,1,0,3? The spec requires struct element > > > > > > processed before the struct in "target exit data" > > > > > No, the frontend emits in the order 0, 1, 2, 3. targetDataEnd process > > > > > in reversed order 3, 2, 1, 0, but the mapper does not know about it > > > > > and still emits the data in the order 0, 1, 2, 3. > > > > > And it is only for mappers! > > > > > So, say you have an extra map something like map(a). > > > > > ``` > > > > > map (a) map(mapper(id), tofrom: c) > > > > > ``` > > > > > where mapper for с does something like you wrote. > > > > > > > > > > In this case the order would be 0, 1, 2, 3, 4, where 0 is mapping of > > > > > a and 1-4 is mapping of c. > > > > > When we need to delete the data, the mapper still would generate > > > > > 1,2,3,4 + 0 for mapping of a, but targetDataEnd expects 4,3,2,1,0. > > > > > That's why we have to reverse the mapping data, produced by the > > > > > mapper generator for targetDataEnd. > > > > Let us leave the mapper case aside which has extra mess. > > > > Double checked that "Libomptarget --> Entry 0" is printed at the > > > > __tgt_target_data_end_mapper. So the order is as you said 0, 1, 2, 3 > > > > from the frontend. What is the "Entry 0"? more specifically is the > > > > difference between entry 0 and 3? Entry 0 seems to be an implicit map > > > > while 3 is explicit. > > > > > > > > #pragma omp target exit data map(from: c.a[0:NUM], c.b[0:NU2M]) > > > > map(delete: c) > > > > the "map(delete: c)" has some state machine to protect the delete due > > > > to ordering. > > > > So I'm wondering why the frontend must issue both 0 and 3. Can the > > > > front end fuse 0 and 3? > > > > I mean the frontend generates 3, 1, 2 and the runtime processing 2,1,3 > > > > without the deleting issue? > > > Entry 0 is the address that should be passed to the kernel (for the > > > captured variable, that's why it is marked as TGT_TARGET_PARAM - target > > > kernel argument), entries 1-3 are the actual mappings. Yes, I assume the > > > frontend can fuse these entries in many cases, but it is different > > > problem that should be addressed in a separate patch. > > There is no kernel associated enter/exit data... > > > > Thanks for answering all my puzzles. If you think some of my questions can > > be better answered by some documentation, please point me if there are any. > > > > I think both the frontend and runtime needs to be further consolidated to > > reduce side effects in future patches. The current patch in the runtime > > library part looks good to me for the current need. > > > > Please ping appropriate reviewers for the frontend change, so we can keep > > this patch moving. > No problem! > Yes, there is no kernel, missed it, but still it is the reason. The codegen > you see is not directly related to this patch, this is just how the mapping > currently works, it is just not quite optimal. Sure, it can be improved in > many cases but just like I said it is different problem that should be > addressed in separate patch(es). Also, I believe some of the optimizations > can be implemented in OpenMPOpt pass. For the documentations, there is design description for the mappers https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx. For the mapping rules, see the comments in generateInfoForComponentList function Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D86119/new/ https://reviews.llvm.org/D86119 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits