Yicong-Huang commented on PR #5179:
URL: https://github.com/apache/texera/pull/5179#issuecomment-4529506515

   Thanks for the PR! It's great to see multiple designs of macros. We 
definitely need this feature!
   
   Let's talk about the experience before diving into code.
   
   
   ### create a macro
   It is not shown or clear to me how to create a macro from scratch? 
   
   In @Xiao-zhen-Liu 's version (#5115), user highlights three operators and 
right click to show the context menu, then click "create Marco", then the three 
operators disappear and replaced with a macro operator. The Marco operator is 
same shape/size of other normal operators. 
   
   I do like this experience. 
   
   ### Representation of a macro and reuse a macro operator 
   In this version, user drags a macro operator (which is a marker, instead of 
an operator), and selects options from a drop down menu inside property panel.
   
   It is not clear how to reuse a marco in #5115. 
   
   I would prefer not to have Marco as a marker. instead, we have always been 
discussing that a Marco operator should just be the same as other logical 
operators. For instance, you can view `Hash Join` (we should rename it to Join) 
operator as a macro. It gets expanded into build and probe operators later 
after compile. I will be great if Marco operators can just look similar to 
other normal operators (like inside #5115). Most importantly, Marco operators 
should have ports, that is derived from the sub-dag that it is representing. 
For instance, if a macro operator contains a sub-dag that has 2 inputs, and 3 
output branches, then it will be good for the Marco to have 2 input ports and 3 
output ports. This way user can use the macro operator without knowing its 
internal shape. During execution, user can see the macro's aggregated 
statistics instead of internal operators' statistics. 
   
   ### Display the macro internal sub-Dag and edit it
   In this version, the macro's content (internal sub-DAG) is displayed 
inplace. It is more like the grouping view Zuozhi implemented before in #754. 
And user can drag and drop a new operator into existing macro to edit it. I 
assume user can also delete one operator or reconnect an edge between 
operators. 
   
   In Xiaozhen's #5115, user needs to click on the macro to jump to another 
workflow view, where it expands into the internal dag. The internal view has a 
dummy source operator and a dummy sink operator for user to understand. I think 
this experience is also mimicking rapidminer/knime.  It is not clear if user 
can edit the internal sub-DAG in that view.
   
   I do like the experience to jump to another workflow view to edit Marco's 
internal shape, because usually once a macro is created, it will be likely to 
be reused by many workflows. To edit a macro in place of a particular workflow 
would also affect the same macro being used in other workflows (think of 
modifying a function's definition at its callsite, other invocations places of 
the same function would also be affected). The marco is better to be edited in 
a stand alone view. 
   
   However I do not like the dummy source and dummy sink in #5115. I think we 
should prioritize ports and just show input and output ports are sufficient.
   
   ### Other ideas
   - I think it maybe good idea to show "My macros" in the operator properties 
directly and let users drag a Marco operator similarly as dragging a normal 
operator out. This can save the changes on the property panel. Think about 
this: how many times would user select a marco (marker) and uses the drop down 
to change it to another macro implementation? I think if in any cases user need 
to use another macro, they should just drag it out from the operator menu. 
   
   
   
   Ok those are all my high level comments regarding this PR. I am sure we will 
have much more detailed comments going deeper.
   
   So let's talk about breaking into smaller PRs. I still think this PR is too 
large: it has too many design choices that include creating a Marco, reuse a 
previously created macro, representation of a macro, editing a Marco, runtime 
behavior of the Marco. There are other life cycles which are not clear or being 
discussed: removing of a Marco, disable a macro, backend compilation of macro, 
persistence of macro (how to represent it in DB), search of a Marco (indexing) 
.... There are also other things we need to change, such as documentation, 
tests, tutorials of how to use Marco, etc.
   
   The right size of a PR, is we can discuss and review for a single or two 
design choices. Making the scope smaller can make the discussion easier. For 
instance, I would recommend you have one PR about creating a macro (without 
saving it or reuse it), then we can discussion about the experience of creating 
a macro in the PR. Then you can have a PR to persist the Marco into db, we can 
discuss where to store macro information and how to make sure search can index 
it. Then you can have a PR to introduce experience of how user can reuse a 
previously stored macro in a new workflow. Then another PR to edit a previously 
macro. For each feature, you can split into frontend, backend to make size even 
smaller.  I think you can see my point now. Making PRs smaller is not the goal: 
it is because smaller PRs can get proper reviews and discussions on your 
changes.  
   
   You may worry about that: these are partial features, can we merge into 
main? yes we can, just give a config that macro feature is turned off by 
default in main, so it doesn't affect any other features. After we have more 
PRs to implement the whole life cycle, fix bugs, add documentations, we then 
can turn on the feature by default to let user see and use it. 
   
   I hope this make sense to you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to