GitHub user gidhubuser255 added a comment to the discussion: Parallelization 
Enhancement Ideas

2) Ok so the way I would foresee this working 

Normally a hamilton function determines its dependencies by inspecting the 
function arguments during the graph build and mapping that to a function. With 
the new proposed "driver aware" functions, during graph build, you would 
instead invoke the function with a pseudo-driver object that records the 
dependency when it's called with driver.call('dep') and then it would exit the 
function when it reaches the Collect. And then during execution you would 
invoke the function with a different pseudo-driver object that pulls the result 
from a pre-populated dict. And you can easily check if a function is driver 
aware to switch between the standard dependency collect method vs the new 
proposed method.

>From an end user POV it would look something like this, take the existing 
>hamilton function:

```
def A(B: int, C: int):
    return B + C
```

Could be rewritten as:

```
def A(driver: Driver):  # Driver type would need to be protocol or something 
since graph build and execution get different pseudo-drivers, doesn't really 
matter though since the type here wouldn't be used for anything
    b = driver.call('B', type=int)  # not sure how best to do the types, 
something like this maybe
    c = driver.type(int).call('C')  # or maybe type could be specified like this
    Collect()  # exit here in graph build stage, ignore in execution stage
    return b + c
```

In this case, there would be no point in doing so, but you could then add any 
arguments to A's signature to pass through as regular parameters to other 
calls. This means that the Node object definition also needs to contain these 
parameter values. Also you can see here that the concept doesn't really only 
need to be restricted to parallelization. It could also still be used in 
non-parameterized cases where you want some control flow to determine what 
dependencies to call. e.g:

```
def A(driver):
    b = driver.call('B', type=int)
    if <some condition>:
        c = driver.call('C1', type=int)
    else:
        c = driver.call('C2', type=int)
    Collect()
    return b + c
```

But this case can already be handled by config.when in most cases.


GitHub link: 
https://github.com/apache/hamilton/discussions/1412#discussioncomment-14821836

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]

Reply via email to