GitHub user gidhubuser255 added a comment to the discussion: Parallelization
Enhancement Ideas
2) Ok so the way I would foresee this working
Normally a hamilton function determines its dependencies by inspecting the
function arguments during the graph build and mapping that to a function. With
the new proposed "driver aware" functions, during graph build, you would
instead invoke the function with a pseudo-driver object that records the
dependency when it's called with driver.call('dep') and then it would exit the
function when it reaches the Collect. And then during execution you would
invoke the function with a different pseudo-driver object that pulls the result
from a pre-populated dict. And you can easily check if a function is driver
aware to switch between the standard dependency collect method vs the new
proposed method.
>From an end user POV it would look something like this, take the existing
>hamilton function:
```
def A(B: int, C: int):
return B + C
```
Could be rewritten as:
```
def A(driver: Driver): # Driver type would need to be protocol or something
since graph build and execution get different pseudo-drivers, doesn't really
matter though since the type here wouldn't be used for anything
b = driver.call('B', type=int) # not sure how best to do the types,
something like this maybe
c = driver.type(int).call('C') # or maybe type could be specified like this
Collect() # exit here in graph build stage, ignore in execution stage
return b + c
```
In this case, there would be no point in doing so, but you could then add any
arguments to A's signature to pass through as regular parameters to other
calls. This means that the Node object definition also needs to contain these
parameter values. Also you can see here that the concept doesn't really only
need to be restricted to parallelization. It could also still be used in
non-parameterized cases where you want some control flow to determine what
dependencies to call. e.g:
```
def A(driver):
b = driver.call('B', type=int)
if <some condition>:
c = driver.call('C1', type=int)
else:
c = driver.call('C2', type=int)
Collect()
return b + c
```
But this case can already be handled by config.when in most cases.
GitHub link:
https://github.com/apache/hamilton/discussions/1412#discussioncomment-14821836
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to:
[email protected]