GitHub user mitstake added a comment to the discussion: Parallelization
Enhancement Ideas
With regards to equivalence probably easiest to show it in terms of the simple
ABC example.
```
def A():
return 3
def B(A):
return A / 3
def C(A, B):
return A ** 2 * B
```
This works in both Hamilton and darl. However, with darl you can also
equivalently define the above as:
```
def A():
return 3
def B(ngn):
a = ngn.A()
ngn.collect()
return a / 3
def C(ngn):
a = ngn.A()
b = ngn.B()
ngn.collect()
return a ** 2 * b
```
(You can also mix the styles throughout your different functions)
This way instead of parsing the signature to identify the dependencies, you
call the function itself during the graph build step. However, in the graph
build step you only run up to `ngn.collect()` and then you exit. `ngn` in the
graph build step just collects the name called on it which is then traversed to
to collect their dependencies recursively, until your graph is built.
The benefit of doing it this way is that you can parameterize your functions.
Which among other things lets you use basic constructs like for loops to build
nodes in your graph. E.g:
```
def USGDP(ngn):
gdp = 0
for state in ALL_STATES:
gdp += ngn.StateGDP(state)
ngn.collect()
return gdp
def StateGDP(ngn, state):
return len(state)
```
This will create a graph like:
<img width="638" height="202" alt="image"
src="https://github.com/user-attachments/assets/be70bcb7-6934-4b0b-a127-3760e9280387"
/>
With this there's no need for queues or anything. It's not really
parallelization, it's just another way to define nodes in the graph, thus
"nested parallelization" is nothing special, you can just define whatever loops
wherever you want. Actual parallel execution is left to whatever executor you
want to plugin like dask or ray.
GitHub link:
https://github.com/apache/hamilton/discussions/1412#discussioncomment-15662461
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to:
[email protected]