GitHub user jernejfrank added a comment to the discussion: APIs for updating
underlying function of a node (2 part question)
Hey, cool ideas.
> ### (1) Updating function logic
> For reference consider the following DAG
>
> ```
> def A(B: int, C: int) -> int:
> return B + C
>
> def B(D: int) -> int:
> return D + 1
>
> def C(D: int) -> int:
> return D + 2
>
> def D() -> int:
> return 7
>
> dr = driver.Builder().with_modules(__import__('__main__')).build()
> dr.execute('A') # returns 17
> ```
> **Questions:**
>
> 1. Is there a similar existing mechanism to this in Hamilton?
> 2. If not could it be added?
>
At the moment Hamilton does everything via modules (there's things brewing to
enable directly passing functions in, but not there yet). There is a way to
override `B` by defining it in a seperate modue and then using
`.allow_module_overrides()` during the building process (checkout this
[example](https://github.com/apache/hamilton/tree/main/examples/module_overrides).
In your case:
```python
# new_module.py
@config.when(name='B', use_alternative_b=True)
def B(C: int, D: int) -> int:
return C + D
```
and
```python
# main.py
def A(B: int, C: int) -> int:
return B + C
def B(D: int) -> int:
return D + 1
def C(D: int) -> int:
return D + 2
def D() -> int:
return 7
if __name__ == "__main__":
import __main__ as main
import new_module
dr = driver.Builder().with_modules(main,
new_module).allow_module_overrides().build()
dr.execute('A')
```
A word of warning, if you use multiple modules and have in them defined
functions with the same name Hamilton will throw an error. However, with the
`.allow_module_overrides()` flag order matters `.with_modules(main, new_module)
!= .with_modules(new_module,main)` and the flag is global so in case you use
multiple modules you need to be careful not to override something you don't
want to.
> ### (2) Updating function logic in a specific branch
> **Questions:**
>
> 1. Is there a good existing way to achieve this? I'm thinking there's
> probably a way using the existing `Parallelizable` and `Collect` mechanism
> which can already dynamically generate new nodes to add to the graph, and
> which could be used to do the bifurcation I mentioned?
> 2. Would it be possible to add the ability to do this with something more
> like the fn_graph style api, something like `driver.update(X=X__prime,
> in_path=['B']) `
I am not sure a fully understand the second part, but I can give it a go and
let me know if I missed something. Hamilton will create the full graph, but it
will only walk / execute nodes that are relevant for a specified output (in
your case `A`). You cannot really control which branch of the graph will
execute, only the nodes that are leading to the requested output. In your case,
`B` will always execute because `A` needs it (same for `C`) all the way down to
`X`. What you have control over is what the individual nodes are and there's a
couple of ways to organise your code to manage multiple potential branches:
1. Define "static" nodes (things that you think will not change) in a seperate
module and organise "dynamic" functions into different modules. You can then
during the build time choose which module to load that will create the desired
nodes. like having a feature catalog.
2. Using `subdag` decorator to re-use functions
[example](https://github.com/apache/hamilton/tree/main/examples/reusing_functions)
and run custom tags esentially as a node.
3. Similar to subdag for simpler stuff maybe the `pipe` family might be easier
to use
[example](https://github.com/apache/hamilton/tree/main/examples/scikit-learn/species_distribution_modeling)
GitHub link:
https://github.com/apache/hamilton/discussions/1397#discussioncomment-14567925
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to:
[email protected]