GitHub user jernejfrank added a comment to the discussion: APIs for updating 
underlying function of a node (2 part question)

Hey, cool ideas.

> ### (1) Updating function logic
> For reference consider the following DAG
> 
> ```
> def A(B: int, C: int) -> int:
>     return B + C
> 
> def B(D: int) -> int:
>     return D + 1
> 
> def C(D: int) -> int:
>     return D + 2
> 
> def D() -> int:
>     return 7
> 
> dr = driver.Builder().with_modules(__import__('__main__')).build()
> dr.execute('A')  # returns 17
> ```

> **Questions:**
> 
> 1. Is there a similar existing mechanism to this in Hamilton?
> 2. If not could it be added?
> 

At the moment Hamilton does everything via modules (there's things brewing to 
enable directly passing functions in, but not there yet).  There is a way to 
override `B` by defining it in a seperate modue and then using 
`.allow_module_overrides()` during the building process (checkout this 
[example](https://github.com/apache/hamilton/tree/main/examples/module_overrides).
 In your case:

```python
# new_module.py

@config.when(name='B', use_alternative_b=True)
def B(C: int, D: int) -> int:
    return C + D
```

and

```python
# main.py

def A(B: int, C: int) -> int:
    return B + C

def B(D: int) -> int:
    return D + 1

def C(D: int) -> int:
    return D + 2

def D() -> int:
    return 7

if __name__ == "__main__":
    import __main__ as main
    import new_module

    dr = driver.Builder().with_modules(main, 
new_module).allow_module_overrides().build()
    dr.execute('A')
```

A word of warning, if you use multiple modules and have in them defined 
functions with the same name Hamilton will throw an error. However, with the 
`.allow_module_overrides()` flag order matters `.with_modules(main, new_module) 
!= .with_modules(new_module,main)` and the flag is global so in case you use 
multiple modules you need to be careful not to override something you don't 
want to.


> ### (2) Updating function logic in a specific branch

> **Questions:**
> 
> 1. Is there a good existing way to achieve this? I'm thinking there's 
> probably a way using the existing `Parallelizable` and `Collect` mechanism 
> which can already dynamically generate new nodes to add to the graph, and 
> which could be used to do the bifurcation I mentioned?
> 2. Would it be possible to add the ability to do this with something more 
> like the fn_graph style api, something like `driver.update(X=X__prime, 
> in_path=['B']) `

I am not sure a fully understand the second part,  but I can give it a go and 
let me know if I missed something. Hamilton will create the full graph, but it 
will only walk / execute nodes that are relevant for a specified output (in 
your case `A`). You cannot really control which branch of the graph will 
execute, only the nodes that are leading to the requested output. In your case, 
`B` will always execute because `A` needs it (same for `C`) all the way down to 
`X`. What you have control over is what the individual nodes are and there's a 
couple of ways to organise your code to manage multiple potential branches:

1. Define "static" nodes (things that you think will not change) in a seperate 
module and organise "dynamic" functions into different modules. You can then 
during the build time choose which module to load that will create the desired 
nodes. like having a feature catalog.
2. Using `subdag` decorator to re-use functions 
[example](https://github.com/apache/hamilton/tree/main/examples/reusing_functions)
 and run custom tags esentially as a node.
3. Similar to subdag for simpler stuff maybe the `pipe` family might be easier 
to use 
[example](https://github.com/apache/hamilton/tree/main/examples/scikit-learn/species_distribution_modeling)

GitHub link: 
https://github.com/apache/hamilton/discussions/1397#discussioncomment-14567925

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]

Reply via email to