eric-czech commented on issue #1358:
URL: https://github.com/apache/hamilton/issues/1358#issuecomment-3175800355

   > can you specify a little more on what the use-case is?
   
   Certainly.  It would involve small to medium scale workflows on large 
datasets for:
   
   - Running data integration and processing workflows via Pandas, Spark, Dask, 
Xarray, Ray, etc.
   - Running training jobs on SLURM clusters and/or Neocloud providers
   - Running inference and post-processing pipelines (with or without 
accelerators)
   
   Hamilton seems like an obvious fit for handling the large number of small 
steps related to pre-processing, e.g.  Longer term, I'm also interested in 
seeing to what extent it may be helpful for orchestrating work across mixed 
hardware and multiple cloud providers.  To be clear, I don't expect it to do 
much other than solve for building DAGs and offering some reasonable semantics 
over retries and caching of expensive results.  Provisioning, configuration, 
pickling functions, validating schemas, etc. are all things I would expect 
other tools to do -- I'm only really looking to Hamilton to define workflows 
via Python rather than a DSL or API complicated enough to be called a DSL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to