OK, so 2 of those things, not even I was aware of. Hans, Perhaps your reply could be put into the docs ?
Thad https://www.linkedin.com/in/thadguidry/ https://calendly.com/thadguidry/ On Tue, Nov 19, 2024 at 6:32 PM hansva (via GitHub) <[email protected]> wrote: > > GitHub user hansva added a comment to the discussion: HOP Sizing > > Hi @xProga, > > I'm afraid the answer to all this is "it depends". That's why we don't > have these guidelines. > > Some of the things that are known: > - Each action/transform (or transform copy) will create a processor thread > - This means the maximum amount of active transforms equals the maximum > amount of threads the CPU supports (-1 for the main process), when the > amount of active transforms is higher thread switching will occur > - So our recommendation is to keep your pipelines as compact as possible > (~30 transforms sounds like a sane rule) > - Each transform has a configurable buffer > - Each transform (copy) has an input buffer, compared to other tools we > do not load all data into memory and move them from one transform to the > next. We have a buffer system (default 10K rows) and transforms will fill > those buffers and get pushback signals to stop processing/fetching data > - This means the amount of active memory = rows in buffers x (columns x > data type) > - There are a couple of exceptions, eg. `Sort Rows` needs to have all > data so it will load all rows but it has a configurable buffer to spool of > data to disk > > Hop Web: > Users can run workflows and pipelines inside Hop Web, it is not a > client/server application. This does imply that these instances need enough > resources to run the processes locally. > > Hop Server: > This one is mainly used as a remote extension to local development. It is > a stateless server so workloads do not survive restarts. It does not have > scheduling. > > We recommend using short-lived containers for actual workload scheduling > using an orchestration tool of your choice. > > Hope this helps. > > > GitHub link: > https://github.com/apache/hop/discussions/4586#discussioncomment-11303562 > > ---- > This is an automatically sent email for [email protected]. > To unsubscribe, please send an email to: [email protected] > >
