I need to load a PFA (portable format for analytics) that can be around 30
GB and later process it with hadrian which is the java implementation for
PFA's (https://github.com/opendatagroup/hadrian).

I would like to execute this transformation step inside a specific worker
of the cluster (since I don't want to load 30 GB on every single worker
node). Unfortunately, hadrian cannot be executed in a distributed way.

So my question would be if there is a way to do some routing with Flink and
execute this particular transformation step using always the same worker
node?

Perhaps my approach is completely wrong, so if anybody has any suggestions
I would be more than happy to hear them:)

Thanks

Reply via email to