Hi there, I'm working on getting some complex scalding flow working on top of Flink. I've get pretty far along with Cyrille Chépélov's cascading 3 branch. However, the Distributed Cache support in scalding assumes Hadoop which our jobs use heavily.
I've been working on getting the support in there. I succeeded in at least setting the files by proxying between Scalding/Hadoop config and Flink Config and then back again. I'm stuck on getting out inside the the running flow. I just cannot figure out how to get an instance of org.apache.flink.api.common.cache.DistributedCache so I can call getFile(). Or for that mater, how do I get a RuntimeContext inside the cascading / scalding job. I've spent a bunch of time pouring over the cascading-flink code and I still can't figure it out. Can somebody point me in the right direction. I will post the code as scalding PR once I have it working. Thanks, -M -- Milosz Tanski CTO 16 East 34th Street, 15th floor New York, NY 10016 p: 646-253-9055 e: mil...@adfin.com