Best practices for sharing/maintaining large resource files for Spark jobs

Dmitry Goldenberg Mon, 11 Jan 2016 12:14:35 -0800

We have a bunch of Spark jobs deployed and a few large resource files such
as e.g. a dictionary for lookups or a statistical model.


Right now, these are deployed as part of the Spark jobs which will
eventually make the mongo-jars too bloated for deployments.

What are some of the best practices to consider for maintaining and sharing
large resource files like these?

Thanks.

Best practices for sharing/maintaining large resource files for Spark jobs

Reply via email to