Hi Yan, Happy to contribute with something back, for this amazing project! I'll go through the guide and submit a patch tomorrow!
Best, José Luis On Thu, May 21, 2015 at 11:17 PM, Yan Fang <yanfang...@gmail.com> wrote: > Hi José, > > Thank you. If you can contribute a patch for this fix (SAMZA-688 > <https://issues.apache.org/jira/browse/SAMZA-688>), it would be very > helpful. And here > <https://cwiki.apache.org/confluence/display/SAMZA/Contributor%27s+Corner> > is the guide for contributing. > > Cheers, > > Fang, Yan > yanfang...@gmail.com > > On Thu, May 21, 2015 at 8:38 PM, Yi Pan <nickpa...@gmail.com> wrote: > > > Hi, Jose, > > > > Thanks a lot! I I have opened a JIRA to support that: SAMZA-688. > > > > -Yi > > > > On Thu, May 21, 2015 at 8:03 PM, José Barrueta <j...@stormpath.com> > wrote: > > > > > Hi all, > > > > > > Once we figure it out the problem we were able to easily come up with a > > > solution for this. > > > > > > Basically, we want to be able to set the `yarn.package.path` property > to > > > look for an artifact over `https`, when we did this we ran into this > > > exception: > > > > > > Exception in thread "main" java.io.IOException: No FileSystem for > scheme: > > > https > > > at > > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385) > > > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) > > > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > > > at > > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > > > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > > > > > > First we look at the actual Yarn Resource Manager and make sure it > > > supported the https file system, so after a while we looked at the > > > YarnJobFactory code and found out the current implementation. > > > > > > class YarnJobFactory extends StreamJobFactory { > > > def getJob(config: Config) = { > > > // TODO fix this. needed to support http package locations. > > > val hConfig = new YarnConfiguration > > > hConfig.set("fs.http.impl", classOf[HttpFileSystem].getName) > > > > > > new YarnJob(config, hConfig) > > > } > > > } > > > > > > And like I said, after this it was easy to fix the issue, we just > created > > > our own YarnJobFactory > > > > > > /** > > > * YarnJobFactory is an implementation based on Samza's {@link > > > org.apache.samza.job.yarn.YarnJobFactory} > > > * implementation. > > > * > > > * @since 0.1.0 > > > */ > > > public class YarnJobFactory implements StreamJobFactory { > > > > > > @Override > > > public StreamJob getJob(Config config) { > > > > > > Configuration yarnConfig = new YarnConfiguration(); > > > yarnConfig.set("fs.http.impl", > > > org.apache.samza.util.hadoop.HttpFileSystem.class.getName()); > > > yarnConfig.set("fs.https.impl", > > > org.apache.samza.util.hadoop.HttpFileSystem.class.getName()); > > > > > > return new YarnJob(config, yarnConfig); > > > } > > > } > > > > > > This one supports both, schemes http and https, I noticed the comment > for > > > the current implementation, is there a way I can contribute to enhance > > this > > > implementation, I'm thinking maybe the Samza configuration might > specify > > > the schema and map to a FileSystem instance. > > > > > > Best, > > > > > > Jose Luis > > > > > >