Hi there, I am a potentially new contributor, so don't spend too much time on me. However I would like to give this a try. Reason is that it would be a nice to have at my work (the connection between glue and spark). We run our own spark clusters and don't use EMR and right now our spark jobs can't benefit from the glue metastore. This is not a huge problem, because we keep strict naming conventions and use orc, but still it would be nice for our user base.
As you can guess, our cluster runs on AWS and I have a good amount of experience with the aws SDK's, reasonable amount with Scala. I am however a beginner with Spark, never contributed before. As far as I can see I need to implement ExternelCatalog for Glue and glue seems to support all operations specified in the trait. Even the user defined functions, which surprised me, because Athena doesn't support this. I can see some obstacles, e.g. how to deal with permissions. Therefore I will study the hive ExternalCatalog. Can I take that as leading example? I also saw there was prior work from the mailing list ( http://apache-spark-developers-list.1001551.n3.nabble.com/A-new-external-catalog-td23394.html), but unfortunately there is no code. Would this be a suitable project to pick up? I thought it might be, because it is kinda on the edge of Spark. Thanks for your time in advance! Greets, Edgar Klerks