After just spending a couple days fighting with a new spark installation, getting spark and hadoop version numbers matching everywhere, I have a suggestion I'd like to put out there.
Can we put the hadoop version against which the spark jars were built into the version number? I noticed that the Cloudera maven repo has started to do this ( https://repository.cloudera.com/artifactory/cloudera-repos/org/apache/spark/spark-core_2.10/) - sadly, though, only with the cdh5.x versions, not with the 4.x versions for which they also have spark parcels. But I see no signs of it in the central maven repo. Is this already done in some other repo about which I don't know, perhaps? I know it would save us a lot of time and grief simply to be able to point a project we build at the right version, and not have to rebuild and deploy spark manually. -- Nathan Kronenfeld Senior Visualization Developer Oculus Info Inc 2 Berkeley Street, Suite 600, Toronto, Ontario M5A 4J5 Phone: +1-416-203-3003 x 238 Email: nkronenf...@oculusinfo.com