Finally find a way out of the ClassLoader maze! It took me some times to understand how it works; I think it worths to document it in a separated thread.
We're trying to add external utility.jar which contains CSVRecordParser, and we added the jar to executors through sc.addJar APIs. If the instance of CSVRecordParser is created without reflection, it raises *ClassNotFound Exception*. data.mapPartitions(lines => { val csvParser = new CSVRecordParser((delimiter.charAt(0)) lines.foreach(line => { val lineElems = csvParser.parseLine(line) }) ... ... ) If the instance of CSVRecordParser is created through reflection, it works. data.mapPartitions(lines => { val loader = Thread.currentThread.getContextClassLoader val CSVRecordParser = loader.loadClass("com.alpine.hadoop.ext.CSVRecordParser") val csvParser = CSVRecordParser.getConstructor(Character.TYPE) .newInstance(delimiter.charAt(0).asInstanceOf[Character]) val parseLine = CSVRecordParser .getDeclaredMethod("parseLine", classOf[String]) lines.foreach(line => { val lineElems = parseLine.invoke(csvParser, line).asInstanceOf[Array[String]] }) ... ... ) This is identical to this question, http://stackoverflow.com/questions/7452411/thread-currentthread-setcontextclassloader-without-using-reflection It's not intuitive for users to load external classes through reflection, but couple available solutions including 1) messing around systemClassLoader by calling systemClassLoader.addURI through reflection or 2) forking another JVM to add jars into classpath before bootstrap loader are very tricky. Any thought on fixing it properly? @Xiangrui, netlib-java jniloader is loaded from netlib-java through reflection, so this problem will not be seen. Sincerely, DB Tsai ------------------------------------------------------- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai