I can re-produce the error with Spark 1.0-RC and YARN (CDH-5). The
reflection approach mentioned by DB didn't work either. I checked the
distributed cache on a worker node and found the jar there. It is also
in the Environment tab of the WebUI. The workaround is making an
assembly jar.

DB, could you create a JIRA and describe what you have found so far? Thanks!

Best,
Xiangrui

On Sat, May 17, 2014 at 1:29 AM, Mridul Muralidharan <mri...@gmail.com> wrote:
> Can you try moving your mapPartitions to another class/object which is
> referenced only after sc.addJar ?
>
> I would suspect CNFEx is coming while loading the class containing
> mapPartitions before addJars is executed.
>
> In general though, dynamic loading of classes means you use reflection to
> instantiate it since expectation is you don't know which implementation
> provides the interface ... If you statically know it apriori, you bundle it
> in your classpath.
>
> Regards
> Mridul
> On 17-May-2014 7:28 am, "DB Tsai" <dbt...@stanford.edu> wrote:
>
>> Finally find a way out of the ClassLoader maze! It took me some times to
>> understand how it works; I think it worths to document it in a separated
>> thread.
>>
>> We're trying to add external utility.jar which contains CSVRecordParser,
>> and we added the jar to executors through sc.addJar APIs.
>>
>> If the instance of CSVRecordParser is created without reflection, it
>> raises *ClassNotFound
>> Exception*.
>>
>> data.mapPartitions(lines => {
>>     val csvParser = new CSVRecordParser((delimiter.charAt(0))
>>     lines.foreach(line => {
>>       val lineElems = csvParser.parseLine(line)
>>     })
>>     ...
>>     ...
>>  )
>>
>>
>> If the instance of CSVRecordParser is created through reflection, it works.
>>
>> data.mapPartitions(lines => {
>>     val loader = Thread.currentThread.getContextClassLoader
>>     val CSVRecordParser =
>>         loader.loadClass("com.alpine.hadoop.ext.CSVRecordParser")
>>
>>     val csvParser = CSVRecordParser.getConstructor(Character.TYPE)
>>         .newInstance(delimiter.charAt(0).asInstanceOf[Character])
>>
>>     val parseLine = CSVRecordParser
>>         .getDeclaredMethod("parseLine", classOf[String])
>>
>>     lines.foreach(line => {
>>        val lineElems = parseLine.invoke(csvParser,
>> line).asInstanceOf[Array[String]]
>>     })
>>     ...
>>     ...
>>  )
>>
>>
>> This is identical to this question,
>>
>> http://stackoverflow.com/questions/7452411/thread-currentthread-setcontextclassloader-without-using-reflection
>>
>> It's not intuitive for users to load external classes through reflection,
>> but couple available solutions including 1) messing around
>> systemClassLoader by calling systemClassLoader.addURI through reflection or
>> 2) forking another JVM to add jars into classpath before bootstrap loader
>> are very tricky.
>>
>> Any thought on fixing it properly?
>>
>> @Xiangrui,
>> netlib-java jniloader is loaded from netlib-java through reflection, so
>> this problem will not be seen.
>>
>> Sincerely,
>>
>> DB Tsai
>> -------------------------------------------------------
>> My Blog: https://www.dbtsai.com
>> LinkedIn: https://www.linkedin.com/in/dbtsai
>>

Reply via email to