The reflection actually works. But you need to get the loader by `val
loader = Thread.currentThread.getContextClassLoader` which is set by Spark
executor. Our team verified this, and uses it as workaround.



Sincerely,

DB Tsai
-------------------------------------------------------
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai


On Sun, May 18, 2014 at 9:46 AM, Xiangrui Meng <men...@gmail.com> wrote:

> Btw, I tried
>
> rdd.map { i =>
>   System.getProperty("java.class.path")
> }.collect()
>
> but didn't see the jars added via "--jars" on the executor classpath.
>
> -Xiangrui
>
> On Sat, May 17, 2014 at 11:26 PM, Xiangrui Meng <men...@gmail.com> wrote:
> > I can re-produce the error with Spark 1.0-RC and YARN (CDH-5). The
> > reflection approach mentioned by DB didn't work either. I checked the
> > distributed cache on a worker node and found the jar there. It is also
> > in the Environment tab of the WebUI. The workaround is making an
> > assembly jar.
> >
> > DB, could you create a JIRA and describe what you have found so far?
> Thanks!
> >
> > Best,
> > Xiangrui
> >
> > On Sat, May 17, 2014 at 1:29 AM, Mridul Muralidharan <mri...@gmail.com>
> wrote:
> >> Can you try moving your mapPartitions to another class/object which is
> >> referenced only after sc.addJar ?
> >>
> >> I would suspect CNFEx is coming while loading the class containing
> >> mapPartitions before addJars is executed.
> >>
> >> In general though, dynamic loading of classes means you use reflection
> to
> >> instantiate it since expectation is you don't know which implementation
> >> provides the interface ... If you statically know it apriori, you
> bundle it
> >> in your classpath.
> >>
> >> Regards
> >> Mridul
> >> On 17-May-2014 7:28 am, "DB Tsai" <dbt...@stanford.edu> wrote:
> >>
> >>> Finally find a way out of the ClassLoader maze! It took me some times
> to
> >>> understand how it works; I think it worths to document it in a
> separated
> >>> thread.
> >>>
> >>> We're trying to add external utility.jar which contains
> CSVRecordParser,
> >>> and we added the jar to executors through sc.addJar APIs.
> >>>
> >>> If the instance of CSVRecordParser is created without reflection, it
> >>> raises *ClassNotFound
> >>> Exception*.
> >>>
> >>> data.mapPartitions(lines => {
> >>>     val csvParser = new CSVRecordParser((delimiter.charAt(0))
> >>>     lines.foreach(line => {
> >>>       val lineElems = csvParser.parseLine(line)
> >>>     })
> >>>     ...
> >>>     ...
> >>>  )
> >>>
> >>>
> >>> If the instance of CSVRecordParser is created through reflection, it
> works.
> >>>
> >>> data.mapPartitions(lines => {
> >>>     val loader = Thread.currentThread.getContextClassLoader
> >>>     val CSVRecordParser =
> >>>         loader.loadClass("com.alpine.hadoop.ext.CSVRecordParser")
> >>>
> >>>     val csvParser = CSVRecordParser.getConstructor(Character.TYPE)
> >>>         .newInstance(delimiter.charAt(0).asInstanceOf[Character])
> >>>
> >>>     val parseLine = CSVRecordParser
> >>>         .getDeclaredMethod("parseLine", classOf[String])
> >>>
> >>>     lines.foreach(line => {
> >>>        val lineElems = parseLine.invoke(csvParser,
> >>> line).asInstanceOf[Array[String]]
> >>>     })
> >>>     ...
> >>>     ...
> >>>  )
> >>>
> >>>
> >>> This is identical to this question,
> >>>
> >>>
> http://stackoverflow.com/questions/7452411/thread-currentthread-setcontextclassloader-without-using-reflection
> >>>
> >>> It's not intuitive for users to load external classes through
> reflection,
> >>> but couple available solutions including 1) messing around
> >>> systemClassLoader by calling systemClassLoader.addURI through
> reflection or
> >>> 2) forking another JVM to add jars into classpath before bootstrap
> loader
> >>> are very tricky.
> >>>
> >>> Any thought on fixing it properly?
> >>>
> >>> @Xiangrui,
> >>> netlib-java jniloader is loaded from netlib-java through reflection, so
> >>> this problem will not be seen.
> >>>
> >>> Sincerely,
> >>>
> >>> DB Tsai
> >>> -------------------------------------------------------
> >>> My Blog: https://www.dbtsai.com
> >>> LinkedIn: https://www.linkedin.com/in/dbtsai
> >>>
>

Reply via email to