[ https://issues.apache.org/jira/browse/TOREE-464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexander Anokhin updated TOREE-464: ------------------------------------ Description: Running in non local mode actions on RDDs fail when RDDs hold custom non-primitive objects. It affects * current master version 0.3.0 ([https://github.com/apache/incubator-toree]) as well as released versions * RC1 ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc1]), * RC2 ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc2]) and * RC3 ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc3]). *Example:* Cell 1: case class A(i: Int) Cell 2: val events = sc.parallelize((1 to 5).toSeq).map { i => A(i) } Cell 3: println(events.count()) results "java.lang.NoClassDefFoundError: Could not initialize class ..." However, it does work if code from the cells 1 and 2 is combined into one cell. In that case actions on such RDDs work correctly, but case class definition should be added to every cell with problem RDD. *Example:* Cell 1: case class A(i: Int) val events = sc.parallelize((1 to 5).toSeq).map { i => A(i) } Cell 2: println(events.count()) results "5" was: Running in non local mode actions on RDDs fail when RDDs hold custom non-primitive objects. It affects current master version 0.3.0 (https://github.com/apache/incubator-toree) as well as released versions RC1 (https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc1), RC2 (https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc2) and RC3 (https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc3). Example: ``` Cell 1: case class A(i: Int) Cell 2: val events = sc.parallelize((1 to 5).toSeq).map \{ i => A(i) } Cell 3: println(events.count()) results "java.lang.NoClassDefFoundError: Could not initialize class ..." ``` However, it does work if code from the cells 1 and 2 is combined into one cell. In that case actions on such RDDs work correctly, but case class definition should be added to every cell with problem RDD. Example: ``` Cell 1: case class A(i: Int) val events = sc.parallelize((1 to 5).toSeq).map \{ i => A(i) } Cell 2: println(events.count()) results "5" ``` > Failing actions on RDDs with non-primitive objects > -------------------------------------------------- > > Key: TOREE-464 > URL: https://issues.apache.org/jira/browse/TOREE-464 > Project: TOREE > Issue Type: Bug > Affects Versions: 0.2.0 > Reporter: Alexander Anokhin > Priority: Major > Attachments: test_case.ipynb > > > Running in non local mode actions on RDDs fail when RDDs hold custom > non-primitive objects. It affects > * current master version 0.3.0 ([https://github.com/apache/incubator-toree]) > as well as released versions > * RC1 > ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc1]), > * RC2 > ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc2]) > and > * RC3 > ([https://github.com/apache/incubator-toree/releases/tag/v0.2.0-incubating-rc3]). > *Example:* > Cell 1: case class A(i: Int) > Cell 2: val events = sc.parallelize((1 to 5).toSeq).map { i => A(i) } > Cell 3: println(events.count()) > results "java.lang.NoClassDefFoundError: Could not initialize class ..." > However, it does work if code from the cells 1 and 2 is combined into one > cell. In that case actions on such RDDs work correctly, but case class > definition should be added to every cell with problem RDD. > *Example:* > Cell 1: case class A(i: Int) > val events = sc.parallelize((1 to 5).toSeq).map { i => A(i) } > Cell 2: println(events.count()) > results "5" -- This message was sent by Atlassian JIRA (v7.6.3#76005)