Reading JSON in Pyspark throws scala.MatchError

2015-10-02 Thread balajikvijayan
Running Windows 8.1, Python 2.7.x, Scala 2.10.5, Spark 1.4.1. I'm trying to read in a large quantity of json data in a couple of files and I receive a scala.MatchError when I do so. Json, Python and stack trace all shown below. Json: { "dataunit": { "page_view": {

Re: pyspark-Failed to run first

2015-09-29 Thread balajikvijayan
Any updates on this issue? A cursory search shows that others are still experiencing this issue. I'm seeing this occur on trivial data sets in pyspark; however they are running successfully in scala. While this is an acceptable workaround I would like to know if this item is on the spark roadmap