On Fri, Sep 12, 2014 at 8:55 AM, Brad Miller <bmill...@eecs.berkeley.edu> wrote: > Hi Davies, > > Thanks for the quick fix. I'm sorry to send out a bug report on release day > - 1.1.0 really is a great release. I've been running the 1.1 branch for a > while and there's definitely lots of good stuff. > > For the workaround, I think you may have meant: > > srdd2 = SchemaRDD(srdd._jschema_rdd.coalesce(N, False, None), sqlCtx)
Yes, thanks for the correction. > Note: > "_schema_rdd" -> "_jschema_rdd" > "false" -> "False" > > That workaround seems to work fine (in that I've observed the correct number > of partitions in the web-ui, although haven't tested it any beyond that). > > Thanks! > -Brad > > On Thu, Sep 11, 2014 at 11:30 PM, Davies Liu <dav...@databricks.com> wrote: >> >> This is a bug, I had create an issue to track this: >> https://issues.apache.org/jira/browse/SPARK-3500 >> >> Also, there is PR to fix this: https://github.com/apache/spark/pull/2369 >> >> Before next bugfix release, you can workaround this by: >> >> srdd = sqlCtx.jsonRDD(rdd) >> srdd2 = SchemaRDD(srdd._schema_rdd.coalesce(N, false, None), sqlCtx) >> >> >> On Thu, Sep 11, 2014 at 6:12 PM, Brad Miller <bmill...@eecs.berkeley.edu> >> wrote: >> > Hi All, >> > >> > I'm having some trouble with the coalesce and repartition functions for >> > SchemaRDD objects in pyspark. When I run: >> > >> > sqlCtx.jsonRDD(sc.parallelize(['{"foo":"bar"}', >> > '{"foo":"baz"}'])).coalesce(1) >> > >> > I get this error: >> > >> > Py4JError: An error occurred while calling o94.coalesce. Trace: >> > py4j.Py4JException: Method coalesce([class java.lang.Integer, class >> > java.lang.Boolean]) does not exist >> > >> > For context, I have a dataset stored in a parquet file, and I'm using >> > SQLContext to make several queries against the data. I then register >> > the >> > results of these as queries new tables in the SQLContext. Unfortunately >> > each new table has the same number of partitions as the original >> > (despite >> > being much smaller). Hence my interest in coalesce and repartition. >> > >> > Has anybody else encountered this bug? Is there an alternate workflow I >> > should consider? >> > >> > I am running the 1.1.0 binaries released today. >> > >> > best, >> > -Brad > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org