Hi all,
Just wanted to thank all for the dataset API - most of the times we see
only bugs in these lists ;o).
- Putting some context, this weekend I was updating the SQL chapters of
my book - it had all the ugliness of SchemaRDD,
registerTempTable, take(10).foreach(println)
and
Hi, Yash,
It should work.
val df = spark.range(1, 5)
.select('id + 1 as 'p1, 'id + 2 as 'p2, 'id + 3 as 'p3, 'id + 4 as 'p4,
'id + 5 as 'p5, 'id as 'b)
.selectExpr("p1", "p2", "p3", "p4", "p5", "CAST(b AS STRING) AS
s").coalesce(1)
df.write.partitionBy("p1", "p2", "p3", "p4",
+1
Pozdrawiam,
Jacek Laskowski
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski
On Sat, Jun 18, 2016 at 9:13 AM, Reynold Xin wrote:
> Looks like that's resolved now.
>
> I
Going to go ahead and starting working on the docs assuming this gets
merged https://github.com/apache/spark/pull/13592. Opened a JIRA
https://issues.apache.org/jira/browse/SPARK-16046
Having some issues building docs. The Java docs fail to build. Output when
it fails is here:
On Sat, Jun 18, 2016 at 6:13 AM, Pedro Rodriguez
wrote:
> using Datasets (eg using $ to select columns).
Or even my favourite one - the tick ` :-)
Jacek
-
To unsubscribe, e-mail:
Looks like that's resolved now.
I will wait till Sunday to cut rc2 to give people more time to find issues
with rc1.
On Fri, Jun 17, 2016 at 10:58 AM, Marcelo Vanzin
wrote:
> -1 (non-binding)
>
> SPARK-16017 shows a severe perf regression in YARN compared to 1.6.1.
>
> On
Please go for it!
On Friday, June 17, 2016, Pedro Rodriguez wrote:
> I would be open to working on Dataset documentation if no one else isn't
> already working on it. Thoughts?
>
> On Fri, Jun 17, 2016 at 11:44 PM, Cheng Lian