I filed an SPIP for this at
https://issues.apache.org/jira/browse/SPARK-24258. Let’s discuss!
On Wed, Apr 18, 2018 at 23:33 Leif Walsh wrote:
> I agree we should reuse as much as possible. For PySpark, I think the
> obvious choices of Breeze and numpy arrays already made make a lot of
> sense, I
I agree we should reuse as much as possible. For PySpark, I think the
obvious choices of Breeze and numpy arrays already made make a lot of
sense, I’m not sure about the other language bindings and would defer to
others.
I was under the impression that UDTs were gone and (probably?) not coming
bac
Thanks for the thoughts! We've gone back and forth quite a bit about local
linear algebra support in Spark. For reference, there have been some
discussions here:
https://issues.apache.org/jira/browse/SPARK-6442
https://issues.apache.org/jira/browse/SPARK-16365
https://issues.apache.org/jira/brows
Hi all,
I’ve been playing around with the Vector and Matrix UDTs in pyspark.ml and
I’ve found myself wanting more.
There is a minor issue in that with the arrow serialization enabled, these
types don’t serialize properly in python UDF calls or in toPandas. There’s
a natural representation for the