Re: [EXTERNAL] RDD.pipe() for binary data

2022-07-16 Thread Yuhao Zhang
Hi Shay, Thanks for your reply! I would very much like to use pyspark. However, my project depends on GraphX, which is only available in the Scala API as far as I know. So I'm locked with Scala and trying to find a way out. I wonder if there's a way to go around it. Best regards, Yuhao Zhang

RDD.pipe() for binary data

2022-07-08 Thread Yuhao Zhang
Hi All, I'm currently working on a project involving transferring between Spark 3.x (I use Scala) and a Python runtime. In Spark, data is stored in an RDD as floating-point number arrays/vectors and I have custom routines written in Python to process them. On the Spark side, I also have some