Re: Spark Write BinaryType Column as continues file to S3

Bjørn Jørgensen Fri, 08 Apr 2022 08:31:36 -0700

In the New spark 3.3 there Will be an sql function
https://github.com/apache/spark/commit/25dd4254fed71923731fd59838875c0dd1ff665a
hope this can help you.


fre. 8. apr. 2022, 17:14 skrev Philipp Kraus <
philipp.kraus.flashp...@gmail.com>:

> Hello,
>
> I have got a data frame with numerical data in Spark 3.1.1 (Java) which
> should be converted to a binary file.
> My idea is that I create a udf function that generates a byte array based
> on the numerical values, so I can apply this function on each row of the
> data frame and get than a new column with row-wise binary byte data.
> If this is done, I would like to write this column as continues byte
> stream to a file which is stored in a S3 bucket.
>
> So my question is, is the idea with the udf function a good idea and is it
> possible to write this continues byte stream directly to S3 / is there any
> built-in functionality?
> What is a good strategy to do this?
>
> Thanks for help
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark Write BinaryType Column as continues file to S3

Reply via email to