Re: read a binary file and save in another location

Russell Jurney Thu, 09 Mar 2023 12:07:53 -0800

Yeah, that's the right answer!

Thanks,
Russell Jurney @rjurney <http://twitter.com/rjurney>
russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB
<http://facebook.com/jurney> datasyndrome.com Book a time on Calendly
<https://calendly.com/rjurney_personal/30min>



On Thu, Mar 9, 2023 at 10:14 AM Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Does this need any action in PySpark?
>
>
> How about importing using the shutil package?
>
>
> https://sparkbyexamples.com/python/how-to-copy-files-in-python/
>
>
>
>    view my Linkedin profile
> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
>
> On Thu, 9 Mar 2023 at 17:46, Russell Jurney <russell.jur...@gmail.com>
> wrote:
>
>> https://spark.apache.org/docs/latest/sql-data-sources-binaryFile.html
>>
>> This says "Binary file data source does not support writing a DataFrame
>> back to the original files." which I take to mean this isn't possible...
>>
>> I haven't done this, but going from the docs, it would be:
>>
>> spark.read.format("binaryFile").option("pathGlobFilter", 
>> "*.png").load("/path/to/data").write.format("binaryFile").save("/new/path/to/data")
>>
>> Looking at the DataFrameWriter code on master branch
>> <https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala>
>> for DataFrameWriter, let's see if there is a binaryFile format option...
>>
>> At this point I get lost. I can't figure out how this works either, but
>> hopefully I have helped define the problem. The format() method of
>> DataFrameWriter isn't documented
>> <https://spark.apache.org/docs/3.1.3/api/java/org/apache/spark/sql/DataFrameWriter.html#format-java.lang.String->
>> .
>>
>> Russell Jurney @rjurney <http://twitter.com/rjurney>
>> russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB
>> <http://facebook.com/jurney> datasyndrome.com Book a time on Calendly
>> <https://calendly.com/rjurney_personal/30min>
>>
>>
>> On Thu, Mar 9, 2023 at 12:52 AM second_co...@yahoo.com.INVALID
>> <second_co...@yahoo.com.invalid> wrote:
>>
>>> any example on how to read a binary file using pySpark and save it in
>>> another location . copy feature
>>>
>>>
>>> Thank you,
>>> Teoh
>>>
>>

Re: read a binary file and save in another location

Reply via email to