Yeah, that's the right answer! Thanks, Russell Jurney @rjurney <http://twitter.com/rjurney> russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB <http://facebook.com/jurney> datasyndrome.com Book a time on Calendly <https://calendly.com/rjurney_personal/30min>
On Thu, Mar 9, 2023 at 10:14 AM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > Does this need any action in PySpark? > > > How about importing using the shutil package? > > > https://sparkbyexamples.com/python/how-to-copy-files-in-python/ > > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > loss, damage or destruction of data or any other property which may arise > from relying on this email's technical content is explicitly disclaimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > > On Thu, 9 Mar 2023 at 17:46, Russell Jurney <russell.jur...@gmail.com> > wrote: > >> https://spark.apache.org/docs/latest/sql-data-sources-binaryFile.html >> >> This says "Binary file data source does not support writing a DataFrame >> back to the original files." which I take to mean this isn't possible... >> >> I haven't done this, but going from the docs, it would be: >> >> spark.read.format("binaryFile").option("pathGlobFilter", >> "*.png").load("/path/to/data").write.format("binaryFile").save("/new/path/to/data") >> >> Looking at the DataFrameWriter code on master branch >> <https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala> >> for DataFrameWriter, let's see if there is a binaryFile format option... >> >> At this point I get lost. I can't figure out how this works either, but >> hopefully I have helped define the problem. The format() method of >> DataFrameWriter isn't documented >> <https://spark.apache.org/docs/3.1.3/api/java/org/apache/spark/sql/DataFrameWriter.html#format-java.lang.String-> >> . >> >> Russell Jurney @rjurney <http://twitter.com/rjurney> >> russell.jur...@gmail.com LI <http://linkedin.com/in/russelljurney> FB >> <http://facebook.com/jurney> datasyndrome.com Book a time on Calendly >> <https://calendly.com/rjurney_personal/30min> >> >> >> On Thu, Mar 9, 2023 at 12:52 AM second_co...@yahoo.com.INVALID >> <second_co...@yahoo.com.invalid> wrote: >> >>> any example on how to read a binary file using pySpark and save it in >>> another location . copy feature >>> >>> >>> Thank you, >>> Teoh >>> >>