Okay will contact Sebastiaan then, thanks. Sitaraman From: Aldrin <[email protected]> Date: Thursday, February 9, 2023 at 12:09 PM To: [email protected] <[email protected]> Subject: Re: Transferring a spark data frame from Java to Python using Arrow, ArrowFlight. ***** EXTERNAL EMAIL ***** I don't know the details of the work, but my reading of the abstract aligns with your interpretation.
But, Section II B says, "Implementing our connector in core Spark (JVM) circumvents all aforementioned inefficiencies and shortcomings. We can therefore use our connector with all programming languages that Spark supports." My thought was that if they have code to go in one direction, it would be helpful for figuring out how to go the other direction. A quick traversal in the repo makes me think that [1] should be relevant. This is not my work, so if it seems relevant and you need more in depth help then you would probably be better off contacting the authors. The owner of the repo is [2]. [1]: https://github.com/Sebastiaan-Alvarez-Rodriguez/arrow-spark-publication/blob/master/arrow-spark-connector/src/main/scala/org/arrowspark/spark/rdd/ArrowRDD.scala<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSebastiaan-Alvarez-Rodriguez%2Farrow-spark-publication%2Fblob%2Fmaster%2Farrow-spark-connector%2Fsrc%2Fmain%2Fscala%2Forg%2Farrowspark%2Fspark%2Frdd%2FArrowRDD.scala&data=05%7C01%7Cvilayannur.sitaraman%40hitachivantara.com%7Cedee6944274542de114508db0ad971fc%7C18791e1761594f52a8d4de814ca8284a%7C0%7C0%7C638115701556694463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0JkeW9t7WjhobtIu0FULlevKA%2BhBSPMOkA66QcD5mG4%3D&reserved=0> [2]: https://github.com/Sebastiaan-Alvarez-Rodriguez<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSebastiaan-Alvarez-Rodriguez&data=05%7C01%7Cvilayannur.sitaraman%40hitachivantara.com%7Cedee6944274542de114508db0ad971fc%7C18791e1761594f52a8d4de814ca8284a%7C0%7C0%7C638115701556694463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Cs6i4a65nPP%2FSf6Iz8RBqOAly6Yv7LkHBBbODna1A%2F8%3D&reserved=0> Aldrin Montana Computer Science PhD Student UC Santa Cruz On Thu, Feb 9, 2023 at 10:47 AM Vilayannur Sitaraman <[email protected]<mailto:[email protected]>> wrote: Thanks Aldrin for the pointers. Did I understand the effort correctly in that it deals with accessing arrow enabled data via Spark. What I have is a Java based Spark DataFrame and I need to go the other direction, convert this DataFrame to an arrow format so that I can server it via Arrow Flight…do you think this could be achieved with the arrow-spark module you have pointed to…Thanks for your suggestions. Sitaraman From: Aldrin <[email protected]<mailto:[email protected]>> Date: Thursday, February 9, 2023 at 10:31 AM To: [email protected]<mailto:[email protected]> <[email protected]<mailto:[email protected]>> Subject: Re: Transferring a spark data frame from Java to Python using Arrow, ArrowFlight. ***** EXTERNAL EMAIL ***** Hello! This repo [1] and this paper [2] may be relevant. [1]: https://github.com/Sebastiaan-Alvarez-Rodriguez/arrow-spark-publication<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FSebastiaan-Alvarez-Rodriguez%2Farrow-spark-publication&data=05%7C01%7Cvilayannur.sitaraman%40hitachivantara.com%7Cedee6944274542de114508db0ad971fc%7C18791e1761594f52a8d4de814ca8284a%7C0%7C0%7C638115701556694463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ur6c1L5YUiaSNJoTjG4XySyyDDAX6Vu4lJaYTnkKkm4%3D&reserved=0> [2]: https://arxiv.org/pdf/2106.13020.pdf<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fpdf%2F2106.13020.pdf&data=05%7C01%7Cvilayannur.sitaraman%40hitachivantara.com%7Cedee6944274542de114508db0ad971fc%7C18791e1761594f52a8d4de814ca8284a%7C0%7C0%7C638115701556694463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=WS8qRpFAEbIE8wf4WVlClwIV7KNwpLXOA%2BSxbzOYchg%3D&reserved=0> Aldrin Montana Computer Science PhD Student UC Santa Cruz On Wed, Feb 8, 2023 at 7:11 PM Vilayannur Sitaraman <[email protected]<mailto:[email protected]>> wrote: Hi, I just successfully wrote my first flight server and client that transfers data read from and arrow file from Java Server to Python Client. I would like to be able to transfer a Spark DataFrame created in Java to Python using Arrow and ArrowFlight. If I can convert a Spark Dataframe created in java to an Arrow file format then I can use the above created flight server and python client to do the transfer. But I am not sure how to convert a Spark dataframe created in Java to Arrow format in a Java module. Any help/pointers appreciated. Sitaraman
