DJeyCodeX opened a new pull request #27506: Demo Pipeline of Migrating OnPremise Database to Spark & Hadoop URL: https://github.com/apache/spark/pull/27506 ### What changes were proposed in this pull request? This PR consist the following: 1. Many of the organisations are facing issue to migrate their current database such as Mysql to Hadoop & Spark Ecosystem 2. Created a Demo Pipeline where I have covered 3 use cases: **Case 1: Storing & then reading from HDFS Part File in Spark** **Case 2: Converting it into parquete format & then reading from parquete file format in SPARK** **Special Case: Directly analyisng in Spark from MySQL without storing in HDFS** 3. Finally after all the aggregations in Spark, generating a reporting Dashboard using Tableau. Well, this Code may help many of the Spark Users who are willing to do this.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
