vinothchandar commented on a change in pull request #937: [HUDI-291] Simplify quickstart documentation URL: https://github.com/apache/incubator-hudi/pull/937#discussion_r331274747
########## File path: docs/quickstart.md ########## @@ -3,196 +3,186 @@ title: Quickstart keywords: hudi, quickstart tags: [quickstart] sidebar: mydoc_sidebar -toc: false +toc: true permalink: quickstart.html --- <br/> -To get a quick peek at Hudi's capabilities, we have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) -that showcases this on a docker based setup with all dependent systems running locally. We recommend you replicate the same setup -and run the demo yourself, by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, -refer to [migration guide](migration_guide.html). -If you have Hive, Hadoop, Spark installed already & prefer to do it on your own setup, read on. +This guide provides a quick peak at Hudi's capabilities using simple spark-shell. Using Spark datasources, this guide +walks through code snippets that allows you to insert and update a Hudi table of default Storage type: + [Copy on Write](https://hudi.apache.org/concepts.html#copy-on-write-storage). +After each write operation we show how to read the data. We will also be looking at how to query a Hudi table incrementally. -## Download Hudi +We have put together a [demo video](https://www.youtube.com/watch?v=VhNgUsxdrD0) that showcases this on a docker based +setup with all dependent systems running locally. We recommend you replicate the same setup and run the demo yourself, +by following steps [here](docker_demo.html). Also, if you are looking for ways to migrate your existing data to Hudi, +refer to [migration guide](migration_guide.html). -Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line +For the quickstart, you would need to build Hudi spark bundle jar and provide that to the spark shell as shown below. -``` -$ mvn clean install -DskipTests -DskipITs -``` - -Hudi works with Hive 2.3.x or higher versions. As long as Hive 2.x protocol can talk to Hive 1.x, you can use Hudi to -talk to older hive versions. - -For IDE, you can pull in the code into IntelliJ as a normal maven project. -You might want to add your spark jars folder to project dependencies under 'Module Setttings', to be able to run from IDE. - - -### Version Compatibility +## Build Hudi spark bundle jar -Hudi requires Java 8 to be installed on a *nix system. Hudi works with Spark-2.x versions. -Further, we have verified that Hudi works with the following combination of Hadoop/Hive/Spark. - -| Hadoop | Hive | Spark | Instructions to Build Hudi | -| ---- | ----- | ---- | ---- | -| Apache hadoop-2.[7-8].x | Apache hive-2.3.[1-3] | spark-2.[1-3].x | Use "mvn clean install -DskipTests" | - -If your environment has other versions of hadoop/hive/spark, please try out Hudi -and let us know if there are any issues. - -## Generate Sample Dataset - -### Environment Variables - -Please set the following environment variables according to your setup. We have given an example setup with CDH version +Hudi requires Java 8 to be installed on a *nix system. +Check out [code](https://github.com/apache/incubator-hudi) and normally build the maven project, from command line: Review comment: include git clone command as well? so someone can just keep copy pasting. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services