rahil-c commented on code in PR #7819:
URL: https://github.com/apache/hudi/pull/7819#discussion_r1093844152


##########
website/docs/cli.md:
##########
@@ -5,10 +5,22 @@ last_modified_at: 2021-08-18T15:59:57-04:00
 ---
 
 ### Local set up
-Once hudi has been built, the shell can be fired by via  `cd hudi-cli && 
./hudi-cli.sh`. A hudi table resides on DFS, in a location referred to as the 
`basePath` and
+Once hudi has been built, the shell can be fired by via  `cd hudi-cli && 
./hudi-cli.sh`.
+
+Optionally in release `0.13.0` we have now added another way of launching the 
`hudi cli`, which is using the `hudi-cli-bundle`.
+There are a couple of requirements when using this approach such as having 
`spark` installed locally on your machine. 
+It is required to use a spark distribution with hadoop dependencies packaged 
such as `spark-3.3.1-bin-hadoop2.tgz` from 
https://archive.apache.org/dist/spark/.
+We also recommend you set an env variable `$SPARK_HOME` to the path of where 
spark is installed on your machine. 
+One important thing to note is that the `hudi-spark-bundle` should also be 
present when using the `hudi-cli-bundle`.  

Review Comment:
   Good question, ideally the cli bundle and spark bundle should be inferred 
based on the logic of the script. 
https://github.com/apache/hudi/blob/master/packaging/hudi-cli-bundle/hudi-cli-with-bundle.sh#L23
   
   User can also set these env var themselves in their shell, but is not 
required. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to