xBis7 opened a new pull request, #6483: URL: https://github.com/apache/hadoop/pull/6483
### Description of PR There are special branches for running hadoop in docker [docker-hadoop-runner-latest](https://github.com/apache/hadoop/tree/docker-hadoop-runner-latest) [docker-hadoop-runner-jdk11](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk11) [docker-hadoop-runner-jdk8](https://github.com/apache/hadoop/tree/docker-hadoop-runner-jdk8) [docker-hadoop-runner](https://github.com/apache/hadoop/tree/docker-hadoop-runner) [docker-hadoop-3](https://github.com/apache/hadoop/tree/docker-hadoop-3) [docker-hadoop-2](https://github.com/apache/hadoop/tree/docker-hadoop-2) These branches, run specific versions of hadoop. For example, branch `docker-hadoop-3` runs `hadoop-3.3.6`. This patch is adding a similar setup under the main source code which will be running the latest trunk. This will be useful for testing code changes and debugging. It will also make it easier for new hadoop contributors to test the project and get familiar with it. The files were originally copied from the branch `hadoop-docker-3` and were modified so that the code comes from trunk and maven builds the docker image instead of the user. More specifically, in branch `hadoop-docker-3` * The Dockerfile downloads hadoop release 3.3.6 and then un-tars it inside the containers and sets it up * The user has to build the Dockerfile image before starting the docker environment With these changes, in trunk * The user doesn't have to build the image, this is done by maven while packaging the project * Maven generates a directory `hadoop-<version>` under `hadoop-dist/target` * That directory contains the same files as a release * The compose files are placed under that location * The files from that dir are used as docker mounted volumes. That way these files are present in the containers when starting the environment and are also used for starting the services (namenode, etc.) Jira: https://issues.apache.org/jira/browse/HADOOP-18682 ### How was this patch tested? This patch was tested manually in a local env. It's adding a setup for running hadoop in docker. A `README.md` has been added with instruction for testing the changes locally. I've also created a patch and tested it against trunk with `dev-support/bin/test-patch /tmp/1.patch`. To test that the docker environment uses the latest code from trunk, I added a LOG prefix to this ``` 2024-01-22 09:30:57 INFO NameNode:1846 - createNameNode [] ``` and then built the project again, started the docker env and checked the logs ``` 2024-01-22 09:58:44 INFO NameNode:1846 - xbis: createNameNode [] ``` ### For code changes: - [X] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [X] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
