robertwb commented on code in PR #30879:
URL: https://github.com/apache/beam/pull/30879#discussion_r1566531486


##########
contributor-docs/code-change-guide.md:
##########
@@ -0,0 +1,519 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+This guide is for Beam users and developers changing and testing Beam code.
+Specifically, this guide provides information about:
+
+1. Testing code changes locally
+
+2. Building Beam artifacts with modified Beam code and using the modified code 
for pipelines
+
+# Repository structure
+
+The Apache Beam GitHub repository (Beam repo) is, for the most part, a "mono 
repo":
+it contains everything in the Beam project, including the SDK, test
+infrastructure, dashboards, the [Beam website](https://beam.apache.org),
+the [Beam Playground](https://play.beam.apache.org), and so on.
+
+## Gradle quick start
+
+The Beam repo is a single Gradle project that contains all components, 
including Python,
+Go, the website, etc. It is useful to familiarize yourself with the Gradle 
project structure:
+https://docs.gradle.org/current/userguide/multi_project_builds.html
+
+### Gradle key concepts
+
+Grade uses the following key concepts:
+
+* **project**: a folder that contains the `build.gradle` file
+* **task**: an action defined in the `build.gradle` file
+* **plugin**: runs in the project's `build.gradle` and contains predefined 
tasks and hierarchies
+
+For example, common tasks for a Java project or subproject include:
+
+- `compileJava`
+- `compileTestJava`
+- `test`
+- `integrationTest`
+
+To run a Gradle task, the command is `./gradlew -p <project path> <task>` or 
`./gradlew :project:path:task_name`. For example:
+
+```
+./gradlew -p sdks/java/core compileJava
+
+./gradlew :sdks:java:harness:test
+```
+
+### Gradle project configuration: Beam specific
+
+* A **huge** plugin 
`buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin` manages 
everything.
+
+In each java project or subproject, the `build.gradle` file starts with:
+
+```groovy
+
+apply plugin: 'org.apache.beam.module'
+
+applyJavaNature( ... )
+```
+
+Relevant usage of `BeamModulePlugin` includes:
+* Manage Java dependencies
+* Configure projects (Java, Python, Go, Proto, Docker, Grpc, Avro, an so on)
+  * Java -> `applyJavaNature`; Python -> `applyPythonNature`, and so on
+  * Define common custom tasks for each type of project
+    * `test`: run Java unit tests
+    * `spotlessApply`: format java code
+
+## Code paths
+
+The following are example code paths relevant for SDK development:
+
+* `sdks/java` Java SDK
+  * `sdks/java/core` Java core
+  * `sdks/java/harness` SDK harness (entrypoint of SDK container)
+
+* `runners` runner supports, written in Java. For example,

Review Comment:
   These are only relevant for Java. It might make sense to state that for 
everything but Java, all relevant files are in sdks/LANG, whereas for Java the 
code is split up more widely. 



##########
contributor-docs/code-change-guide.md:
##########
@@ -0,0 +1,519 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+This guide is for Beam users and developers changing and testing Beam code.
+Specifically, this guide provides information about:
+
+1. Testing code changes locally
+
+2. Building Beam artifacts with modified Beam code and using the modified code 
for pipelines
+
+# Repository structure
+
+The Apache Beam GitHub repository (Beam repo) is, for the most part, a "mono 
repo":
+it contains everything in the Beam project, including the SDK, test
+infrastructure, dashboards, the [Beam website](https://beam.apache.org),
+the [Beam Playground](https://play.beam.apache.org), and so on.
+
+## Gradle quick start
+
+The Beam repo is a single Gradle project that contains all components, 
including Python,
+Go, the website, etc. It is useful to familiarize yourself with the Gradle 
project structure:
+https://docs.gradle.org/current/userguide/multi_project_builds.html
+
+### Gradle key concepts
+
+Grade uses the following key concepts:
+
+* **project**: a folder that contains the `build.gradle` file
+* **task**: an action defined in the `build.gradle` file
+* **plugin**: runs in the project's `build.gradle` and contains predefined 
tasks and hierarchies
+
+For example, common tasks for a Java project or subproject include:
+
+- `compileJava`
+- `compileTestJava`
+- `test`
+- `integrationTest`
+
+To run a Gradle task, the command is `./gradlew -p <project path> <task>` or 
`./gradlew :project:path:task_name`. For example:
+
+```
+./gradlew -p sdks/java/core compileJava
+
+./gradlew :sdks:java:harness:test
+```
+
+### Gradle project configuration: Beam specific
+
+* A **huge** plugin 
`buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin` manages 
everything.
+
+In each java project or subproject, the `build.gradle` file starts with:
+
+```groovy
+
+apply plugin: 'org.apache.beam.module'
+
+applyJavaNature( ... )
+```
+
+Relevant usage of `BeamModulePlugin` includes:
+* Manage Java dependencies
+* Configure projects (Java, Python, Go, Proto, Docker, Grpc, Avro, an so on)
+  * Java -> `applyJavaNature`; Python -> `applyPythonNature`, and so on
+  * Define common custom tasks for each type of project
+    * `test`: run Java unit tests
+    * `spotlessApply`: format java code
+
+## Code paths
+
+The following are example code paths relevant for SDK development:
+
+* `sdks/java` Java SDK
+  * `sdks/java/core` Java core
+  * `sdks/java/harness` SDK harness (entrypoint of SDK container)
+
+* `runners` runner supports, written in Java. For example,
+  * `runners/direct-java` Java direct runner
+  * `runners/flink-java` Java Flink runner
+  * `runners/google-cloud-dataflow-java` Dataflow runner (job submission, 
translation, etc)
+    * `runners/google-cloud-dataflow-java/worker` Worker on Dataflow legacy 
runner
+
+* `sdks/python` contains the setup file and scripts to trigger test-suites
+  * `sdks/python/apache_beam` actual beam package
+    * `sdks/python/apache_beam/runners/worker` SDK worker harness entrypoint, 
state sampler
+    * `sdks/python/apache_beam/io` I/O connectors
+    * `sdks/python/apache_beam/transforms` most "core" components
+    * `sdks/python/apache_beam/ml` Beam ML
+    * `sdks/python/apache_beam/runners` runner implementations and wrappers
+    * ...
+
+* `sdks/go` Go SDK
+
+* `.github/workflow` GitHub action workflows (for example, tests run under 
PR). Most
+  workflows run a single Gradle command. Check which command is running for
+  a test so that you can run the same command locally during development.
+
+## Environment setup
+
+To set up local development environments, first see the [Contributing 
guide](../CONTRIBUTING.md) .
+If you plan to use Dataflow, see the [Google Cloud 
documentation](https://cloud.google.com/dataflow/docs/quickstarts/create-pipeline-java)
 to setup `gcloud` credentials.
+
+To check if your environment is set up, follow these steps:
+
+Your `PATH` needs to have the following elements configured.

Review Comment:
   These are not needed for all development, e.g. one can do much Go/Python 
development without java installed, and vice versa. 



##########
contributor-docs/code-change-guide.md:
##########
@@ -0,0 +1,519 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+This guide is for Beam users and developers changing and testing Beam code.
+Specifically, this guide provides information about:
+
+1. Testing code changes locally
+
+2. Building Beam artifacts with modified Beam code and using the modified code 
for pipelines
+
+# Repository structure
+
+The Apache Beam GitHub repository (Beam repo) is, for the most part, a "mono 
repo":
+it contains everything in the Beam project, including the SDK, test
+infrastructure, dashboards, the [Beam website](https://beam.apache.org),
+the [Beam Playground](https://play.beam.apache.org), and so on.
+
+## Gradle quick start
+
+The Beam repo is a single Gradle project that contains all components, 
including Python,
+Go, the website, etc. It is useful to familiarize yourself with the Gradle 
project structure:
+https://docs.gradle.org/current/userguide/multi_project_builds.html
+
+### Gradle key concepts
+
+Grade uses the following key concepts:
+
+* **project**: a folder that contains the `build.gradle` file
+* **task**: an action defined in the `build.gradle` file
+* **plugin**: runs in the project's `build.gradle` and contains predefined 
tasks and hierarchies
+
+For example, common tasks for a Java project or subproject include:
+
+- `compileJava`
+- `compileTestJava`
+- `test`
+- `integrationTest`
+
+To run a Gradle task, the command is `./gradlew -p <project path> <task>` or 
`./gradlew :project:path:task_name`. For example:
+
+```
+./gradlew -p sdks/java/core compileJava
+
+./gradlew :sdks:java:harness:test
+```
+
+### Gradle project configuration: Beam specific
+
+* A **huge** plugin 
`buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin` manages 
everything.
+
+In each java project or subproject, the `build.gradle` file starts with:
+
+```groovy
+
+apply plugin: 'org.apache.beam.module'
+
+applyJavaNature( ... )
+```
+
+Relevant usage of `BeamModulePlugin` includes:
+* Manage Java dependencies
+* Configure projects (Java, Python, Go, Proto, Docker, Grpc, Avro, an so on)
+  * Java -> `applyJavaNature`; Python -> `applyPythonNature`, and so on
+  * Define common custom tasks for each type of project
+    * `test`: run Java unit tests
+    * `spotlessApply`: format java code
+
+## Code paths
+
+The following are example code paths relevant for SDK development:
+
+* `sdks/java` Java SDK
+  * `sdks/java/core` Java core
+  * `sdks/java/harness` SDK harness (entrypoint of SDK container)
+
+* `runners` runner supports, written in Java. For example,
+  * `runners/direct-java` Java direct runner
+  * `runners/flink-java` Java Flink runner
+  * `runners/google-cloud-dataflow-java` Dataflow runner (job submission, 
translation, etc)
+    * `runners/google-cloud-dataflow-java/worker` Worker on Dataflow legacy 
runner
+
+* `sdks/python` contains the setup file and scripts to trigger test-suites
+  * `sdks/python/apache_beam` actual beam package
+    * `sdks/python/apache_beam/runners/worker` SDK worker harness entrypoint, 
state sampler
+    * `sdks/python/apache_beam/io` I/O connectors
+    * `sdks/python/apache_beam/transforms` most "core" components
+    * `sdks/python/apache_beam/ml` Beam ML
+    * `sdks/python/apache_beam/runners` runner implementations and wrappers
+    * ...
+
+* `sdks/go` Go SDK
+
+* `.github/workflow` GitHub action workflows (for example, tests run under 
PR). Most
+  workflows run a single Gradle command. Check which command is running for
+  a test so that you can run the same command locally during development.
+
+## Environment setup
+
+To set up local development environments, first see the [Contributing 
guide](../CONTRIBUTING.md) .
+If you plan to use Dataflow, see the [Google Cloud 
documentation](https://cloud.google.com/dataflow/docs/quickstarts/create-pipeline-java)
 to setup `gcloud` credentials.
+
+To check if your environment is set up, follow these steps:
+
+Your `PATH` needs to have the following elements configured.
+
+* A Java environment (any supported Java version, Java8 preferably as of 2024).
+  * This environment is needed for all development, because Beam is a Gradle 
project that uses JVM.
+  * Recommended: Use [sdkman](https://sdkman.io/install) to manage Java 
versions.
+* A Python environment (any supported Python version)
+  * Needed for Python SDK development
+  * Recommended: Use [`pyenv`](https://github.com/pyenv/pyenv) and
+    a [virtual environment](https://docs.python.org/3/library/venv.html) to 
manage Python versions.
+* A Go environment. Install the latest Go version.
+  * Needed for Go SDK development and SDK container change (for all SDKs), 
because
+  the container entrypoint scripts are written in Go.
+* A Docker environment.

Review Comment:
   You don't need docker (and, consequently, go) for cross-language anymore. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to