zentol opened a new pull request #7605: [FLINK-11476][PROTOTYPE] Port Kafka E2E test to Java URL: https://github.com/apache/flink/pull/7605 ## What is the purpose of the change *(For example: This pull request makes task deployment go through the blob server, rather than through RPC. That way we avoid re-transferring them on each deployment (during recovery).)* This PR is a prototype for writing Java-based end-to-end tests. It contains a port of the Kafka end-to-end test. ## "Brief" change log The first few commits (labeled as hotfixes) are cleanups and extensions around existing code. `[hotfix][build] Remove unnecessary version tag` Removes a redundant version tag. `[hotfix][tests] Remove forced step logging in AutoClosableProcess` The convenience methods for creating Processes forced the user to pass a description of what this Process is supposed to be doing. This is rather tiresome to work against, the output doesn't look as nice as dedicated log messages would and is inconsistent for non-blocking operations. As such I have removed the logging. Additionally this commit gives the callee control over whether the Process should inherit the parents IO streams. `[hotfix][tests] Extend Tar wrapper to support strip argument` Simple extension of the Tar wrapper to support additional functionality. This is useful for archives which only contain a directory in the root. `[hotfix][tests] Add modified ExternalResource class` This commit adds a modified version of the jUnit `ExternalResource`. It is an interface instead of an abstract class and allows resources to behave differently in their `after` method depending on whether the test succeeded. This will be used by the `FlinkDistribution` to backup logs for failed tests. `[hotfix][tests] Backup logs for failed tests` Modifies the `FlinkDistribution` to implement our modified `ExternalResource` interface instead, and implements a backup logic for logs if a test fails. `[hotfix][tests] Support starting individual Job-/TaskManagers` Small extesion of the `FlinkDistribution` to expose `jobmanager.sh start` and `taskmanager.sh start`. `[hotfix][tests] Support job submissions` Extension of the `FlinkDistribution` to support running jars. Now to the actual prototype: `[FLINK-11463][tests] Add utilities` Adds a number of utilities used in subsequent commits. Kept in a separate commit to make re-ordering of commits easier. `TestUtils#getResourceJar` is used to creating absolute paths for files in a modules `target` directory, like copied dependencies. `OperatingSystemRestriction` provides a common way to disable test based on the operating system. Will be commonly needed for E2E tests until we got rid of all shell-script and unix tool dependencies. `ParameterProperty` is is used for retrieving and parsing parameters from system properties. `FactoryUtils#loadAndInvokeFactory` is used by the core resource factory methods to load and process factories. `[FLINK-11465][tests] Add FlinkResource` Adds a common `FlinkResource` interface for interacting with Flink, and adds a `LocalStandaloneFlinkResource` implementation that sets up local standalone clusters. The interfaces are intentionally kept bare-bones. Tests can make use of a `FlinkResource` like this: ``` @Rule public final FlinkResource flink = FlinkResource.get(); ``` This will return one `FlinkResource` implementation. Which implementation is loaded depends on the available `FlinkResourceFactory` implementations (loaded via ServiceLoader) and the system properties that have been set. As of right now this will always return a `LocalStandaloneFlinkResource` as it is the only existing implementation. The `LocalStandaloneFlinkResource` is backed by a `FlinkDistribution` and is rather straight-forward. `[FLINK-11464][tests] Add DownloadCache` Adds a utility for downloading and caching artifacts. The access/loading mechanism works identical to the `FlinkResource`. 3 implementations are provided in this PR. All of them support downloading arbitrary files, but they differ in their caching behavior. * LolCache - The default implementation. Does not cache anything. * PersistingDownloadCache - Intended default implementation for local setups. Will be used if the `cache-dir` property is set to a local directory to be used as a cache location. Identifies files by the hash of the URL, and encodes the hash and download-date into the file name. Supports time-based cleanup of cached entries by setting the `cache-ttl` property to a `Period`, e.g. `P1D` (one day), when running maven. * TravisDownloadCache - Implementation intended for Travis. Similar to the PersistingDownloadCache, but instead of time uses a build-number based TTL. `[FLINK-11466][tests] Add KafkaResource` Adds a common `KafkaResource` interface for interacting with Kafka, and adds a LocalStandaloneKafkaResource` implementation that downloads and sets up a local cluster with the bundled zookeeper. This should behave similarly like the kafka cluster setup in our bash end-to-end test (see kafka-common.sh). The access/loading mechanism works identical to the `FlinkResource`. `[FLINK-11468][tests] Setup surefire execution` Sets up a separate surefire execution for end-to-end test in `flink-end-to-end-tests`. * e2e-test by be suffixed with `ITCase` * e2e-tests by be annotated with `@Category(TravisGroupX.class)` to have them run in a specific Travis job. * e2e-tests that rely on hadoop should also be marked as such. `@Category({TravisGroupX.class, Hadoop.class})` `[FLINK-11467][kafka][tests] Port Kafka Streaming E2E test` Actually ports the kakfa Streaming E2E test to Java, building on the previous commits. Effectively a 1:1 port of the `test-streaming-kafka-common.sh`. ## Verifying this change Individual utilities are not tested. The kafka test can be executed by running `mvn verify -Dcategories="org.apache.flink.tests.util.categories.TravisGroup1" -DdistDir=<path to flink-dist>` in `flink-streaming-kafka-driver`. All tests can be executed by running `mvn verify -Dcategories="" -DdistDir=<path to flink-dist>` in `flink-end-to-end-tests`
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
