[
https://issues.apache.org/jira/browse/FLINK-8973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16411383#comment-16411383
]
ASF GitHub Bot commented on FLINK-8973:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/5750#discussion_r176726476
--- Diff: flink-end-to-end-tests/test-scripts/common.sh ---
@@ -39,6 +39,93 @@ cd $TEST_ROOT
export TEST_DATA_DIR=$TEST_INFRA_DIR/temp-test-directory-$(date +%S%N)
echo "TEST_DATA_DIR: $TEST_DATA_DIR"
+function revert_default_config() {
+ sed 's/^ //g' > ${FLINK_DIR}/conf/flink-conf.yaml << EOL
+
#==============================================================================
+ # Common
+
#==============================================================================
+
+ jobmanager.rpc.address: localhost
+ jobmanager.rpc.port: 6123
+ jobmanager.heap.mb: 1024
+ taskmanager.heap.mb: 1024
+ taskmanager.numberOfTaskSlots: 1
+ parallelism.default: 1
+
+
#==============================================================================
+ # Web Frontend
+
#==============================================================================
+
+ web.port: 8081
+EOL
+}
+
+function create_ha_conf() {
+
+ # create the masters file (only one currently).
+ # This must have all the masters to be used in HA.
+ echo "localhost:8081" > ${FLINK_DIR}/conf/masters
+
+ # then move on to create the flink-conf.yaml
+
+ if [ -e $TEST_DATA_DIR/recovery ]; then
+ echo "File ${TEST_DATA_DIR}/recovery exists. Deleting it..."
+ rm -rf $TEST_DATA_DIR/recovery
+ fi
+
+ sed 's/^ //g' > ${FLINK_DIR}/conf/flink-conf.yaml << EOL
+
#==============================================================================
+ # Common
+
#==============================================================================
+
+ jobmanager.rpc.address: localhost
+ jobmanager.rpc.port: 6123
+ jobmanager.heap.mb: 1024
+ taskmanager.heap.mb: 1024
+ taskmanager.numberOfTaskSlots: 4
+ parallelism.default: 1
+
+
#==============================================================================
+ # High Availability
+
#==============================================================================
+
+ high-availability: zookeeper
+ high-availability.zookeeper.storageDir:
file://${TEST_DATA_DIR}/recovery/
+ high-availability.zookeeper.quorum: localhost:2181
+ high-availability.zookeeper.path.root: /flink
+ high-availability.cluster-id: /test_cluster_one
+
+
#==============================================================================
+ # Web Frontend
+
#==============================================================================
+
+ web.port: 8081
+EOL
+}
+
+function start_ha_cluster {
+ echo "Setting up HA Cluster..."
+ create_ha_conf
+ start_local_zk
+ start_cluster
+}
+
+function start_local_zk {
+ while read server ; do
+ server=$(echo -e "${server}" | sed -e 's/^[[:space:]]*//' -e
's/[[:space:]]*$//') # trim
+
+ # match server.id=address[:port[:port]]
+ if [[ $server =~ ^server\.([0-9]+)[[:space:]]*\=[[:space:]]*([^:
\#]+) ]]; then
+ id=${BASH_REMATCH[1]}
+ address=${BASH_REMATCH[2]}
--- End diff --
`address` seems to be unused
> End-to-end test: Run general purpose job with failures in standalone mode
> -------------------------------------------------------------------------
>
> Key: FLINK-8973
> URL: https://issues.apache.org/jira/browse/FLINK-8973
> Project: Flink
> Issue Type: Sub-task
> Components: Tests
> Affects Versions: 1.5.0
> Reporter: Till Rohrmann
> Assignee: Kostas Kloudas
> Priority: Blocker
> Fix For: 1.5.0
>
>
> We should set up an end-to-end test which runs the general purpose job
> (FLINK-8971) in a standalone setting with HA enabled (ZooKeeper). When
> running the job, the job failures should be activated.
> Additionally, we should randomly kill Flink processes (cluster entrypoint and
> TaskExecutors). When killing them, we should also spawn new processes to make
> up for the loss.
> This end-to-end test case should run with all different state backend
> settings: {{RocksDB}} (full/incremental, async/sync), {{FsStateBackend}}
> (sync/async)
> We should then verify that the general purpose job is successfully recovered
> without data loss or other failures.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)