Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-03-07 Thread via GitHub


XComp commented on PR #24463:
URL: https://github.com/apache/flink/pull/24463#issuecomment-1983532633

   I deleted the branch again because it doesn't enable the S3 credentials, of 
course. :facepalm: Anyway, it's better to replace it with Minio.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-03-07 Thread via GitHub


XComp closed pull request #24463: [FLINK-34324][test] Makes all s3 related 
operations being declared and called in a single location
URL: https://github.com/apache/flink/pull/24463


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-03-07 Thread via GitHub


flinkbot commented on PR #24463:
URL: https://github.com/apache/flink/pull/24463#issuecomment-1983274468

   
   ## CI report:
   
   * 2d78f0d7ea556f374bc7f9ba62953045713fe86e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-03-07 Thread via GitHub


XComp commented on PR #24244:
URL: https://github.com/apache/flink/pull/24244#issuecomment-1983251954

   Closing this one becaue [there are still 
issues](https://issues.apache.org/jira/browse/FLINK-34324?focusedCommentId=17814891=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17814891).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-03-07 Thread via GitHub


XComp closed pull request #24244: [FLINK-34324][test] Makes all s3 related 
operations being declared and called in a single location
URL: https://github.com/apache/flink/pull/24244


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-14 Thread via GitHub


XComp commented on PR #24244:
URL: https://github.com/apache/flink/pull/24244#issuecomment-1944268824

   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-05 Thread via GitHub


flinkbot commented on PR #24269:
URL: https://github.com/apache/flink/pull/24269#issuecomment-1928303748

   
   ## CI report:
   
   * 163efd8584796be84153823bb5180b6213bbecbb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-05 Thread via GitHub


flinkbot commented on PR #24268:
URL: https://github.com/apache/flink/pull/24268#issuecomment-1928297908

   
   ## CI report:
   
   * e686b9960d39a69031df4f846bbb17a1565a7340 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-05 Thread via GitHub


XComp opened a new pull request, #24269:
URL: https://github.com/apache/flink/pull/24269

   1.17 backport PR for parent PR https://github.com/apache/flink/pull/24244


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-05 Thread via GitHub


XComp opened a new pull request, #24268:
URL: https://github.com/apache/flink/pull/24268

   1.18 backport PR for parent PR #24244 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-01 Thread via GitHub


zentol commented on code in PR #24244:
URL: https://github.com/apache/flink/pull/24244#discussion_r1474268805


##
flink-end-to-end-tests/test-scripts/test_file_sink.sh:
##
@@ -96,13 +51,54 @@ function get_complete_result {
 #   line number in part files
 ###
 function get_total_number_of_valid_lines {
-  if [ "${OUT_TYPE}" == "local" ]; then
-get_complete_result | wc -l | tr -d '[:space:]'
-  elif [ "${OUT_TYPE}" == "s3" ]; then
-s3_get_number_of_lines_by_prefix "${S3_PREFIX}" "part-"
-  fi
+  get_complete_result | wc -l | tr -d '[:space:]'
 }
 
+if [ "${OUT_TYPE}" == "s3" ]; then
+  source "$(dirname "$0")"/common_s3.sh
+  s3_setup hadoop
+
+  JOB_OUTPUT_PATH="s3://$IT_CASE_S3_BUCKET/$S3_PREFIX"
+  set_config_key "state.checkpoints.dir" 
"s3://$IT_CASE_S3_BUCKET/$S3_PREFIX-chk"
+  mkdir -p "$OUTPUT_PATH-chk"
+
+  # overwrites implementation for local runs
+  function get_complete_result {
+s3_get_by_full_path_and_filename_prefix "$OUTPUT_PATH" "$S3_PREFIX" 
"part-" true
+  }

Review Comment:
   I wonder why we are reading from OUTPUT_PATH and not JOB_OUTPUT_PATH 樂 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-01 Thread via GitHub


zentol commented on code in PR #24244:
URL: https://github.com/apache/flink/pull/24244#discussion_r1474261856


##
flink-end-to-end-tests/test-scripts/test_file_sink.sh:
##
@@ -96,13 +51,54 @@ function get_complete_result {
 #   line number in part files
 ###
 function get_total_number_of_valid_lines {
-  if [ "${OUT_TYPE}" == "local" ]; then
-get_complete_result | wc -l | tr -d '[:space:]'
-  elif [ "${OUT_TYPE}" == "s3" ]; then
-s3_get_number_of_lines_by_prefix "${S3_PREFIX}" "part-"
-  fi
+  get_complete_result | wc -l | tr -d '[:space:]'
 }
 
+if [ "${OUT_TYPE}" == "s3" ]; then
+  source "$(dirname "$0")"/common_s3.sh
+  s3_setup hadoop
+
+  JOB_OUTPUT_PATH="s3://$IT_CASE_S3_BUCKET/$S3_PREFIX"
+  set_config_key "state.checkpoints.dir" 
"s3://$IT_CASE_S3_BUCKET/$S3_PREFIX-chk"
+  mkdir -p "$OUTPUT_PATH-chk"
+
+  # overwrites implementation for local runs
+  function get_complete_result {
+s3_get_by_full_path_and_filename_prefix "$OUTPUT_PATH" "$S3_PREFIX" 
"part-" true
+  }

Review Comment:
   Or is that a no-op because in s3 mode we never write anything there...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-01 Thread via GitHub


zentol commented on code in PR #24244:
URL: https://github.com/apache/flink/pull/24244#discussion_r1474257337


##
flink-end-to-end-tests/test-scripts/test_file_sink.sh:
##
@@ -96,13 +51,54 @@ function get_complete_result {
 #   line number in part files
 ###
 function get_total_number_of_valid_lines {
-  if [ "${OUT_TYPE}" == "local" ]; then
-get_complete_result | wc -l | tr -d '[:space:]'
-  elif [ "${OUT_TYPE}" == "s3" ]; then
-s3_get_number_of_lines_by_prefix "${S3_PREFIX}" "part-"
-  fi
+  get_complete_result | wc -l | tr -d '[:space:]'
 }
 
+if [ "${OUT_TYPE}" == "s3" ]; then
+  source "$(dirname "$0")"/common_s3.sh
+  s3_setup hadoop
+
+  JOB_OUTPUT_PATH="s3://$IT_CASE_S3_BUCKET/$S3_PREFIX"
+  set_config_key "state.checkpoints.dir" 
"s3://$IT_CASE_S3_BUCKET/$S3_PREFIX-chk"
+  mkdir -p "$OUTPUT_PATH-chk"
+
+  # overwrites implementation for local runs
+  function get_complete_result {
+s3_get_by_full_path_and_filename_prefix "$OUTPUT_PATH" "$S3_PREFIX" 
"part-" true
+  }

Review Comment:
   This is missing the final line from the original get_complete_result 
function?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-02-01 Thread via GitHub


zentol commented on code in PR #24244:
URL: https://github.com/apache/flink/pull/24244#discussion_r1474255709


##
flink-end-to-end-tests/test-scripts/test_file_sink.sh:
##
@@ -96,13 +51,54 @@ function get_complete_result {
 #   line number in part files
 ###
 function get_total_number_of_valid_lines {
-  if [ "${OUT_TYPE}" == "local" ]; then
-get_complete_result | wc -l | tr -d '[:space:]'
-  elif [ "${OUT_TYPE}" == "s3" ]; then
-s3_get_number_of_lines_by_prefix "${S3_PREFIX}" "part-"
-  fi
+  get_complete_result | wc -l | tr -d '[:space:]'
 }
 
+if [ "${OUT_TYPE}" == "s3" ]; then
+  source "$(dirname "$0")"/common_s3.sh
+  s3_setup hadoop
+
+  JOB_OUTPUT_PATH="s3://$IT_CASE_S3_BUCKET/$S3_PREFIX"
+  set_config_key "state.checkpoints.dir" 
"s3://$IT_CASE_S3_BUCKET/$S3_PREFIX-chk"
+  mkdir -p "$OUTPUT_PATH-chk"
+
+  # overwrites implementation for local runs
+  function get_complete_result {
+s3_get_by_full_path_and_filename_prefix "$OUTPUT_PATH" "$S3_PREFIX" 
"part-" true
+  }
+
+  # overwrites implementation for local runs
+  function get_total_number_of_valid_lines {
+s3_get_number_of_lines_by_prefix "${S3_PREFIX}" "part-"
+  }
+
+  # make sure we delete the file at the end
+  function out_cleanup {
+s3_delete_by_full_path_prefix "$S3_PREFIX"
+s3_delete_by_full_path_prefix "${S3_PREFIX}-chk"
+rollback_openssl_lib
+  }
+
+  on_exit out_cleanup
+elif [ "${OUT_TYPE}" == "local" ]; then

Review Comment:
   How about we use this in the top-most branch so that the local setup is 
closer together?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-01-31 Thread via GitHub


flinkbot commented on PR #24244:
URL: https://github.com/apache/flink/pull/24244#issuecomment-1919180871

   
   ## CI report:
   
   * 8bf09a9c3c67b9561ebe461f130f3aebb2eb908a UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] [FLINK-34324][test] Makes all s3 related operations being declared and called in a single location [flink]

2024-01-31 Thread via GitHub


XComp opened a new pull request, #24244:
URL: https://github.com/apache/flink/pull/24244

   ## What is the purpose of the change
   
   * Fixes error message in test execution
   * unifies S3-related code in one location
   
   ## Brief change log
   
   * Moves the entire s3-related logic into the if block
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
 - If yes, how is the feature documented? not applicable


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org