ConverJens commented on pull request #13723:
URL: https://github.com/apache/beam/pull/13723#issuecomment-765258288


   @dandy10 I corrected my s3_access_key argument but still got the same error. 
   
   The weird thing is that I believe that I have passed them on correctly 
because if switch to in_memory and one worker it works as expected and beam 
writes to minio, so something seems slightly off.
   
   I think there is a discrepancy on how the options are passed on to the 
filesystem itself. Looking at the first line of the error message I'm getting 
it first says that:
   ```
   "Error in _start_upload while inserting file 
s3://pipelines/tfx/trace_pipeline_e2e/FileBasedExampleGenWithDate/examples/1438/train/beam-temp-data_tfrecord-286f19285c8d11ebbb52a24bbfe454c5/319ecf64-85bb-46bd-afcd-a355144724a7.data_tfrecord.gz:
   ```
   which is the correct endpoint. Later on the same line it says:
   ```
   botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: \"
   
https://pipelines.s3.amazonaws.com/tfx/trace_pipeline_e2e/FileBasedExampleGenWithDate/examples/1438/train/beam-temp-data_tfrecord-286f19285c8d11ebbb52a24bbfe454c5/319ecf64-85bb-46bd-afcd-a355144724a7.data_tfrecord.gz?uploads
   ```
   which seems to indicate that the s3 client is still trying to reach 
amazon.com which I assume is a default value.
   
   Do you have any idea why that is?
   
   Below is the full first line again.
   ```
   INFO:apache_beam.runners.portability.local_job_service:Worker: severity: 
ERROR timestamp {   seconds: 1611304997   nanos: 298857688 } message: "Error in 
_start_upload while inserting file 
s3://pipelines/tfx/trace_pipeline_e2e/FileBasedExampleGenWithDate/examples/1438/train/beam-temp-data_tfrecord-286f19285c8d11ebbb52a24bbfe454c5/319ecf64-85bb-46bd-afcd-a355144724a7.data_tfrecord.gz:
 Traceback (most recent call last):\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connection.py\", line 170, in 
_new_conn\n    (self._dns_host, self.port), self.timeout, **extra_kw\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/util/connection.py\", line 96, 
in create_connection\n    raise err\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/util/connection.py\", line 86, 
in create_connection\n    sock.connect(sa)\nsocket.timeout: timed out\n\nDuring 
handling of the above exception, another exception occurred:\n\nTraceback (most 
recent call last):\n  File \"/usr/local/lib/py
 thon3.7/dist-packages/botocore/httpsession.py\", line 317, in send\n    
chunked=self._chunked(request.headers),\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py\", line 756, 
in urlopen\n    method, url, error=e, _pool=self, 
_stacktrace=sys.exc_info()[2]\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/util/retry.py\", line 506, in 
increment\n    raise six.reraise(type(error), error, _stacktrace)\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/packages/six.py\", line 735, 
in reraise\n    raise value\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py\", line 706, 
in urlopen\n    chunked=chunked,\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py\", line 382, 
in _make_request\n    self._validate_conn(conn)\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connectionpool.py\", line 
1010, in _validate_conn\n    conn.connect()\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connectio
 n.py\", line 353, in connect\n    conn = self._new_conn()\n  File 
\"/usr/local/lib/python3.7/dist-packages/urllib3/connection.py\", line 177, in 
_new_conn\n    % (self.host, 
self.timeout),\nurllib3.exceptions.ConnectTimeoutError: 
(<botocore.awsrequest.AWSHTTPSConnection object at 0x7f69952bde50>, 
\'Connection to pipelines.s3.amazonaws.com timed out. (connect 
timeout=60)\')\n\nDuring handling of the above exception, another exception 
occurred:\n\nTraceback (most recent call last):\n  File 
\"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/clients/s3/boto3_client.py\",
 line 171, in create_multipart_upload\n    ContentType=request.mime_type)\n  
File \"/usr/local/lib/python3.7/dist-packages/botocore/client.py\", line 357, 
in _api_call\n    return self._make_api_call(operation_name, kwargs)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/client.py\", line 663, in 
_make_api_call\n    operation_model, request_dict, request_context)\n  File 
\"/usr/local/lib/python3.7/dist
 -packages/botocore/client.py\", line 682, in _make_request\n    return 
self._endpoint.make_request(operation_model, request_dict)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/endpoint.py\", line 102, in 
make_request\n    return self._send_request(request_dict, operation_model)\n  
File \"/usr/local/lib/python3.7/dist-packages/botocore/endpoint.py\", line 137, 
in _send_request\n    success_response, exception):\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/endpoint.py\", line 256, in 
_needs_retry\n    caught_exception=caught_exception, 
request_dict=request_dict)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/hooks.py\", line 356, in 
emit\n    return self._emitter.emit(aliased_event_name, **kwargs)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/hooks.py\", line 228, in 
emit\n    return self._emit(event_name, kwargs)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/hooks.py\", line 211, in 
_emit\n    response = handler(**kwargs)\n 
  File \"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", 
line 183, in __call__\n    if self._checker(attempts, response, 
caught_exception):\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", line 251, 
in __call__\n    caught_exception)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", line 277, 
in _should_retry\n    return self._checker(attempt_number, response, 
caught_exception)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", line 317, 
in __call__\n    caught_exception)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", line 223, 
in __call__\n    attempt_number, caught_exception)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/retryhandler.py\", line 359, 
in _check_caught_exception\n    raise caught_exception\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/endpoint.py\", line 200, in 
_do_get_response\n    http_response = self._send(reques
 t)\n  File \"/usr/local/lib/python3.7/dist-packages/botocore/endpoint.py\", 
line 269, in _send\n    return self.http_session.send(request)\n  File 
\"/usr/local/lib/python3.7/dist-packages/botocore/httpsession.py\", line 341, 
in send\n    raise ConnectTimeoutError(endpoint_url=request.url, 
error=e)\nbotocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint 
URL: \"
   
https://pipelines.s3.amazonaws.com/tfx/trace_pipeline_e2e/FileBasedExampleGenWithDate/examples/1438/train/beam-temp-data_tfrecord-286f19285c8d11ebbb52a24bbfe454c5/319ecf64-85bb-46bd-afcd-a355144724a7.data_tfrecord.gz?uploads
   \"\n\nDuring handling of the above exception, another exception 
occurred:\n\nTraceback (most recent call last):\n  File 
\"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/s3io.py\", line 
566, in _start_upload\n    response = 
self._client.create_multipart_upload(request)\n  File 
\"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/clients/s3/boto3_client.py\",
 line 174, in create_multipart_upload\n    message = 
e.response[\'Error\'][\'Message\']\nAttributeError: \'ConnectTimeoutError\' 
object has no attribute \'response\'\n" instruction_id: "bundle_33" 
transform_id: "WriteSplit[train]/Write/Write/WriteImpl/WriteBundles" 
log_location: 
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/s3io.py:572" thread: 
"Thread-14" 
   ```
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to