ConverJens edited a comment on pull request #13180:
URL: https://github.com/apache/beam/pull/13180#issuecomment-762308020
@dandy10 @pabloem
Great work with this PR!
I'm trying to get s3 (Minio) to work for TFX, and I get it to work for all
but the beam components where I get this strange error:
```
Traceback (most recent call last):
File "apache_beam/runners/common.py", line 1213, in
apache_beam.runners.common.DoFnRunner.process
File "apache_beam/runners/common.py", line 742, in
apache_beam.runners.common.PerWindowInvoker.invoke_process
File "apache_beam/runners/common.py", line 867, in
apache_beam.runners.common.PerWindowInvoker._invoke_process_per_window
File "/usr/local/lib/python3.7/dist-packages/apache_beam/io/iobase.py",
line 1129, in process
self.writer = self.sink.open_writer(init_result, str(uuid.uuid4()))
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/options/value_provider.py",
line 135, in _f
return fnc(self, *args, **kwargs)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/filebasedsink.py", line
196, in open_writer
return FileBasedSinkWriter(self, writer_path)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/filebasedsink.py", line
417, in __init__
self.temp_handle = self.sink.open(temp_shard_path)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/options/value_provider.py",
line 135, in _f
return fnc(self, *args, **kwargs)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/filebasedsink.py", line
138, in open
return FileSystems.create(temp_path, self.mime_type,
self.compression_type)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/filesystems.py", line
229, in create
return filesystem.create(path, mime_type, compression_type)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/s3filesystem.py",
line 171, in create
return self._path_open(path, 'wb', mime_type, compression_type)
File
"/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/s3filesystem.py",
line 151, in _path_open
raw_file = s3io.S3IO(options=self._options).open(
File "/usr/local/lib/python3.7/dist-packages/apache_beam/io/aws/s3io.py",
line 63, in __init__
raise ValueError('Must provide one of client or options')
ValueError: Must provide one of client or options
```
Do you have any idea what I'm doing wrong?
These are the beam pipeline args that I'm supplying and I know for sure that
at least the multi process and nr_of_workers arguments are applied:
```
'--direct_running_mode=multi_processing',
f'--direct_num_workers={NR_OF_CPUS}',
'--s3_endpoint_url=minio-service.kubeflow:9000',
f'--s3_access_key={ACCESS_KEY}',
f'--s3_secret_access_key={SECRET_ACCESS_KEY},
'--s3_verify=False'
```
Help would be greatly appreciated!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]