[ 
https://issues.apache.org/jira/browse/BEAM-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17122896#comment-17122896
 ] 

Beam JIRA Bot commented on BEAM-6821:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> FileBasedSink is not creating file paths according to target filesystem
> -----------------------------------------------------------------------
>
>                 Key: BEAM-6821
>                 URL: https://issues.apache.org/jira/browse/BEAM-6821
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>    Affects Versions: 2.11.0
>         Environment: Windows 10
>            Reporter: Gregory Kovelman
>            Priority: P2
>              Labels: stale-P2
>          Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> File path generated in _open_writer_ method is not according to target 
> filesystem, because
> os.path.join is used and not FileSystems.join.
> apache_beam\io\filebasedsink.py extract:
>  
> {code:java}
> def _create_temp_dir(self, file_path_prefix):
>  base_path, last_component = FileSystems.split(file_path_prefix)
>  if not last_component:
>    # Trying to re-split the base_path to check if it's a root.
>    new_base_path, _ = FileSystems.split(base_path)
>    if base_path == new_base_path:
>      raise ValueError('Cannot create a temporary directory for root path '
>                       'prefix %s. Please specify a file path prefix with '
>                       'at least two components.' % file_path_prefix)
>  path_components = [base_path,
>                     'beam-temp-' + last_component + '-' + uuid.uuid1().hex]
>  return FileSystems.join(*path_components)
> @check_accessible(['file_path_prefix', 'file_name_suffix'])
>  def open_writer(self, init_result, uid):
>  # A proper suffix is needed for AUTO compression detection.
>  # We also ensure there will be no collisions with uid and a
>  # (possibly unsharded) file_path_prefix and a (possibly empty)
>  # file_name_suffix.
>  file_path_prefix = self.file_path_prefix.get()
>  file_name_suffix = self.file_name_suffix.get()
>  suffix = (
>     '.' + os.path.basename(file_path_prefix) + file_name_suffix)
>  return FileBasedSinkWriter(self, os.path.join(init_result, uid) + suffix)
> {code}
>  
>  
> This created incompatibilities between, for example, Windows and GCS.
> Expected: gs://bucket/beam-temp-result-uuid\\uid.result
> Actual: gs://bucket/beam-temp-result-uuid/uid.result
> Replacing os.path.join with FileSystems.join fixes the issue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to