[ 
https://issues.apache.org/jira/browse/BEAM-2857?focusedWorklogId=247157&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247157
 ]

ASF GitHub Bot logged work on BEAM-2857:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/May/19 00:07
            Start Date: 23/May/19 00:07
    Worklog Time Spent: 10m 
      Work Description: pabloem commented on issue #8394: [BEAM-2857] 
Implementing WriteToFiles transform for fileio (Python)
URL: https://github.com/apache/beam/pull/8394#issuecomment-495018942
 
 
   > From what I understand deletion of temporary files is best effort. There 
is currently no mechanism to clean up old temporary files left over due to an 
error. Perhaps we could, in the future, add a TTL or some other way to tell if 
a temporary file can be safely deleted.
   
   Do you mean when the pipeline fails? (e.g. for batch?)
   
   You're right about that, that if the pipeline fails, we won't be able to 
clean up the bundles that did not run. But how could we know this ahead of 
time? : /
   
   Currently, all files will be cleaned up for any bundle that succeeds. If the 
bundle does not succeed, then we should not clean up the files. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 247157)
    Time Spent: 4.5h  (was: 4h 20m)

> Create FileIO in Python
> -----------------------
>
>                 Key: BEAM-2857
>                 URL: https://issues.apache.org/jira/browse/BEAM-2857
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py-core
>            Reporter: Eugene Kirpichov
>            Assignee: Pablo Estrada
>            Priority: Major
>              Labels: gsoc, gsoc2019, mentor
>          Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Beam Java has a FileIO with operations: match()/matchAll(), readMatches(), 
> which together cover the majority of needs for general-purpose file 
> ingestion. Beam Python should have something similar.
> An early design document for this: https://s.apache.org/fileio-beam-python



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to