[ 
https://issues.apache.org/jira/browse/HADOOP-14365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15991249#comment-15991249
 ] 

Andrew Wang commented on HADOOP-14365:
--------------------------------------

Hi Steve, thanks for the JIRA and thoughtful comments,

We didn't have any specifications for output streams before, so I didn't see it 
as a blocker for HDFS-11170. Considering the complexity of the input stream 
specification, we likely need your help with defining the output stream spec. 
The input stream spec doesn't define parameters for {{open}} (e.g. buffer 
size), so if you could do a few options for {{create}} as an example, someone 
else can iterate on that.

Could we start by splitting the desired enhancements into subtasks, and be 
clear about what is considered a release blocker (i.e. affects compatibility)? 
Fixing the default create-parent behavior definitely a blocker, and agree that 
we should move over the tests. However, I think the proposed {{setOption}} 
could be compatibly added later (and likely needs to be discussed), and I'd 
also like to treat the spec as something we can iterate on.

Since we're planning to release 3.0.0-alpha3 soon, we also need a subtask to 
hide everything from the public API. We need the new create options added to 
the builder to make the balancer lock file work with EC, so it's inappropriate 
to revert.

> Stabilise FileSystem builder-based create API 
> ----------------------------------------------
>
>                 Key: HADOOP-14365
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14365
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.9.0
>            Reporter: Steve Loughran
>            Priority: Blocker
>
> HDFS-11170 added a builder-based create API for file creation which has a few 
> issues to work out before it can be considered ready for use
> 1. There no specification in the filesystem.md of what it is meant to do, 
> which means there's no public documentation on expected behaviour except on 
> the Javadocs, which consists of the sentences "Create a new 
> FSDataOutputStreamBuilder for the file with path" and "Base of specific file 
> system FSDataOutputStreamBuilder".
> I propose:
> # Give the new method a relevant name rather than just define the return 
> type, e.g. {{createFile()}}. 
> # `Filesystem.md` to be extended with coverage of this method, and, sadly for 
> the authors, coverage of what the semantics of 
> {{FSDataOutputStreamBuilder.build()}} are.
> 2. There are only tests for HDFS and local, neither of them perfect. 
> Proposed: move to {{AbstractContractCreateTest}}, test for all filesystems, 
> fix tests and FS where appropriate. 
> 3. Add more tests to generate the failure conditions implied by the updated 
> filesystem spec. Eg. create over a an existing file, create over a directory, 
> create with negative buffer size, negative block size, empty dest path, etc, 
> etc. 
> This will clarify when precondition checks are made, as well as whether. For 
> example: should {{newFSDataOutputStreamBuilder()}} validate the path 
> immediately?
> 4. Add to {{FileContext}}.
> 5. Take the opportunity to look at the flaws in today's {{create()}} calls 
> and address them, rather than replicate. In particular, I'd like to end the 
> behaviour "create all parent dirs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to