[
https://issues.apache.org/jira/browse/HADOOP-9565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150522#comment-15150522
]
Steve Loughran commented on HADOOP-9565:
----------------------------------------
I've been thinking about this, and wondering if we could have better
extensibility by providing a lookup operation where you asked for the specific
method and got back an enum of values:
{code}
getOperationSemantics("create") = <SUPPORTED, O_1, CONSISTENT, ATOMIC,
SYNCHRONOUS, PERSISTENT> // HDDA change is visible, operation with check for
existence is atomic
getOperationSemantics("create") = <SUPPORTED, O_1> // s3
getOperationSemantics("append") = <SUPPORTED, PERSISTENT> // HDFS: supported
but no consistency guarantees
getOperationSemantics("append") = <> // s3 doesn't support append
getOperationSemantics("delete") = <SUPPORTED, O_1, CONSISTENT, ATOMIC,
SYNCHRONOUS, PERSISTENT> // HDFS: supported but no consistency guarantees
getOperationSemantics("delete") = <SUPPORTED, O_N, PERSISTENT> // maybe async
getOperationSemantics("rename") = <SUPPORTED, O_1, CONSISTENT, ATOMIC,
SYNCHRONOUS, PERSISTENT> // hdfs
getOperationSemantics("rename") = <SUPPORTED, CLIENT_SIDE, O_N, SYNCHRONOUS,
PERSISTENT> // s3
getOperationSemantics("OutputStream.close") = <SUPPORTED, O_N, ATOMIC,
SYNCHRONOUS, PERSISTENT> // s3
getOperationSemantics("OutputStream.close") = <SUPPORTED, O_1, ATOMIC,
CONSISTENT, SYNCHRONOUS, PERSISTENT> // HDFS
getOperationSemantics("OutputStream.write") = <SUPPORTED, O_N, PERSISTENT> //
HDFS
getOperationSemantics("OutputStream.write") = <SUPPORTED, O_1> // s3
getOperationSemantics("OutputStream.flush") = <SUPPORTED, O_N, SYNCHRONOUS,
PERSISTENT> // HDFS
getOperationSemantics("OutputStream.flush") = <SUPPORTED, NO_OP, O_1> // s3
won't fail on the cal, but it doesn't do anything
{code}
I know it's potentially much more complex, especially for clients, but it does
expose all the information apps may possibly need.
Example: dfsclient & can look for rename being 0_1 and !CLIENT_SIDE; if not, it
bypasses rename and writes direct.
another example, some code trying to use create(overwrite=false) for locking
could check and fail if "create" wasn't atomic/persistent (i.e. check & create
atomic, result visible to all)
> Add a Blobstore interface to add to blobstore FileSystems
> ---------------------------------------------------------
>
> Key: HADOOP-9565
> URL: https://issues.apache.org/jira/browse/HADOOP-9565
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs, fs/s3, fs/swift
> Affects Versions: 2.6.0
> Reporter: Steve Loughran
> Assignee: Thomas Demoor
> Labels: BB2015-05-TBR
> Attachments: HADOOP-9565-001.patch, HADOOP-9565-002.patch,
> HADOOP-9565-003.patch, HADOOP-9565-004.patch, HADOOP-9565-005.patch
>
>
> We can make the fact that some {{FileSystem}} implementations are really
> blobstores, with different atomicity and consistency guarantees, by adding a
> {{Blobstore}} interface to add to them.
> This could also be a place to add a {{Copy(Path,Path)}} method, assuming that
> all blobstores implement at server-side copy operation as a substitute for
> rename.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)