[ 
https://issues.apache.org/jira/browse/HADOOP-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16707096#comment-16707096
 ] 

Steve Loughran commented on HADOOP-15229:
-----------------------------------------

[~fabbri]

Thanks for the comments

bq.  I'm self-funding my AWS usage these days and I need to set that up still.

noted. If you sign up for an MSDN freebie a/c you'll get enough $0 capacity 
there to play with the wasb stuff; maybe ABFS as well.

bq. Thanks for pulling SelectBinding stuff into separate class.

I'm starting to stare at S3AFilesystem and then at Beck and Fowler 99, 
"refactoring". No real plan yet, but its too big and we could think about a 
clearer Model-View approach. The model is the real object store, The Hadoop FS 
API just a view of it, *but only one view*. WriteOperationsHelper is a clear 
example of where we are exposing that underlying model; we should extend it to 
the read side, make it complete and then split stuff up. Course, that'll need 
some async invoker stuff next. I'm not in a rush to do this as it'll make 
backporting an all-or-nothing change *and* run the risk of rework-bloat. 
Something to think about tho'

See also: HADOOP-11867, scatter/gather


 

> Add FileSystem builder-based openFile() API to match createFile()
> -----------------------------------------------------------------
>
>                 Key: HADOOP-15229
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15229
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs, fs/azure, fs/s3
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15229-001.patch, HADOOP-15229-002.patch, 
> HADOOP-15229-003.patch, HADOOP-15229-004.patch, HADOOP-15229-004.patch, 
> HADOOP-15229-005.patch, HADOOP-15229-006.patch
>
>
> Replicate HDFS-1170 and HADOOP-14365 with an API to open files.
> A key requirement of this is not HDFS, it's to put in the fadvise policy for 
> working with object stores, where getting the decision to do a full GET and 
> TCP abort on seek vs smaller GETs is fundamentally different: the wrong 
> option can cost you minutes. S3A and Azure both have adaptive policies now 
> (first backward seek), but they still don't do it that well.
> Columnar formats (ORC, Parquet) should be able to say "fs.input.fadvise" 
> "random" as an option when they open files; I can imagine other options too.
> The Builder model of [~eddyxu] is the one to mimic, method for method. 
> Ideally with as much code reuse as possible



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to