[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16378955#comment-16378955
 ] 

Rakesh R commented on HDFS-12090:
---------------------------------

[~ehiggs] Sorry for the delay in my response. I thought of replying after 
finishing the major code changes to support *external SPS*, still two more 
sub-tasks to go HDFS-13165, HDFS-13166. I'm trying to push these asap.
{quote}Agreed. The SPS underwent considerable changes in the past few months so 
it would not have been possible to track it.
{quote}
Yes, I understand the pain to learn branch code multiple times. Actually, we 
had received review comments to support *external SPS* quite lately and this 
caused lot of refactoring efforts in past few months. Now, the {{Context}} 
interface is clearly defined to separate out both internal and external SPS. 
Please let me know any help to understand sps code flow. I believe major code 
refactoring is completed and committed to the branch. Presently, [~daryn] has 
offered help to take another round of reviews and we are waiting to get few 
clarifications from him.
{quote}Are there hooks in the tracker for multinode-multipart uploads to 
perform the init and complete?
{quote}
No, there is no hook provided. Probably, you can add it for provided writes.
{quote}The two main interfaces we need to work with are scanAndCollectFileIds 
and submitMoveTask. These are very generic so we could potentially implement 
them
{quote}
One of the review comment by him is to use existing {{datatransfer}} rather 
than introducing new command for {{submitMoveTask}}. I've replied to that by 
putting sps requirements. I think, separate command is very much required to do 
multinode-multipart uploads for provided store writes. How about adding your 
use case in HDFS-10285 jira so that we could discuss together then define basic 
move command and later it can be extended to support provided store writes.
{quote}I'm not sure if this would mean the Context API would need to change or 
if it can be done with the existing APIs since ExternalSPSFileIDCollector has 
access to a DistributedFileSystem which can perform the snapshotting and 
provide diffs.
{quote}
SPS is not based on snapshot diff, it just do file block scanning and schedule 
the block movements(src-target pairs). It is possible to extend 
_ExternalSPSFileIDCollector_ interface and do necessary modifications to 
support your case. Anyway, Context and other supporting interfaces are private 
and open to make changes internally.

> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>            Priority: Major
>         Attachments: HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090.0000.patch, HDFS-12090.0001.patch
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to