[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2019-11-04 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Resolution: Won't Do
Status: Resolved  (was: Patch Available)

The intermediate step of putting the scheduler in the NN won't be done. The SPS 
will handle the scheduling of tasks.

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-10867) [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage

2019-11-04 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-10867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs resolved HDFS-10867.
---
Resolution: Later

This work will be done in a future ticket when we handle bidirectional stores 
(where the external store can create new files and have them show up in HDFS).

> [PROVIDED Phase 2] Block Bit Field Allocation of Provided Storage
> -
>
> Key: HDFS-10867
> URL: https://issues.apache.org/jira/browse/HDFS-10867
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Priority: Major
> Attachments: Block Bit Field Allocation of Provided Storage.pdf
>
>
> We wish to design and implement the following related features for provided 
> storage:
> # Dynamic mounting of provided storage within a Namenode (mount, unmount)
> # Mount multiple provided storage systems on a single Namenode.
> # Support updates to the provided storage system without having to regenerate 
> an fsimg.
> A mount in the namespace addresses a corresponding set of block data. When 
> unmounted, any block data associated with the mount becomes invalid and 
> (eventually) unaddressable in HDFS. As with erasure-coded blocks, efficient 
> unmounting requires that all blocks with that attribute be identifiable by 
> the block management layer
> In this subtask, we focus on changes and conventions to the block management 
> layer. Namespace operations are covered in a separate subtask.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-11-04 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

This command line work will be done in HDFS-14805.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-12090.007.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-11-04 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11639:
--
Resolution: Won't Do
Status: Resolved  (was: Patch Available)

As discussed, the Block Alias is no longer required in the block token.

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-11-04 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16966561#comment-16966561
 ] 

Ewan Higgs commented on HDFS-11639:
---

[~virajith] indeed, I dont think this is needed any more if we are no longer 
using the client as part of the interaction for writing data back to the 
provided storage and instead plan to use the heart beat protocol to instruct 
DNs to perform the writes.

You can close this.

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2019-10-23 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16957962#comment-16957962
 ] 

Ewan Higgs commented on HDFS-13777:
---

While we have patches for this to work in the namenode, the implementation at 
the time was expected to land in the then-pending SPS service. As the SPS has 
been released this work could be picked apart and possibly moved directly to 
the SPS service.

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-10-21 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11639:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-10-21 Thread Ewan Higgs (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11639:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-10-21 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955929#comment-16955929
 ] 

Ewan Higgs edited comment on HDFS-11639 at 10/21/19 10:03 AM:
--

https://github.com/apache/hadoop/pull/1665 submitted as a rebase of this work 
onto HDFS-12090.


was (Author: ehiggs):
https://github.com/apache/hadoop/pull/1665 submitted as a rebase of this work.

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-10-21 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955929#comment-16955929
 ] 

Ewan Higgs commented on HDFS-11639:
---

https://github.com/apache/hadoop/pull/1665 submitted as a rebase of this work.

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11639) [PROVIDED Phase 2] Encode the BlockAlias in the client protocol

2019-10-18 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954591#comment-16954591
 ] 

Ewan Higgs commented on HDFS-11639:
---

Note for myself as I page this back in: 

Part of the rebase here is to remove bpid from the FileRegionProto (which was 
in the previous patch). This is no longer used since HDFS-12713 moved it.

> [PROVIDED Phase 2] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch, HDFS-11639-HDFS-9806.003.patch, 
> HDFS-11639-HDFS-9806.004.patch, HDFS-11639-HDFS-9806.005.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-10-11 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949533#comment-16949533
 ] 

Ewan Higgs commented on HDFS-12478:
---

Patch was rebased and an MR made here: 
https://github.com/apache/hadoop/pull/1648

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-12090.007.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-10-11 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16949411#comment-16949411
 ] 

Ewan Higgs commented on HDFS-12478:
---

There is overlapping work in HDFS-14805 that should be converged.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-12090.007.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14805) Mounting external stores in HDFS on-the-fly

2019-09-04 Thread Ewan Higgs (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922576#comment-16922576
 ] 

Ewan Higgs commented on HDFS-14805:
---

Hi,
I took a look at the attached design document and the approach looks good.

The design document covers command line tools. Will this supplant HDFS-12478 or 
does it depend on it?

> Mounting external stores in HDFS on-the-fly
> ---
>
> Key: HDFS-14805
> URL: https://issues.apache.org/jira/browse/HDFS-14805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Virajith Jalaparti
>Priority: Major
> Attachments: dynamic-mounts-in-hdfs.pdf
>
>
> Provided storage (HDFS-9806) allows HDFS to address data in external storage 
> systems, including cloud stores. Data mounted in this manner, seamlessly, 
> appears to be part of HDFS for applications/clients. The external data can 
> also be cached by HDFS on local disks and SSDs, accelerating remote data 
> reads (HDFS-13069). 
> However, Provided storage was originally targeted at ephemeral HDFS 
> deployments in the cloud (e.g., Azure HDInsight). Long running HDFS clusters 
> are common in many other scenarios which can benefit from accessing data in 
> remote stores. This JIRA targets such scenarios and aims to provide the 
> ability to:
> (a) Dynamically mount external stores in a HDFS cluster while supporting high 
> availability.
> (b) Mount multiple remote stores simultaneously.
> (c) Reduce deployment overheads and simplify usability of Provided storage.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-08-18 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16909972#comment-16909972
 ] 

Ewan Higgs commented on HDFS-13118:
---

[~jojochuang], I turned this into a github with a rebase MR: 
https://github.com/apache/hadoop/pull/1313

Will take a look at the findbugs warnings.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-08-18 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Open  (was: Patch Available)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-6708) StorageType should be encoded in the block token

2019-04-19 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16821969#comment-16821969
 ] 

Ewan Higgs commented on HDFS-6708:
--

{quote}Daryn Sharp do you have an opinion on backwards compatibility for the 
Writable case? I suspect older clients will ignore the extra fields harmlessly- 
these should be framed in the RPC- but I haven't actually verified.{quote}

It looks like upgrading NN first means the fields are not harmlessly ignored, 
presumably because the re-hash is not regenerated if the StorageType is ignored 
(I haven't fully paged this topic back in). 

Upgrading DNs first would allow the DN to understand any new message that the 
NN might send. Documentation 
[here|https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html]
 indicates that NN should be upgraded first.

> StorageType should be encoded in the block token
> 
>
> Key: HDFS-6708
> URL: https://issues.apache.org/jira/browse/HDFS-6708
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 2.4.1
>Reporter: Arpit Agarwal
>Assignee: Ewan Higgs
>Priority: Major
> Fix For: 3.0.0-alpha4
>
> Attachments: HDFS-6708.0001.patch, HDFS-6708.0002.patch, 
> HDFS-6708.0003.patch, HDFS-6708.0004.patch, HDFS-6708.0005.patch, 
> HDFS-6708.0006.patch, HDFS-6708.0007.patch, HDFS-6708.0008.patch, 
> HDFS-6708.0009.patch, HDFS-6708.0010.patch
>
>
> HDFS-6702 is adding support for file creation based on StorageType.
> The block token is used as a tamper-proof channel for communicating block 
> parameters from the NN to the DN during block creation. The StorageType 
> should be included in this block token.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2019-03-07 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786534#comment-16786534
 ] 

Ewan Higgs commented on HDFS-13186:
---

{quote}The HADOOP-15691 PathCapabilities patch is intended to allow callers to 
probe for a feature being available before making the API Call. This'd let you 
go{quote}
A capability model is much better, indeed.

{quote}Bear in mind I also want to move the MPU API to being async block 
uploads, complete calls. For the classic local and HDFS stores, these would 
actually be done in the current thread. For S3 they'd run in the thread pool, 
so you could trivially kick off a parallel upload of blocks from a single 
thread without even knowing that the FS impl worked that way.{quote}
This is a good idea. When designing the API I thought we wanted to stick to a 
synchronous model to be consistent with the rest of the APIs but async is a 
much better fit for this as it's remote calls and we don't do anything like 
locking (which can be hairy in async code).

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2019-03-06 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785547#comment-16785547
 ] 

Ewan Higgs commented on HDFS-13186:
---

[~ste...@apache.org]
{quote}
Bad news; that new concat operation in raw local makes it possible to create 
files in a checksummed FS which don't have checksums: HADOOP-16150
{quote}
Would it make sense to make a checksumFS MPU that throws upon creation? I don't 
like the approach but using inheritance to remove functionality as checksum FS 
is doing is already broken.

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13186) [PROVIDED Phase 2] Multipart Uploader API

2019-03-06 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785546#comment-16785546
 ] 

Ewan Higgs commented on HDFS-13186:
---

[~fabbri]
{quote}What is the motivation for this?  Even if not part of FileSystem it is 
more surface area we need to deal with.{quote}

The motivation for this is to be able to write files to FileSystems in parallel 
and surviving upload failures without having to restart the entire upload. One 
immediate use case is that Tiered Storage can write files from datanodes to a 
synchronization endpoint without having to reassemble the files locally. The NN 
can initialize the write and tell the DNs to upload files and when they are 
done, the NN will commit the work. Further down the line, it's possible that a 
tool like DistCp could be written in terms of this uploader to allow 
users/admins to copy data from one HDFS system to another without having to 
stream blocks locally to a single worker on a single DN.

> [PROVIDED Phase 2] Multipart Uploader API
> -
>
> Key: HDFS-13186
> URL: https://issues.apache.org/jira/browse/HDFS-13186
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Fix For: 3.2.0
>
> Attachments: HDFS-13186.001.patch, HDFS-13186.002.patch, 
> HDFS-13186.003.patch, HDFS-13186.004.patch, HDFS-13186.005.patch, 
> HDFS-13186.006.patch, HDFS-13186.007.patch, HDFS-13186.008.patch, 
> HDFS-13186.009.patch, HDFS-13186.010.patch
>
>
> To write files in parallel to an external storage system as in HDFS-12090, 
> there are two approaches:
>  # Naive approach: use a single datanode per file that copies blocks locally 
> as it streams data to the external service. This requires a copy for each 
> block inside the HDFS system and then a copy for the block to be sent to the 
> external system.
>  # Better approach: Single point (e.g. Namenode or SPS style external client) 
> and Datanodes coordinate in a multipart - multinode upload.
> This system needs to work with multiple back ends and needs to coordinate 
> across the network. So we propose an API that resembles the following:
> {code:java}
> public UploadHandle multipartInit(Path filePath) throws IOException;
> public PartHandle multipartPutPart(InputStream inputStream,
> int partNumber, UploadHandle uploadId) throws IOException;
> public void multipartComplete(Path filePath,
> List> handles, 
> UploadHandle multipartUploadId) throws IOException;{code}
> Here, UploadHandle and PartHandle are opaque handlers in the vein of 
> PathHandle so they can be serialized and deserialized in hadoop-hdfs project 
> without knowledge of how to deserialize e.g. S3A's version of a UpoadHandle 
> and PartHandle.
> In an object store such as S3A, the implementation is straight forward. In 
> the case of writing multipart/multinode to HDFS, we can write each block as a 
> file part. The complete call will perform a concat on the blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Patch Available  (was: Open)

007 - Rebased onto rebased HDFS-12090.

This patch depends on HDFS-13118.005.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-12090.007.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-02-17 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770372#comment-16770372
 ] 

Ewan Higgs commented on HDFS-13118:
---

005 - rebased onto rebased HDFS-12090 branch.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: HDFS-12478-HDFS-12090.007.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-12090.007.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Patch Available  (was: Open)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Attachment: HDFS-13118.005.patch

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2019-02-17 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Open  (was: Patch Available)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch, HDFS-13118.005.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

2019-02-15 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769465#comment-16769465
 ] 

Ewan Higgs commented on HDFS-12090:
---

Now that HDFS-13794 has been merged into branch, I've rebased onto 
e0fe3d1ecaf859d0bf1a5b5223c4b2a56bfcde0e.

> Handling writes from HDFS to Provided storages
> --
>
> Key: HDFS-12090
> URL: https://issues.apache.org/jira/browse/HDFS-12090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Virajith Jalaparti
>Priority: Major
> Attachments: External-SyncService-CreateFile.001.png, 
> HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090..patch, HDFS-12090.0001.patch
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-11-28 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16701666#comment-16701666
 ] 

Ewan Higgs commented on HDFS-13713:
---

{quote}Apart from some ambiguity about concurrency, is this ready to go 
in?{quote}
Yes!

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-11-28 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Status: Open  (was: Patch Available)

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-11-28 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Status: Patch Available  (was: Open)

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685025#comment-16685025
 ] 

Ewan Higgs edited comment on HDFS-12478 at 11/13/18 10:52 AM:
--

006
- Factored out INodeType work.
- This patch now depends on HDFS-13118.004.patch so don't expect Jenkins to be 
able to apply it. 

[~virajith], please take a look at your earliest convenience.


was (Author: ehiggs):
006
- Factored out INodeType work.

[~virajith], please take a look at your earliest convenience.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685025#comment-16685025
 ] 

Ewan Higgs commented on HDFS-12478:
---

006
- Factored out INodeType work.

[~virajith], please take a look at your earliest convenience.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-13 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-13 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-13 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: HDFS-12478-HDFS-12090.006.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-12090.006.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681649#comment-16681649
 ] 

Ewan Higgs commented on HDFS-13118:
---

004
- Rebased from HDFS-12478 patch 005.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Attachment: HDFS-13118.004.patch

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Open  (was: Patch Available)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Patch Available  (was: Open)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch, HDFS-13118.004.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681129#comment-16681129
 ] 

Ewan Higgs commented on HDFS-13118:
---

003
- Rebased patch and added INodeType to test calls that have been added in the 
meantime.

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-11-09 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681126#comment-16681126
 ] 

Ewan Higgs commented on HDFS-12478:
---

The patch for INodeType is available in HDFS-13118. It's now been rebased.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Patch Available  (was: Open)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Attachment: HDFS-13118.003.patch

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13118) SnapshotDiffReport should provide the INode type

2018-11-09 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13118:
--
Status: Open  (was: Patch Available)

> SnapshotDiffReport should provide the INode type
> 
>
> Key: HDFS-13118
> URL: https://issues.apache.org/jira/browse/HDFS-13118
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: snapshots
>Affects Versions: 3.0.0
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13118.001.patch, HDFS-13118.002.patch, 
> HDFS-13118.003.patch
>
>
> Currently the snapshot diff report will list which inodes were added, 
> removed, renamed, etc. But to see what the INode actually is, we need to 
> actually access the underlying snapshot - and this is cumbersome to do 
> programmatically when the snapshot diff already has the information.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669154#comment-16669154
 ] 

Ewan Higgs edited comment on HDFS-12478 at 10/30/18 6:24 PM:
-

005
- minor fixes while walking [~virajith] through the code over the phone.

Comment from [~virajith]: move SnapshotDiff.INodeType changes out of this patch 
and into another changeset.
Also: add test or command line in TestDFSAdmin tht calls DFSAdmin to run the 
commands (catching Unsupported exceptions right now)


was (Author: ehiggs):
005
- minor fixes while walking [~virajith] through the code over the phone.

Comment from [~virajith]: move SnapshotDiff.INodeType changes out of this patch 
and into another changeset.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669154#comment-16669154
 ] 

Ewan Higgs commented on HDFS-12478:
---

005
- minor fixes while walking [~virajith] through the code over the phone.

Comment from [~virajith]: move SnapshotDiff.INodeType changes out of this patch 
and into another changeset.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: HDFS-12478-HDFS-12090.005.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-12090.005.patch, HDFS-12478-HDFS-9806.001.patch, 
> HDFS-12478-HDFS-9806.002.patch, HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: HDFS-12478-HDFS-12090.004.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: (was: HDFS-12478-HDFS-12090.004.patch)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13794) [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13794:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.
> --
>
> Key: HDFS-13794
> URL: https://issues.apache.org/jira/browse/HDFS-13794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13794-HDFS-12090.001.patch, 
> HDFS-13794-HDFS-12090.002.patch, HDFS-13794-HDFS-12090.003.patch, 
> HDFS-13794-HDFS-12090.004.patch
>
>
> When updating the BlockAliasMap we may need to deal with deleted blocks. 
> Otherwise the BlockAliasMap will grow indefinitely(!).
> Therefore, the BlockAliasMap.Writer needs a method for removing blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13794) [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13794:
--
Attachment: HDFS-13794-HDFS-12090.004.patch

> [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.
> --
>
> Key: HDFS-13794
> URL: https://issues.apache.org/jira/browse/HDFS-13794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13794-HDFS-12090.001.patch, 
> HDFS-13794-HDFS-12090.002.patch, HDFS-13794-HDFS-12090.003.patch, 
> HDFS-13794-HDFS-12090.004.patch
>
>
> When updating the BlockAliasMap we may need to deal with deleted blocks. 
> Otherwise the BlockAliasMap will grow indefinitely(!).
> Therefore, the BlockAliasMap.Writer needs a method for removing blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13794) [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.

2018-10-30 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13794:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.
> --
>
> Key: HDFS-13794
> URL: https://issues.apache.org/jira/browse/HDFS-13794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13794-HDFS-12090.001.patch, 
> HDFS-13794-HDFS-12090.002.patch, HDFS-13794-HDFS-12090.003.patch, 
> HDFS-13794-HDFS-12090.004.patch
>
>
> When updating the BlockAliasMap we may need to deal with deleted blocks. 
> Otherwise the BlockAliasMap will grow indefinitely(!).
> Therefore, the BlockAliasMap.Writer needs a method for removing blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13794) [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.

2018-10-30 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669024#comment-16669024
 ] 

Ewan Higgs commented on HDFS-13794:
---

004
- Rebased the code onto updates in HDFS-12090.

> [PROVIDED Phase 2] Teach BlockAliasMap.Writer `remove` method.
> --
>
> Key: HDFS-13794
> URL: https://issues.apache.org/jira/browse/HDFS-13794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13794-HDFS-12090.001.patch, 
> HDFS-13794-HDFS-12090.002.patch, HDFS-13794-HDFS-12090.003.patch, 
> HDFS-13794-HDFS-12090.004.patch
>
>
> When updating the BlockAliasMap we may need to deal with deleted blocks. 
> Otherwise the BlockAliasMap will grow indefinitely(!).
> Therefore, the BlockAliasMap.Writer needs a method for removing blocks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12666) [PROVIDED Phase 2] Provided Storage Mount Manager (PSMM) mount

2018-10-24 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662343#comment-16662343
 ] 

Ewan Higgs commented on HDFS-12666:
---

This is subsumed into HDFS-13777 patch 7.

> [PROVIDED Phase 2] Provided Storage Mount Manager (PSMM) mount
> --
>
> Key: HDFS-12666
> URL: https://issues.apache.org/jira/browse/HDFS-12666
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-12666-HDFS-12090.001.patch
>
>
> Implement the Provided Storage Mount Manager. This is a service (thread) in 
> the Namenode that manages backup mounts, unmounts, snapshotting, and 
> monitoring the progress of backups.
> On mount, the mount manager writes XATTR information at the top level of the 
> mount to do the appropriate bookkeeping. This is done to maintain state in 
> case the Namenode falls over.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-24 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16662334#comment-16662334
 ] 

Ewan Higgs commented on HDFS-13713:
---

{quote} I'd expect it to be visible, yes?{quote}
Yes.

{quote}FWIW, I suspect that AWS will still bill you for those uncommitted 
parts, even if complete() simply discards the parts. {quote}
AFAIK, yes, you will be billed for parts that have not yet been realized.

{quote}How about we say

if you attempt >1 MPU to the same dest then it may be rejected, or it may be 
accepted. In the latter case, which upload becomes visible after the first 
completes{quote}

I'm not sure what you mean by "In the latter case, which upload becomes visible 
after the first completes." 

e.g. 
"If you complete multiple objects to the same destination at the same time, one 
of the files will be in the destination. The file that is available is 
dependent on the target storage system." As opposed to "there will be an error" 
or, worse, "the file will be corrupted".

My thoughts are that if a user writes multiple files to the same destination at 
the same time, then they either don't mind which one is 'the winner' or they 
should be advised to choose names that won't collide.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-16 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16651665#comment-16651665
 ] 

Ewan Higgs commented on HDFS-13713:
---

HDFS-13713.008.patch (using HDFS prefix, not HADOOP prefix even though this 
also concerns S3AFilesystem)

008 
- Allow concurrent uploads for Local file system and HDFS.
- Reorder uploads in the concurrent case.
- finalization methods (complete, abort) are not idempotent on HDFS (upload IDs 
are consumed). But they are briefly on S3 while there is a server side GC that 
will reap the upload IDs at a later time). Added an implementation dependent 
boolean to determine which behaviour is expected with repeated completes and 
aborts using an already burned uploadid.

{quote}We could maybe be vague about what happens, i.e. {quote}

We may need to be leave this open because S3 has a behaviour that is not 
consistent with HDFS and it's not obvious that we would prefer one over the 
other. Let me explain:

1. In the contract tests it becomes obvious that inS3 the last-started 
successful upload is 'the winner'.

example: Given upload1 and upload2:

init 1
init 2 <-- last started upload
putpart 1
putpart 2
complete 2 <-- last started upload is complete - 'the winner'
complete 1 <-- never to be seen unless versioning is enabled

2. In HDFS the last completed upload is 'the winner'.

Example: given upload1 and upload2:

init 1
init 2
putpart 1
putpart 2
complete 2 <-- concat and copy into place - visible until complete1
complete 1 <-- concat and copy into place - 'the winner'

3. I don't know what WASB or GCS do so specifying based on S3 behaviour at this 
time could be undesirable.



> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-16 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Status: Open  (was: Patch Available)

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-16 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Status: Patch Available  (was: Open)

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-16 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Attachment: HDFS-13713.008.patch

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> HDFS-13713.008.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-10 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16644855#comment-16644855
 ] 

Ewan Higgs commented on HDFS-13713:
---

I'll make some time to look at this.

My thoughts are that the file commit should overwrite whatever data is there 
and if users don't want their files/objects clobbered they should choose names 
that won't be guessed.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-10-02 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635374#comment-16635374
 ] 

Ewan Higgs commented on HDFS-13713:
---

{quote}Either the MPU rejects a second attempt to write to the destination 
(which the filesystem one now does){quote}

>From a contract perspective, this should overwrite.

{quote}Fix up FileSystemMultipartUploader to detect concurrent uploads and fail 
(it just looks for the completion dir & rejects if it exists){quote} 

This should succeed.

{quote}New test for concurrent uploads and ordering of visibility (eventually 
the last upload committed will be come and remain the final file){quote}
+1

{quote}n TestHDFSContractMultipartUploader.testUploadEmptyBlock() expect a 
failure, downgrade to WARN and skip(). This will need fixing in 3.3; for now 
just skip that test. See HDFS-13936{quote}
+1



> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HADOOP-13713-004.patch, HADOOP-13713-004.patch, 
> HADOOP-13713-005.patch, HADOOP-13713-006.patch, HADOOP-13713-007.patch, 
> HDFS-13713.001.patch, HDFS-13713.002.patch, HDFS-13713.003.patch, 
> multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-10-02 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635356#comment-16635356
 ] 

Ewan Higgs commented on HDFS-13936:
---

I took the LGTM as +1. Merged.

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13936.01.patch
>
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-09-25 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628004#comment-16628004
 ] 

Ewan Higgs commented on HDFS-13936:
---

01

* If the parts in the FileSystemMultipartUploader are empty then only touch the 
file (create+close).
* Add a contract test to make sure this works across MPU implementations.

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13936.01.patch
>
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-09-25 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13936:
--
Status: Patch Available  (was: Open)

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13936.01.patch
>
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-09-25 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13936:
--
Attachment: HDFS-13936.01.patch

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13936.01.patch
>
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-09-24 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625847#comment-16625847
 ] 

Ewan Higgs commented on HDFS-13936:
---

Right, but you could touch a file with no implication of the file being open. 
This is used in HDFS for sentinel files (not usually used for files you would 
want to concat).
{code:java}
hdfs dfs -touch /empty-file{code}
 

The best option seems to be to check if there is only a single part and if that 
single part is empty, just touch the destination.

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13936) multipart upload to HDFS to support 0 byte upload

2018-09-24 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625761#comment-16625761
 ] 

Ewan Higgs commented on HDFS-13936:
---

If there anything in the contract of HDFS specifying that it must fail when 
concatting empty files? If so, is it a bad idea to allow this behaviour?

> multipart upload to HDFS to support 0 byte upload
> -
>
> Key: HDFS-13936
> URL: https://issues.apache.org/jira/browse/HDFS-13936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, hdfs
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Major
>
> MPUs to HDFS fail as you can't concat an empty block. 
> Whatever uploads to HDFS needs to recognise that specific case "0-byte file" 
> and rather than try and concat things, just create a 0-byte file there.
> Without this, you can't use MPU as a replacement for distcp or alternative 
> commit protocols.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-17 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617263#comment-16617263
 ] 

Ewan Higgs commented on HDFS-13713:
---

[~ste...@apache.org], any feedback for patch 003?

Thanks

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> HDFS-13713.003.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613158#comment-16613158
 ] 

Ewan Higgs commented on HDFS-13777:
---

007
* Patch depends on HDFS-12478-HDFS-12090.004.patch and 
HDFS-13794-HDFS-12090.003.patch
* Removes work for the protocol for the command line interface to HDFS-12478.

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-12 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-12 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-12 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Attachment: HDFS-13777-HDFS-12090.007.patch

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-12 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Attachment: (was: HDFS-13777-HDFS-12090.007.patch)

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-12 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Attachment: HDFS-13777-HDFS-12090.007.patch

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch, 
> HDFS-13777-HDFS-12090.007.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609373#comment-16609373
 ] 

Ewan Higgs edited comment on HDFS-12478 at 9/10/18 3:17 PM:


004
* Rebase onto HDFS-12090 branch.
* Add protocol code or also adding INodeType to SnapshotDiffReport as it's part 
of the protocol here. This resulted in a lot of code changes as I plumbed the 
new argument through the system.
* Having the code here will allow the patch in HDFS-13777 to lose some girth.


was (Author: ehiggs):
004
* Rebase onto HDFS-12090 branch.
* Add protocol code or also adding INodeType to SnapshotDiffReport as it's part 
of the protocol here. This resulted in a lot of code changes as I plumbed the 
new argument through the system.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609373#comment-16609373
 ] 

Ewan Higgs commented on HDFS-12478:
---

004
* Rebase onto HDFS-12090 branch.
* Add protocol code or also adding INodeType to SnapshotDiffReport as it's part 
of the protocol here.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609373#comment-16609373
 ] 

Ewan Higgs edited comment on HDFS-12478 at 9/10/18 3:15 PM:


004
* Rebase onto HDFS-12090 branch.
* Add protocol code or also adding INodeType to SnapshotDiffReport as it's part 
of the protocol here. This resulted in a lot of code changes as I plumbed the 
new argument through the system.


was (Author: ehiggs):
004
* Rebase onto HDFS-12090 branch.
* Add protocol code or also adding INodeType to SnapshotDiffReport as it's part 
of the protocol here.

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Attachment: HDFS-12478-HDFS-12090.004.patch

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12478) [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup mounts

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-12478:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Command line tools for managing Provided Storage Backup 
> mounts
> -
>
> Key: HDFS-12478
> URL: https://issues.apache.org/jira/browse/HDFS-12478
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Minor
> Attachments: HDFS-12478-HDFS-12090.004.patch, 
> HDFS-12478-HDFS-9806.001.patch, HDFS-12478-HDFS-9806.002.patch, 
> HDFS-12478-HDFS-9806.003.patch
>
>
> This is a task for implementing the command line interface for attaching a 
> PROVIDED storage backup system (see HDFS-9806, HDFS-12090).
> # The administrator should be able to mount a PROVIDED storage volume from 
> the command line. 
> {code}hdfs attach -create [-name ]   path (external)>{code}
> # Whitelist of users who are able to manage mounts (create, attach, detach).
> # Be able to interrogate the status of the attached storage (last time a 
> snapshot was taken, files being backed up).
> # The administrator should be able to remove an attached PROVIDED storage 
> volume from the command line. This simply means that the synchronization 
> process no longer runs. If the administrator has configured their setup to no 
> longer have local copies of the data, the blocks in the subtree are simply no 
> longer accessible as the external file store system is currently inaccessible.
> {code}hdfs attach -remove  [-force | -flush]{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-10 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16608925#comment-16608925
 ] 

Ewan Higgs commented on HDFS-13777:
---

006 
- Rebased patch onto HDFS-12090 {{06477abcd93eb988b4afd0a2dff549e67e0dbd85}}
- Still need to split this up

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Status: Patch Available  (was: Open)

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Status: Open  (was: Patch Available)

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-09-10 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Attachment: HDFS-13777-HDFS-12090.006.patch

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch, HDFS-13777-HDFS-12090.006.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-05 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16604207#comment-16604207
 ] 

Ewan Higgs commented on HDFS-13713:
---

003
* Update specifications based on [~ste...@apache.org]'s comments.
* Add tests to check for conditions discussed.
* Add precondition checks to collect that part numbers are distinct and that no 
directory has been placed in the way before the commit took place.

Given that we now check explicitly for distinct part numbers, I'd be amenable 
to changing from {{List>}} to {{Map}}.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> HDFS-13713.003.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-05 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Attachment: HDFS-13713.003.patch

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> HDFS-13713.003.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-05 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Status: Patch Available  (was: Open)

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> HDFS-13713.003.patch, multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-04 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603504#comment-16603504
 ] 

Ewan Higgs commented on HDFS-13713:
---

{quote}duplicate entries: should the s3a one do a check & fail consistently? Or 
call out that it's a MUST fail with IllegalArgumentException or IOE. (I'd 
prefer a consistent IllegalArgumentException as this check is straightforward 
to do client side){quote}
Agreed. I think {{IllegalArgumentException}} fits best here.

{quote}add a marker to stop a file going in there later. {quote} Goodness, I 
hope we don't have to do that. A feature here is that no file exists until the 
complete method is called! 

If you're working on a system where people will trample your destination files 
with directories, I would prefer the onus to be on the client to create names 
that won't be interfered with (e.g. containing a UUID).

{quote}Or, and its an interesting thought: don't do the checks at init time, 
but postpone them until commit. {quote}
Yes, It's a feature here that the destination file doesn't exist until the 
complete method is called. So it makes sense that this is when all the checks 
happen. The parent directory for a file needs to exist at init time though 
because that's where we put the temp directory with the parts.

{quote}All this is straightforward to test, obviously, which is why having 
consistent exceptions is nice. (and its why I like these FS specs, they 
identify those corner cases we can trivially derive tests from, and, use as the 
reference when trying to decide whether a test failure is a bug in the FS vs 
the test itself{quote} Absolutely!

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-04 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603489#comment-16603489
 ] 

Ewan Higgs commented on HDFS-13713:
---

[~goiri], yes there is a HDFS implementation. See 
{{org.apache.hadoop.fs.FileSystemMultipartUploader}}. There is no example yet 
of a use since this is a primitive that will be used by forthcoming work 
(HDFS-12090). But it has wide applicability (e.g. DistCP) so it was submitted 
to trunk and not on the HDFS-12090 branch.

{quote}Does it make sense to add a pointer to the S3 implementation as an 
example?{quote}
Maybe, but would pointing to the S3 implementation preempt the possibilities of 
having a wasb and/or adl implementation?

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-04 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602927#comment-16602927
 ] 

Ewan Higgs commented on HDFS-13713:
---

{quote}something about how all listed handles must be in 
keys(FS.uploads(uploadHandle).parts){quote}

{quote}what if there are handles in the MPU which aren't included in the 
map?{quote}
They are not part of the resulting file and should be cleaned up by the MPU 
uploader implementation if the underlying backing store does not do this 
automatically.

{quote}what if there are duplicate entries in the hand in the parts list 
provided?{quote}
This fails in HDFS with {{org.apache.hadoop.HadoopIllegalArgumentException: 
concat: at least two of the source files are the same}}
This fails in S3 with {{AmazonS3Exception: The list of parts was not in 
ascending order.}}


{quote}what if, at the point of completion, there is now a directory at the 
destination?{quote}
AFAICS, this is a race condition that can only be left as implementation 
dependent undefined behaviour since some backing stores don't have directories, 
and the FileSystem API doesn't have transactions or CaS to only write a file if 
nothing it already there.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch, 
> multipartuploader.md
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages

2018-09-04 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602798#comment-16602798
 ] 

Ewan Higgs commented on HDFS-12090:
---

Now that the SPS (HDFS-10285) has been merged, I've rebased onto 
211034a6c22dd4ebe697481ea4d57b5eb932fa08.

> Handling writes from HDFS to Provided storages
> --
>
> Key: HDFS-12090
> URL: https://issues.apache.org/jira/browse/HDFS-12090
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Virajith Jalaparti
>Priority: Major
> Attachments: External-SyncService-CreateFile.001.png, 
> HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090..patch, HDFS-12090.0001.patch
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-03 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602095#comment-16602095
 ] 

Ewan Higgs commented on HDFS-13713:
---

002
* Explanation of how the API should work.
* Use some backticks to format class names.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-09-03 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Attachment: HDFS-13713.002.patch

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch, HDFS-13713.002.patch
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-08-29 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16596919#comment-16596919
 ] 

Ewan Higgs commented on HDFS-13713:
---

001
- Basic documentation file in 
{{hadoop-common-project/hadoop-common/src/site/markdown/filesystem}}.
- This patch needs help with the specification notation.

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13713) Add specification of Multipart Upload API to FS specification, with contract tests

2018-08-29 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13713:
--
Attachment: HDFS-13713.001.patch

> Add specification of Multipart Upload API to FS specification, with contract 
> tests
> --
>
> Key: HDFS-13713
> URL: https://issues.apache.org/jira/browse/HDFS-13713
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.2.0
>Reporter: Steve Loughran
>Assignee: Ewan Higgs
>Priority: Blocker
> Attachments: HDFS-13713.001.patch
>
>
> There's nothing in the FS spec covering the new API. Add it in a new .md file
> * add FS model with the notion of a function mapping (uploadID -> Upload), 
> the operations (list, commit, abort). The [TLA+ 
> mode|https://issues.apache.org/jira/secure/attachment/12865161/objectstore.pdf]l
>  of HADOOP-13786 shows how to do this.
> * Contract tests of not just the successful path, but all the invalid ones.
> * implementations of the contract tests of all FSs which support the new API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-08-18 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16584940#comment-16584940
 ] 

Ewan Higgs commented on HDFS-13777:
---

005
- removed the aliasmap work as it's covered in the patch for HDFS-13794.

I'll remove the mountmanager work and client protocol in a subsequent patch.

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13777) [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.

2018-08-18 Thread Ewan Higgs (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-13777:
--
Attachment: HDFS-13777-HDFS-12090.005.patch

> [PROVIDED Phase 2] Scheduler in the NN for distributing DNA_BACKUP work.
> 
>
> Key: HDFS-13777
> URL: https://issues.apache.org/jira/browse/HDFS-13777
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
>Priority: Major
> Attachments: HDFS-13777-HDFS-12090.001.patch, 
> HDFS-13777-HDFS-12090.002.patch, HDFS-13777-HDFS-12090.003.patch, 
> HDFS-13777-HDFS-12090.005.patch
>
>
> When the SyncService is running, it should periodically take snapshots, make 
> a snapshotdiff, and then distribute DNA_BACKUP work to the Datanodes (See 
> HDFS-13421). Upon completion of the work, the NN should update the AliasMap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   >