[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-12090: -- Labels: pull-request-available (was: ) > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Priority: Major > Labels: pull-request-available > Attachments: External-SyncService-CreateFile.001.png, > HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, > HDFS-12090..patch, HDFS-12090.0001.patch > > Time Spent: 10m > Remaining Estimate: 0h > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: External-SyncService-CreateFile.001.png > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Priority: Major > Attachments: External-SyncService-CreateFile.001.png, > HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, > HDFS-12090..patch, HDFS-12090.0001.patch > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090.0001.patch > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Priority: Major > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, > HDFS-12090..patch, HDFS-12090.0001.patch > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090..patch Attached is a work in progress patch for the backup case. It's not beautiful, but we'd like to share this as early as possible so the direction we're taking this is visible. This contains a test ( {{ TestGrandSyncServiceTask}} ) that starts a client and backup {{MiniDFSCluster}} and can back files up from the client to the backup. It's very early days, but should show the direction. Some of the caveats: * This patch applies onto 928964102029e96406f5482e8900802f38164501 * We pulled in the HDFS-10285-consolidated-merge-patch-03.patch from the StoragePolicySatisfier work. We initially used this patch with the intention of working with the SPS. However, as the SyncService works on the file level and not the block level, the APIs couldn't be used directly. Therefore we can rebase and remove our dependency on this patch. * The design of the {{SyncServiceSatisfier}} and {{SyncServiceSatisfierWorker}} is based on the idea that we will have an object storage on the back end. This means that all operations could be reordered. When we run hdfs mapping to hdfs, however, the order of operations cannot be reordered! * Rename and delete tests are flakey (don't work). This is because the success/failure reporting hasn't been implemented so the tests incorrectly check that files were deleted before the work was done on the backend. * The snapshotting mechanism doesn't wait until the previous snapshot has been completely synchronised before looking to create more work. * The MountManager works using paths. This will break if the local directory is moved. Therefore it should accept a path but apply the sync service functionality by holding onto an INode. * Snapshot tests appear to have been broken. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, > HDFS-12090..patch > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090-Functional-Specification.003.pdf Attaching version of specification with less formatting issues. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, > HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: (was: HDFS-9806 Functional Specification (003).pdf) > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-9806 Functional Specification (003).pdf Attaching version of specification which was not affected by Word's creative PDF distillation. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090-Functional-Specification.002.pdf Attaching updated version of Functional Specification with some cleanups by [~virajith]. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-Functional-Specification.002.pdf, HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090-Functional-Specification.001.pdf Attaching functional specification. This should be a good entry point for anyone trying to understand what we're trying to do in this project. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-Functional-Specification.001.pdf, > HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: HDFS-12090 Functional Specification.pdf Attaching Functional Specification with description of expected command line and results. This should be a entry point for people new to the project. > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated HDFS-12090: -- Attachment: (was: HDFS-12090 Functional Specification.pdf) > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12090: -- Attachment: HDFS-12090-design.001.pdf Posting the first version of the design document on how writes/updates to {{PROVIDED}} storage will be handled. The primary use-case of this feature will be data backup from HDFS to other storage systems (either objects stores like s3 or a system that implements the {{org.apache.hadoop.fs.FileSystem}} API). > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti > Attachments: HDFS-12090-design.001.pdf > > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12090: -- Description: HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external storage systems accessible through HDFS. However, HDFS-9806 is limited to data being read through HDFS. This JIRA will deal with how data can be written to such {{PROVIDED}} storages from HDFS. (was: HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external storage systems accessible through HDFS. However, HDFS-9806 is limited to data being read through HDFS. This JIRA is to keep track of how data can be written to such {{PROVIDED}} storages from HDFS.) > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA will deal with how data > can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12090) Handling writes from HDFS to Provided storages
[ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Virajith Jalaparti updated HDFS-12090: -- Description: HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external storage systems accessible through HDFS. However, HDFS-9806 is limited to data being read through HDFS. This JIRA is to keep track of how data can be written to such {{PROVIDED}} storages from HDFS. (was: HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external storage systems accessible through HDFS. This JIRA is to keep track of how data can be written to such {{PROVIDED}} storages from HDFS.) > Handling writes from HDFS to Provided storages > -- > > Key: HDFS-12090 > URL: https://issues.apache.org/jira/browse/HDFS-12090 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Virajith Jalaparti >Assignee: Virajith Jalaparti > > HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in > external storage systems accessible through HDFS. However, HDFS-9806 is > limited to data being read through HDFS. This JIRA is to keep track of how > data can be written to such {{PROVIDED}} storages from HDFS. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org