[jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system

2017-04-05 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958353#comment-15958353
 ] 

Thomas Demoor commented on HDFS-9806:
-

We will post an updated design doc next week.

Quick status update: 
* General infrastructure, protocol changes and read path are almost done
* Write path and dynamic mounting are ongoing

> Allow HDFS block replicas to be provided by an external storage system
> --
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2016-11-03 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632918#comment-15632918
 ] 

Thomas Demoor commented on HDFS-11026:
--

Ewan's stacktraces match with [~andrew.wang]'s remarks.
[~daryn], once HDFS-11096 gets resolved we expect the current patch to work 
across 2.x and 3.0.

Thanks for looking at our patch.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11026.002.patch, blocktokenidentifier-protobuf.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7343) A comprehensive and flexible storage policy engine

2016-07-26 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15393680#comment-15393680
 ] 

Thomas Demoor commented on HDFS-7343:
-

Seems to me this (partly) overlaps with HDFS-10285, which already has a design 
doc. [~drankye], as you've been active in both tickets, do you think these 
should be linked up? And what part of it is exclusive to the current ticket?

> A comprehensive and flexible storage policy engine
> --
>
> Key: HDFS-7343
> URL: https://issues.apache.org/jira/browse/HDFS-7343
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Kai Zheng
>Assignee: Kai Zheng
>
> As discussed in HDFS-7285, it would be better to have a comprehensive and 
> flexible storage policy engine considering file attributes, metadata, data 
> temperature, storage type, EC codec, available hardware capabilities, 
> user/application preference and etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9806) Allow HDFS block replicas to be provided by an external storage system

2016-05-20 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292968#comment-15292968
 ] 

Thomas Demoor commented on HDFS-9806:
-

Thanks [~chris.douglas] for the architecture doc. Very interesting feature.
 
First, the way we interpreted the document, the external (provided) storage is 
the source of truth so any changes there should be updated in HDFS and any 
inconsistencies that arise would favour the external store. With that in mind, 
we had some questions mostly relating to the following two paragraphs in 
section 3.4:
{quote}
Periodically, and/or when a particular directory or file is accessed on the 
Namenode, the Namenode queries the PROVIDED store to validate its cache. If the 
ID changed since its last update, the Namenode updates the corresponding 
metadata and block information.
The Datanode is also responsible for verifying the nonce when servicing read 
requests. Without this check, it may return data that does not match the record 
in the Namenode (e.g., if another file is renamed onto the same path in the 
external store).
{quote}
Questions:
# If the Namenode is accessing the PROVIDED storage to update its mapping 
shouldn’t it also update the nonce data at the same time and instruct the 
datanode to refresh too? Or is the intention for the Namenode to only update 
the directory information and not the actual nonce data for the files? (If so, 
how could the Namenode apply heuristics to detect “promoting output to a parent 
directory”?).
# How should this work in the face of Storage Policies? For example, if we have 
a StoragePolicy of {SSD, DISK, PROVIDED} it seems to us that it would make 
sense for the Namenode to use a HEAD request (or equivalent) to see if the data 
is still valid. If so, tell the client to talk to the Datanode with the file on 
SSD. Otherwise, the data needs to be refreshed across all three Datanodes. As 
the Namenode currently manages replication requests, it seems that it would 
make sense for it to trigger requests to refresh the data from the PROVIDED 
storage system.
# When you say “Periodically and/or when a particular directory or file is 
accessed on the Namenode” do you mean this is something to be configured, or 
just that it hasn’t been decided if both are required. We think periodically is 
required since this is the only way to clean up directory listings with files 
that have been removed from the PROVIDED storage. On access, it makes sense to 
always make a HEAD request (or equivalent) to make sure it isn’t stale.
# Finally, do you anticipate changes to the wire protocol between the Namenode 
and Datanode?


> Allow HDFS block replicas to be provided by an external storage system
> --
>
> Key: HDFS-9806
> URL: https://issues.apache.org/jira/browse/HDFS-9806
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Chris Douglas
> Attachments: HDFS-9806-design.001.pdf
>
>
> In addition to heterogeneous media, many applications work with heterogeneous 
> storage systems. The guarantees and semantics provided by these systems are 
> often similar, but not identical to those of 
> [HDFS|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/filesystem/index.html].
>  Any client accessing multiple storage systems is responsible for reasoning 
> about each system independently, and must propagate/and renew credentials for 
> each store.
> Remote stores could be mounted under HDFS. Block locations could be mapped to 
> immutable file regions, opaque IDs, or other tokens that represent a 
> consistent view of the data. While correctness for arbitrary operations 
> requires careful coordination between stores, in practice we can provide 
> workable semantics with weaker guarantees.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7240) Object store in HDFS

2015-07-06 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14614705#comment-14614705
 ] 

Thomas Demoor commented on HDFS-7240:
-

[~john.jian.fang] and [~jnp]: 
* Avoiding rename happens in [HADOOP-9565] by introducing ObjectStore (extends 
Filesystem) and letting FileOutputCommitter, Hadoop CLI, ... act on this (by 
avoiding rename). Ozone could easily extend ObjectStore and benefit from this.
* [HADOOP-11262] extends DelegateToFileSystem to implement s3a as an 
AbstractFileSystem and works around issues as modification times for 
directories (cfr. Azure).

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Ozone-architecture-v1.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7240) Object store in HDFS

2015-06-04 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573635#comment-14573635
 ] 

Thomas Demoor commented on HDFS-7240:
-

Very interesting call yesterday. Might be interesting to have a group 
discussion at Hadoop Summit next week?

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Ozone-architecture-v1.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7240) Object store in HDFS

2015-06-01 Thread Thomas Demoor (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567258#comment-14567258
 ] 

Thomas Demoor commented on HDFS-7240:
-

 Maybe some of the (ongoing) work for currently supported object stores can be 
reused here (f.i. HADOOP-9565)? Will probably call-in.

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
> Attachments: Ozone-architecture-v1.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)