Oops, retrying now subscribed to more than solely yarn-dev.
-Clay
On Wed, 28 Feb 2018, Clay B. wrote:
+1 (non-binding)
I have walked through the code and find it very compelling as a
user; I really look forward to seeing the Ozone code mature and
it maturing HDFS features together. The points which excite me
as an eight year HDFS user are:
* Excitement for making the datanode a storage technology
container - this
patch clearly brings fresh thought to HDFS keeping it from
growing stale
* Ability to build upon a shared storage infrastructure for
diverse
loads: I do not want to have "stranded" storage capacity or
have to
manage competing storage systems on the same disks (and
further I want
the metrics datanodes can provide me today, so I do not have
to
instrument two systems or evolve their instrumentation
separately).
* Looking forward to supporting object-sized files!
* Moves HDFS in the right direction to test out new block
management
techniques for scaling HDFS. I am really excited to see the
raft
integration; I hope it opens a new era in Hadoop matching
modern systems
design with new consistency and replication options in our
ever
distributed ecosystem.
-Clay
On Mon, 26 Feb 2018, Jitendra Pandey wrote:
Dear folks,
We would like to start a vote to merge HDFS-7240
branch into trunk. The context can be reviewed in the
DISCUSSION thread, and in the jiras (See references below).
HDFS-7240 introduces Hadoop Distributed Storage Layer
(HDSL), which is a distributed, replicated block layer.
The old HDFS namespace and NN can be connected to this new
block layer as we have described in HDFS-10419.
We also introduce a key-value namespace called Ozone built
on HDSL.
The code is in a separate module and is turned off by
default. In a secure setup, HDSL and Ozone daemons cannot be
started.
The detailed documentation is available at
https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Distributed+Storage+Layer+and+Applications
I will start with my vote.
+1 (binding)
Discussion Thread:
https://s.apache.org/7240-merge
https://s.apache.org/4sfU
Jiras:
https://issues.apache.org/jira/browse/HDFS-7240
https://issues.apache.org/jira/browse/HDFS-10419
https://issues.apache.org/jira/browse/HDFS-13074
https://issues.apache.org/jira/browse/HDFS-13180
Thanks
jitendra
DISCUSSION THREAD SUMMARY :
On 2/13/18, 6:28 PM, "sanjay Radia"
<sanjayo...@gmail.com> wrote:
Sorry the formatting got messed by my email
client. Here it is again
Dear
Hadoop Community Members,
We had multiple community discussions, a few
meetings in smaller groups and also jira discussions with
respect to this thread. We express our gratitude for
participation and valuable comments.
The key questions raised were following
1) How the new block storage layer and OzoneFS
benefit HDFS and we were asked to chalk out a roadmap towards
the goal of a scalable namenode working with the new storage
layer
2) We were asked to provide a security design
3)There were questions around stability given
ozone brings in a large body of code.
4) Why can?t they be separate projects forever
or merged in when production ready?
We have responded to all the above questions
with detailed explanations and answers on the jira as well as
in the discussions. We believe that should sufficiently
address community?s concerns.
Please see the summary below:
1) The new code base benefits HDFS scaling and
a roadmap has been provided.
Summary:
- New block storage layer addresses the
scalability of the block layer. We have shown how existing NN
can be connected to the new block layer and its benefits. We
have shown 2 milestones, 1st milestone is much simpler than
2nd milestone while giving almost the same scaling benefits.
Originally we had proposed simply milestone 2 and the
community felt that removing the FSN/BM lock was was a fair
amount of work and a simpler solution would be useful
- We provide a new K-V namespace called Ozone
FS with FileSystem/FileContext plugins to allow the users to
use the new system. BTW Hive and Spark work very well on
KV-namespaces on the cloud. This will facilitate stabilizing
the new block layer.
- The new block layer has a new netty based
protocol engine in the Datanode which, when stabilized, can be
used by the old hdfs block layer. See details below on
sharing of code.
2) Stability impact on the existing HDFS code
base and code separation. The new block layer and the OzoneFS
are in modules that are separate from old HDFS code -
currently there are no calls from HDFS into Ozone except for
DN starting the new block layer module if configured to do
so. It does not add instability (the instability argument has
been raised many times). Over time as we share code, we will
ensure that the old HDFS continues to remains stable. (for
example we plan to stabilize the new netty based protocol
engine in the new block layer before sharing it with HDFS?s
old block layer)
3) In the short term and medium term, the new
system and HDFS will be used side-by-side by users. Side
by-side usage in the short term for testing and side-by-side
in the medium term for actual production use till the new
system has feature parity with old HDFS. During this time,
sharing the DN daemon and admin functions between the two
systems is operationally important:
- Sharing DN daemon to avoid additional
operational daemon lifecycle management
- Common decommissioning of the daemon and
DN: One place to decommission for a node and its storage.
- Replacing failed disks and internal
balancing capacity across disks - this needs to be done for
both the current HDFS blocks and the new block-layer blocks.
- Balancer: we would like use the same
balancer and provide a common way to balance and common
management of the bandwidth used for balancing
- Security configuration setup - reuse
existing set up for DNs rather then a new one for an
independent cluster.
4) Need to easily share the block layer code
between the two systems when used side-by-side. Areas where
sharing code is desired over time:
- Sharing new block layer?s new netty based
protocol engine for old HDFS DNs (a long time sore issue for
HDFS block layer).
- Shallow data copy from old system to new
system is practical only if within same project and daemon
otherwise have to deal with security setting and coordinations
across daemons. Shallow copy is useful as customer migrate
from old to new.
- Shared disk scheduling in the future and in
the short term have a single round robin rather than
independent round robins.
While sharing code across projects is
technically possible (anything is possible in software), it
is significantly harder typically requiring cleaner public
apis etc. Sharing within a project though internal APIs is
often simpler (such as the protocol engine that we want to
share).
5) Security design, including a threat model
and and the solution has been posted.
6) Temporary Separation and merge later:
Several of the comments in the jira have argued that we
temporarily separate the two code bases for now and then later
merge them when the new code is stable:
- If there is agreement to merge later, why
bother separating now - there needs to be to be good reasons
to separate now. We have addressed the stability and
separation of the new code from existing above.
- Merge the new code back into HDFS later
will be harder.
**The code and goals will diverge further.
** We will be taking on extra work to split
and then take extra work to merge.
** The issues raised today will be raised
all the same then.
---------------------------------------------------------------------
To unsubscribe, e-mail:
hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail:
hdfs-dev-h...@hadoop.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail:
yarn-dev-h...@hadoop.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org