Hi, Wei-Chiu -
I don't know if this is something already in the pipeline for 3.x, but
I'd like to see a mechanism in HDFS that encrypts blocks pre-storage
such that I'd only have to manage keys in one place (NameManager?). If
that capability existed, then I could move blocks around an unsafe
network and/or not have to worry about my worker nodes having
volume-level or whole-disk-level encryption. Even if I have Hadoop
traffic only crossing a LAN that's captive to the cluster, I might still
have to worry about worker nodes being stolen outright or having the
drive(s) taken out of them.
- Jeff
On 6/10/19 8:40 PM, Wei-Chiu Chuang wrote:
Thank you Sudeep for the feedback,
To be more specific, what sort of examples are??you looking for?
On another note, I had written some docs of extended length about
Hadoop code base and internal designs. I should probably make those
public to share the knowledge (or fix my grammar errors, for that matter)
On Mon, Jun 10, 2019 at 12:11 PM Sudeep Singh Thakur
<sudeepthaku...@gmail.com <mailto:sudeepthaku...@gmail.com>> wrote:
Hi ,
Examples are most helpful for developer. Please add examples as
much as we can.
Thanks
Sudeep Thakur
On Mon, Jun 10, 2019, 10:38 PM Wei-Chiu Chuang
<weic...@cloudera.com.invalid> wrote:
Hi!
I am soliciting feedbacks for HDFS roadmap items and wish list
in the future Hadoop releases. A community meetup
<https://www.meetup.com/Hadoop-Contributors/events/262055924/?rv=ea1_v2&_xtd=gatlbWFpbF9jbGlja9oAJGJiNTE1ODdkLTY0MDAtNDFiZS1iOTU5LTM5ZWYyMDU1N2Q4Nw>
is happening soon, and perhaps we can use this thread to
converge on things we should talk about there.
I am aware of several major features that merged into trunk,
such as RBF, Consistent Standby Serving Reads, as well as some
recent features that merged into 3.2.0 release (storage policy
satisfier).
What else should we be doing? I have a laundry list of
supportability improvement projects, mostly about improving
performance or making performance diagnostics easier. I can
share the list if folks are interested.
Are there things we should do to make developer's life easier
or things that would be nice to have for downstream
applications? I know??Sahil Takiar made a series of
improvements in HDFS for Impala recently, and those
improvements are applicable to other downstreamers such as
HBase. Or would it help if we provide more Hadoop API examples?