Re: [DISCUSS] making Ozone a separate Apache project

2020-05-13 Thread anu engineer
+1
—Anu

> On May 13, 2020, at 12:53 AM, Elek, Marton  wrote:
> 
> 
> 
> I would like to start a discussion to make a separate Apache project for Ozone
> 
> 
> 
> ### HISTORY [1]
> 
> * Apache Hadoop Ozone development started on a feature branch of Hadoop 
> repository (HDFS-7240)
> 
> * In the October of 2017 a discussion has been started to merge it to the 
> Hadoop main branch
> 
> * After a long discussion it's merged to Hadoop trunk at the March of 2018
> 
> * During the discussion of the merge, it was suggested multiple times to 
> create a separated project for the Ozone. But at that time:
>1). Ozone was tightly integrated with Hadoop/HDFS
>2). There was an active plan to use Block layer of Ozone (HDDS or HDSL at 
> that time) as the block level of HDFS
>3). The community of Ozone was a subset of the HDFS community
> 
> * The first beta release of Ozone was just released. Seems to be a good time 
> before the first GA to make a decision about the future.
> 
> 
> 
> ### WHAT HAS BEEN CHANGED
> 
> During the last years Ozone became more and more independent both at the 
> community and code side. The separation has been suggested again and again 
> (for example by Owen [2] and Vinod [3])
> 
> 
> 
> From COMMUNITY point of view:
> 
> 
>  * Fortunately more and more new contributors are helping Ozone. Originally 
> the Ozone community was a subset of HDFS project. But now a bigger and bigger 
> part of the community is related to Ozone only.
> 
>  * It seems to be easier to _build_ the community as a separated project.
> 
>  * A new, younger project might have different practices (communication, 
> commiter criteria, development style) compared to old, mature project
> 
>  * It's easier to communicate (and improve) these standards in a separated 
> projects with clean boundaries
> 
>  * Separated project/brand can help to increase the adoption rate and attract 
> more individual contributor (AFAIK it has been seen in Submarine after a 
> similar move)
> 
> * Contribution process can be communicated more easily, we can make first 
> time contribution more easy
> 
> 
> 
> From CODE point of view Ozone became more and more independent:
> 
> 
> * Ozone has different release cycle
> 
> * Code is already separated from Hadoop code base (apache/hadoop-ozone.git)
> 
> * It has separated CI (github actions)
> 
> * Ozone uses different (more strict) coding style (zero toleration of unit 
> test / checkstyle errors)
> 
> * The code itself became more and more independent from Hadoop on Maven 
> level. Originally it was compiled together with the in-tree latest Hadoop 
> snapshot. Now it depends on released Hadoop artifacts (RPC, Configuration...)
> 
> * It starts to use multiple version of Hadoop (on client side)
> 
> * Volume of resolved issues are already very high on Ozone side (Ozone had 
> slightly more resolved issues than HDFS/YARN/MAPREDUCE/COMMON all together in 
> the last 2-3 months)
> 
> 
> Summary: Before the first Ozone GA release, It seems to be a good time to 
> discuss the long-term future of Ozone. Managing it as a separated TLP project 
> seems to have more benefits.
> 
> 
> Please let me know what your opinion is...
> 
> Thanks a lot,
> Marton
> 
> 
> 
> 
> 
> [1]: For more details, see: 
> https://github.com/apache/hadoop-ozone/blob/master/HISTORY.md
> 
> [2]: 
> https://lists.apache.org/thread.html/0d0253f6e5fa4f609bd9b917df8e1e4d8848e2b7fdb3099b730095e6%40%3Cprivate.hadoop.apache.org%3E
> 
> [3]: 
> https://lists.apache.org/thread.html/8be74421ea495a62e159f2b15d74627c63ea1f67a2464fa02c85d4aa%40%3Chdfs-dev.hadoop.apache.org%3E
> 
> -
> To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] Feature branch for HDFS-14978 In-place Erasure Coding Conversion

2020-01-23 Thread Anu Engineer
+1



> On Jan 23, 2020, at 2:51 PM, Jitendra Pandey  
> wrote:
> 
> +1 for the feature branch.
> 
>> On Thu, Jan 23, 2020 at 1:34 PM Wei-Chiu Chuang
>>  wrote:
>> 
>> Hi we are working on a feature to improve Erasure Coding, and I would like
>> to seek your opinion on creating a feature branch for it. (HDFS-14978
>> )
>> 
>> Reason for a feature branch
>> (1) it turns out we need to update NameNode layout version
>> (2) It's a medium size project and we want to get this feature merged in
>> its entirety.
>> 
>> Aravindan Vijayan and I are planning to work on this feature.
>> 
>> Thoughts?
>> 

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] Remove Ozone and Submarine from Hadoop repo

2019-10-28 Thread Anu Engineer
@Vinod Kumar Vavilapalli 
  Do we need a separate vote thread for this? there are already JIRAs in
place for ozone code removal and I gather it is same for Submarine.
Would it be possible to treat this thread as consensus and act upon the
JIRA itself?

Thanks
Anu


On Sun, Oct 27, 2019 at 6:58 PM 俊平堵  wrote:

> +1.
>
> Thanks,
>
> Junping
>
> Akira Ajisaka  于2019年10月24日周四 下午3:21写道:
>
> > Hi folks,
> >
> > Both Ozone and Apache Submarine have separate repositories.
> > Can we remove these modules from hadoop-trunk?
> >
> > Regards,
> > Akira
> >
>


Re: [DISCUSS] Remove Ozone and Submarine from Hadoop repo

2019-10-24 Thread Anu Engineer
+1 for Ozone. We are in our own repo now. It would be good to remove this
code from Hadoop, otherwise it will confuse new contributors.
I would like to add a git tag tro Hadoop, so that people have the ability
to sync back and see the code evolution.

--Anu

On Thu, Oct 24, 2019 at 4:03 PM Giovanni Matteo Fumarola <
giovanni.fumar...@gmail.com> wrote:

> +1
>
> Thanks Wei-Chiu for creating HADOOP-16670.
>
> On Thu, Oct 24, 2019 at 12:56 PM Wei-Chiu Chuang 
> wrote:
>
> > +1 filed HADOOP-16670 <
> https://issues.apache.org/jira/browse/HADOOP-16670>
> > for
> > stripping the Submarine code.
> >
> > On Thu, Oct 24, 2019 at 12:14 PM Subru Krishnan 
> wrote:
> >
> > > +1.
> > >
> > > Thanks,
> > > Subru
> > >
> > > On Thu, Oct 24, 2019 at 12:51 AM 张铎(Duo Zhang) 
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Akira Ajisaka  于2019年10月24日周四 下午3:21写道:
> > > >
> > > > > Hi folks,
> > > > >
> > > > > Both Ozone and Apache Submarine have separate repositories.
> > > > > Can we remove these modules from hadoop-trunk?
> > > > >
> > > > > Regards,
> > > > > Akira
> > > > >
> > > >
> > >
> >
>


Re: [VOTE] Release Apache Hadoop Ozone 0.4.1-alpha

2019-10-12 Thread Anu Engineer
+1, Binding.

Verified the KEYS
Built from sources and ran tests:
   - General Ozone command line tests
   - Applications like MR and YARN.

--Anu


On Sat, Oct 12, 2019 at 10:25 AM Xiaoyu Yao 
wrote:

> +1 binding. Verified
> * Verify the signature.
> * Build from source.
> * Deploy docker compose in secure mode and verify ACL, sample MR jobs
>
> Thanks,
> Xiaoyu
>
> On Fri, Oct 11, 2019 at 5:37 PM Hanisha Koneru
> 
> wrote:
>
> > Thank you Nanda for putting up the RC.
> >
> > +1 binding.
> >
> > Verified the following:
> >   - Built from source
> >   - Deployed to 5 node cluster and ran smoke tests.
> >   - Ran sanity checks
> >
> > Thanks
> > Hanisha
> >
> > > On Oct 4, 2019, at 10:42 AM, Nanda kumar  wrote:
> > >
> > > Hi Folks,
> > >
> > > I have put together RC0 for Apache Hadoop Ozone 0.4.1-alpha.
> > >
> > > The artifacts are at:
> > > https://home.apache.org/~nanda/ozone/release/0.4.1/RC0/
> > >
> > > The maven artifacts are staged at:
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1238/
> > >
> > > The RC tag in git is at:
> > > https://github.com/apache/hadoop/tree/ozone-0.4.1-alpha-RC0
> > >
> > > And the public key used for signing the artifacts can be found at:
> > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > >
> > > This release contains 363 fixes/improvements [1].
> > > Thanks to everyone who put in the effort to make this happen.
> > >
> > > *The vote will run for 7 days, ending on October 11th at 11:59 pm IST.*
> > > Note: This release is alpha quality, it’s not recommended to use in
> > > production but we believe that it’s stable enough to try out the
> feature
> > > set and collect feedback.
> > >
> > >
> > > [1] https://s.apache.org/yfudc
> > >
> > > Thanks,
> > > Team Ozone
> >
> >
>


Re: [DISCUSS] Separate Hadoop Core trunk and Hadoop Ozone trunk source tree

2019-09-17 Thread Anu Engineer
+1
—Anu

> On Sep 17, 2019, at 2:49 AM, Elek, Marton  wrote:
> 
> 
> 
> TLDR; I propose to move Ozone related code out from Hadoop trunk and store it 
> in a separated *Hadoop* git repository apache/hadoop-ozone.git
> 
> 
> 
> 
> When Ozone was adopted as a new Hadoop subproject it was proposed[1] to be 
> part of the source tree but with separated release cadence, mainly because it 
> had the hadoop-trunk/SNAPSHOT as compile time dependency.
> 
> During the last Ozone releases this dependency is removed to provide more 
> stable releases. Instead of using the latest trunk/SNAPSHOT build from 
> Hadoop, Ozone uses the latest stable Hadoop (3.2.0 as of now).
> 
> As we have no more strict dependency between Hadoop trunk SNAPSHOT and Ozone 
> trunk I propose to separate the two code base from each other with creating a 
> new Hadoop git repository (apache/hadoop-ozone.git):
> 
> With moving Ozone to a separated git repository:
> 
> * It would be easier to contribute and understand the build (as of now we 
> always need `-f pom.ozone.xml` as a Maven parameter)
> * It would be possible to adjust build process without breaking Hadoop/Ozone 
> builds.
> * It would be possible to use different Readme/.asf.yaml/github template for 
> the Hadoop Ozone and core Hadoop. (For example the current github template 
> [2] has a link to the contribution guideline [3]. Ozone has an extended 
> version [4] from this guideline with additional information.)
> * Testing would be more safe as it won't be possible to change core Hadoop 
> and Hadoop Ozone in the same patch.
> * It would be easier to cut branches for Hadoop releases (based on the 
> original consensus, Ozone should be removed from all the release branches 
> after creating relase branches from trunk)
> 
> 
> What do you think?
> 
> Thanks,
> Marton
> 
> [1]: 
> https://lists.apache.org/thread.html/c85e5263dcc0ca1d13cbbe3bcfb53236784a39111b8c353f60582eb4@%3Chdfs-dev.hadoop.apache.org%3E
> [2]: 
> https://github.com/apache/hadoop/blob/trunk/.github/pull_request_template.md
> [3]: https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
> [4]: 
> https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute+to+Ozone
> 
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] ARM/aarch64 support for Hadoop

2019-09-05 Thread Anu Engineer
ng
> >> > for the jobs but I did a little search, according to:
> >> >
> >> >
> >>
> https://packages.ubuntu.com/search?keywords=protobuf-compiler=names
> >> > &
> >> >
> >> >
> >>
> https://packages.ubuntu.com/search?suite=default=all=any=libprotoc-dev=names
> >> > it both said that the version of libprotc-dev and protobuf-compiler
> >> > available for ubuntu 18.04 is 3.0.0
> >> >
> >> >
> >> > On Wed, Sep 4, 2019 at 4:39 PM Ayush Saxena 
> wrote:
> >> >
> >> >> Thanx Vinay for the initiative, Makes sense to add support for
> >> different
> >> >> architectures.
> >> >>
> >> >> +1, for the branch idea.
> >> >> Good Luck!!!
> >> >>
> >> >> -Ayush
> >> >>
> >> >> > On 03-Sep-2019, at 6:19 AM, 张铎(Duo Zhang) 
> >> >> wrote:
> >> >> >
> >> >> > For HBase, we purged all the protobuf related things from the
> public
> >> >> API,
> >> >> > and then upgraded to a shaded and relocated version of protobuf. We
> >> have
> >> >> > created a repo for this:
> >> >> >
> >> >> > https://github.com/apache/hbase-thirdparty
> >> >> >
> >> >> > But since the hadoop dependencies still pull in the protobuf 2.5
> >> jars,
> >> >> our
> >> >> > coprocessors are still on protobuf 2.5. Recently we have opened a
> >> >> discuss
> >> >> > on how to deal with the upgrading of coprocessor. Glad to see that
> >> the
> >> >> > hadoop community is also willing to solve the problem.
> >> >> >
> >> >> > Anu Engineer  于2019年9月3日周二
> 上午1:23写道:
> >> >> >
> >> >> >> +1, for the branch idea. Just FYI, Your biggest problem is proving
> >> that
> >> >> >> Hadoop and the downstream projects work correctly after you
> upgrade
> >> >> core
> >> >> >> components like Protobuf.
> >> >> >> So while branching and working on a branch is easy, merging back
> >> after
> >> >> you
> >> >> >> upgrade some of these core components is insanely hard. You might
> >> want
> >> >> to
> >> >> >> make sure that community buys into upgrading these components in
> the
> >> >> trunk.
> >> >> >> That way we will get testing and downstream components will notice
> >> when
> >> >> >> things break.
> >> >> >>
> >> >> >> That said, I have lobbied for the upgrade of Protobuf for a really
> >> long
> >> >> >> time; I have argued that 2.5 is out of support and we cannot stay
> on
> >> >> that
> >> >> >> branch forever; or we need to take ownership of the Protobuf 2.5
> >> code
> >> >> base.
> >> >> >> It has been rightly pointed to me that while all the arguments I
> >> make
> >> >> is
> >> >> >> correct; it is a very complicated task to upgrade Protobuf, and
> the
> >> >> worst
> >> >> >> part is we will not even know what breaks until downstream
> projects
> >> >> pick up
> >> >> >> these changes and work against us.
> >> >> >>
> >> >> >> If we work off the Hadoop version 3 — and assume that we have
> >> >> "shading" in
> >> >> >> place for all deployments; it might be possible to get there;
> still
> >> a
> >> >> >> daunting task.
> >> >> >>
> >> >> >> So best of luck with the branch approach — But please remember,
> >> Merging
> >> >> >> back will be hard, Just my 2 cents.
> >> >> >>
> >> >> >> — Anu
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng <
> >> zhengzhenyul...@gmail.com
> >> >> >
> >> >> >> wrote:
> >> >> >>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> 

Re: [DISCUSS] ARM/aarch64 support for Hadoop

2019-09-02 Thread Anu Engineer
+1, for the branch idea. Just FYI, Your biggest problem is proving that
Hadoop and the downstream projects work correctly after you upgrade core
components like Protobuf.
So while branching and working on a branch is easy, merging back after you
upgrade some of these core components is insanely hard. You might want to
make sure that community buys into upgrading these components in the trunk.
That way we will get testing and downstream components will notice when
things break.

That said, I have lobbied for the upgrade of Protobuf for a really long
time; I have argued that 2.5 is out of support and we cannot stay on that
branch forever; or we need to take ownership of the Protobuf 2.5 code base.
It has been rightly pointed to me that while all the arguments I make is
correct; it is a very complicated task to upgrade Protobuf, and the worst
part is we will not even know what breaks until downstream projects pick up
these changes and work against us.

If we work off the Hadoop version 3 — and assume that we have "shading" in
place for all deployments; it might be possible to get there; still a
daunting task.

So best of luck with the branch approach — But please remember, Merging
back will be hard, Just my 2 cents.

— Anu




On Sun, Sep 1, 2019 at 7:40 PM Zhenyu Zheng 
wrote:

> Hi,
>
> Thanks Vinaya for bring this up and thanks Sheng for the idea. A separate
> branch with it's own ARM CI seems a really good idea.
> By doing this we won't break any of the undergoing development in trunk and
> a CI can be a very good way to show what are the
> current problems and what have been fixed, it will also provide a very good
> view for contributors that are intrested to working on
> this. We can finally merge back the branch to trunk until the community
> thinks it is good enough and stable enough. We can donate
> ARM machines to the existing CI system for the job.
>
> I wonder if this approch possible?
>
> BR,
>
> On Thu, Aug 29, 2019 at 11:29 AM Sheng Liu  wrote:
>
> > Hi,
> >
> > Thanks Vinay for bring this up, I am a member of "Openlab" community
> > mentioned by Vinay. I am working on building and
> > testing Hadoop components on aarch64 server these days, besides the
> missing
> > dependices of ARM platform issues #1 #2 #3
> > mentioned by Vinay, other similar issue has also be found, such as the
> > "PhantomJS" dependent package also missing for aarch64.
> >
> > To promote the ARM support for Hadoop, we have discussed and hoped to add
> > an ARM specific CI to Hadoop repo. we are not
> > sure about if there is any potential effect or confilict on the trunk
> > branch, so maybe creating a ARM specific branch for doing these stuff
> > is a better choice, what do you think?
> >
> > Hope to hear thoughts from you :)
> >
> > BR,
> > Liu sheng
> >
> > Vinayakumar B  于2019年8月27日周二 上午5:34写道:
> >
> > > Hi Folks,
> > >
> > > ARM is becoming famous lately in its processing capability and has got
> > the
> > > potential to run Bigdata workloads.
> > > Many users have been moving to ARM machines due to its low cost.
> > >
> > > In the past there were attempts to compile Hadoop on ARM (Rasberry PI)
> > for
> > > experimental purposes. Today ARM architecture is taking some of the
> > > serverside processing as well. So there will be/is a real need of
> Hadoop
> > to
> > > support ARM architecture as well.
> > >
> > > There are bunch of users who are trying out building Hadoop on ARM,
> > trying
> > > to add ARM CI to hadoop and facing issues[1]. Also some
> > >
> > > As of today, Hadoop does not compile on ARM due to below issues, found
> > from
> > > testing done in openlab in [2].
> > >
> > > 1. Protobuf :
> > > ---
> > >  Hadoop project (also some downstream projects) stuck to protobuf
> > 2.5.0
> > > version, due to backward compatibility reasons. Protobuf-2.5.0 is not
> > being
> > > maintained in the community. While protobuf 3.x is being actively
> adopted
> > > widely, still protobuf 3.x provides wire compatibility for proto2
> > messages.
> > > Due to some compilation issues in the generated java code, which can
> > induce
> > > problems in downstream. Due to this reason protobuf upgrade from 2.5.0
> > was
> > > not taken up.
> > > In 3.0.0 onwards, hadoop supports shading of libraries to avoid
> classpath
> > > problem in downstream projects.
> > > There are patches available to fix compilation in Hadoop. But need
> to
> > > find a way to upgrade protobuf to latest version and still maintain the
> > > downstream's classpath using shading feature of Hadoop build.
> > >
> > >  There is a Jira for protobuf upgrade[3] created even before shade
> > > support was added to Hadoop. Now need to revisit the Jira and continue
> > > explore possibilities.
> > >
> > > 2. leveldbjni:
> > > ---
> > > Current leveldbjni used in YARN doesnot support ARM architecture,
> > need
> > > to check whether any of the future versions support ARM and can hadoop
> > > upgrade to that version.
> > 

Re: [DISCUSS] A unified and open Hadoop community sync up schedule?

2019-06-11 Thread Anu Engineer
For Ozone, we have started using the Wiki itself as the agenda and after
the meeting is over, we convert it into the meeting notes.
Here is an example, the project owner can edit and maintain it, it is like
10 mins work - and allows anyone to add stuff into the agenda too.

https://cwiki.apache.org/confluence/display/HADOOP/2019-06-10+Meeting+notes

--Anu

On Tue, Jun 11, 2019 at 10:20 AM Yufei Gu  wrote:

> +1 for this idea. Thanks Wangda for bringing this up.
>
> Some comments to share:
>
>- Agenda needed to be posted ahead of meeting and welcome any interested
>party to contribute to topics.
>- We should encourage more people to attend. That's whole point of the
>meeting.
>- Hopefully, this can mitigate the situation that some patches are
>waiting for review for ever, which turns away new contributors.
>- 30m per session sounds a little bit short, we can try it out and see
>if extension is needed.
>
> Best,
>
> Yufei
>
> `This is not a contribution`
>
>
> On Fri, Jun 7, 2019 at 4:39 PM Wangda Tan  wrote:
>
> > Hi Hadoop-devs,
> >
> > Previous we have regular YARN community sync up (1 hr, biweekly, but not
> > open to public). Recently because of changes in our schedules, Less folks
> > showed up in the sync up for the last several months.
> >
> > I saw the K8s community did a pretty good job to run their sig meetings,
> > there's regular meetings for different topics, notes, agenda, etc. Such
> as
> >
> >
> https://docs.google.com/document/d/13mwye7nvrmV11q9_Eg77z-1w3X7Q1GTbslpml4J7F3A/edit
> >
> >
> > For Hadoop community, there are less such regular meetings open to the
> > public except for Ozone project and offline meetups or Bird-of-Features
> in
> > Hadoop/DataWorks Summit. Recently we have a few folks joined DataWorks
> > Summit at Washington DC and Barcelona, and lots (50+) of folks join the
> > Ozone/Hadoop/YARN BoF, ask (good) questions and roadmaps. I think it is
> > important to open such conversations to the public and let more
> > folk/companies join.
> >
> > Discussed a small group of community members and wrote a short proposal
> > about the form, time and topic of the community sync up, thanks for
> > everybody who have contributed to the proposal! Please feel free to add
> > your thoughts to the Proposal Google doc
> > <
> >
> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
> > >
> > .
> >
> > Especially for the following parts:
> > - If you have interests to run any of the community sync-ups, please put
> > your name to the table inside the proposal. We need more volunteers to
> help
> > run the sync-ups in different timezones.
> > - Please add suggestions to the time, frequency and themes and feel free
> to
> > share your thoughts if we should do sync ups for other topics which are
> not
> > covered by the proposal.
> >
> > Link to the Proposal Google doc
> > <
> >
> https://docs.google.com/document/d/1GfNpYKhNUERAEH7m3yx6OfleoF3MqoQk3nJ7xqHD9nY/edit#
> > >
> >
> > Thanks,
> > Wangda Tan
> >
>


Re: VOTE: Hadoop Ozone 0.4.0-alpha RC2

2019-05-07 Thread Anu Engineer
+1 (Binding)

-- Built from sources.
-- Ran smoke tests and verified them.

--Anu


On Sun, May 5, 2019 at 8:05 PM Xiaoyu Yao  wrote:

> +1 Binding. Thanks all who contributed to the release.
>
> + Download sources and verify signature.
> + Build from source and ran docker-based ad-hot security tests.
> ++ From 1 datanode scale to 3 datanodes, verify certificates were
> correctly issued when security enabled
> ++ Smoke test for both non-secure and secure mode.
> ++ Put/Get/Delete/Rename Key with
> +++ Kerberos testing
> +++ Delegation token testing with DTUtil CLI and MR jobs.
> +++ S3 token.
>
> Just have one minor question for the expanded source code which points to
> hadoop-3.3.0-SNAPSHOT-src-with-hdds/hadoop-ozone. But in
> hadoop-ozone/pom.xml, we explicitly declare dependency on Hadoop 3.2.0.
> I understand we just take the trunk source code(3.3.0-SNAPSHOT up to the
> ozone-0.4 RC) here, should we fix this by giving the git hash of the trunk
> or clarify it to avoid confusion?
> This might be done by just updating the name of the binaries without reset
> the release itself.
>
> -Xiaoyu
>
>
> On 5/3/19, 4:07 PM, "Dinesh Chitlangia" 
> wrote:
>
> +1 (non-binding)
>
> - Built from sources and ran smoke test
> - Verified all checksums
> - Toggled audit log and verified audit parser tool
>
> Thanks Ajay for organizing the release.
>
> Cheers,
> Dinesh
>
>
>
> On 5/3/19, 5:42 PM, "Eric Yang"  wrote:
>
> +1
>
> On 4/29/19, 9:05 PM, "Ajay Kumar" 
> wrote:
>
> Hi All,
>
>
>
> We have created the third release candidate (RC2) for Apache
> Hadoop Ozone 0.4.0-alpha.
>
>
>
> This release contains security payload for Ozone. Below are
> some important features in it:
>
>
>
>   *   Hadoop Delegation Tokens and Block Tokens supported for
> Ozone.
>   *   Transparent Data Encryption (TDE) Support - Allows data
> blocks to be encrypted-at-rest.
>   *   Kerberos support for Ozone.
>   *   Certificate Infrastructure for Ozone  - Tokens use PKI
> instead of shared secrets.
>   *   Datanode to Datanode communication secured via mutual
> TLS.
>   *   Ability secure ozone cluster that works with Yarn, Hive,
> and Spark.
>   *   Skaffold support to deploy Ozone clusters on K8s.
>   *   Support S3 Authentication Mechanisms like - S3 v4
> Authentication protocol.
>   *   S3 Gateway supports Multipart upload.
>   *   S3A file system is tested and supported.
>   *   Support for Tracing and Profiling for all Ozone
> components.
>   *   Audit Support - including Audit Parser tools.
>   *   Apache Ranger Support in Ozone.
>   *   Extensive failure testing for Ozone.
>
> The RC artifacts are available at
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc2/
>
>
>
> The RC tag in git is ozone-0.4.0-alpha-RC2 (git hash
> 4ea602c1ee7b5e1a5560c6cbd096de4b140f776b)
>
>
>
> Please try out<
> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+Apache+Release>,
> vote, or just give us feedback.
>
>
>
> The vote will run for 5 days, ending on May 4, 2019, 04:00 UTC.
>
>
>
> Thank you very much,
>
> Ajay
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>


Re: VOTE: Hadoop Ozone 0.4.0-alpha RC1

2019-04-19 Thread Anu Engineer
+1 (Binding)

-- Verified the checksums.
-- Built from sources.
-- Sniff tested the functionality.

--Anu


On Mon, Apr 15, 2019 at 4:09 PM Ajay Kumar 
wrote:

> Hi all,
>
> We have created the second release candidate (RC1) for Apache Hadoop Ozone
> 0.4.0-alpha.
>
> This release contains security payload for Ozone. Below are some important
> features in it:
>
>   *   Hadoop Delegation Tokens and Block Tokens supported for Ozone.
>   *   Transparent Data Encryption (TDE) Support - Allows data blocks to be
> encrypted-at-rest.
>   *   Kerberos support for Ozone.
>   *   Certificate Infrastructure for Ozone  - Tokens use PKI instead of
> shared secrets.
>   *   Datanode to Datanode communication secured via mutual TLS.
>   *   Ability secure ozone cluster that works with Yarn, Hive, and Spark.
>   *   Skaffold support to deploy Ozone clusters on K8s.
>   *   Support S3 Authentication Mechanisms like - S3 v4 Authentication
> protocol.
>   *   S3 Gateway supports Multipart upload.
>   *   S3A file system is tested and supported.
>   *   Support for Tracing and Profiling for all Ozone components.
>   *   Audit Support - including Audit Parser tools.
>   *   Apache Ranger Support in Ozone.
>   *   Extensive failure testing for Ozone.
>
> The RC artifacts are available at
> https://home.apache.org/~ajay/ozone-0.4.0-alpha-rc1
>
> The RC tag in git is ozone-0.4.0-alpha-RC1 (git hash
> d673e16d14bb9377f27c9017e2ffc1bcb03eebfb)
>
> Please try out<
> https://cwiki.apache.org/confluence/display/HADOOP/Running+via+Apache+Release>,
> vote, or just give us feedback.
>
> The vote will run for 5 days, ending on April 20, 2019, 19:00 UTC.
>
> Thank you very much,
>
> Ajay
>
>
>


Re: [DISCUSS] Move to gitbox

2018-12-10 Thread Anu Engineer
+1
--Anu


On 12/10/18, 6:38 PM, "Vinayakumar B"  wrote:

+1

-Vinay

On Mon, 10 Dec 2018, 1:22 pm Elek, Marton 
> Thanks Akira,
>
> +1 (non-binding)
>
> I think it's better to do it now at a planned date.
>
> If I understood well the only bigger task here is to update all the
> jenkins jobs. (I am happy to help/contribute what I can do)
>
>
> Marton
>
> On 12/8/18 6:25 AM, Akira Ajisaka wrote:
> > Hi all,
> >
> > Apache Hadoop git repository is in git-wip-us server and it will be
> > decommissioned.
> > If there are no objection, I'll file a JIRA ticket with INFRA to
> > migrate to https://gitbox.apache.org/ and update documentation.
> >
> > According to ASF infra team, the timeframe is as follows:
> >
> >> - December 9th 2018 -> January 9th 2019: Voluntary (coordinated)
> relocation
> >> - January 9th -> February 6th: Mandated (coordinated) relocation
> >> - February 7th: All remaining repositories are mass migrated.
> >> This timeline may change to accommodate various scenarios.
> >
> > If we got consensus by January 9th, I can file a ticket with INFRA and
> > migrate it.
> > Even if we cannot got consensus, the repository will be migrated by
> > February 7th.
> >
> > Regards,
> > Akira
> >
> > -
> > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> >
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>



-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop Ozone 0.3.0-alpha (RC1)

2018-11-19 Thread Anu Engineer
+1. (Binding)

Thanks for getting this release done. Verified the signatures and S3 Gateway.

--Anu


On 11/16/18, 5:15 AM, "Shashikant Banerjee"  wrote:

+1 (non-binding).

  - Verified signatures
  - Verified checksums
  - Checked LICENSE/NOTICE files
  - Built from source
  - Ran smoke tests.

Thanks Marton for putting up the release together.

Thanks
Shashi

On 11/14/18, 10:44 PM, "Elek, Marton"  wrote:

Hi all,

I've created the second release candidate (RC1) for Apache Hadoop Ozone
0.3.0-alpha including one more fix on top of the previous RC0 (HDDS-854)

This is the second release of Apache Hadoop Ozone. Notable changes since
the first release:

* A new S3 compatible rest server is added. Ozone can be used from any
S3 compatible tools (HDDS-434)
* Ozone Hadoop file system URL prefix is renamed from o3:// to o3fs://
(HDDS-651)
* Extensive testing and stability improvements of OzoneFs.
* Spark, YARN and Hive support and stability improvements.
* Improved Pipeline handling and recovery.
* Separated/dedicated classpath definitions for all the Ozone
components. (HDDS-447)

The RC artifacts are available from:
https://home.apache.org/~elek/ozone-0.3.0-alpha-rc1/

The RC tag in git is: ozone-0.3.0-alpha-RC1 (ebbf459e6a6)

Please try it out, vote, or just give us feedback.

The vote will run for 5 days, ending on November 19, 2018 18:00 UTC.


Thank you very much,
Marton


PS:

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs from ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d
4. open localhost:9874 or localhost:9876



The easiest way to try it out from the source:

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha && docker-compose up -d



The easiest way to test basic functionality (with acceptance tests):

1. mvn  install -DskipTests -Pdist -Dmaven.javadoc.skip=true -Phdds
-DskipShade -am -pl :hadoop-ozone-dist
2. cd hadoop-ozone/dist/target/ozone-0.3.0-alpha/smoketest
3. ./test.sh

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




Re: [VOTE] Release Apache Hadoop Ozone 0.2.1-alpha (RC0)

2018-09-25 Thread Anu Engineer
Hi Marton,

+1 (binding)

1. Verified the Signature
2. Verified the Checksums - MD5 and Sha*
3. Build from Sources.
4. Ran all RPC and REST commands against the cluster via Robot.
5. Tested the OzoneFS functionality

Thank you very much for creating the first release of Ozone.

--Anu


On 9/19/18, 2:49 PM, "Elek, Marton"  wrote:

Hi all,

After the recent discussion about the first Ozone release I've created 
the first release candidate (RC0) for Apache Hadoop Ozone 0.2.1-alpha.

This release is alpha quality: it’s not recommended to use in production 
but we believe that it’s stable enough to try it out the feature set and 
collect feedback.

The RC artifacts are available from: 
https://home.apache.org/~elek/ozone-0.2.1-alpha-rc0/

The RC tag in git is: ozone-0.2.1-alpha-RC0 (968082ffa5d)

Please try the release and vote; the vote will run for the usual 5 
working days, ending on September 26, 2018 10pm UTC time.

The easiest way to try it out is:

1. Download the binary artifact
2. Read the docs at ./docs/index.html
3. TLDR; cd compose/ozone && docker-compose up -d


Please try it out, vote, or just give us feedback.

Thank you very much,
Marton

ps: At next week, we will have a BoF session at ApacheCon North Europe, 
Montreal on Monday evening. Please join, if you are interested, or need 
support to try out the package or just have any feedback.


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org





Re: [VOTE] Release Apache Hadoop 2.8.5 (RC0)

2018-09-22 Thread Anu Engineer
I believe that you need to regenerate the site using ‘hugo’ command (hugo is a 
site builder). Then commit and push the generated files.

Thanks
Anu


On 9/22/18, 9:56 AM, "俊平堵"  wrote:

Martin, thanks for your reply. It works now, but after git changes - I
haven’t seen Apache Hadoop website get refreshed. It seems like to need
some manually steps to refresh the website -if so, can you also update to
the wiki?

Thanks,

Junping

Elek, Marton 于2018年9月20日 周四下午1:40写道:

> Please try
>
> git clone https://gitbox.apache.org/repos/asf/hadoop-site.git -b asf-site
>
> (It seems git tries to check out master instead of the branch).
>
> I updated the wiki, sorry for the inconvenience.
>
> Marton
>
> On 9/18/18 8:05 PM, 俊平堵 wrote:
> > Hey Marton,
> >   The new release web-site actually doesn't work for me.  When I
> > follow your steps in wiki, and hit the issue during git clone repository
> > (writable) for hadoop-site as below:
> >
> > git clone https://gitbox.apache.org/repos/asf/hadoop-site.git
> > Cloning into 'hadoop-site'...
> > remote: Counting objects: 252414, done.
> > remote: Compressing objects: 100% (29625/29625), done.
> > remote: Total 252414 (delta 219617), reused 252211 (delta 219422)
> > Receiving objects: 100% (252414/252414), 98.78 MiB | 3.32 MiB/s, done.
> > Resolving deltas: 100% (219617/219617), done.
> > warning: remote HEAD refers to nonexistent ref, unable to checkout.
> >
> > Can you check above repository is correct for clone?
> > I can clone readable repository (https://github.com/apache/hadoop-site)
> > successfully though but cannot push back changes which is expected.
> >
> > Thanks,
> >
> > Junping
> >
> > Elek, Marton mailto:e...@apache.org>>于2018年9月17日
> > 周一上午6:15写道:
> >
> > Hi Junping,
> >
> > Thank you to work on this release.
> >
> > This release is the first release after the hadoop site change, and 
I
> > would like to be sure that everything works fine.
> >
> > Unfortunately I didn't get permission to edit the old wiki, but this
> is
> > definition of the site update on the new wiki:
> >
> >
> 
https://cwiki.apache.org/confluence/display/HADOOP/How+to+generate+and+push+ASF+web+site+after+HADOOP-14163
> >
> > Please let me know if something is not working for you...
> >
> > Thanks,
> > Marton
> >
> >
> > On 09/10/2018 02:00 PM, 俊平堵 wrote:
> >  > Hi all,
> >  >
> >  >   I've created the first release candidate (RC0) for Apache
> >  > Hadoop 2.8.5. This is our next point release to follow up 2.8.4.
> It
> >  > includes 33 important fixes and improvements.
> >  >
> >  >
> >  >  The RC artifacts are available at:
> >  > http://home.apache.org/~junping_du/hadoop-2.8.5-RC0
> >  >
> >  >
> >  >  The RC tag in git is: release-2.8.5-RC0
> >  >
> >  >
> >  >
> >  >  The maven artifacts are available via repository.apache.org
> > <
> >  > http://repository.apache.org> at:
> >  >
> >  >
> >
> https://repository.apache.org/content/repositories/orgapachehadoop-1140
> >  >
> >  >
> >  >  Please try the release and vote; the vote will run for the
> > usual 5 working
> >  > days, ending on 9/15/2018 PST time.
> >  >
> >  >
> >  > Thanks,
> >  >
> >  >
> >  > Junping
> >  >
> >
>




Re: [DISCUSS] Alpha Release of Ozone

2018-08-08 Thread Anu Engineer
Thanks for reporting this issue. I have filed a JIRA to address this issue.

https://issues.apache.org/jira/browse/HDDS-341

>So, consider this as a report. IMHO, cutting an Ozone release prior to a 
>Hadoop release ill-advised given the distribution impact and the requirements 
>of the merge vote.  

The Ozone release is being planned to address issues like these; In my mind if 
we go thru a release exercise, we will be able to identify all ozone and Hadoop 
related build and release issues. 
Ozone will tremendously benefit from a release exercise and the community 
review that comes from that.
 
Thanks
Anu


On 8/8/18, 1:19 PM, "Allen Wittenauer"  wrote:



> On Aug 8, 2018, at 12:56 PM, Anu Engineer  
wrote:
> 
>> Has anyone verified that a Hadoop release doesn't have _any_ of the 
extra ozone bits that are sprinkled outside the maven modules?
> As far as I know that is the state, we have had multiple Hadoop releases 
after ozone has been merged. So far no one has reported Ozone bits leaking into 
Hadoop. If we find something like that, it would be a bug.

There hasn't been a release from a branch where Ozone has been merged 
yet. The first one will be 3.2.0.  Running create-release off of trunk 
presently shows bits of Ozone in dev-support, hadoop-dist, and elsewhere in the 
Hadoop source tar ball. 

So, consider this as a report. IMHO, cutting an Ozone release prior to 
a Hadoop release ill-advised given the distribution impact and the requirements 
of the merge vote.  





Re: [DISCUSS] Alpha Release of Ozone

2018-08-06 Thread Anu Engineer
+1,  It will allow many users to get a first look at Ozone/HDDS. 

Thanks
Anu


On 8/6/18, 10:34 AM, "Elek, Marton"  wrote:

Hi All,

I would like to discuss creating an Alpha release for Ozone. The core 
functionality of Ozone is complete but there are two missing features; 
Security and HA, work on these features are progressing in Branches 
HDDS-4 and HDDS-151. Right now, Ozone can handle millions of keys and 
has a Hadoop compatible file system, which allows applications like 
Hive, Spark, and YARN use Ozone.

Having an Alpha release of Ozone will help in getting some early 
feedback (this release will be marked as an Alpha -- and not production 
ready).

Going through a complete release cycle will help us flush out Ozone 
release process, update user documentation and nail down deployment models.

Please share your thoughts on the Alpha release (over mail or in 
HDDS-214), as voted on by the community earlier, Ozone release will be 
independent of Hadoop releases.

Thanks a lot,
Marton Elek




-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org





Re: [VOTE] reset/force push to clean up inadvertent merge commit pushed to trunk

2018-07-06 Thread Anu Engineer
@Sunil G +1, Thanks for fixing this issue.

--Anu


On 7/6/18, 11:12 AM, "Sunil G"  wrote:

I just checked.  YARN-7556 and YARN-7451 can be cherry-picked.
I cherry-picked in my local and compiled. Things are good.

I can push this now  which will restore trunk to its original.
I can do this if there are no objection.

- Sunil

On Fri, Jul 6, 2018 at 11:10 AM Arpit Agarwal 
wrote:

> afaict YARN-8435 is still in trunk. YARN-7556 and YARN-7451 are not.
>
>
> From: Giovanni Matteo Fumarola 
> Date: Friday, July 6, 2018 at 10:59 AM
> To: Vinod Kumar Vavilapalli 
> Cc: Anu Engineer , Arpit Agarwal <
> aagar...@hortonworks.com>, "su...@apache.org" , "
> yarn-dev@hadoop.apache.org" , "
> hdfs-...@hadoop.apache.org" , "
> common-...@hadoop.apache.org" , "
> mapreduce-...@hadoop.apache.org" 
> Subject: Re: [VOTE] reset/force push to clean up inadvertent merge commit
> pushed to trunk
>
> Everything seems ok except the 3 commits: YARN-8435, YARN-7556, YARN-7451
> are not anymore in trunk due to the revert.
>
> Haibo/Robert if you can recommit your patches I will commit mine
> subsequently to preserve the original order.
>
> (My apology for the mess I did with the merge commit)
>
> On Fri, Jul 6, 2018 at 10:42 AM, Vinod Kumar Vavilapalli <
> vino...@apache.org<mailto:vino...@apache.org>> wrote:
> I will add that the branch also successfully compiles.
>
> Let's just move forward as is, unblock commits and just fix things if
> anything is broken.
>
> +Vinod
>
> > On Jul 6, 2018, at 10:30 AM, Anu Engineer  <mailto:aengin...@hortonworks.com>> wrote:
> >
> > Hi All,
> >
> > [ Thanks to Arpit for working offline and verifying that branch is
> indeed good.]
> >
> > I want to summarize what I know of this issue and also solicit other
> points of view.
> >
> > We reverted the commit(c163d1797) from the branch, as soon as we noticed
> it. That is, we have made no other commits after the merge commit.
> >
> > We used the following command to revert
> > git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1
> >
> > Giovanni's branch had three commits + merge, The JIRAs he had were
> YARN-7451, YARN-7556, YARN-8435.
> >
> > The issue seems to be the revert of merge has some diffs. I am not a
> YARN developer, so the only problem is to look at the revert and see if
> there were any spurious edits in Giovanni's original commit + merge.
> > If there are none, we don't need a reset/force push.  But if we find an
> issue I am more than willing to go the force commit route.
> >
> > The revert takes the trunk back to the point of the first commit from
> Giovanni which is YARN-8435. His branch was also rewriting the order of
> commits which we have lost due to the revert.
> >
> > Based on what I know so far, I am -1 on the force push.
> >
> > In other words, I am trying to understand why we need the force push. I
> have left a similar comment in JIRA (
> https://issues.apache.org/jira/browse/INFRA-16727) too.
> >
> >
> > Thanks
> > Anu
> >
> >
> > On 7/6/18, 10:24 AM, "Arpit Agarwal"  aagar...@hortonworks.com>> wrote:
> >
> >-1 for the force push. Nothing is broken in trunk. The history looks
> ugly for two commits and we can live with it.
> >
> >The revert restored the branch to Giovanni's intent. i.e. only
> YARN-8435 is applied. Verified there is no delta between hashes 0d9804d 
and
> 39ad989 (HEAD).
> >
> >39ad989 2018-07-05 aengineer@ o {apache/trunk} Revert "Merge branch
> 't...
> >c163d17 2018-07-05 gifuma@apa M─┐ Merge branch 'trunk' of
> https://git-...
> >99febe7 2018-07-05 rkanter@ap │ o YARN-7451. Add missing tests to
> veri...
> >1726247 2018-07-05 haibochen@ │ o YARN-7556. Fair scheduler
> configurat...
> >0d9804d 2018-07-05 gifuma@apa o │ YARN-8435. Fix NPE when the same
> cli...
> >71df8c2 2018-07-05 nanda@apac o─┘ HDDS-212. Introduce
> NodeStateManager...
> >
> >Regards,
> >Arpit
> >
> >
> >On 7/5/18, 2:37 PM, "Subru Krishnan"  su...@apache.org>> wrote:
> >

Re: [VOTE] reset/force push to clean up inadvertent merge commit pushed to trunk

2018-07-06 Thread Anu Engineer
Hi All,

[ Thanks to Arpit for working offline and verifying that branch is indeed good.]

I want to summarize what I know of this issue and also solicit other points of 
view.

We reverted the commit(c163d1797) from the branch, as soon as we noticed it. 
That is, we have made no other commits after the merge commit.

We used the following command to revert 
git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1

Giovanni's branch had three commits + merge, The JIRAs he had were YARN-7451, 
YARN-7556, YARN-8435.

The issue seems to be the revert of merge has some diffs. I am not a YARN 
developer, so the only problem is to look at the revert and see if there were 
any spurious edits in Giovanni's original commit + merge. 
If there are none, we don't need a reset/force push.  But if we find an issue I 
am more than willing to go the force commit route.

The revert takes the trunk back to the point of the first commit from Giovanni 
which is YARN-8435. His branch was also rewriting the order of commits which we 
have lost due to the revert.
 
Based on what I know so far, I am -1 on the force push.

In other words, I am trying to understand why we need the force push. I have 
left a similar comment in JIRA 
(https://issues.apache.org/jira/browse/INFRA-16727) too.


Thanks
Anu


On 7/6/18, 10:24 AM, "Arpit Agarwal"  wrote:

-1 for the force push. Nothing is broken in trunk. The history looks ugly 
for two commits and we can live with it.

The revert restored the branch to Giovanni's intent. i.e. only YARN-8435 is 
applied. Verified there is no delta between hashes 0d9804d and 39ad989 (HEAD).

39ad989 2018-07-05 aengineer@ o {apache/trunk} Revert "Merge branch 't...
c163d17 2018-07-05 gifuma@apa M─┐ Merge branch 'trunk' of https://git-...
99febe7 2018-07-05 rkanter@ap │ o YARN-7451. Add missing tests to veri...
1726247 2018-07-05 haibochen@ │ o YARN-7556. Fair scheduler configurat...
0d9804d 2018-07-05 gifuma@apa o │ YARN-8435. Fix NPE when the same cli...
71df8c2 2018-07-05 nanda@apac o─┘ HDDS-212. Introduce NodeStateManager...

Regards,
Arpit


On 7/5/18, 2:37 PM, "Subru Krishnan"  wrote:

Folks,

There was a merge commit accidentally pushed to trunk, you can find the
details in the mail thread [1].

I have raised an INFRA ticket [2] to reset/force push to clean up trunk.

Can we have a quick vote for INFRA sign-off to proceed as this is 
blocking
all commits?

Thanks,
Subru

[1]

http://mail-archives.apache.org/mod_mbox/hadoop-yarn-dev/201807.mbox/%3CCAHqguubKBqwfUMwhtJuSD7X1Bgfro_P6FV%2BhhFhMMYRaxFsF9Q%40mail.gmail.com%3E
[2] https://issues.apache.org/jira/browse/INFRA-16727



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




Re: Merge branch commit in trunk by mistake

2018-07-05 Thread Anu Engineer
I ran  “git revert -c c163d1797ade0f47d35b4a44381b8ef1dfec5b60 -m 1”

that will remove all changes from Giovanni’s branch (There are 3 YARN commits). 
I am presuming that he can recommit the dropped changes directly into trunk.

I do not know off a better way than to lose changes from his branch. I am open 
to force pushing if that is needed.

--Anu


On 7/5/18, 2:20 PM, "Wangda Tan"  wrote:

Adding back hdfs/common/mr-dev again to cc list.

Here's the last merge revert commit:

https://github.com/apache/hadoop/commit/39ad98903a5f042573b97a2e5438bc57af7cc7a1


On Thu, Jul 5, 2018 at 2:17 PM Wangda Tan  wrote:

> It looks like the latest revert is not correct, many of commits get
> reverted.
>
> Dealing with merge commit revert is different from reverting a normal
> commit: https://www.christianengvall.se/undo-pushed-merge-git/
>
> We have to do force reset, now it is a complete mess in trunk.
>
>
>
> On Thu, Jul 5, 2018 at 2:10 PM Vinod Kumar Vavilapalli 

> wrote:
>
>> What is broken due to this merge commit?
>>
>> +Vinod
>>
>> > On Jul 5, 2018, at 2:03 PM, Arun Suresh  wrote:
>> >
>> > I agree with Sean, to be honest.. it is disruptive.
>> > Also, we have to kind of lock down the repo till it is completed..
>> >
>> > I recommend we be careful and try not to get into this situation 
again..
>> >
>> > -1 on force pushing..
>> >
>> > Cheers
>> > -Arun
>> >
>> > On Thu, Jul 5, 2018, 1:55 PM Sean Busbey  wrote:
>> >
>> >> If we need a vote, please have a thread with either DISCUSS or
>> >> preferably VOTE in the subject so folks are more likely to see it.
>> >>
>> >> that said, I'm -1 (non-binding). force pushes are extremely
>> >> disruptive. there's no way to know who's updated their local git repo
>> >> to include these changes in the last few hours. if a merge commit is
>> >> so disruptive that we need to subject folks to the inconvenience of a
>> >> force push then we should have more tooling in place to avoid them
>> >> (like client side git hooks for all committers).
>> >>
>> >> On Thu, Jul 5, 2018 at 3:36 PM, Wangda Tan 
>> wrote:
>> >>> +1 for force reset the branch.
>> >>>
>> >>> On Thu, Jul 5, 2018 at 12:14 PM Subru Krishnan 
>> wrote:
>> >>>
>>  Looking at the merge commit, I feel it's better to reset/force push
>>  especially since this is still the latest commit on trunk.
>> 
>>  I have raised an INFRA ticket requesting the same:
>>  https://issues.apache.org/jira/browse/INFRA-16727
>> 
>>  -S
>> 
>>  On Thu, Jul 5, 2018 at 11:45 AM, Sean Busbey
>> >> 
>>  wrote:
>> 
>> > FYI, no images make it through ASF mailing lists. I presume the
>> image
>> >> was
>> > of the git history? If that's correct, here's what that looks like
>> in
>> >> a
>> > paste:
>> >
>> > https://paste.apache.org/eRix
>> >
>> > There are no force pushes on trunk, so backing the change out would
>>  require
>> > the PMC asking INFRA to unblock force pushes for a period of time.
>> >
>> > Probably the merge commit isn't a big enough deal to do that. There
>> >> was a
>> > merge commit ~5 months ago for when YARN-6592 merged into trunk.
>> >
>> > So I'd say just try to avoid doing it in the future?
>> >
>> > -busbey
>> >
>> > On Thu, Jul 5, 2018 at 1:31 PM, Giovanni Matteo Fumarola <
>> > giovanni.fumar...@gmail.com> wrote:
>> >
>> >> Hi folks,
>> >>
>> >> After I pushed something on trunk a merge commit showed up in the
>> > history. *My
>> >> bad*.
>> >>
>> >>
>> >>
>> >> Since it was one of my first patches, I run a few tests on my
>> >> machine
>> >> before checked in.
>> >> While I was running all the tests, someone else checked in. I
>> >> correctly
>> >> pulled all the new changes.
>> >>
>> >> Even before I did the "git push" there was no merge commit in my
>>  history.
>> >>
>> >> Can someone help me reverting this change?
>> >>
>> >> Thanks
>> >> Giovanni
>> >>
>> >>
>> >>
>> >
>> >
>> > --
>> > busbey
>> >
>> 
>> >>
>> >>
>> >>
>> >> --
>> >> busbey
>> >>
>>
>>



-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS]: securing ASF Hadoop releases out of the box

2018-07-05 Thread Anu Engineer
+1, on the Non-Routable Idea. We like it so much that we added it to the Ozone 
roadmap.
https://issues.apache.org/jira/browse/HDDS-231

If there is consensus on bringing this to Hadoop in general, we can build this 
feature in common.

--Anu


On 7/5/18, 1:09 PM, "Sean Busbey"  wrote:

I really, really like the approach of defaulting to only non-routeable
IPs allowed. it seems like a good tradeoff for complexity of
implementation, pain to reconfigure, and level of protection.

On Thu, Jul 5, 2018 at 2:25 PM, Todd Lipcon  
wrote:
> The approach we took in Apache Kudu is that, if Kerberos hasn't been
> enabled, we default to a whitelist of subnets. The default whitelist is
> 127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16,169.254.0.0/16 which
> matches the IANA "non-routeable IP" subnet list.
>
> In other words, out-of-the-box, you get a deployment that works fine 
within
> a typical LAN environment, but won't allow some remote hacker to locate
> your cluster and access your data. We thought this was a nice balance
> between "works out of the box without lots of configuration" and "decent
> security". In my opinion a "localhost-only by default" would be be overly
> restrictive since I'd usually be deploying on some datacenter or EC2
> machine and then trying to access it from a client on my laptop.
>
> We released this first a bit over a year ago if my memory serves me, and
> we've had relatively few complaints or questions about it. We also made
> sure that the error message that comes back to clients is pretty
> reasonable, indicating the specific configuration that is disallowing
> access, so if people hit the issue on upgrade they had a clear idea what 
is
> going on.
>
> Of course it's not foolproof, since as Eric says, you're still likely open
> to the entirety of your corporation, and you may not want that, but as he
> also pointed out, that might be true even if you enable Kerberos
> authentication.
>
> -Todd
>
> On Thu, Jul 5, 2018 at 11:38 AM, Eric Yang  wrote:
>
>> Hadoop default configuration aimed for user friendliness to increase
>> adoption, and security can be enabled one by one.  This approach is most
>> problematic to security because system can be compromised before all
>> security features are turned on.
>> Larry's proposal will add some safety to remind system admin if security
>> is disabled.  However, reducing the number of knobs on security configs 
are
>> likely required to make the system secure for the banner idea to work
>> without writing too much guessing logic to determine if UI is secured.
>> Penetration test can provide better insights of what hasn't been secured 
to
>> improve the next release.  Thankfully most Hadoop vendors have done this
>> work periodically to help the community secure Hadoop.
>>
>> There are plenty of company advertised if you want security, use
>> Kerberos.  This statement is not entirely true.  Kerberos makes security
>> more difficult to crack for external parties, but it shouldn't be the 
only
>> method to secure Hadoop.  When the Kerberos environment is larger than
>> Hadoop cluster, anyone within Kerberos environment can access Hadoop
>> cluster freely without restriction.  In large scale enterprises or some
>> cloud vendors that sublet their resources, this might not be acceptable.
>>
>> From my point of view, a secure Hadoop release must default all settings
>> to localhost only and allow users to add more hosts through authorized
>> white list of servers.  This will keep security perimeter in check.  All
>> wild card ACLs will need to be removed or default to current user/current
>> host only.  Proxy user/host ACL list must be enforced on http channels.
>> This is basically realigning the default configuration to single node
>> cluster or firewalled configuration.
>>
>> Regards,
>> Eric
>>
>> On 7/5/18, 8:24 AM, "larry mccay"  wrote:
>>
>> Hi Steve -
>>
>> This is a long overdue DISCUSS thread!
>>
>> Perhaps the UIs can very visibly state (in red) "WARNING: UNSECURED 
UI
>> ACCESS - OPEN TO COMPROMISE" - maybe even force a click through the
>> warning
>> to get to the page like SSL exceptions in the browser do?
>> Similar tactic for UI access without SSL?
>> A new AuthenticationFilter can be added to the filter chains that
>> blocks
>> API calls unless explicitly configured to be open and obvious log a
>> similar
>> message?
>>
>> thanks,
>>
>> --larry
>>
>>
>>
>>
>> On Wed, Jul 4, 2018 at 11:58 AM, Steve Loughran <
>> ste...@hortonworks.com>
>> wrote:
>>
>> > Bitcoins are profitable enough to justify writing 

Re: Merge branch commit in trunk by mistake

2018-07-05 Thread Anu Engineer
Based on conversations with Giovanni and Subru, I have pushed a revert for this 
merge.

Thanks
Anu


On 7/5/18, 12:55 PM, "Giovanni Matteo Fumarola"  
wrote:

+ common-dev and hdfs-dev as fyi.

Thanks Subru and Sean for the answer.

On Thu, Jul 5, 2018 at 12:14 PM, Subru Krishnan  wrote:

> Looking at the merge commit, I feel it's better to reset/force push
> especially since this is still the latest commit on trunk.
>
> I have raised an INFRA ticket requesting the same:
> https://issues.apache.org/jira/browse/INFRA-16727
>
> -S
>
> On Thu, Jul 5, 2018 at 11:45 AM, Sean Busbey 
> wrote:
>
>> FYI, no images make it through ASF mailing lists. I presume the image was
>> of the git history? If that's correct, here's what that looks like in a
>> paste:
>>
>> https://paste.apache.org/eRix
>>
>> There are no force pushes on trunk, so backing the change out would
>> require
>> the PMC asking INFRA to unblock force pushes for a period of time.
>>
>> Probably the merge commit isn't a big enough deal to do that. There was a
>> merge commit ~5 months ago for when YARN-6592 merged into trunk.
>>
>> So I'd say just try to avoid doing it in the future?
>>
>> -busbey
>>
>> On Thu, Jul 5, 2018 at 1:31 PM, Giovanni Matteo Fumarola <
>> giovanni.fumar...@gmail.com> wrote:
>>
>> > Hi folks,
>> >
>> > After I pushed something on trunk a merge commit showed up in the
>> history. *My
>> > bad*.
>> >
>> >
>> >
>> > Since it was one of my first patches, I run a few tests on my machine
>> > before checked in.
>> > While I was running all the tests, someone else checked in. I correctly
>> > pulled all the new changes.
>> >
>> > Even before I did the "git push" there was no merge commit in my
>> history.
>> >
>> > Can someone help me reverting this change?
>> >
>> > Thanks
>> > Giovanni
>> >
>> >
>> >
>>
>>
>> --
>> busbey
>>
>
>




Re: [VOTE] Merging branch HDFS-7240 to trunk

2018-03-02 Thread Anu Engineer
Hi Owen,

  >> 1. It is hard to tell what has changed. git rebase -i tells me the
  >> branch has 722 commits. The rebase failed with a conflict. It would really
   >> help if you rebased to current trunk.

Thanks for the comments. I have merged trunk to HDFS-7240 branch. 
Hopefully, this makes it easy to look at the changes; I have committed the 
change required to fix the conflict as a separate commit to make it easy for 
you to see.

Thanks
Anu


On 3/2/18, 4:42 PM, "Wangda Tan"  wrote:

I like the idea of same source / same release and put Ozone's source under
a different directory.

Like Owen mentioned, It gonna be important for all parties to keep a
regular and shorter release cycle for Hadoop, e.g. 3-4 months between minor
releases. Users can try features and give feedbacks to stabilize feature
earlier; developers can be happier since efforts will be consumed by users
soon after features get merged. In addition to this, if features merged to
trunk after reasonable tests/review, Andrew's concern may not be a problem
anymore:

bq. Finally, I earnestly believe that Ozone/HDSL itself would benefit from
being a separate project. Ozone could release faster and iterate more
quickly if it wasn't hampered by Hadoop's release schedule and security and
compatibility requirements.

Thanks,
Wangda


On Fri, Mar 2, 2018 at 4:24 PM, Owen O'Malley 
wrote:

> On Thu, Mar 1, 2018 at 11:03 PM, Andrew Wang 
> wrote:
>
> Owen mentioned making a Hadoop subproject; we'd have to
> > hash out what exactly this means (I assume a separate repo still managed
> by
> > the Hadoop project), but I think we could make this work if it's more
> > attractive than incubation or a new TLP.
>
>
> Ok, there are multiple levels of sub-projects that all make sense:
>
>- Same source tree, same releases - examples like HDFS & YARN
>- Same master branch, separate releases and release branches - Hive's
>Storage API vs Hive. It is in the source tree for the master branch, 
but
>has distinct releases and release branches.
>- Separate source, separate release - Apache Commons.
>
> There are advantages and disadvantages to each. I'd propose that we use 
the
> same source, same release pattern for Ozone. Note that we tried and later
> reverted doing Common, HDFS, and YARN as separate source, separate release
> because it was too much trouble. I like Daryn's idea of putting it as a 
top
> level directory in Hadoop and making sure that nothing in Common, HDFS, or
> YARN depend on it. That way if a Release Manager doesn't think it is ready
> for release, it can be trivially removed before the release.
>
> One thing about using the same releases, Sanjay and Jitendra are signing 
up
> to make much more regular bugfix and minor releases in the near future. 
For
> example, they'll need to make 3.2 relatively soon to get it released and
> then 3.3 somewhere in the next 3 to 6 months. That would be good for the
> project. Hadoop needs more regular releases and fewer big bang releases.
>
> .. Owen
>




Namenode RPC default port reverted to 8020

2018-02-06 Thread Anu Engineer
Hi All,

I wanted to bring to your attention that HDFS-12990 has been committed to trunk 
and branch 3.0.1.

This change reverts the Namenode RPC port to the familiar 8020, making it same 
as Apache Hadoop 2.x series.
In Hadoop 3.0.0 release, the default port is 9820. If you have deployed Hadoop 
3.0.0, then please reconfigure Namenode port to 8020.

If you have not deployed Hadoop 3.0.0, we recommend waiting for Hadoop 3.0.1 
which is planned to be released in next two weeks.

Thanks
Anu


Re: Apache Hadoop 3.0.1 Release plan

2018-02-01 Thread Anu Engineer
Hi Eddy,

Thanks for driving this release. Just a quick question, do we have time to 
close this issue? 
https://issues.apache.org/jira/browse/HDFS-12990

or are we abandoning it? I believe that this is the last window for us to fix 
this issue.

Should we have a call and get this resolved one way or another?

Thanks
Anu

On 2/1/18, 10:51 AM, "Lei Xu"  wrote:

Hi, All

I just cut branch-3.0.1 from branch-3.0.  Please make sure all patches
targeted to 3.0.1 being checked in both branch-3.0 and branch-3.0.1.

Thanks!
Eddy

On Tue, Jan 9, 2018 at 11:17 AM, Lei Xu  wrote:
> Hi, All
>
> We have released Apache Hadoop 3.0.0 in December [1]. To further
> improve the quality of release, we plan to cut branch-3.0.1 branch
> tomorrow for the preparation of Apache Hadoop 3.0.1 release. The focus
> of 3.0.1 will be fixing blockers (3), critical bugs (1) and bug fixes
> [2].  No new features and improvement should be included.
>
> We plan to cut branch-3.0.1 tomorrow (Jan 10th) and vote for RC on Feb
> 1st, targeting for Feb 9th release.
>
> Please feel free to share your insights.
>
> [1] https://www.mail-archive.com/general@hadoop.apache.org/msg07757.html
> [2] https://issues.apache.org/jira/issues/?filter=12342842
>
> Best,
> --
> Lei (Eddy) Xu
> Software Engineer, Cloudera



-- 
Lei (Eddy) Xu
Software Engineer, Cloudera

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.9.0 (RC2)

2017-11-13 Thread Anu Engineer
-1 (binding)

Thank you for all the hard work on 2.9 series. Unfortunately, this is one of 
the times I have to -1 this release.

Looks like HADOOP-14840 added a dependency on “oj! Algorithms - version 43.0”, 
but we have just added “oj! Algorithms - version 43.0” to the 
“LICENSE.txt”. The right addition to the LICENESE.txt should contain the 
original MIT License, especially “Copyright (c) 2003-2017 Optimatika”.

Please take a look at https://github.com/optimatika/ojAlgo/blob/master/LICENSE

I am a +1 after this is fixed.

Thanks 
Anu




On 11/13/17, 9:50 AM, "Sunil G"  wrote:

+1 (binding)

Deployed cluster built from source.



   - Tested few cases in an HA cluster and tried to do failover by using
   rmadmin commands etc. This seems works fine including submitting apps.
   - I also tested many MR apps and all are running fine w/o any issues.
   - Majorly tested below feature sanity too (works fine)
  - Application priority
  - Application timeout
   - Tested basic NodeLabel scenarios.
  - Added some labels to couple of nodes
  - Verified old UI for labels
  - Submitted apps to labelled cluster and it works fine.
  - Also performed few cli commands related to nodelabel
   - Verified new YARN UI and accessed various pages when cluster was in
   use. It seems fine to me.


Thanks all folks who participated in this release, appreciate the same!

- Sunil


On Mon, Nov 13, 2017 at 3:01 AM Subru Krishnan  wrote:

> Hi Folks,
>
> Apache Hadoop 2.9.0 is the first release of Hadoop 2.9 line and will be 
the
> starting release for Apache Hadoop 2.9.x line - it includes 30 New 
Features
> with 500+ subtasks, 407 Improvements, 790 Bug fixes new fixed issues since
> 2.8.2.
>
> More information about the 2.9.0 release plan can be found here:
> *
> 
https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#Roadmap-Version2.9
> <
> 
https://cwiki.apache.org/confluence/display/HADOOP/Roadmap#Roadmap-Version2.9
> >*
>
> New RC is available at: http://home.apache.org/~asuresh/hadoop-2.9.0-RC2/
> <
> 
http://www.google.com/url?q=http%3A%2F%2Fhome.apache.org%2F~asuresh%2Fhadoop-2.9.0-RC1%2F=D=1=AFQjCNE7BF35IDIMZID3hPqiNglWEVsTpg
> >
>
> The RC tag in git is: release-2.9.0-RC2, and the latest commit id is:
> 1eb05c1dd48fbc9e4b375a76f2046a59103bbeb1.
>
> The maven artifacts are available via repository.apache.org at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1067/
> <
> 
https://www.google.com/url?q=https%3A%2F%2Frepository.apache.org%2Fcontent%2Frepositories%2Forgapachehadoop-1066=D=1=AFQjCNFcern4uingMV_sEreko_zeLlgdlg
> >
>
> Please try the release and vote; the vote will run for the usual 5 days,
> ending on Friday 17th November 2017 2pm PT time.
>
> We want to give a big shout out to Sunil, Varun, Rohith, Wangda, Vrushali
> and Inigo for the extensive testing/validation which helped prepare for
> RC2. Do report your results in this vote as it'll be very useful to the
> entire community.
>
> Thanks,
> -Subru/Arun
>




Re: 答复: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk

2017-10-20 Thread Anu Engineer
Hi Steve,

In addition to everything Weiwei mentioned (chapter 3 of user guide), if you 
really want to drill down to REST protocol you might want to apply this patch 
and build ozone.

https://issues.apache.org/jira/browse/HDFS-12690

This will generate an Open API (https://www.openapis.org , http://swagger.io) 
based specification which can be accessed from KSM UI or just as a json file.
Unfortunately, this patch is still at code review stage, so you will have to 
apply the patch and build it yourself. 

Thanks
Anu


On 10/20/17, 6:09 AM, "Yang Weiwei"  wrote:

Hi Steve


The code is available in HDFS-7240 feature branch, public git repo 
here.

I am not sure if there is a "public" API for object stores, but the design 
doc
 uses most common syntax so I believe it should be compliance. You can find the 
rest API doc 
here
 (with some example usages), and commandline API 
here.


Look forward for your feedback!


--Weiwei



发件人: Steve Loughran 
发送时间: 2017年10月20日 11:49
收件人: Yang Weiwei
抄送: hdfs-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; 
yarn-dev@hadoop.apache.org; common-...@hadoop.apache.org
主题: Re: [DISCUSSION] Merging HDFS-7240 Object Store (Ozone) to trunk


Wow, big piece of work

1. Where is a PR/branch on github with rendered docs for us to look at?
2. Have you made any public APi changes related to object stores? That's 
probably something I'll have opinions on more than implementation details.

thanks

> On 19 Oct 2017, at 02:54, Yang Weiwei  wrote:
>
> Hello everyone,
>
>
> I would like to start this thread to discuss merging Ozone (HDFS-7240) to 
trunk. This feature implements an object store which can co-exist with HDFS. 
Ozone is disabled by default. We have tested Ozone with cluster sizes varying 
from 1 to 100 data nodes.
>
>
>
> The merge payload includes the following:
>
>  1.  All services, management scripts
>  2.  Object store APIs, exposed via both REST and RPC
>  3.  Master service UIs, command line interfaces
>  4.  Pluggable pipeline Integration
>  5.  Ozone File System (Hadoop compatible file system implementation, 
passes all FileSystem contract tests)
>  6.  Corona - a load generator for Ozone.
>  7.  Essential documentation added to Hadoop site.
>  8.  Version specific Ozone Documentation, accessible via service UI.
>  9.  Docker support for ozone, which enables faster development cycles.
>
>
> To build Ozone and run ozone using docker, please follow instructions in 
this wiki page. 
https://cwiki.apache.org/confluence/display/HADOOP/Dev+cluster+with+docker.
Dev cluster with docker - Hadoop - Apache Software 
Foundation
cwiki.apache.org
First, it uses a much more smaller common image which doesn't contains 
Hadoop. Second, the real Hadoop should be built from the source and the dist 
director should be ...



>
>
> We have built a passionate and diverse community to drive this feature 
development. As a team, we have achieved significant progress in past 3 years 
since first JIRA for HDFS-7240 was opened on Oct 2014. So far, we have resolved 
almost 400 JIRAs by 20+ contributors/committers from different countries and 
affiliations. We also want to thank the large number of community members who 
were supportive of our efforts and contributed ideas and participated in the 
design of ozone.
>
>
> Please share your thoughts, thanks!
>
>
> -- Weiwei Yang




-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [DISCUSS] Looking to Apache Hadoop 3.1 release

2017-09-06 Thread Anu Engineer
Hi Wangda,

We are planning to start the Ozone merge discussion by the end of this month. I 
am hopeful that it will be merged pretty soon after that. 
Please add Ozone to the list of features that are being tracked for Apache 
Hadoop 3.1. 

We would love to release Ozone as an alpha feature in Hadoop 3.1.

Thanks
Anu


On 9/6/17, 2:26 PM, "Arun Suresh"  wrote:

>Thanks for starting this Wangda.
>
>I would also like to add:
>- YARN-5972: Support Pausing/Freezing of opportunistic containers
>
>Cheers
>-Arun
>
>On Wed, Sep 6, 2017 at 1:49 PM, Steve Loughran 
>wrote:
>
>>
>> > On 6 Sep 2017, at 19:13, Wangda Tan  wrote:
>> >
>> > Hi all,
>> >
>> > As we discussed on [1], there were proposals from Steve / Vinod etc to
>> have
>> > a faster cadence of releases and to start thinking of a Hadoop 3.1
>> release
>> > earlier than March 2018 as is currently proposed.
>> >
>> > I think this is a good idea. I'd like to start the process sooner, and
>> > establish timeline etc so that we can be ready when 3.0.0 GA is out. With
>> > this we can also establish faster cadence for future Hadoop 3.x releases.
>> >
>> > To this end, I propose to target Hadoop 3.1.0 for a release by mid Jan
>> > 2018. (About 4.5 months from now and 2.5 months after 3.0-GA, instead of
>> > 6.5 months from now).
>> >
>> > I'd also want to take this opportunity to come up with a more elaborate
>> > release plan to avoid some of the confusion we had with 3.0 beta. General
>> > proposal for the timeline (per this other proposal [2])
>> > - Feature freeze date - all features should be merged by Dec 15, 2017.
>> > - Code freeze date - blockers/critical only, no more improvements and non
>> > blocker/critical bug-fixes: Jan 1, 2018.
>> > - Release date: Jan 15, 2018
>> >
>> > Following is a list of features on my radar which could be candidates
>> for a
>> > 3.1 release:
>> > - YARN-5734, Dynamic scheduler queue configuration. (Owner: Jonathan
>> Hung)
>> > - YARN-5881, Add absolute resource configuration to CapacityScheduler.
>> > (Owner: Sunil)
>> > - YARN-5673, Container-executor rewrite for better security,
>> extensibility
>> > and portability. (Owner Varun Vasudev)
>> > - YARN-6223, GPU isolation. (Owner: Wangda)
>> >
>> > And from email [3] mentioned by Andrew, there’re several other HDFS
>> > features want to be released with 3.1 as well, assuming they fit the
>> > timelines:
>> > - Storage Policy Satisfier
>> > - HDFS tiered storage
>> >
>> > Please let me know if I missed any features targeted to 3.1 per this
>> > timeline.
>>
>>
>> HADOOP-13786 : S3Guard committer, which also adds resilience to failures
>> talking to S3 (we barely have any today),
>>
>> >
>> > And I want to volunteer myself as release manager of 3.1.0 release.
>> Please
>> > let me know if you have any suggestions/concerns.
>>
>> well volunteered :)
>>
>> >
>> > Thanks,
>> > Wangda Tan
>> >
>> > [1] http://markmail.org/message/hwar5f5ap654ck5o?q=
>> > Branch+merges+and+3%2E0%2E0-beta1+scope
>> > [2] http://markmail.org/message/hwar5f5ap654ck5o?q=Branch+
>> > merges+and+3%2E0%2E0-beta1+scope#query:Branch%20merges%
>> > 20and%203.0.0-beta1%20scope+page:1+mid:2hqqkhl2dymcikf5+state:results
>> > [3] http://markmail.org/message/h35obzqrh3ag6dgn?q=Branch+merge
>> > s+and+3%2E0%2E0-beta1+scope


Re: DISCUSS: Hadoop Compatability Guidelines

2017-09-05 Thread Anu Engineer
Could you please attach the PDFs to the JIRA. I think the mailer is stripping 
them off from the mail.

Thanks
Anu





On 9/5/17, 9:44 AM, "Daniel Templeton"  wrote:

>Resending with a broader audience, and reattaching the PDFs.
>
>Daniel
>
>On 9/4/17 9:01 AM, Daniel Templeton wrote:
>> All, in prep for Hadoop 3 beta 1 I've been working on updating the 
>> compatibility guidelines on HADOOP-13714.  I think the initial doc is 
>> more or less complete, so I'd like to open the discussion up to the 
>> broader Hadoop community.
>>
>> In the new guidelines, I have drawn some lines in the sand regarding 
>> compatibility between releases.  In some cases these lines are more 
>> restrictive than the current practices.  The intent with the new 
>> guidelines is not to limit progress by restricting what goes into a 
>> release, but rather to drive release numbering to keep in line with 
>> the reality of the code.
>>
>> Please have a read and provide feedback on the JIRA.  I'm sure there 
>> are more than a couple of areas that could be improved.  If you'd 
>> rather not read markdown from a diff patch, I've attached PDFs of the 
>> two modified docs.
>>
>> Thanks!
>> Daniel
>
>

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: LinkedIn Dynamometer Tool (was About 2.7.4 Release)

2017-07-20 Thread Anu Engineer
Hi Erik,

Looking forward to the release of this tool. Thank you very much for the 
contribution.

Had a couple of questions about how the tool works.

1. Would you be able to provide the traces along with this tool? In other 
words, would I be able to use this out of the box, or do I have to build up 
traces myself? 

2. Could you explain how the “fake out DNs into thinking they are storing data” 
— works? Or I can be patient and read your blog post too.

Thanks
Anu






On 7/20/17, 10:42 AM, "Erik Krogen"  wrote:

>forking off of the 2.7.4 release thread to answer this question about
>Dynamometer
>
>Dynamometer is a tool developed at LinkedIn for scale testing HDFS,
>specifically the NameNode. We have been using it for some time now and have
>recently been making some enhancements to ease of use and reproducibility.
>We hope to post a blog post sometime in the not-too-distant future, and
>also to open source it. I can provide some details here given that we have
>been leveraging it as part of our 2.7.4 release / upgrade process (in
>addition to previous upgrades).
>
>The basic idea is to get full-scale black-box testing of the HDFS NN while
>using significantly less (~10%) hardware than a real cluster of that size
>would require. We use real NN images from our at-scale clusters paired with
>some logic to fake out DNs into thinking they are storing data when they
>are not, allowing us to stuff more DNs onto each machine. Since we use a
>real image, we can replay real traces (collected from audit logs) to
>compare actual production performance vs. performance on this simulated
>cluster (with additional tuning, different version, etc.). We leverage YARN
>to manage setting up this cluster and to replay the traces.
>
>Happy to answer questions.
>
>Erik
>
>On Wed, Jul 19, 2017 at 5:05 PM, Konstantin Shvachko 
>wrote:
>
>> Hi Tianyi,
>>
>> Glad you are interested in Dynamometer. Erik (CC-ed) is actively working
>> on this project right now, I'll let him elaborate.
>> Erik, you should probably respond on Apache dev list, as I think it could
>> be interesting for other people as well, asince we planned to open source
>> it. You can fork the "About 2.7.4 Release" thread with a new subject and
>> give some details about Dynamometer there.
>>
>> Thanks,
>> --Konstantin
>>
>> On Wed, Jul 19, 2017 at 1:40 AM, 何天一  wrote:
>>
>>> Hi, Shavachko.
>>>
>>> You mentioned an internal tool called Dynamometer to test NameNode
>>> performance earlier in the 2.7.4 release thread.
>>> I wonder if you could share some ideas behind the tool. Or is there a
>>> plan to bring Dynamometer to open source community?
>>>
>>> Thanks.
>>>
>>> BR,
>>> Tianyi
>>>
>>> On Fri, Jul 14, 2017 at 8:45 AM Konstantin Shvachko 
>>> wrote:
>>>
 Hi everybody.

 We have been doing some internal testing of Hadoop 2.7.4. The testing is
 going well.
 Did not find any major issues on our workloads.
 Used an internal tool called Dynamometer to check NameNode performance on
 real cluster traces. Good.
 Overall test cluster performance looks good.
 Some more testing is still going on.

 I plan to build an RC next week. If there are no objection.

 Thanks,
 --Konst

 On Thu, Jun 15, 2017 at 4:42 PM, Konstantin Shvachko <
 shv.had...@gmail.com>
 wrote:

 > Hey guys.
 >
 > An update on 2.7.4 progress.
 > We are down to 4 blockers. There is some work remaining on those.
 > https://issues.apache.org/jira/browse/HDFS-11896?filter=12340814
 > Would be good if people could follow up on review comments.
 >
 > I looked through nightly Jenkins build results for 2.7.4 both on Apache
 > Jenkins and internal.
 > Some test fail intermittently, but there no consistent failures. I
 filed
 > HDFS-11985 to track some of them.
 > https://issues.apache.org/jira/browse/HDFS-11985
 > I do not currently consider these failures as blockers. LMK if some of
 > them are.
 >
 > We started internal testing of branch-2.7 on one of our smallish (100+
 > nodes) test clusters.
 > Will update on the results.
 >
 > There is a plan to enable BigTop for 2.7.4 testing.
 >
 > Akira, Brahma thank you for setting up a wiki page for 2.7.4 release.
 > Thank you everybody for contributing to this effort.
 >
 > Regards,
 > --Konstantin
 >
 >
 > On Tue, May 30, 2017 at 12:08 AM, Akira Ajisaka 
 > wrote:
 >
 >> Sure.
 >> If you want to edit the wiki, please tell me your ASF confluence
 account.
 >>
 >> -Akira
 >>
 >> On 2017/05/30 15:31, Rohith Sharma K S wrote:
 >>
 >>> Couple of more JIRAs need to be back ported for 2.7.4 release. These
 will
 >>> solve RM HA unstability issues.
 >>> https://issues.apache.org/jira/browse/YARN-5333
 >>> 

Re: Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-04-17 Thread Anu Engineer
Hi Allen, 

https://issues.apache.org/jira/browse/INFRA-13902

That happened with ozone branch too. It was an inadvertent force push. Infra 
has advised us to force push the latest branch if you have it.

Thanks
Anu


On 4/17/17, 7:10 AM, "Allen Wittenauer"  wrote:

>Looks like someone reset HEAD back to Mar 31. 
>
>Sent from my iPad
>
>> On Apr 16, 2017, at 12:08 AM, Apache Jenkins Server 
>>  wrote:
>> 
>> For more details, see 
>> https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/378/
>> 
>> 
>> 
>> 
>> 
>> -1 overall
>> 
>> 
>> The following subsystems voted -1:
>>docker
>> 
>> 
>> Powered by Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org
>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>
>-
>To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>



Re: [VOTE] Release Apache Hadoop 3.0.0-alpha2 RC0

2017-01-20 Thread Anu Engineer
Hi Andrew, 

Thank you for all the hard work. I am really excited to see us making progress 
towards a 3.0 release.

+1 (Non-Binding)

1. Deployed the downloaded bits on 4 node cluster with 1 Namenode and 3 
datanodes.
2. Verified all normal HDFS operations like create directory, create file , 
delete file etc.
3. Ran Map reduce jobs  - Pi and Wordcount
5. Verified Hadoop version command output is correct.

Thanks
Anu


On 1/20/17, 2:36 PM, "Andrew Wang"  wrote:

>Hi all,
>
>With heartfelt thanks to many contributors, the RC0 for 3.0.0-alpha2 is
>ready.
>
>3.0.0-alpha2 is the second alpha in the planned 3.0.0 release line leading
>up to a 3.0.0 GA. It comprises 857 fixes, improvements, and new features
>since alpha1 was released on September 3rd, 2016.
>
>More information about the 3.0.0 release plan can be found here:
>
>https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.0.0+release
>
>The artifacts can be found here:
>
>http://home.apache.org/~wang/3.0.0-alpha2-RC0/
>
>This vote will run 5 days, ending on 01/25/2017 at 2PM pacific.
>
>I ran basic validation with a local pseudo cluster and a Pi job. RAT output
>was clean.
>
>My +1 to start.
>
>Thanks,
>Andrew



Re: [VOTE] Merge HADOOP-13341

2016-09-09 Thread Anu Engineer
>  SUBCOMMAND is one of:
>
>
>Clients:
>   cacheadmin   configure the HDFS cache
>   classpathprints the class path needed to get the hadoop jar 
> and the required libraries
>   crypto   configure HDFS encryption zones
>   ...
>
>Daemons:
>   balancer run a cluster balancing utility
>   datanode run a DFS datanode
>   namenode run the DFS name node
>...
>---snip---


Absolutely, that is a great output, very clear and provides a very good user 
experience.

Thanks
Anu


On 9/9/16, 3:06 PM, "Allen Wittenauer" <a...@effectivemachines.com> wrote:

>
>> On Sep 9, 2016, at 2:15 PM, Anu Engineer <aengin...@hortonworks.com> wrote:
>> 
>> +1, Thanks for the effort. It brings in a world of consistency to the hadoop 
>> vars; and as usual reading your bash code was very educative.
>
>   Thanks!
>
>   There's still a handful of HDFS and MAPRED vars that begin with HADOOP, 
> but those should be trivial to knock out after a pattern has been established.
>
>> I had a minor suggestion though. since we have classified the _OPTS to 
>> client and daemon opts, for new people it is hard to know which of these 
>> subcommands are daemon vs. a client command.  Maybe we can add a special 
>> char in the help message to indicate which are daemons or just document it? 
>> Only way I know right now is to look the appropriate script and see if 
>> HADOOP_SUBCMD_SUPPORTDAEMONIZATION is set to true.
>
>
>   That's a great suggestion.  Would it be better if the usage output was 
> more like:
>
>---snip---
>Usage: hdfs [OPTIONS] SUBCOMMAND [SUBCOMMAND OPTIONS]
>
>  OPTIONS is none or any of:
>
>--buildpaths   attempt to add class files from build tree
>--config dir   Hadoop config directory
>--daemon (start|status|stop)   operate on a daemon
>--debugturn on shell script debug mode
>--help usage information
>--hostnames list[,of,host,names]   hosts to use in worker mode
>--hosts filename   list of hosts to use in worker mode
>--loglevel level   set the log4j level for this command
>--workers  turn on worker mode
>
>  SUBCOMMAND is one of:
>
>
>Clients:
>   cacheadmin   configure the HDFS cache
>   classpathprints the class path needed to get the hadoop jar 
> and the required libraries
>   crypto   configure HDFS encryption zones
>   ...
>
>Daemons:
>   balancer run a cluster balancing utility
>   datanode run a DFS datanode
>   namenode run the DFS name node
>...
>---snip---
>
>   We do something similar in Apache Yetus and shouldn't be too hard to do 
> in Apache Hadoop. We couldn't read SUPPORTDAEMONIZATION to place things, but 
> as long as people put their new commands in the correct section in 
> hadoop_usage, it should work.
>
>


-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Re: [VOTE] Merge HADOOP-13341

2016-09-09 Thread Anu Engineer
+1, Thanks for the effort. It brings in a world of consistency to the hadoop 
vars; and as usual reading your bash code was very educative.

I had a minor suggestion though. since we have classified the _OPTS to client 
and daemon opts, for new people it is hard to know which of these subcommands 
are daemon vs. a client command.  Maybe we can add a special char in the help 
message to indicate which are daemons or just document it? Only way I know 
right now is to look the appropriate script and see if 
HADOOP_SUBCMD_SUPPORTDAEMONIZATION is set to true.

On 9/7/16, 6:44 AM, "Allen Wittenauer"  wrote:

>
>   I’d like to call for a vote to run for 5 days (ending  Mon 12, 2016 at 
> 7AM PT) to merge the HADOOP-13341 feature branch into trunk. This branch was 
> developed exclusively by me.  As usual with large shell script changes, it's 
> been broken up into several smaller commits to make it easier to read.  The 
> core of the functionality is almost entirely in hadoop-functions.sh with the 
> majority of the rest of the new additions either being documentation or test 
> code. In addition, large swaths of code is removed from the hadoop, hdfs, 
> mapred, and yarn executables.
>
>   Here's a quick summary:
>
>* makes the rules around _OPTS consistent across all the projects
>* makes it possible to provide custom _OPTS for every hadoop, hdfs, mapred, 
>and yarn subcommand
>* with the exception of deprecations, removes all of the custom daemon _OPTS 
>handling sprinkled around the hadoop, hdfs, mapred, and yarn subcommands
>* removes the custom handling handling of HADOOP_CLIENT_OPTS and makes it 
>consistent for non-daemon subcommands
>* makes the _USER blocker consistent with _OPTS as well as providing better 
>documentation around this feature's existence.  Note that this is an 
>incompatible change against -alpha1.
>* by consolidating all of this code, makes it possible to finally fix a good 
>chunk of the "directory name containing spaces blows up the bash code" 
>problems that's been around since the beginning of the project
>
>   Thanks!
>
>
>-
>To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>



Re: [DISCUSS] Increased use of feature branches

2016-06-10 Thread Anu Engineer
I actively work on two branches (Diskbalancer and ozone) and I agree with most 
of what Sangjin said. 
There is an overhead in working with branches, there are both technical costs 
and administrative issues 
which discourages developers from using branches.

I think the biggest issue with branch based development is that fact that other 
developers do not use a branch.
If a small feature appears as a series of commits to “”datanode.java””, the 
branch based developer ends up rebasing 
and paying this price of rebasing many times. If everyone followed a model of 
branch + Pull request, other branches
would not have to deal with continues rebasing to trunk commits. If we are 
moving to a branch based 
development, we should probably move to that model for most development to 
avoid this tax on people who
 actually end up working in the branches.

I do have a question in my mind though: What is being proposed is that we move 
active development to branches 
if the feature is small or incomplete, however keep the trunk open for 
check-ins. One of the biggest reason why we 
check-in into trunk and not to branch-2 is because it is a change that will 
break backward compatibility. So do we 
have an expectation of backward compatibility thru the 3.0-alpha series (I 
personally vote No, since 3.0 is experimental 
at this stage), but if we decide to support some sort of backward-compact then 
willy-nilly committing to trunk 
and still maintaining the expectation we can release Alphas from 3.0 does not 
look possible.

And then comes the question, once 3.0 becomes official, where do we check-in a 
change,  if that would break something? 
so this will lead us back to trunk being the unstable – 3.0 being the new 
“branch-2”.

One more point: If we are moving to use a branch always – then we are looking 
at a model similar to using a git + pull 
request model. If that is so would it make sense to modify the rules to make 
these branches easier to merge?
Say for example, if all commits in a branch has followed review and checking 
policy – just like trunk and commits 
have been made only after a sign off from a committer, would it be possible to 
merge with a 3-day voting period 
instead of 7, or treat it just like today’s commit to trunk – but with 2 people 
signing-off? 

What I am suggesting is reducing the administrative overheads of using a branch 
to encourage use of branching.  
Right now it feels like Apache’s process encourages committing directly to 
trunk than a branch

Thanks
Anu


On 6/10/16, 10:50 AM, "sjl...@gmail.com on behalf of Sangjin Lee" 
 wrote:

>Having worked on a major feature in a feature branch, I have some thoughts
>and observations on feature branch development.
>
>IMO feature branch development v. direct commits to trunk in piecemeal is
>really a choice of *granularity*. Do we want a series of fine-grained state
>changes on trunk or fewer coarse-grained chunks of commits on trunk?
>
>This makes me favor a branch-based development model for any "decent-sized"
>features (we'll need to define "decent-sized" of course). Once you have
>coarse-grained changes, it's easier to reason about what made what release
>and in what state. As importantly, it makes it easier to back out a
>complete feature fairly easily if that becomes necessary. My totally
>unscientific suggestion may be if a feature takes more than dozen commits
>and longer than a month, we should probably have a bias towards a feature
>branch.
>
>Branch-based development also makes you go faster if your feature is
>larger. I wouldn't do it the other way for timeline service v.2 for example.
>
>That said, feature branches don't come for free. Now the onus is on the
>feature developer to constantly rebase with the trunk to keep it reasonably
>integrated with the trunk. More logistics is involved for the feature
>developer. Another big question is, when a feature branch gets big and it's
>time to merge, would it get as scrutinized as a series of individual
>commits? Since the size of merge can be big, you kind of have to rely on
>those feature committers and those who help them.
>
>In terms of integrating/stabilizing, I don't think branch development
>necessarily makes it harder. It is again granularity. In case of direct
>commits on trunk, you do a lot more fine-grained integrations. In case of
>branch development, you do far fewer coarse-grained integrations via
>rebasing. If more people are doing branch-based development, it makes
>rebasing easier to manage too.
>
>Going back to the related topic of where to release (trunk v. branch-X), I
>think that is more of a proxy of the real question of "how do we maintain
>quality and stability of the trunk?". Even if we release from the trunk, if
>our bar for merging to trunk is low, the quality will not improve
>automatically. So I think we ought to tackle the quality question first.
>
>My 2 cents.
>
>
>On Fri, Jun 10, 2016 at 8:57 AM, Zhe Zhang 

Re: [VOTE] Release Apache Hadoop 2.6.2

2015-10-26 Thread Anu Engineer
+1 ( Non-binding)

- Downloaded 2.6.1 — created a cluster with namenode and a bunch of data nodes.
- verified that Rolling upgrade and Rollback options work correctly in moving 
from 2.61 to 2.6.2

—Anu




On 10/22/15, 2:14 PM, "sjl...@gmail.com on behalf of Sangjin Lee" 
 wrote:

>Hi all,
>
>I have created a release candidate (RC0) for Hadoop 2.6.2.
>
>The RC is available at: http://people.apache.org/~sjlee/hadoop-2.6.2-RC0/
>
>The RC tag in git is: release-2.6.2-RC0
>
>The list of JIRAs committed for 2.6.2:
>https://issues.apache.org/jira/browse/YARN-4101?jql=project%20in%20(HADOOP%2C%20HDFS%2C%20YARN%2C%20MAPREDUCE)%20AND%20fixVersion%20%3D%202.6.2
>
>The maven artifacts are staged at
>https://repository.apache.org/content/repositories/orgapachehadoop-1022/
>
>Please try out the release candidate and vote. The vote will run for 5 days.
>
>Thanks,
>Sangjin