Re: Hadoop 3.1.0 release discussion

2018-01-30 Thread Gangumalla, Uma
Hi Wangda,

Daryn has provided his feedback and we would like to handle some of his 
feedbacks before merge.
That will take couple of more days for us to go through review cycles etc.
Next couple of days I will be in travel, I would not be able to put my efforts 
in it. Meanwhile Rakesh/Surendra will update if any changes in the plan.

So, to conclude, you can proceed cutting the branch and we will merge after 
reviews closed, that may go into 3.2.

Regards,
Uma

From: Wangda Tan 
Date: Tuesday, January 30, 2018 at 6:24 AM
To: Uma Gangumalla 
Cc: "hdfs-...@hadoop.apache.org" , 
"yarn-...@hadoop.apache.org" , 
"common-dev@hadoop.apache.org" , 
"mapreduce-...@hadoop.apache.org" , Vinod 
Kumar Vavilapalli 
Subject: Re: Hadoop 3.1.0 release discussion

Thanks for the update.

On Tue, Jan 30, 2018 at 4:51 PM, Gangumalla, Uma 
mailto:uma.ganguma...@intel.com>> wrote:
Hi Wangda,

Sorry that we have not started vote on 29th. Daryn is reviewing the branch and 
he needs a day for his finalized review, then we have to prioritize the 
comments and decide.
Will keep updated here.

Regards,
Uma


On 1/28/18, 5:31 PM, "Wangda Tan" 
mailto:wheele...@gmail.com>> wrote:

Hi Uma,

Thanks, I saw HDFS-13050 has been resolved 4 hours ago, I don't see any
other blockers under HDFS-10285. I think you guys should be able to start
voting thread in time for merging to trunk.

- Wangda

On Mon, Jan 29, 2018 at 3:12 AM, Gangumalla, Uma 
mailto:uma.ganguma...@intel.com>>
wrote:

> Hi Wangda,
>
>
>
> Sorry for the delay.
>
>
>
> >>* (Uma) HDFS-10285: HDFS SPS. There're two remaining blockers:
> HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of the two
> JIRAs?
>
> We have only one blocker now that is HDFS-13050 and we finished the key
> implementation from HDFS-12995, by HDFS-13075.
>
> We are planning to start vote by tomorrow (29th PST time). So, we request
> you to give us time for running vote. We will keep SPS off by default. So,
> interested users only can enable explicitly.
>
>
>
> Regards,
>
> Uma
>
>
>
> *From: *Wangda Tan mailto:wheele...@gmail.com>>
> *Date: *Friday, January 26, 2018 at 5:21 PM
> *To: *Uma Gangumalla 
mailto:uma.ganguma...@intel.com>>
> *Cc: *"hdfs-...@hadoop.apache.org<mailto:hdfs-...@hadoop.apache.org>" 
mailto:hdfs-...@hadoop.apache.org>>, "
> yarn-...@hadoop.apache.org<mailto:yarn-...@hadoop.apache.org>" 
mailto:yarn-...@hadoop.apache.org>>, "
> common-dev@hadoop.apache.org<mailto:common-dev@hadoop.apache.org>" 
mailto:common-dev@hadoop.apache.org>>, "
> mapreduce-...@hadoop.apache.org<mailto:mapreduce-...@hadoop.apache.org>" 
mailto:mapreduce-...@hadoop.apache.org>>, Vinod
> Kumar Vavilapalli 
mailto:vino...@hortonworks.com>>
> *Subject: *Re: Hadoop 3.1.0 release discussion
>
>
>
> Hi All,
>
>
>
> Just a reminder about feature freeze date.
>
>
>
> Feature freeze date for 3.1.0 release is Jan 30 PST (about 4 days from
> today), If you've any features which live in a branch and targeted to
> 3.1.0, please reply this email thread. Ideally, we should finish branch
> merging before feature freeze date.
>
>
>
> Here's an updated 3.1.0 feature status:
>
>
>
> 1. Merged features:
>
> * (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
>
> * (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
> end-to-end.
>
> * (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
>
> * (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename
> commits.
>
> * (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
> Queues While Doing Queue Mapping.
>
> * (Chris Douglas) HDFS-9806: HDFS Tiered Storage.
>
> * (Zhankun) YARN-5983: FPGA support. Majority implementations completed
> and merged to trunk. Except for UI/documentation.
>
>
>
> 2. Features close to finish:
>
> * (Uma) HDFS-10285: HDFS SPS. There're two remaining
> blockers: HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of
> the two JIRAs?
>
> * (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest and
> anti-affinity support. (Voting thread started).
>
>
>
> 3. Tentative features:
>
&

Re: Hadoop 3.1.0 release discussion

2018-01-30 Thread Gangumalla, Uma
Hi Wangda,

Sorry that we have not started vote on 29th. Daryn is reviewing the branch and 
he needs a day for his finalized review, then we have to prioritize the 
comments and decide.
Will keep updated here.

Regards,
Uma


On 1/28/18, 5:31 PM, "Wangda Tan"  wrote:

Hi Uma,

Thanks, I saw HDFS-13050 has been resolved 4 hours ago, I don't see any
other blockers under HDFS-10285. I think you guys should be able to start
voting thread in time for merging to trunk.

- Wangda

On Mon, Jan 29, 2018 at 3:12 AM, Gangumalla, Uma 
wrote:

> Hi Wangda,
>
>
>
> Sorry for the delay.
>
>
>
> >>* (Uma) HDFS-10285: HDFS SPS. There're two remaining blockers:
> HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of the two
> JIRAs?
>
> We have only one blocker now that is HDFS-13050 and we finished the key
> implementation from HDFS-12995, by HDFS-13075.
>
> We are planning to start vote by tomorrow (29th PST time). So, we request
> you to give us time for running vote. We will keep SPS off by default. So,
> interested users only can enable explicitly.
>
>
>
> Regards,
>
> Uma
>
>
>
> *From: *Wangda Tan 
> *Date: *Friday, January 26, 2018 at 5:21 PM
> *To: *Uma Gangumalla 
> *Cc: *"hdfs-...@hadoop.apache.org" , "
> yarn-...@hadoop.apache.org" , "
> common-dev@hadoop.apache.org" , "
> mapreduce-...@hadoop.apache.org" , Vinod
> Kumar Vavilapalli 
> *Subject: *Re: Hadoop 3.1.0 release discussion
>
>
>
> Hi All,
>
>
>
> Just a reminder about feature freeze date.
>
>
>
> Feature freeze date for 3.1.0 release is Jan 30 PST (about 4 days from
> today), If you've any features which live in a branch and targeted to
> 3.1.0, please reply this email thread. Ideally, we should finish branch
> merging before feature freeze date.
>
>
>
> Here's an updated 3.1.0 feature status:
>
>
>
> 1. Merged features:
>
> * (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
>
> * (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
> end-to-end.
>
> * (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
>
> * (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename
> commits.
>
> * (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
> Queues While Doing Queue Mapping.
>
> * (Chris Douglas) HDFS-9806: HDFS Tiered Storage.
>
> * (Zhankun) YARN-5983: FPGA support. Majority implementations completed
> and merged to trunk. Except for UI/documentation.
>
>
>
> 2. Features close to finish:
>
> * (Uma) HDFS-10285: HDFS SPS. There're two remaining
> blockers: HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of
> the two JIRAs?
>
> * (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest and
> anti-affinity support. (Voting thread started).
>
>
>
> 3. Tentative features:
>
> * (Arun Suresh). YARN-5972: Support pausing/freezing opportunistic
> containers. Only one pending patch. Plan to finish before Jan 7th.
>
> * (Haibo Chen). YARN-1011: Resource overcommitment. Looks challenging to
> be done before Jan 2018.
>
> * (Anu): HDFS-7240: Ozone. Given the discussion on HDFS-7240. Looks
> challenging to be done before Jan 2018.
>
> * (Varun V) YARN-5673: container-executor write. Given security
> refactoring of c-e (YARN-6623) is already landed, IMHO other stuff may be
> moved to 3.2.
>
>
>
> Thanks,
>
> Wangda
>
>
>
>
>
> On Mon, Jan 22, 2018 at 1:49 PM, Gangumalla, Uma 

> wrote:
>
> Sure, Wangda.
>
> Regards,
> Uma
>
>
> On 1/18/18, 10:19 AM, "Wangda Tan"  wrote:
>
> Thanks Uma,
>
> Could you update this thread once the merge vote started?
>
> Best,
> Wangda
>
> On Wed, Jan 17, 2018 at 4:30 PM, Gangumalla, Uma <
> uma.ganguma...@intel.com>
> wrote:
>
> > HI Wangda,
> >
> >  Thank you for the head-up mail.
> >  We are in the branch (HDFS-10285) and trying to push the tasks
> 

Re: Hadoop 3.1.0 release discussion

2018-01-28 Thread Gangumalla, Uma
Hi Wangda,

Sorry for the delay.

>>* (Uma) HDFS-10285: HDFS SPS. There're two remaining blockers: 
>>HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of the two JIRAs?
We have only one blocker now that is HDFS-13050 and we finished the key 
implementation from HDFS-12995, by HDFS-13075.
We are planning to start vote by tomorrow (29th PST time). So, we request you 
to give us time for running vote. We will keep SPS off by default. So, 
interested users only can enable explicitly.

Regards,
Uma

From: Wangda Tan 
Date: Friday, January 26, 2018 at 5:21 PM
To: Uma Gangumalla 
Cc: "hdfs-...@hadoop.apache.org" , 
"yarn-...@hadoop.apache.org" , 
"common-dev@hadoop.apache.org" , 
"mapreduce-...@hadoop.apache.org" , Vinod 
Kumar Vavilapalli 
Subject: Re: Hadoop 3.1.0 release discussion

Hi All,

Just a reminder about feature freeze date.

Feature freeze date for 3.1.0 release is Jan 30 PST (about 4 days from today), 
If you've any features which live in a branch and targeted to 3.1.0, please 
reply this email thread. Ideally, we should finish branch merging before 
feature freeze date.

Here's an updated 3.1.0 feature status:

1. Merged features:
* (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
* (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works 
end-to-end.
* (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
* (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename commits.
* (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf Queues 
While Doing Queue Mapping.
* (Chris Douglas) HDFS-9806: HDFS Tiered Storage.
* (Zhankun) YARN-5983: FPGA support. Majority implementations completed and 
merged to trunk. Except for UI/documentation.

2. Features close to finish:
* (Uma) HDFS-10285: HDFS SPS. There're two remaining blockers: 
HDFS-12995/HDFS-13050. @Uma could you update what's the ETA of the two JIRAs?
* (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest and 
anti-affinity support. (Voting thread started).

3. Tentative features:
* (Arun Suresh). YARN-5972: Support pausing/freezing opportunistic containers. 
Only one pending patch. Plan to finish before Jan 7th.
* (Haibo Chen). YARN-1011: Resource overcommitment. Looks challenging to be 
done before Jan 2018.
* (Anu): HDFS-7240: Ozone. Given the discussion on HDFS-7240. Looks challenging 
to be done before Jan 2018.
* (Varun V) YARN-5673: container-executor write. Given security refactoring of 
c-e (YARN-6623) is already landed, IMHO other stuff may be moved to 3.2.

Thanks,
Wangda


On Mon, Jan 22, 2018 at 1:49 PM, Gangumalla, Uma 
mailto:uma.ganguma...@intel.com>> wrote:
Sure, Wangda.

Regards,
Uma

On 1/18/18, 10:19 AM, "Wangda Tan" 
mailto:wheele...@gmail.com>> wrote:

Thanks Uma,

Could you update this thread once the merge vote started?

Best,
Wangda

On Wed, Jan 17, 2018 at 4:30 PM, Gangumalla, Uma 
mailto:uma.ganguma...@intel.com>>
wrote:

> HI Wangda,
>
>  Thank you for the head-up mail.
>  We are in the branch (HDFS-10285) and trying to push the tasks sooner
> before the deadline.
>
> Regards,
> Uma
>
> On 1/17/18, 11:35 AM, "Wangda Tan" 
mailto:wheele...@gmail.com>> wrote:
>
> Hi All,
>
> Since we're fast approaching previously proposed feature freeze date
> (Jan
> 30, about 13 days from today). If you've any features which live in a
> branch and targeted to 3.1.0, please reply this email thread. Ideally,
> we
> should finish branch merging before feature freeze date.
>
> Here's an updated 3.1.0 feature status:
>
> 1. Merged & Completed features:
> * (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
> * (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
> end-to-end.
> * (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
> * (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename
> commits.
> * (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
> Queues While Doing Queue Mapping.
> * (Chris Douglas) HDFS-9806: HDFS Tiered Storage.
>
> 2. Features close to finish:
> * (Zhankun) YARN-5983: FPGA support. Majority implementations
> completed and
> merged to trunk. Except for UI/documentation.
> * (Uma) HDFS-10285: HDFS SPS. Majority implementations are done, some
> discussions going on about implementation.
> * (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest 
and
> anti-affinity support. Close to finish, on track to b

Re: Hadoop 3.1.0 release discussion

2018-01-21 Thread Gangumalla, Uma
Sure, Wangda.

Regards,
Uma

On 1/18/18, 10:19 AM, "Wangda Tan"  wrote:

Thanks Uma,

Could you update this thread once the merge vote started?

Best,
Wangda

On Wed, Jan 17, 2018 at 4:30 PM, Gangumalla, Uma 
wrote:

> HI Wangda,
>
>  Thank you for the head-up mail.
>  We are in the branch (HDFS-10285) and trying to push the tasks sooner
> before the deadline.
>
> Regards,
> Uma
>
> On 1/17/18, 11:35 AM, "Wangda Tan"  wrote:
>
> Hi All,
>
> Since we're fast approaching previously proposed feature freeze date
> (Jan
> 30, about 13 days from today). If you've any features which live in a
> branch and targeted to 3.1.0, please reply this email thread. Ideally,
> we
> should finish branch merging before feature freeze date.
>
> Here's an updated 3.1.0 feature status:
>
> 1. Merged & Completed features:
> * (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
> * (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
> end-to-end.
> * (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
> * (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename
> commits.
> * (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
> Queues While Doing Queue Mapping.
> * (Chris Douglas) HDFS-9806: HDFS Tiered Storage.
>
> 2. Features close to finish:
> * (Zhankun) YARN-5983: FPGA support. Majority implementations
> completed and
> merged to trunk. Except for UI/documentation.
> * (Uma) HDFS-10285: HDFS SPS. Majority implementations are done, some
> discussions going on about implementation.
> * (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest 
and
> anti-affinity support. Close to finish, on track to be merged before
> Jan 30.
>
> 3. Tentative features:
> * (Arun Suresh). YARN-5972: Support pausing/freezing opportunistic
> containers. Only one pending patch. Plan to finish before Jan 7th.
> * (Haibo Chen). YARN-1011: Resource overcommitment. Looks challenging
> to be
> done before Jan 2018.
> * (Anu): HDFS-7240: Ozone. Given the discussion on HDFS-7240. Looks
> challenging to be done before Jan 2018.
> * (Varun V) YARN-5673: container-executor write. Given security
> refactoring
> of c-e (YARN-6623) is already landed, IMHO other stuff may be moved to
> 3.2.
>
> Thanks,
> Wangda
>
>
>
>
> On Fri, Dec 15, 2017 at 1:20 PM, Wangda Tan 
> wrote:
>
> > Hi all,
> >
> > Congratulations on the 3.0.0-GA release!
> >
> > As we discussed in the previous email thread [1], I'd like to 
restart
> > 3.1.0 release plans.
> >
> > a) Quick summary:
> > a.1 Release status
> > We started 3.1 release discussion on Sep 6, 2017 [1]. As of today,
> > there’re 232 patches loaded on 3.1.0 alone [2], besides 6 open
> blockers and
> > 22 open critical issues.
> >
> > a.2 Release date update
> > Considering delays of 3.0-GA release by month-and-a-half, I propose
> to
> > move the dates as follows
> >  - feature freeze date from Dec 15, 2017, to Jan 30, 2018 - last
> date for
> > any branches to get merged too;
> >  - code freeze (blockers & critical only) date to Feb 08, 2018;
> >  - release voting start by Feb 18, 2018, leaving time for at least
> two RCx
> >  - release date from Jan 15, 2018, to Feb 28, 2018;
> >
> > Unlike before, I added an additional milestone for
> release-vote-start so
> > that we can account for voting time-period also.
> >
> > This overall is still 5 1/2 months of release-timeline unlike the
> faster
> > cadence we hoped for, but this, in my opinion, is the best-updated
> timeline
> > given the delays of the final release of 3.0-GA.
> >
> > b) Individual feature status:
> > I spoke to several feature owners and checked the status of
> un-finished
> > features, following are status of features planned to 3.1.0:
> >
> > b.1 Merged & Completed feat

Re: Hadoop 3.1.0 release discussion

2018-01-17 Thread Gangumalla, Uma
HI Wangda,

 Thank you for the head-up mail.
 We are in the branch (HDFS-10285) and trying to push the tasks sooner before 
the deadline.

Regards,
Uma

On 1/17/18, 11:35 AM, "Wangda Tan"  wrote:

Hi All,

Since we're fast approaching previously proposed feature freeze date (Jan
30, about 13 days from today). If you've any features which live in a
branch and targeted to 3.1.0, please reply this email thread. Ideally, we
should finish branch merging before feature freeze date.

Here's an updated 3.1.0 feature status:

1. Merged & Completed features:
* (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
* (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
end-to-end.
* (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
* (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename commits.
* (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
Queues While Doing Queue Mapping.
* (Chris Douglas) HDFS-9806: HDFS Tiered Storage.

2. Features close to finish:
* (Zhankun) YARN-5983: FPGA support. Majority implementations completed and
merged to trunk. Except for UI/documentation.
* (Uma) HDFS-10285: HDFS SPS. Majority implementations are done, some
discussions going on about implementation.
* (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest and
anti-affinity support. Close to finish, on track to be merged before Jan 30.

3. Tentative features:
* (Arun Suresh). YARN-5972: Support pausing/freezing opportunistic
containers. Only one pending patch. Plan to finish before Jan 7th.
* (Haibo Chen). YARN-1011: Resource overcommitment. Looks challenging to be
done before Jan 2018.
* (Anu): HDFS-7240: Ozone. Given the discussion on HDFS-7240. Looks
challenging to be done before Jan 2018.
* (Varun V) YARN-5673: container-executor write. Given security refactoring
of c-e (YARN-6623) is already landed, IMHO other stuff may be moved to 3.2.

Thanks,
Wangda




On Fri, Dec 15, 2017 at 1:20 PM, Wangda Tan  wrote:

> Hi all,
>
> Congratulations on the 3.0.0-GA release!
>
> As we discussed in the previous email thread [1], I'd like to restart
> 3.1.0 release plans.
>
> a) Quick summary:
> a.1 Release status
> We started 3.1 release discussion on Sep 6, 2017 [1]. As of today,
> there’re 232 patches loaded on 3.1.0 alone [2], besides 6 open blockers 
and
> 22 open critical issues.
>
> a.2 Release date update
> Considering delays of 3.0-GA release by month-and-a-half, I propose to
> move the dates as follows
>  - feature freeze date from Dec 15, 2017, to Jan 30, 2018 - last date for
> any branches to get merged too;
>  - code freeze (blockers & critical only) date to Feb 08, 2018;
>  - release voting start by Feb 18, 2018, leaving time for at least two RCx
>  - release date from Jan 15, 2018, to Feb 28, 2018;
>
> Unlike before, I added an additional milestone for release-vote-start so
> that we can account for voting time-period also.
>
> This overall is still 5 1/2 months of release-timeline unlike the faster
> cadence we hoped for, but this, in my opinion, is the best-updated 
timeline
> given the delays of the final release of 3.0-GA.
>
> b) Individual feature status:
> I spoke to several feature owners and checked the status of un-finished
> features, following are status of features planned to 3.1.0:
>
> b.1 Merged & Completed features:
> * (Sunil) YARN-5881: Support absolute value in CapacityScheduler.
> * (Wangda) YARN-6223: GPU support on YARN. Features in trunk and works
> end-to-end.
> * (Jian) YARN-5079,YARN-4793,YARN-4757,YARN-6419 YARN native services.
> * (Steve Loughran): HADOOP-13786: S3Guard committer for zero-rename
> commits.
> * (Suma): YARN-7117: Capacity Scheduler: Support Auto Creation of Leaf
> Queues While Doing Queue Mapping.
>
> b.2 Features close to finish:
> * (Chris Douglas) HDFS-9806: HDFS Tiered Storage. Being voting now.
> * (Zhankun) YARN-5983: FPGA support. Majority implementations completed
> and merged to trunk. Except for UI/documentation.
> * (Uma) HDFS-10285: HDFS SPS. Majority implementations are done, some
> discussions going on about implementation.
>
> b.3 Tentative features:
> * (Arun Suresh). YARN-5972: Support pausing/freezing opportunistic
> containers. Only one pending patch. Plan to finish before Jan 7th.
> * (Haibo Chen). YARN-1011: Resource overcommitment. Looks challenging to
> be done before Jan 2018.
> * (Arun Suresh / Kostas / Wangda). YARN-6592: New SchedulingRequest and
> anti-affinity support. Tentative will figure out by Jan 1st.
> * (Anu): HDFS-7240: Ozone. Given the discussion on HDFS-

Re: [VOTE] Release Apache Hadoop 3.0.0 RC1

2017-12-13 Thread Gangumalla, Uma
Here is my +1(binding) too. 
Sorry for late vote.

Verified signatures of the source tarball.
built from source.
set up a 2-node test cluster.
Tested via HDFS commands and java API – Written bunch of files and read back. 
Ran basic MR job

Thanks Andrew and others for the hard work for getting Hadoop 3.0 out.

Regards,
Uma

On 12/13/17, 1:05 PM, "Andrew Wang"  wrote:

Hi folks,

To close this out, the vote passes successfully with 13 binding +1s, 5
non-binding +1s, and no -1s. Thanks everyone for voting! I'll work on
staging.

I'm hoping we can address YARN-7588 and any remaining rolling upgrade
issues in 3.0.x maintenance releases. Beyond a wiki page, it would be
really great to get JIRAs filed and targeted for tracking as soon as
possible.

Vinod, what do you think we need to do regarding caveating rolling upgrade
support? We haven't advertised rolling upgrade support between major
releases outside of dev lists and JIRA. As a new major release, our compat
guidelines allow us to break compatibility, so I don't think it's expected
by users.

Best,
Andrew

On Wed, Dec 13, 2017 at 12:37 PM, Vinod Kumar Vavilapalli <
vino...@apache.org> wrote:

> I was waiting for Daniel to post the minutes from YARN meetup to talk
> about this. Anyways, in that discussion, we identified a bunch of key
> upgrade related scenarios that no-one seems to have validated - atleast
> from the representation in the YARN meetup. I'm going to create a 
wiki-page
> listing all these scenarios.
>
> But back to the bug that Junping raised. At this point, we don't have a
> clear path towards running 2.x applications on 3.0.0 clusters. So, our
> claim of rolling-upgrades already working is not accurate.
>
> One of the two options that Junping proposed should be pursued before we
> close the release. I'm in favor of calling out rolling-upgrade support be
> with-drawn or caveated and push for progress instead of blocking the
> release.
>
> Thanks
> +Vinod
>
> > On Dec 12, 2017, at 5:44 PM, Junping Du  wrote:
> >
> > Thanks Andrew for pushing new RC for 3.0.0. I was out last week, just
> get chance to validate new RC now.
> >
> > Basically, I found two critical issues with the same rolling upgrade
> scenario as where HADOOP-15059 get found previously:
> > HDFS-12920, we changed value format for some hdfs configurations that
> old version MR client doesn't understand when fetching these
> configurations. Some quick workarounds are to add old value (without time
> unit) in hdfs-site.xml to override new default values but will generate
> many annoying warnings. I provided my fix suggestions on the JIRA already
> for more discussion.
> > The other one is YARN-7646. After we workaround HDFS-12920, will hit the
> issue that old version MR AppMaster cannot communicate with new version of
> YARN RM - could be related to resource profile changes from YARN side but
> root cause are still in investigation.
> >
> > The first issue may not belong to a blocker given we can workaround this
> without code change. I am not sure if we can workaround 2nd issue so far.
> If not, we may have to fix this or compromise with withdrawing support of
> rolling upgrade or calling it a stable release.
> >
> >
> > Thanks,
> >
> > Junping
> >
> > 
> > From: Robert Kanter 
> > Sent: Tuesday, December 12, 2017 3:10 PM
> > To: Arun Suresh
> > Cc: Andrew Wang; Lei Xu; Wei-Chiu Chuang; Ajay Kumar; Xiao Chen; Aaron
> T. Myers; common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org;
> yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
> > Subject: Re: [VOTE] Release Apache Hadoop 3.0.0 RC1
> >
> > +1 (binding)
> >
> > + Downloaded the binary release
> > + Deployed on a 3 node cluster on CentOS 7.3
> > + Ran some MR jobs, clicked around the UI, etc
> > + Ran some CLI commands (yarn logs, etc)
> >
> > Good job everyone on Hadoop 3!
> >
> >
> > - Robert
> >
> > On Tue, Dec 12, 2017 at 1:56 PM, Arun Suresh  wrote:
> >
> >> +1 (binding)
> >>
> >> - Verified signatures of the source tarball.
> >> - built from source - using the docker build environment.
> >> - set up a pseudo-distributed test cluster.
> >> - ran basic HDFS commands
> >> - ran some basic MR jobs
> >>
> >> Cheers
> >> -Arun
> >>
> >> On Tue, Dec 12, 2017 at 1:52 PM, Andrew Wang 
> >> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> As a reminder, this vote closes tomorrow at 12:31pm, so please give it
> a
> >>> whack if you have time. There are already enough binding +1s to pass
> this
> >>> vote, but it'd be great to get additional v

Re: [DISCUSS] Branches and versions for Hadoop 3

2017-08-25 Thread Gangumalla, Uma
Plan looks good to me.

+1

Regards,
Uma

On 8/25/17, 10:36 AM, "Andrew Wang"  wrote:

>Hi folks,
>
>With 3.0.0-beta1 fast approaching, I wanted to go over the proposed
>branching strategy.
>
>In the early 2.x days, moving trunk immediately to 3.0.0 was a mistake.
>branch-2 and trunk were virtually identical, and increased backport
>complexity. Until we need to make incompatible changes, there's no need
>for
>a Hadoop 4.0 version.
>
>Thus, here's a proposal of branches and versions:
>
>trunk: 3.1.0-SNAPSHOT
>branch-3.0: 3.0.0-beta1-SNAPSHOT
>branch-2 and etc: remain as is
>
>LMK questions/comments/etc. Appreciate your attentiveness; I'm hoping to
>build consensus quickly since we have a number of open VOTEs for branch
>merges.
>
>Thanks,
>Andrew


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] HADOOP-12756 - Aliyun OSS Support branch merge

2016-10-05 Thread Gangumalla, Uma
+1 (binding)

Regards,
Uma

On 9/27/16, 7:35 PM, "Zheng, Kai"  wrote:

>Hi all,
>
>I would like to propose a merge vote for HADOOP-12756 branch to trunk.
>This branch develops support for Aliyun OSS (another cloud storage) in
>Hadoop.
>
>The voting starts now and will run for 7 days till Oct 5, 2016 07:00 PM
>PDT.
>
>Aliyun OSS is widely used among China's cloud users, and currently it is
>not easy to access data in Aliyun OSS from Hadoop. The branch develops a
>new module hadoop-aliyun and provides support for accessing data in
>Aliyun OSS cloud storage, which will enable more use cases and bring
>better use experience for Hadoop users. Like the existing s3a support,
>AliyunOSSFileSystem a new implementation of FileSystem backed by Aliyun
>OSS is provided. During the implementation, the contributors refer to the
>s3a support, keeping the same coding style and project structure.
>
>. The updated architecture document is here.
>   
>[https://issues.apache.org/jira/secure/attachment/12829541/Aliyun-OSS-inte
>gration-v2.pdf]
>
>. The merge patch that is a diff against trunk is posted here, which
>builds cleanly with manual testing results posted in HADOOP-13584.
>   
>[https://issues.apache.org/jira/secure/attachment/12829738/HADOOP-13584.00
>4.patch]
>
>. The user documentation is also provided as part of the module.
>
>HADOOP-12756 has a set of sub-tasks and they are ordered in the same
>sequence as they were committed to HADOOP-12756. Hopefully this will make
>it easier for reviewing.
>
>What I want to emphasize is: this is a fundamental implementation aiming
>at guaranteeing functionality and stability. The major functionality has
>been running in production environments for some while. There're
>definitely performance optimizations that we can do like the community
>have done for the existing s3a and azure supports. Merging this to trunk
>would serve as a very good beginning for the following optimizations
>aligning with the related efforts together.
>
>The new hadoop-aliyun modlue is made possible owing to many people.
>Thanks to the contributors Mingfei Shi, Genmao Yu and Ling Zhou; thanks
>to Cheng Hao, Steve Loughran, Chris Nauroth, Yi Liu, Lei (Eddy) Xu, Uma
>Maheswara Rao G and Allen Wittenauer for their kind reviewing and
>guidance. Also thanks Arpit Agarwal, Andrew Wang and Anu Engineer for the
>great process discussions to bring this up.
>
>Please kindly vote. Thanks in advance!
>
>Regards,
>Kai
>


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 3.0.0-alpha1 RC0

2016-08-31 Thread Gangumalla, Uma
+1 (binding).

Overall it¹s a great effort, Andrew. Thank you for putting all the energy.

Downloaded and built.
Ran some sample jobs.

I would love to see all this efforts will lead to get the GA from Hadoop
3.X soon.

Regards,
Uma


On 8/30/16, 8:51 AM, "Andrew Wang"  wrote:

>Hi all,
>
>Thanks to the combined work of many, many contributors, here's an RC0 for
>3.0.0-alpha1:
>
>http://home.apache.org/~wang/3.0.0-alpha1-RC0/
>
>alpha1 is the first in a series of planned alpha releases leading up to
>GA.
>The objective is to get an artifact out to downstreams for testing and to
>iterate quickly based on their feedback. So, please keep that in mind when
>voting; hopefully most issues can be addressed by future alphas rather
>than
>future RCs.
>
>Sorry for getting this out on a Tuesday, but I'd still like this vote to
>run the normal 5 days, thus ending Saturday (9/3) at 9AM PDT. I'll extend
>if we lack the votes.
>
>Please try it out and let me know what you think.
>
>Best,
>Andrew


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Release numbering for 3.x leading up to GA

2016-05-12 Thread Gangumalla, Uma
Thanks Andrew for driving. Sounds good. Go ahead please.

Good luck :-)

Regards,
Uma

On 5/12/16, 10:52 AM, "Andrew Wang"  wrote:

>Hi all,
>
>Sounds like we have general agreement on this release numbering scheme for
>3.x.
>
>I'm going to attempt some mvn and JIRA invocations to get the version
>numbers lined up for alpha1, wish me luck.
>
>Best,
>Andrew
>
>On Tue, May 3, 2016 at 9:52 AM, Roman Shaposhnik 
>wrote:
>
>> On Tue, May 3, 2016 at 8:18 AM, Karthik Kambatla 
>> wrote:
>> > The naming scheme sounds good. Since we want to start out sooner, I am
>> > assuming we are not limiting ourselves to two alphas as the email
>>might
>> > indicate.
>> >
>> > Also, as the release manager, can you elaborate on your definitions of
>> > alpha and beta? Specifically, when do we expect downstream projects to
>> try
>> > and integrate and when we expect Hadoop users to try out the bits?
>>
>> Not to speak of all the downstream PMC,s but Bigtop project will jump
>> on the first alpha the same way we jumped on the first alpha back
>> in the 1 -> 2 transition period.
>>
>> Given that Bigtop currently integrates quite a bit of Hadoop ecosystem
>> that work is going to produce valuable feedback that we plan to
>>communicate
>> to the individual PMCs. What PMCs do with that feedback, of course, will
>> be up to them (obviously Bigtop can't take the ownership of issues that
>> go outside of integration work between projects in the Hadoop ecoystem).
>>
>> Thanks,
>> Roman.
>>


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] Set minimum version of Hadoop 3 to JDK8 (HADOOP-11858)

2016-05-11 Thread Gangumalla, Uma
+1

Regards,
Uma

On 5/10/16, 2:24 PM, "Andrew Wang"  wrote:

>+1
>
>On Tue, May 10, 2016 at 12:36 PM, Ravi Prakash 
>wrote:
>
>> +1. Thanks for driving this Akira
>>
>> On Tue, May 10, 2016 at 10:25 AM, Tsuyoshi Ozawa 
>>wrote:
>>
>> > > Before cutting 3.0.0-alpha RC, I'd like to drop JDK7 support in
>>trunk.
>> >
>> > Sounds good. To do so, we need to check the blockers of 3.0.0-alpha
>> > RC, especially upgrading all dependencies which use refractions at
>> > first.
>> >
>> > Thanks,
>> > - Tsuyoshi
>> >
>> > On Tue, May 10, 2016 at 8:32 AM, Akira AJISAKA
>> >  wrote:
>> > > Hi developers,
>> > >
>> > > Before cutting 3.0.0-alpha RC, I'd like to drop JDK7 support in
>>trunk.
>> > > Given this is a critical change, I'm thinking we should get the
>> consensus
>> > > first.
>> > >
>> > > One concern I think is, when the minimum version is set to JDK8, we
>> need
>> > to
>> > > configure Jenkins to disable multi JDK test only in trunk.
>> > >
>> > > Any thoughts?
>> > >
>> > > Thanks,
>> > > Akira
>> > >
>> > > 
>>-
>> > > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
>> > > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
>> > >
>> >
>> > -
>> > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
>> > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>> >
>> >
>>


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Accept Chimera as new Apache Commons Component

2016-03-23 Thread Gangumalla, Uma
+1 (non-binding)


+common-dev@hadoop
On 3/21/16, 6:59 PM, "Gangumalla, Uma"  wrote:

>+1 (non-binding)
>
>
>Regards,
>Uma
>
>On 3/21/16, 1:45 AM, "Benedikt Ritter"  wrote:
>
>>Hi all,
>>
>>after long discussions I think we have gathered enough information to
>>decide whether we want to accept the Chimera project as a new Apache
>>Commons component.
>>
>>Proposed name: Apache Commons Crypto
>>Proposal text:
>>https://github.com/intel-hadoop/chimera/blob/master/PROPOSAL.html
>>Initial Code Base:  https://github.com/intel-hadoop/chimera/
>>Initial Committers (Names in alphabetical order):
>>- Aaron T. Myers (a...@apache.org, Apache Hadoop PMC, one of the original
>>Crypto dev team in Apache Hadoop)
>>- Andrew Wang (w...@apache.org, Apache Hadoop PMC, one of the original
>>Crypto dev team in Apache Hadoop)
>>- Chris Nauroth (cnaur...@apache.org, Apache Hadoop PMC and active
>>reviewer)
>>- Colin P. McCabe (cmcc...@apache.org, Apache Hadoop PMC, one of the
>>original Crypto dev team in Apache Hadoop)
>>- Dapeng Sun (s...@apache.org, Apache Sentry Committer, Chimera
>>contributor)
>>- Dian Fu (dia...@apache.org, Apache Sqoop Committer, Chimera
>>contributor)
>>- Dong Chen (do...@apache.org, Apache Hive Committer,interested on
>>Chimera)
>>- Ferdinand Xu (x...@apache.org, Apache Hive Committer, Chimera
>>contributor)
>>- Haifeng Chen (haifengc...@apache.org, Chimera lead and code
>>contributor)
>>- Marcelo Vanzin (Apache Spark Committer, Chimera contributor)
>>- Uma Maheswara Rao G (umamah...@apache.org, Apache Hadoop PMC, One of
>>the
>>original Crypto dev/review team in Apache Hadoop)
>>- Yi Liu (y...@apache.org, Apache Hadoop PMC, One of the original Crypto
>>dev/review team in Apache Hadoop)
>>
>>Please review the proposal and vote.
>>This vote will close no sooner than 72 hours from now, i.e. after 0900
>>GMT 24-Mar 2016
>>
>>  [ ] +1 Accept Chimera as new Apache Commons Component
>>  [ ] +0 OK, but...
>>  [ ] -0 OK, but really should fix...
>>  [ ] -1 I oppose this because...
>>
>>Thank you!
>>Benedikt
>
>
>-
>To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>For additional commands, e-mail: dev-h...@commons.apache.org
>



Re: Branch policy question

2016-03-23 Thread Gangumalla, Uma
Thanks Chris N for digging back on this details.

The main concern on this question was started like ³Since it¹s nearly
impossible for me to get timely reviews for some build and script changes
Š.². So, Even for CTR process, review is needed thing at some point. I
have one question here is CTR tracks for each commit level reviews even if
they are later? Or we just commit, can review like bulk? I understand now
it may be branch managers decision. I feel this bulk review may be
problematic if it comes directly when merge vote comes. I am sure review
with that merge patch ( bigger patch containing all the branch changes)
may not be that efficient than smaller level patch reviews. It is ok, if
there is a way and reviewer reviews for each commit later, even after
commit with CTR process. But if commits get accumulated with out reviews
and to review whole changes at once at merge time, it may be much harder
to review as bulk. In this current discussion case also, much hard to get
some reviewers to review for a bulk patch. Is it make sense to leave JIRAs
open until they get reviewed even if you commit the patch with CTR in
branches (may be JIRA state can show something like ³In Review²)? So, that
patches will get reviewed before coming to merge votes.

Regards,
Uma  

On 3/23/16, 6:56 PM, "larry mccay"  wrote:

>Thanks for digging that up, Chris.
>That is completely what I would have expected but began questioning it
>given this thread.
>
>I think that Allen's use of a feature branch for this effort makes sense
>and that he should have the freedom to choose his commit policy for the
>branch.
>The tricky part will be getting the reviews at the end but I would imagine
>that it can be managed with some documentation, code review, tests and
>instructions.
>
>On Wed, Mar 23, 2016 at 5:20 PM, Chris Nauroth 
>wrote:
>
>> It's interesting to go back to the change in bylaws in 2011 that
>> introduced the requirement for 3 binding +1s on a branch merge [1].  The
>> text of that resolution suggests that it's supportive of
>> commit-then-review if that's what the developers on the branch want to
>>do.
>>
>> "Branches' commit requirements are determined by the branch maintainer
>>and
>> in this situation are often set up as commit-then-review."
>>
>> It would also be very much against the spirit of that resolution to
>> combine it all down into a single patch file and get a single +1.
>>
>> "As such, there is no way to guarantee that the entire change set
>>offered
>> for trunk merge has had a second pair of eyes on it.  Therefore, it is
>> prudent to give that final merge heightened scrutiny, particularly since
>> these branches often extensively affect critical parts of the system.
>> Requiring three binding +1s does not slow down the branch development
>> process, but does provide a better chance of catching bugs before they
>> make their way to trunk."
>>
>> --Chris Nauroth
>>
>> [1] https://s.apache.org/iW1F
>>
>>
>>
>> On 3/23/16, 2:11 PM, "Steve Loughran"  wrote:
>>
>> >
>> >> On 22 Mar 2016, at 18:23, Andrew Wang 
>>wrote:
>> >>
>> >> A branch sounds fine, but how are we going to get 3 +1's to merge
>>it? If
>> >> it's hard to find one reviewer, seems even harder to find two.
>> >
>> >Given that only one +1 is needed to merge a non-branch patch, he could
>>in
>> >theory convert the entire branch into a single .patch for review. Not
>> >that I'd encourage that, just observing that its possible
>> >
>> >
>> >>
>> >> On Tue, Mar 22, 2016 at 10:56 AM, Allen Wittenauer <
>> >> allenwittena...@yahoo.com.invalid> wrote:
>> >>
>> >>>
>>  On Mar 22, 2016, at 10:49 AM, larry mccay 
>> wrote:
>> 
>>  That sounds like a reasonable approach and valid use of branches to
>> me.
>> 
>>  Perhaps a set of functional tests could be provided/identified that
>> would
>>  help the review process by showing backward compatibility along
>>with
>> new
>>  extensions for things like dynamic commands?
>> 
>> >>>
>> >>>This is going into trunk, so no need for backward
>>compatibility.
>> >>>
>> >
>> >
>>
>>



Re: Branch policy question

2016-03-22 Thread Gangumalla, Uma
> is it possible for me to setup a branch, self review+commit to that
>branch, then request a branch merge?
Basically this is something like Commit-Then-Review(here review later)
process right. I have not seen we followed this approach here( not sure if
I missed some branches followed that). Even though original author code
quality is good, there would always be chances for missing somethings. So,
peer review is important because the other eye can catch some issues which
original author might overlooked (general review advantage :-)). In this
case for branch merge we need three  +1s. if we face difficult in getting
one +1, then I am afraid that we may face more difficult to get reviewers
a the time of merge, because code base much larger than normal patch. Some
times we suggest contributors to split patches into multiple JIRAs of
patch size is becoming larger. It is better to find some reviewers for the
branch and creating branch could turn into healthy merge later.

Colin suggestion sounds good to me. How about providing more details and
find some reviewers (who is more familiar in that area etc)?

If this is general question question for branch policy MY answer is ³no²
for "self review+commit to that branch, then request a branch merge². But
for this kind of special cases where we are sure that we may not have
enough reviewers for branch, having dev mailing discussion about that
JIRA/branch proposal and see how to go about that changes may be good idea
instead of going ahead finishing work and raising merge vote thread ?
(Something like this what you did now :-))

Just my thoughts on this discussion.

Thanks & Regards,
Uma

On 3/22/16, 9:14 AM, "Allen Wittenauer"  wrote:

>Since it¹s nearly impossible for me to get timely reviews for some build
>and script changes, is it possible for me to setup a branch, self
>review+commit to that branch, then request a branch merge?



Re: Style checking related to getters

2016-02-29 Thread Gangumalla, Uma
+1 for disabling them.

Regards,
Uma

On 2/29/16, 11:16 AM, "Andrew Wang"  wrote:

>Hi Kai,
>
>Could you file a JIRA and post patch to disable that checkstyle rule? You
>can look at HADOOP-12713 for an example. Ping me and I'll review.
>
>Best,
>Andrew
>
>On Sun, Feb 28, 2016 at 11:28 PM, Zheng, Kai  wrote:
>
>> Hi,
>>
>> I'm wondering if we could get rid of the style checking in getters like
>> the following (from HDFS-9733). It's annoying because it's a common Java
>> practice and widely used in the project.
>>
>>
>> void setBlockLocations(LocatedBlocks blockLocations) {:42:
>> 'blockLocations' hides a field.
>>
>> void setTimeout(int timeout) {:25: 'timeout' hides a field.
>>
>> void setLocatedBlocks(List locatedBlocks) {:46:
>> 'locatedBlocks' hides a field.
>>
>> void setRemaining(long remaining) {:28: 'remaining' hides a field.
>>
>> void setBytesPerCRC(int bytesPerCRC) {:29: 'bytesPerCRC' hides a field.
>>
>> void setCrcType(DataChecksum.Type crcType) {:39: 'crcType' hides a
>>field.
>>
>> void setCrcPerBlock(long crcPerBlock) {:30: 'crcPerBlock' hides a field.
>>
>> void setRefetchBlocks(boolean refetchBlocks) {:35: 'refetchBlocks'
>>hides a
>> field.
>>
>> void setLastRetriedIndex(int lastRetriedIndex) {:34: 'lastRetriedIndex'
>> hides a field.
>>
>> Regards,
>> Kai
>>



Re: [crypto][chimera] Next steps

2016-02-23 Thread Gangumalla, Uma
Thanks all for the valuable feedbacks and discussions.
Here are my replies for some of the questions..
[Mark wrote]
It depends. I care less about the quality of the code than I do about
the community that comes with it / forms around it. A strong community
can fix code issues. Great code can't save a weak community.
[uma] Nice point. Fully agreed to it.


[Jochen wrote]
Therefore, I suggest that you provide at least fallback
implementations in pure Java, which are being used, if the JNI based
stuff is not available (for whatever reason).
[Uma] Thank you for the suggestion Jochen, If I understand your point
right,  Yes its there in Hadoop when we develop.
Here is the JIRA HADOOP-10735 : Fall back AesCtrCryptoCodec implementation
from OpenSSL to JCE if non native support.

The same should be there with Chimera/Apache Crypto.


[Benedikt]
I still have concerns about the IP, since this seems to be an Intel
codebase. I do not have the necessary experience to say what would be the
right way here. My gut feeling tells me, that we should go through the
incubator. WDYT?
And [Jochen wrote]
"An Intel codebase" is not a problem as such. Question is: "Available
under what license?"

[Uma] we would fix up IP issues if any sooner. If you see all the code
file license header is with Apache License files.
The current repo and package structure there with name Intel. I will check
with Haifeng on resolution of this.


[Jochen wrote]
So, have the Chimera project attempt to resolve them quickly. If
possible: Fine. If not: We still have the Incubator as a possibility.
[Uma] Agree. We would resolve on this points in sooner.


Regards,
Uma

 
On 2/23/16, 1:18 AM, "Mark Thomas"  wrote:

>On 23/02/2016 09:12, sebb wrote:
>> On 23 February 2016 at 07:34, Benedikt Ritter 
>>wrote:
>>> I'm confused. None of the other PMC members has expressed whether he
>>>or she
>>> want's the see Chimera/crypto joining Apache Commons, yet we're already
>>> discussing how JNI bindings should be handled.
>>>
>>> I'd like to see:
>>> 1) a clear statement whether Chimera/crypto should become part of
>>>Apache
>>> Commons. Do we need a vote for that?
>> 
>> Yes, of course.
>> 
>> However that decision clearly depends on at least some of the design
>> aspects of the code.
>> If it were written entirely in C or Fortran, it would not be a
>> suitable candidate.
>> 
>>> 2) Discuss a plan on how to do that (I've described a possible plan
>>>[1])
>>> 3) After that is clear: discuss design details regarding the component.
>> 
>> Some design details impact on the decision.
>> 
>> Indeed even for pure Java code the code quality has a bearing on
>> whether Commons would/could want to take it.
>> Would we want a large code base with no unit-tests, no build
>> mechanism, and no comments?
>
>It depends. I care less about the quality of the code than I do about
>the community that comes with it / forms around it. A strong community
>can fix code issues. Great code can't save a weak community.
>
>How about creating a new sandbox component, let folks start work and see
>how the community develops?
>
>Mark
>
>
>> 
>>> Thanks! :-)
>>> Benedikt
>>>
>>> [1] http://markmail.org/message/74j4el6bpfpt4evs
>>>
>>> 2016-02-23 3:03 GMT+01:00 Xu, Cheng A :
>>>
 At this point, it has just Java interfaces only.

 -Original Message-
 From: Colin P. McCabe [mailto:cmcc...@apache.org]
 Sent: Tuesday, February 23, 2016 1:29 AM
 To: Hadoop Common
 Cc: Commons Developers List
 Subject: Re: [crypto][chimera] Next steps

 I would highly recommend shading this library when it is used in
 Hadoop and/or Spark, to prevent version skew problems between Hadoop
 and Spark like we have had in the past.

 What is the strategy for handling JNI components?  I think at a
 minimum, we should include the version number in the native library
 name to avoid problems when deploying multiple versions of Chimera.
 This is something that has been problematic in Hadoop with
 libhadoop.so.

 Is this library going to have Scala interfaces as well as Java ones,
 or just Java?

 cheers,
 Colin

 On Sat, Feb 20, 2016 at 3:15 AM, Benedikt Ritter 
 wrote:
> Hi,
>
> I'd like to discuss the next steps for moving the Chimera component
>to
> Apache Commons. So far, none of the other PMC members has expressed
>his
 or
> her thoughts about this. If nobody brings up objections about moving
>the
> component to Apache Commons, I'm assuming lazy consensus about this.
>
> So the next steps would be:
> - decide on a name for the new component (my proposal was Apache
>Commons
> Crypto)
> - move code to an Apache repo (probably git?!)
> - request a Jira project
> - setup maven build
> - setup project website
> - work on an initial release under Apache Commons coordinates
>
> Anything missing?
>
> Regards,
> Benedikt

Re: [crypto][chimera] Next steps

2016-02-22 Thread Gangumalla, Uma

>All files should follow the Commons Maven naming scheme to make it easy to
>reach from Maven, Ivy and so on.
>This will be commons-crypto-1.0.jar for example.
Sure. Thanks Gary. We will follow the naming convention here from Commons.

Regards,
Uma

On 2/22/16, 1:20 PM, "Gary Gregory"  wrote:

>All files should follow the Commons Maven naming scheme to make it easy to
>reach from Maven, Ivy and so on.
>
>This will be commons-crypto-1.0.jar for example.
>
>Gary
>
>On Mon, Feb 22, 2016 at 1:06 PM, Gangumalla, Uma
>
>wrote:
>
>> >I would highly recommend shading this library when it is used in
>> Hadoop and/or Spark, to prevent version skew problems between Hadoop
>> and Spark like we have had in the past.
>> [uma]Ha. This avoids multiple jars versions issues. Agreed IMO.
>>
>> >I think at a
>> minimum, we should include the version number in the native library
>> name to avoid problems when deploying multiple versions of Chimera.
>> This is something that has been problematic in Hadoop with
>> libhadoop.so.
>> [uma]I think this is very valid suggestion. We can maintain version
>>number
>> with native lib. Also here target is to bundle libchimera.so along with
>> jars. Ideally it should be less confusion, but its good idea to have
>> version number along.
>>
>> >Is this library going to have Scala interfaces as well as Java ones,
>> or just Java?
>> [uma] Currently it is focussing on java. If there is a demand for Scala
>> specifically may be we can think on that.
>>
>> Regards,
>> Uma
>>
>> On 2/22/16, 9:28 AM, "Colin P. McCabe"  wrote:
>>
>> >I would highly recommend shading this library when it is used in
>> >Hadoop and/or Spark, to prevent version skew problems between Hadoop
>> >and Spark like we have had in the past.
>> >
>> >What is the strategy for handling JNI components?  I think at a
>> >minimum, we should include the version number in the native library
>> >name to avoid problems when deploying multiple versions of Chimera.
>> >This is something that has been problematic in Hadoop with
>> >libhadoop.so.
>> >
>> >Is this library going to have Scala interfaces as well as Java ones,
>> >or just Java?
>> >
>> >cheers,
>> >Colin
>> >
>> >On Sat, Feb 20, 2016 at 3:15 AM, Benedikt Ritter 
>> >wrote:
>> >> Hi,
>> >>
>> >> I'd like to discuss the next steps for moving the Chimera component
>>to
>> >> Apache Commons. So far, none of the other PMC members has expressed
>>his
>> >>or
>> >> her thoughts about this. If nobody brings up objections about moving
>>the
>> >> component to Apache Commons, I'm assuming lazy consensus about this.
>> >>
>> >> So the next steps would be:
>> >> - decide on a name for the new component (my proposal was Apache
>>Commons
>> >> Crypto)
>> >> - move code to an Apache repo (probably git?!)
>> >> - request a Jira project
>> >> - setup maven build
>> >> - setup project website
>> >> - work on an initial release under Apache Commons coordinates
>> >>
>> >> Anything missing?
>> >>
>> >> Regards,
>> >> Benedikt
>> >>
>> >> --
>> >> http://home.apache.org/~britter/
>> >> http://twitter.com/BenediktRitter
>> >> http://github.com/britter
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>>
>>
>
>
>-- 
>E-Mail: garydgreg...@gmail.com | ggreg...@apache.org
>Java Persistence with Hibernate, Second Edition
><http://www.manning.com/bauer3/>
>JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
>Spring Batch in Action <http://www.manning.com/templier/>
>Blog: http://garygregory.wordpress.com
>Home: http://garygregory.com/
>Tweet! http://twitter.com/GaryGregory



Re: [crypto][chimera] Next steps

2016-02-22 Thread Gangumalla, Uma
>I would highly recommend shading this library when it is used in
Hadoop and/or Spark, to prevent version skew problems between Hadoop
and Spark like we have had in the past.
[uma]Ha. This avoids multiple jars versions issues. Agreed IMO.

>I think at a
minimum, we should include the version number in the native library
name to avoid problems when deploying multiple versions of Chimera.
This is something that has been problematic in Hadoop with
libhadoop.so.
[uma]I think this is very valid suggestion. We can maintain version number
with native lib. Also here target is to bundle libchimera.so along with
jars. Ideally it should be less confusion, but its good idea to have
version number along.

>Is this library going to have Scala interfaces as well as Java ones,
or just Java?
[uma] Currently it is focussing on java. If there is a demand for Scala
specifically may be we can think on that.

Regards,
Uma

On 2/22/16, 9:28 AM, "Colin P. McCabe"  wrote:

>I would highly recommend shading this library when it is used in
>Hadoop and/or Spark, to prevent version skew problems between Hadoop
>and Spark like we have had in the past.
>
>What is the strategy for handling JNI components?  I think at a
>minimum, we should include the version number in the native library
>name to avoid problems when deploying multiple versions of Chimera.
>This is something that has been problematic in Hadoop with
>libhadoop.so.
>
>Is this library going to have Scala interfaces as well as Java ones,
>or just Java?
>
>cheers,
>Colin
>
>On Sat, Feb 20, 2016 at 3:15 AM, Benedikt Ritter 
>wrote:
>> Hi,
>>
>> I'd like to discuss the next steps for moving the Chimera component to
>> Apache Commons. So far, none of the other PMC members has expressed his
>>or
>> her thoughts about this. If nobody brings up objections about moving the
>> component to Apache Commons, I'm assuming lazy consensus about this.
>>
>> So the next steps would be:
>> - decide on a name for the new component (my proposal was Apache Commons
>> Crypto)
>> - move code to an Apache repo (probably git?!)
>> - request a Jira project
>> - setup maven build
>> - setup project website
>> - work on an initial release under Apache Commons coordinates
>>
>> Anything missing?
>>
>> Regards,
>> Benedikt
>>
>> --
>> http://home.apache.org/~britter/
>> http://twitter.com/BenediktRitter
>> http://github.com/britter



Re: Looking to a Hadoop 3 release

2016-02-18 Thread Gangumalla, Uma
Yes. I think starting 3.0 release with alpha is good idea. So it would get
some time to reach the beta or GA.

+1 for the plan.

For the compatibility purposes and as current stable versions, we should
continue 2.x releases anyway.

Thanks Andrew for starting the thread.

Regards,
Uma

On 2/18/16, 3:04 PM, "Andrew Wang"  wrote:

>Hi Kihwal,
>
>I think there's still value in continuing the 2.x releases. 3.x comes with
>the incompatible bump to a JDK8 runtime, and also the fact that 3.x won't
>be beta or GA for some number of months. In the meanwhile, it'd be good to
>keep putting out regular, stable 2.x releases.
>
>Best,
>Andrew
>
>
>On Thu, Feb 18, 2016 at 2:50 PM, Kihwal Lee 
>wrote:
>
>> Moving Hadoop 3 forward sounds fine. If EC is one of the main
>>motivations,
>> are we getting rid of branch-2.8?
>>
>> Kihwal
>>
>>   From: Andrew Wang 
>>  To: "common-dev@hadoop.apache.org" 
>> Cc: "yarn-...@hadoop.apache.org" ; "
>> mapreduce-...@hadoop.apache.org" ;
>> hdfs-dev 
>>  Sent: Thursday, February 18, 2016 4:35 PM
>>  Subject: Re: Looking to a Hadoop 3 release
>>
>> Hi all,
>>
>> Reviving this thread. I've seen renewed interest in a trunk release
>>since
>> HDFS erasure coding has not yet made it to branch-2. Along with JDK8,
>>the
>> shell script rewrite, and many other improvements, I think it's time to
>> revisit Hadoop 3.0 release plans.
>>
>> My overall plan is still the same as in my original email: a series of
>> regular alpha releases leading up to beta and GA. Alpha releases make it
>> easier for downstreams to integrate with our code, and making them
>>regular
>> means features can be included when they are ready.
>>
>> I know there are some incompatible changes waiting in the wings
>> (i.e. HDFS-6984 making FileStatus a PB rather than Writable, some of
>> HADOOP-9991 bumping dependency versions) that would be good to get in.
>>If
>> you have changes like this, please set the target version to 3.0.0 and
>>mark
>> them "Incompatible". We can use this JIRA query to track:
>>
>>
>> 
>>https://issues.apache.org/jira/issues/?jql=project%20in%20(HADOOP%2C%20HD
>>FS%2C%20YARN%2C%20MAPREDUCE)%20and%20%22Target%20Version%2Fs%22%20%3D%20%
>>223.0.0%22%20and%20resolution%3D%22unresolved%22%20and%20%22Hadoop%20Flag
>>s%22%3D%22Incompatible%20change%22%20order%20by%20priority
>>
>> There's some release-related stuff that needs to be sorted out (namely,
>>the
>> new CHANGES.txt and release note generation from Yetus), but I'd
>> tentatively like to roll the first alpha a month out, so third week of
>> March.
>>
>> Best,
>> Andrew
>>
>> On Mon, Mar 9, 2015 at 7:23 PM, Raymie Stata 
>>wrote:
>>
>> > Avoiding the use of JDK8 language features (and, presumably, APIs)
>> > means you've abandoned #1, i.e., you haven't (really) bumped the JDK
>> > source version to JDK8.
>> >
>> > Also, note that releasing from trunk is a way of achieving #3, it's
>> > not a way of abandoning it.
>> >
>> >
>> >
>> > On Mon, Mar 9, 2015 at 7:10 PM, Andrew Wang 
>> > wrote:
>> > > Hi Raymie,
>> > >
>> > > Konst proposed just releasing off of trunk rather than cutting a
>> > branch-2,
>> > > and there was general agreement there. So, consider #3 abandoned.
>>1&2
>> can
>> > > be achieved at the same time, we just need to avoid using JDK8
>>language
>> > > features in trunk so things can be backported.
>> > >
>> > > Best,
>> > > Andrew
>> > >
>> > > On Mon, Mar 9, 2015 at 7:01 PM, Raymie Stata 
>> > wrote:
>> > >
>> > >> In this (and the related threads), I see the following three
>> > requirements:
>> > >>
>> > >> 1. "Bump the source JDK version to JDK8" (ie, drop JDK7 support).
>> > >>
>> > >> 2. "We'll still be releasing 2.x releases for a while, with similar
>> > >> feature sets as 3.x."
>> > >>
>> > >> 3. Avoid the "risk of split-brain behavior" by "minimize
>>backporting
>> > >> headaches. Pulling trunk > branch-2 > branch-2.x is already
>>tedious.
>> > >> Adding a branch-3, branch-3.x would be obnoxious."
>> > >>
>> > >> These three cannot be achieved at the same time.  Which do we
>>abandon?
>> > >>
>> > >>
>> > >> On Mon, Mar 9, 2015 at 12:45 PM, sanjay Radia
>>
>> > >> wrote:
>> > >> >
>> > >> >> On Mar 5, 2015, at 3:21 PM, Siddharth Seth 
>> wrote:
>> > >> >>
>> > >> >> 2) Simplification of configs - potentially separating client
>>side
>> > >> configs
>> > >> >> and those used by daemons. This is another source of perpetual
>> > confusion
>> > >> >> for users.
>> > >> > + 1 on this.
>> > >> >
>> > >> > sanjay
>> > >>
>> >
>>
>>
>>



Re: Chimera as new component in Apache Commons

2016-02-17 Thread Gangumalla, Uma
Hi Benedikt,

Thanks for offering the great help.

Benedikt Wrote:

I'm no crypto expert but I can help with the Apache Commons related tasks,
like moving the code over to Apache Commons, setting up the maven build,
publishing the project website etc.
[UMA] Thank you. We would love to work with you on the further steps,
based on your guidance on these aspects.

Benedikt Wrote:

I'd love see you moving Chimera here.

[UMA] Thanks for the acceptance. :-)

Benedikt Wrote:

1. There are no Apache  sub communities. There is only the
Apache Commons community. This means, there won't be a separat mailing list
for the new component. It is important to understand that we are a
community maintaining a number of components. Not a group of sub
communities.
[UMA] Got it. Thanks for the information.


Benedikt Wrote:

2. The Apache Commons versioning guide lines are very restrictive [1]. We
put great effort into binary compatibility. This is, because we expect our
components to be reused by a lot of other projects and we try our best to
avoid jar hell. Often this means, that greater refactorings simply can not
be implemented since they would break BC. This is usually not a problem for
the major components. But it my be a problem for a young component.
[UMA] Right. 


Benedikt Wrote:

3. Apache Commons components usually have a (boring) descriptive name,
rather then a fancy one. This is the reason why we renamed Apache Commons
Sanselan zu Apache Commons Imaging. People should be able to tell just by
looking at the name of a component what that component is about. IMHO
Chimera falls into the fancy name category, so maybe we will discuss that
Name.
[UMA] Ok. Naming can be self descriptive. No issues on this.

Regards,
Uma (An Apache Hadoop PMC member)





On 2/16/16, 11:56 PM, "Benedikt Ritter"  wrote:

>Hello Uma,
>
>welcome to the Apache Commons dev list. It's great to see that two
>projects
>get together to share code via Apache Commons.
>
>2016-02-16 22:36 GMT+01:00 Gangumalla, Uma :
>
>> Hi Devs,
>>
>>
>>
>> Recently we worked with Spark community to implement the shuffle
>> encryption. While implementing that, we realized some/most of the code
>>in
>> Apache Hadoop encryption code and this implementation code have to be
>> duplicated. This leads to an idea to create separate reusable library,
>> named it as Chimera (https://github.com/intel-hadoop/chimera). It is an
>> optimized cryptographic library. It provides Java API for both cipher
>>level
>> and Java stream level to help developers implement high performance AES
>> encryption/decryption with the minimum code and effort.
>>
>>
>>
>> We know that Java has Cipher implementations, but why we need this
>> optimized cryptographic library:
>>
>> 1. Performance is very critical for encryption and decryption. The JDK
>> Cipher implementation of AES are not yet optimized with the modern
>> hardware. For example, the optimized implementation is 17x+ faster than
>> JDK6 implementations for some modes such as CBC decryption, CTR and GCM.
>> Even some optimizations has included JDK7 or JDK8, there are still 5x
>>to 6x
>> gap with the most optimized implementations.
>>
>
>That sounds pretty useful! :-)
>
>
>>
>> 2. Java Stream based API on cryptographic data stream. Cipher API is
>> powerful but a lot of code needs to be written for layered stream
>> processing applications. The design pattern is very common in modern
>> applications such as Hadoop or Spark.
>>
>>
>>
>> Chimera was originally based Hadoop crypto code but was improved and
>> generalized a lot for supporting wider scope of data encryption needs
>>for
>> more components in the community. The encryption related code in Hadoop
>>was
>> developed a year and so far it is running well. So we feel that code
>>part
>> of stable enough already.
>>
>>
>>
>> So, we propose to contribute this Chimera (optimized encryption library)
>> code to Apache Commons and we wanted to have independent release cycles
>>for
>> this module like any other modules in Apache Commons. This module is
>> basically provides Java based interfaces for encryption based IO and It
>> will have native based AES-NI encryption integration code.
>>
>>
>>
>> We already discussed about this proposal in Apache Hadoop dev lists and
>> the discussion conclusion was positive to contribute this module to
>>Apache
>> Commons.
>>
>>
>>
>> We need your help and support in adopting this code to make as Apache
>> Commons sub module and establish for making it to have its own
>>development
>> communi

Chimera as new component in Apache Commons

2016-02-16 Thread Gangumalla, Uma
Hi Devs,



Recently we worked with Spark community to implement the shuffle encryption. 
While implementing that, we realized some/most of the code in Apache Hadoop 
encryption code and this implementation code have to be duplicated. This leads 
to an idea to create separate reusable library, named it as Chimera 
(https://github.com/intel-hadoop/chimera). It is an optimized cryptographic 
library. It provides Java API for both cipher level and Java stream level to 
help developers implement high performance AES encryption/decryption with the 
minimum code and effort.



We know that Java has Cipher implementations, but why we need this optimized 
cryptographic library:

1. Performance is very critical for encryption and decryption. The JDK Cipher 
implementation of AES are not yet optimized with the modern hardware. For 
example, the optimized implementation is 17x+ faster than JDK6 implementations 
for some modes such as CBC decryption, CTR and GCM. Even some optimizations has 
included JDK7 or JDK8, there are still 5x to 6x gap with the most optimized 
implementations.

2. Java Stream based API on cryptographic data stream. Cipher API is powerful 
but a lot of code needs to be written for layered stream processing 
applications. The design pattern is very common in modern applications such as 
Hadoop or Spark.



Chimera was originally based Hadoop crypto code but was improved and 
generalized a lot for supporting wider scope of data encryption needs for more 
components in the community. The encryption related code in Hadoop was 
developed a year and so far it is running well. So we feel that code part of 
stable enough already.



So, we propose to contribute this Chimera (optimized encryption library) code 
to Apache Commons and we wanted to have independent release cycles for this 
module like any other modules in Apache Commons. This module is basically 
provides Java based interfaces for encryption based IO and It will have native 
based AES-NI encryption integration code.



We already discussed about this proposal in Apache Hadoop dev lists and the 
discussion conclusion was positive to contribute this module to Apache Commons.



We need your help and support in adopting this code to make as Apache Commons 
sub module and establish for making it to have its own development community 
(of course we can discuss more about this factors in this thread). And Hadoop 
and Spark will be the two visible projects to use it. We do expect there will 
be more projects using it.



Once Apache Commons PMC agreed to place this module under Commons, I will work 
on getting the interested developers etc for establishing Chimera development 
community as part of next steps. Please help on the process.



Regards,

Uma (An Apache Hadoop PMC member)


Re: Hadoop encryption module as Apache Chimera incubator project

2016-02-11 Thread Gangumalla, Uma
Thanks Haifeng. I was just waiting if any more comments. If no objections
further, I would initiate a discussion thread in Apache Commons in a day
time and will also cc to hadoop common.

Regards,
Uma

On 2/11/16, 6:13 PM, "Chen, Haifeng"  wrote:

>Thanks all the folks participating this discussion and providing valuable
>suggestions and options.
>
>I suggest we take it forward to make a proposal in Apache Commons
>community. 
>
>Thanks,
>Haifeng
>
>-Original Message-
>From: Chen, Haifeng [mailto:haifeng.c...@intel.com]
>Sent: Friday, February 5, 2016 10:06 AM
>To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org
>Subject: RE: Hadoop encryption module as Apache Chimera incubator project
>
>> [Chirs] Yes, but even if the artifact is widely consumed, as a TLP it
>>would need to sustain a community. If the scope is too narrow, then it
>>will quickly fall into maintenance mode, its contributors will move on,
>>and it will retire to the attic. Alone, I doubt its viability as a TLP.
>>So as a first option, donating only this code to Apache Commons would
>>accomplish some immediate goals in a sustainable forum.
>Totally agree. As a TLP it needs nice scope and roadmap to sustain a
>development community.
>
>Thanks,
>Haifeng
>
>-Original Message-
>From: Chris Douglas [mailto:cdoug...@apache.org]
>Sent: Friday, February 5, 2016 6:28 AM
>To: common-dev@hadoop.apache.org
>Cc: hdfs-...@hadoop.apache.org
>Subject: Re: Hadoop encryption module as Apache Chimera incubator project
>
>On Thu, Feb 4, 2016 at 12:06 PM, Gangumalla, Uma
> wrote:
>
>> [UMA] Ok. Great. You are right. I have cc¹ed to hadoop common. (You
>> mean to cc Apache commons as well?)
>
>I meant, if you start a discussion with Apache Commons, please CC
>common-dev@hadoop to coordinate.
>
>> [UMA] Right now we plan to have encryption libraries are the only
>> one¹s we planned and as we see lot of interest from other projects
>> like spark to use them. I see some challenges when we bring lot of
>> code(other common
>> codes) into this project is that, they all would have different
>> requirements and may be different expected timelines for release etc.
>> Some projects may just wanted to use encryption interfaces alone but
>>not all.
>> As they are completely independent codes, may be better to scope out
>> clearly.
>
>Yes, but even if the artifact is widely consumed, as a TLP it would need
>to sustain a community. If the scope is too narrow, then it will quickly
>fall into maintenance mode, its contributors will move on, and it will
>retire to the attic. Alone, I doubt its viability as a TLP. So as a first
>option, donating only this code to Apache Commons would accomplish some
>immediate goals in a sustainable forum.
>
>APR has a similar scope. As a second option, that may also be a
>reasonable home, particularly if some of the native bits could integrate
>with APR.
>
>If the scope is broader, the effort could sustain prolonged development.
>The current code is developing a strategy for packing native libraries on
>multiple platforms, a capability that, say, the native compression codecs
>(AFAIK) still lack. While java.nio is improving, many projects would
>benefit from a better, native interface to the filesystem (e.g.,
>NativeIO). We could avoid duplicating effort and collaborate on a common
>library.
>
>As a third option, Hadoop already implements some useful native
>libraries, which is why a subproject might be a sound course. That would
>enable the subproject to coordinate with Hadoop on migrating its native
>functionality to a separable, reusable component, then move to a TLP when
>we can rely on it exclusively (if it has a well-defined, independent
>community). It could control its release cadence and limit its
>dependencies.
>
>Finally, this is beside the point if nobody is interested in doing the
>work on such a project. It's rude to pull code out of Hadoop and donate
>it to another project so Spark can avoid a dependency, but this instance
>seems reasonable to me. -C
>
>[1] https://apr.apache.org/
>
>> On 2/3/16, 6:46 PM, "Chen, Haifeng"  wrote:
>>
>>>Thanks Chris.
>>>
>>>>> I went through the repository, and now understand the reasoning
>>>>>that would locate this code in Apache Commons. This isn't proposing
>>>>>to extract much of the implementation and it takes none of the
>>>>>integration. It's limited to interfaces to crypto libraries and
>>>>>streams/configuration.
>>>Exactly.
>>>
>>>>> Chimera would be a boutique TLP, unless we wa

Re: Hadoop encryption module as Apache Chimera incubator project

2016-02-04 Thread Gangumalla, Uma
roject requires too many dependencies and
>carries too much historical baggage for other projects to rely on.
>I agree with Colin/Steve: we don't want this to grow into another
>guava-like dependency that creates more work in conflicts than it saves
>in implementation...
>
>Would it make sense to also package some of the compression libraries,
>and maybe some of the text processing from MapReduce? Evolving some of
>this code to a common library with few/no dependencies would be generally
>useful. As a subproject, it could have a broader scope that could evolve
>into a viable TLP. If the encryption libraries are the only ones you're
>interested in pulling out, then Apache Commons does seem like a better
>target than a separate project. -C
>
>
>On Wed, Feb 3, 2016 at 1:49 AM, Chris Douglas  wrote:
>> On Wed, Feb 3, 2016 at 12:48 AM, Gangumalla, Uma
>>  wrote:
>>>>Standing in the point of shared fundamental piece of code like this,
>>>>I do think Apache Commons might be the best direction which we can
>>>>try as the first effort. In this direction, we still need to work
>>>>with Apache Common community for buying in and accepting the proposal.
>>> Make sense.
>>
>> Makes sense how?
>>
>>> For this we should define the independent release cycles for this
>>> project and it would just place under Hadoop tree if we all conclude
>>> with this option at the end.
>>
>> Yes.
>>
>>> [Chris]
>>>>If Chimera is not successful as an independent project or stalls,
>>>>Hadoop and/or Spark and/or $project will have to reabsorb it as
>>>>maintainers.
>>>>
>>> I am not so strong on this point. If we assume project would be
>>> unsuccessful, it can be unsuccessful(less maintained) even under
>>>hadoop.
>>> But if other projects depending on this piece then they would get
>>> less support. Of course right now we feel this piece of code is very
>>> important and we feel(expect) it can be successful as independent
>>> project, irrespective of whether it as separate project outside hadoop
>>>or inside.
>>> So, I feel this point would not really influence to judge the
>>>discussion.
>>
>> Sure; code can idle anywhere, but that wasn't the point I was after.
>> You propose to extract code from Hadoop, but if Chimera fails then
>> what recourse do we have among the other projects taking a dependency
>> on it? Splitting off another project is feasible, but Chimera should
>> be sustainable before this PMC can divest itself of responsibility for
>> security libraries. That's a pretty low bar.
>>
>> Bundling the library with the jar is helpful; I've used that before.
>> It should prefer (updated) libraries from the environment, if
>> configured. Otherwise it's a pain (or impossible) for ops to patch
>> security bugs. -C
>>
>>>>-Original Message-
>>>>From: Colin P. McCabe [mailto:cmcc...@apache.org]
>>>>Sent: Wednesday, February 3, 2016 4:56 AM
>>>>To: hdfs-...@hadoop.apache.org
>>>>Subject: Re: Hadoop encryption module as Apache Chimera incubator
>>>>project
>>>>
>>>>It's great to see interest in improving this functionality.  I think
>>>>Chimera could be successful as an Apache project.  I don't have a
>>>>strong opinion one way or the other as to whether it belongs as part
>>>>of Hadoop or separate.
>>>>
>>>>I do think there will be some challenges splitting this functionality
>>>>out into a separate jar, because of the way our CLASSPATH works right
>>>>now.
>>>>For example, let's say that Hadoop depends on Chimera 1.2 and Spark
>>>>depends on Chimera 1.1.  Now Spark jobs have two different versions
>>>>fighting it out on the classpath, similar to the situation with Guava
>>>>and other libraries.  Perhaps if Chimera adopts a policy of strong
>>>>backwards compatibility, we can just always use the latest jar, but
>>>>it still seems likely that there will be problems.  There are various
>>>>classpath isolation ideas that could help here, but they are big
>>>>projects in their own right and we don't have a clear timeline for
>>>>them.  If this does end up being a separate jar, we may need to shade
>>>>it to avoid all these issues.
>>>>
>>>>Bundling the JNI glue code in the jar itself is an interesting idea,
>>>>which we have talked about before for libhadoop.s

Re: [Release thread] 2.8.0 release activities

2016-02-03 Thread Gangumalla, Uma
Thanks Vinod. +1 for 2.8 release start.

Regards,
Uma

On 2/3/16, 3:53 PM, "Vinod Kumar Vavilapalli"  wrote:

>Seems like all the features listed in the Roadmap wiki are in. I¹m going
>to try cutting an RC this weekend for a first/non-stable release off of
>branch-2.8.
>
>Let me know if anyone has any objections/concerns.
>
>Thanks
>+Vinod
>
>> On Nov 25, 2015, at 5:59 PM, Vinod Kumar Vavilapalli
>> wrote:
>> 
>> Branch-2.8 is created.
>> 
>> As mentioned before, the goal on branch-2.8 is to put improvements /
>>fixes to existing features with a goal of converging on an alpha release
>>soon.
>> 
>> Thanks
>> +Vinod
>> 
>> 
>>> On Nov 25, 2015, at 5:30 PM, Vinod Kumar Vavilapalli
>>> wrote:
>>> 
>>> Forking threads now in order to track all things related to the
>>>release.
>>> 
>>> Creating the branch now.
>>> 
>>> Thanks
>>> +Vinod
>>> 
>>> 
 On Nov 25, 2015, at 11:37 AM, Vinod Kumar Vavilapalli
 wrote:
 
 I think we¹ve converged at a high level w.r.t 2.8. And as I just sent
out an email, I updated the Roadmap wiki reflecting the same:
https://wiki.apache.org/hadoop/Roadmap

 
 I plan to create a 2.8 branch EOD today.
 
 The goal for all of us should be to restrict improvements & fixes to
only (a) the feature-set documented under 2.8 in the RoadMap wiki and
(b) other minor features that are already in 2.8.
 
 Thanks
 +Vinod
 
 
> On Nov 11, 2015, at 12:13 PM, Vinod Kumar Vavilapalli
>mailto:vino...@hortonworks.com>> wrote:
> 
> - Cut a branch about two weeks from now
> - Do an RC mid next month (leaving ~4weeks since branch-cut)
> - As with 2.7.x series, the first release will still be called as
>early / alpha release in the interest of
>   ‹ gaining downstream adoption
>   ‹ wider testing,
>   ‹ yet reserving our right to fix any inadvertent incompatibilities
>introduced.
 
>>> 
>> 
>



Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk

2015-09-22 Thread Gangumalla, Uma
+1 

Great addition to HDFS. Thanks all contributors for the nice work.

Regards,
Uma

On 9/22/15, 3:40 PM, "Zhe Zhang"  wrote:

>Hi,
>
>I'd like to propose a vote to merge the HDFS-7285 feature branch back to
>trunk. Since November 2014 we have been designing and developing this
>feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and have
>committed approximately 210 patches.
>
>The HDFS-7285 feature branch was created to support the first phase of
>HDFS
>erasure coding (HDFS-EC). The objective of HDFS-EC is to significantly
>reduce storage space usage in HDFS clusters. Instead of always creating 3
>replicas of each block with 200% storage space overhead, HDFS-EC provides
>data durability through parity data blocks. With most EC configurations,
>the storage overhead is no more than 50%. Based on profiling results of
>production clusters, we decided to support EC with the striped block
>layout
>in the first phase, so that small files can be better handled. This means
>dividing each logical HDFS file block into smaller units (striping cells)
>and spreading them on a set of DataNodes in round-robin fashion. Parity
>cells are generated for each stripe of original data cells. We have made
>changes to NameNode, client, and DataNode to generalize the block concept
>and handle the mapping between a logical file block and its internal
>storage blocks. For further details please see the design doc on
>HDFS-7285.
>HADOOP-11264 focuses on providing flexible and high-performance codec
>calculation support.
>
>The nightly Jenkins job of the branch has reported several successful
>runs,
>and doesn't show new flaky tests compared with trunk. We have posted
>several versions of the test plan including both unit testing and cluster
>testing, and have executed most tests in the plan. The most basic
>functionalities have been extensively tested and verified in several real
>clusters with different hardware configurations; results have been very
>stable. We have created follow-on tasks for more advanced error handling
>and optimization under the umbrella HDFS-8031. We also plan to implement
>or
>harden the integration of EC with existing features such as WebHDFS,
>snapshot, append, truncate, hflush, hsync, and so forth.
>
>Development of this feature has been a collaboration across many companies
>and institutions. I'd like to thank J. Andreina, Takanobu Asanuma,
>Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, Rui Li, Yi
>Liu,
>Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai Sasaki, Walter Su, Tsz Wo
>Nicholas Sze, Andrew Wang, Yong Zhang, Jing Zhao, Hui Zheng and Kai Zheng
>for their code contributions and reviews. Andrew and Kai Zheng also made
>fundamental contributions to the initial design. Rui Li, Gao Rui, Kai
>Sasaki, Kai Zheng and many other contributors have made great efforts in
>system testing. Many thanks go to Weihua Jiang for proposing the JIRA, and
>ATM, Todd Lipcon, Silvius Rus, Suresh, as well as many others for
>providing
>helpful feedbacks.
>
>Following the community convention, this vote will last for 7 days (ending
>September 29th). Votes from Hadoop committers are binding but non-binding
>votes are very welcome as well. And here's my non-binding +1.
>
>Thanks,
>---
>Zhe Zhang



Re: [YETUS] Yetus TLP approved

2015-09-17 Thread Gangumalla, Uma
Congratulations, Great efforts Sean and team!

Regards,
Uma

On 9/17/15, 8:59 AM, "Sean Busbey"  wrote:

>Hi Folks!
>
>At yesterday's ASF board meeting the Apache Yetus TLP was approved.
>There's
>still some ASF Infra work to get done[1] before we can start transitioning
>our mailing list, jira, and code over.
>
>Thanks to all the folks in Hadoop who've helped us along this process. I
>look forward to our communities maintaining a healthy working relationship
>in the future.
>
>[1]: https://issues.apache.org/jira/browse/INFRA-10447
>
>-- 
>Sean



RE: [VOTE] Migration from subversion to git for version control

2014-08-10 Thread Gangumalla, Uma
+1

Regards,
Uma

-Original Message-
From: Karthik Kambatla [mailto:ka...@cloudera.com] 
Sent: Saturday, August 09, 2014 8:27 AM
To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
Subject: [VOTE] Migration from subversion to git for version control

I have put together this proposal based on recent discussion on this topic.

Please vote on the proposal. The vote runs for 7 days.

   1. Migrate from subversion to git for version control.
   2. Force-push to be disabled on trunk and branch-* branches. Applying
   changes from any of trunk/branch-* to any of branch-* should be through
   "git cherry-pick -x".
   3. Force-push on feature-branches is allowed. Before pulling in a
   feature, the feature-branch should be rebased on latest trunk and the
   changes applied to trunk through "git rebase --onto" or "git cherry-pick
   ".
   4. Every time a feature branch is rebased on trunk, a tag that
   identifies the state before the rebase needs to be created (e.g.
   tag_feature_JIRA-2454_2014-08-07_rebase). These tags can be deleted once
   the feature is pulled into trunk and the tags are no longer useful.
   5. The relevance/use of tags stay the same after the migration.

Thanks
Karthik

PS: Per Andrew Wang, this should be a "Adoption of New Codebase" kind of vote 
and will be Lazy 2/3 majority of PMC members.


RE: [VOTE] Change by-laws on release votes: 5 days instead of 7

2014-06-29 Thread Gangumalla, Uma
+1

Regards,
Uma

-Original Message-
From: Arun C Murthy [mailto:a...@hortonworks.com] 
Sent: Tuesday, June 24, 2014 2:24 PM
To: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
Subject: [VOTE] Change by-laws on release votes: 5 days instead of 7 

Folks,

 As discussed, I'd like to call a vote on changing our by-laws to change 
release votes from 7 days to 5.

 I've attached the change to by-laws I'm proposing.

 Please vote, the vote will the usual period of 7 days.

thanks,
Arun



[main]$ svn diff
Index: author/src/documentation/content/xdocs/bylaws.xml
===
--- author/src/documentation/content/xdocs/bylaws.xml   (revision 1605015)
+++ author/src/documentation/content/xdocs/bylaws.xml   (working copy)
@@ -344,7 +344,16 @@
 Votes are open for a period of 7 days to allow all active
 voters time to consider the vote. Votes relating to code
 changes are not subject to a strict timetable but should be
-made as timely as possible.
+made as timely as possible.
+
+ 
+  Product Release - Vote Timeframe
+   Release votes, alone, run for a period of 5 days. All other
+ votes are subject to the above timeframe of 7 days.
+ 
+   
+   
+


 
--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


RE: [DISCUSS] Change by-laws on release votes: 5 days instead of 7

2014-06-24 Thread Gangumalla, Uma
Thanks Arun. 

+1 

Regards,
Uma

-Original Message-
From: Arun C. Murthy [mailto:a...@hortonworks.com] 
Sent: Saturday, June 21, 2014 11:07 PM
To: hdfs-...@hadoop.apache.org
Cc: common-dev@hadoop.apache.org; yarn-...@hadoop.apache.org; 
mapreduce-...@hadoop.apache.org
Subject: Re: [DISCUSS] Change by-laws on release votes: 5 days instead of 7

Uma,

 Voting periods are defined in *minimum* terms, so it already covers what you'd 
like to see i.e. the vote can continue longer.

thanks,
Arun

> On Jun 21, 2014, at 2:19 AM, "Gangumalla, Uma"  
> wrote:
> 
> How about proposing vote for 5days and give chance to RM for extending vote 
> for 2more days( total to 7days) if the rc did not receive enough vote within 
> 5days? If a rc received enough votes in 5days, RM can close vote.
> I can see an advantage of 7days voting is, that will cover all the week and 
> weekend days. So, if someone wants to test on weekend time(due to the weekday 
> schedules), that will give chance to them. 
> 
> Regards,
> Uma
> 
> -Original Message-
> From: Arun C Murthy [mailto:a...@hortonworks.com]
> Sent: Saturday, June 21, 2014 11:25 AM
> To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; 
> yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
> Subject: [DISCUSS] Change by-laws on release votes: 5 days instead of 
> 7
> 
> Folks,
> 
> I'd like to propose we change our by-laws to reduce our voting periods on new 
> releases from 7 days to 5.
> 
> Currently, it just takes too long to turn around releases; particularly if we 
> have critical security fixes etc.
> 
> Thoughts?
> 
> thanks,
> Arun
> 
> 
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader of 
> this message is not the intended recipient, you are hereby notified that any 
> printing, copying, dissemination, distribution, disclosure or forwarding of 
> this communication is strictly prohibited. If you have received this 
> communication in error, please contact the sender immediately and delete it 
> from your system. Thank You.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


RE: [DISCUSS] Change by-laws on release votes: 5 days instead of 7

2014-06-21 Thread Gangumalla, Uma
How about proposing vote for 5days and give chance to RM for extending vote for 
2more days( total to 7days) if the rc did not receive enough vote within 5days? 
If a rc received enough votes in 5days, RM can close vote.
I can see an advantage of 7days voting is, that will cover all the week and 
weekend days. So, if someone wants to test on weekend time(due to the weekday 
schedules), that will give chance to them. 

Regards,
Uma

-Original Message-
From: Arun C Murthy [mailto:a...@hortonworks.com] 
Sent: Saturday, June 21, 2014 11:25 AM
To: hdfs-...@hadoop.apache.org; common-dev@hadoop.apache.org; 
yarn-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org
Subject: [DISCUSS] Change by-laws on release votes: 5 days instead of 7

Folks,

 I'd like to propose we change our by-laws to reduce our voting periods on new 
releases from 7 days to 5.

 Currently, it just takes too long to turn around releases; particularly if we 
have critical security fixes etc.

 Thoughts?

thanks,
Arun


--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.


RE: hadoop-2.5 - June end?

2014-06-11 Thread Gangumalla, Uma
Yes. Suresh.

I have merged HDFS-2006 (Extended Attributes) to branch-2. So, that it will be 
included in 2.5 release.

Regards,
Uma

-Original Message-
From: Suresh Srinivas [mailto:sur...@hortonworks.com] 
Sent: Tuesday, June 10, 2014 10:15 PM
To: mapreduce-...@hadoop.apache.org
Cc: common-dev@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org
Subject: Re: hadoop-2.5 - June end?

We should also include extended attributes feature for HDFS from HDFS-2006 for 
release 2.5.


On Mon, Jun 9, 2014 at 9:39 AM, Arun C Murthy  wrote:

> Folks,
>
>  As you can see from the Roadmap wiki, it looks like several items are 
> still a bit away from being ready.
>
>  I think rather than wait for them, it will be useful to create an 
> intermediate release (2.5) this month - I think ATS security is pretty 
> close, so we can ship that. I'm thinking of creating hadoop-2.5 by end 
> of the month, with a branch a couple of weeks prior.
>
>  Thoughts?
>
> thanks,
> Arun
>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or 
> entity to which it is addressed and may contain information that is 
> confidential, privileged and exempt from disclosure under applicable 
> law. If the reader of this message is not the intended recipient, you 
> are hereby notified that any printing, copying, dissemination, 
> distribution, disclosure or forwarding of this communication is 
> strictly prohibited. If you have received this communication in error, 
> please contact the sender immediately and delete it from your system. Thank 
> You.
>



--
http://hortonworks.com/download/

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader of 
this message is not the intended recipient, you are hereby notified that any 
printing, copying, dissemination, distribution, disclosure or forwarding of 
this communication is strictly prohibited. If you have received this 
communication in error, please contact the sender immediately and delete it 
from your system. Thank You.