Re: [DISCUSS] fate of branch-2.9

2020-08-26 Thread Wei-Chiu Chuang
Bump up this thread after 6 months.

Is anyone still interested in the 2.9 release line? Or are we good to start
the EOL process? The 2.9.2 was released in Nov 2018.

I'd really like to see the community to converge to fewer release lines and
make more frequent releases in each line.

Thanks,
Weichiu


On Fri, Mar 6, 2020 at 5:47 PM Wei-Chiu Chuang  wrote:

> I think that's a great suggestion.
> Currently, we make 1 minor release per year, and within each minor release
> we bring up 1 thousand to 2 thousand commits in it compared with the
> previous one.
> I can totally understand it is a big bite for users to swallow. Having a
> more frequent release cycle, plus LTS and non-LTS releases should help with
> this. (Of course we will need to make the release preparation much easier,
> which is currently a pain)
>
> I am happy to discuss the release model further in the dev ML. LTS v.s.
> non-LTS is one suggestion.
>
> Another similar issue: In the past Hadoop strived to
> maintain compatibility. However, this is no longer sustainable as more CVEs
> coming from our dependencies: netty, jetty, jackson ... etc.
> In many cases, updating the dependencies brings breaking changes. More
> recently, especially in Hadoop 3.x, I started to make the effort to update
> dependencies much more frequently. How do users feel about this change?
>
> On Thu, Mar 5, 2020 at 7:58 AM Igor Dvorzhak 
> wrote:
>
>> Maybe Hadoop will benefit from adopting a similar release and support
>> strategy as Java? I.e. designate some releases as LTS and support them for
>> 2 (?) years (it seems that 2.7.x branch was de-facto LTS), other non-LTS
>> releases will be supported for 6 months (or until next release). This
>> should allow to reduce maintenance cost of non-LTS release and provide
>> conservative users desired stability by allowing them to wait for new LTS
>> release and upgrading to it.
>>
>> On Thu, Mar 5, 2020 at 1:26 AM Rupert Mazzucco 
>> wrote:
>>
>>> After recently jumping from 2.7.7 to 2.10 without issue myself, I vote
>>> for keeping only the 2.10 line.
>>> It would seem all other 2.x branches can upgrade to a 2.10.x easily if
>>> they feel like upgrading at all,
>>> unlike a jump to 3.x, which may require more planning.
>>>
>>> I also vote for having only one main 3.x branch. Why are there 3.1.x and
>>> 3.2.x seemingly competing,
>>> and now 3.3.x? For a community that does not have the resources to
>>> manage multiple release lines,
>>> you guys sure like to multiply release lines a lot.
>>>
>>> Cheers
>>> Rupert
>>>
>>> Am Mi., 4. März 2020 um 19:40 Uhr schrieb Wei-Chiu Chuang
>>> :
>>>
 Forwarding the discussion thread from the dev mailing lists to the user
 mailing lists.

 I'd like to get an idea of how many users are still on Hadoop 2.9.
 Please share your thoughts.

 On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
  wrote:

> +1
>
> Sent from Yahoo Mail on Android
>
>   On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang
> wrote:   Hi,
>
> Following the discussion to end branch-2.8, I want to start a
> discussion
> around what's next with branch-2.9. I am hesitant to use the word "end
> of
> life" but consider these facts:
>
> * 2.9.0 was released Dec 17, 2017.
> * 2.9.2, the last 2.9.x release, went out Nov 19 2018, which is more
> than
> 15 months ago.
> * no one seems to be interested in being the release manager for 2.9.3.
> * Most if not all of the active Hadoop contributors are using Hadoop
> 2.10
> or Hadoop 3.x.
> * We as a community do not have the cycle to manage multiple release
> line,
> especially since Hadoop 3.3.0 is coming out soon.
>
> It is perhaps the time to gradually reduce our footprint in Hadoop
> 2.x, and
> encourage people to upgrade to Hadoop 3.x
>
> Thoughts?
>
>


Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread Wei-Chiu Chuang
Thanks Brahma,

Eric, do you have a target Hadoop 3 release line in mind?

The "unofficial" plan here at Cloudera is to rebase our current dev
codebase from Hadoop 3.1.1 to 3.3 some time later. The Hadoop 3.1 code line
will approach its 3rd anniversary by this year's end so perhaps we can
start to sunset it.

On Wed, Aug 26, 2020 at 10:51 AM Brahma Reddy Battula 
wrote:

> One more update from me.
>
> We didn't face any issues with YARN, for HDFS you can have a look at the
> following jira's.
>
> https://issues.apache.org/jira/browse/HDFS-13596
> https://issues.apache.org/jira/browse/HDFS-14396
> https://issues.apache.org/jira/browse/HDFS-14509
>
> Following jira is incompatible for ACL commands.Only hadoop-3 clients will
> work against hadoop-3 server during the upgrade.
>
> https://issues.apache.org/jira/browse/HDFS-6984
>
>
>
> On Wed, Aug 26, 2020 at 11:06 PM Brahma Reddy Battula 
> wrote:
>
> >
> > Hi Eric,
> >
> > check the following references for the same.
> >
> > 01/02/2020 Didi talked about their large scale HDFS cluster upgrade
> > experience.
> >
> > Slides:
> > https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy
> >
> > Recording:
> >
> https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl
> >
> > Didi studied two upgrade approaches from the community documentation:
> > express upgrade and rolling upgrade. Rolling upgrade was selected.
> >
> > Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1
> >
> > https://techblog.yahoo.co.jp/entry/20191206786320/
> >
> > On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org 
> > wrote:
> >
> >> Hello. Just a reminder that today I would like to invite you all to
> >> discuss your
> >> experiences migrating from Hadoop 2 to Hadoop 3.
> >>
> >> -Eric
> >>
> >> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
> >> epa...@apache.org> wrote:
> >>
> >> Hello everyone!
> >>
> >> We are considering migrating to Hadoop 3, and we would be very
> interested
> >> to
> >> hear about your experiences. If you have migrated from Hadoop 2 to
> Hadoop
> >> 3
> >> and can provide insights, please kindly consider attending the
> following:
> >>
> >> Date: Wednesday, Aug 26, 2020
> >> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
> >> Location: Zoom: https://cloudera.zoom.us/j/880548968
> >>
> >> Hope to see you there!
> >>
> >> Thank you!
> >> Eric Payne
> >> @ Verizon Media
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >>
> >
> > --
> >
> >
> >
> > --Brahma Reddy Battula
> >
>
>
> --
>
>
>
> --Brahma Reddy Battula
>


Re: Mandarin Hadoop online sync this week

2020-08-26 Thread Wei-Chiu Chuang
This week's summary:

8/26 Mandarin online sync

Weichiu, Xiaoiao, Baoloongmao, Hui, wuweiwei,, Leon Gao, Lisheng Sun,
Jinglun, zhoubin86

Leon shared a DataNode improvement proposal at Uber.

Different storage density. Balance disk IO among different disk size.

Problem: archive disk’s IO utilization is very low. Want to use it more.

The proposed change will be based on the HSM, with quite minimal change.

Cold data is in GCS. A simple scheme to copy cold data to GCS. The data in
GCS is not intended to be accessible readily, so don’t worry about the
scheme change.

Jinglun shared the solutions to an operational problem: NameNode QPS
dropped, waiting time more than 1 second, processing time more than 400ms.
Solution: (1) migrate a directory to a new namespace.  (2) RBF can hash out
a directory to multiple namespaces, reducing the pressure of a particular
NN.

Baoloongmao suggested we can port Ozone features into Hadoop Common. For
example, Java-based configuration is a power feature which can benefit
Hadoop as well.


On Tue, Aug 25, 2020 at 9:47 AM Wei-Chiu Chuang  wrote:

> Hello,
>
> There hasn't been a Mandarin online sync for quite some time. I'd like to
> call for one this week:
>
> Date/time:
>
> 8/27 Thursday Beijing Time 1PM
> 8/26 Wednesday US Pacific Time 10PM
>
> Link:
> https://cloudera.zoom.us/j/880548968
>
> Past sync summary:
>
> https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit
>
>
>


Re: [DISCUSS] GitHub PR link auto-posting to JIRA?

2020-08-26 Thread Ayush Saxena
Hi Mingliang,
I think this issue has been there for a couple of months, It used to work
earlier IIRC.
I tried checking a bit, I think ASF-GITHUB-BOT didn't have permissions, As
of now I added it as HDFS-Contributor-1(temporarily) and I just saw one
notification on HDFS-15025 from Github.
Can you check if that solves the issue?

-Ayush

On Thu, 27 Aug 2020 at 03:41, Mingliang Liu  wrote:

> Hi,
>
> I found that GitHub PR will not show up as "links" of the JIRA even if the
> PR subject starts with a JIRA number.
>
> Is this a known issue? I see this works for HBase projects, but not Hadoop.
>
> Thanks,
>


[jira] [Created] (HADOOP-17231) empty getDefaultExtension() is ignored

2020-08-26 Thread Ruslan Dautkhanov (Jira)
Ruslan Dautkhanov created HADOOP-17231:
--

 Summary: empty getDefaultExtension() is ignored
 Key: HADOOP-17231
 URL: https://issues.apache.org/jira/browse/HADOOP-17231
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 3.1.3, 3.2.0
Reporter: Ruslan Dautkhanov


Use case - source files are gz-compressed but have no extensions.

Attempt to auto-decompress them through 
{code:java}
package com.my.codec.test

import org.apache.hadoop.io.compress.GzipCodec

class GZCodec extends GzipCodec {
  override def getDefaultExtension(): String = ""
 }
{code}
 (notice empty getDefaultExtension ) and then setting *io.compression.codecs* 
to com.my.codec.test.GZCodec makes no effect 

Similar tests with one-character encoding for last possible names makes it 
work. So only the empty-string getDefaultExtension case is broken. 

I guess the issue is somewhere in 
[https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/CompressionCodecFactory.java#L109]
 

but it's not obvious. 

Folks have built some workarounds using custom readers, for example, 
 # 
[https://daynebatten.com/2015/11/override-hadoop-compression-codec-file-extension/]
 # 
[https://stackoverflow.com/questions/52011697/how-to-read-a-compressed-gzip-file-without-extension-in-spark?rq=1]
 

Hopefully it would be an easy fix to support empty getDefaultExtension? 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread epa...@apache.org
Thank you! This will be very helpful. And I really appreciate your 
participation in todays meeting.
-Eric

On Wednesday, August 26, 2020, 12:36:38 PM CDT, Brahma Reddy Battula 
 wrote: 

Hi Eric,

check the following references for the same.

01/02/2020 Didi talked about their large scale HDFS cluster upgrade
experience.

Slides:
https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy

Recording:
https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl

Didi studied two upgrade approaches from the community documentation:
express upgrade and rolling upgrade. Rolling upgrade was selected.

Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1

https://techblog.yahoo.co.jp/entry/20191206786320/

On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org  wrote:

> Hello. Just a reminder that today I would like to invite you all to
> discuss your
> experiences migrating from Hadoop 2 to Hadoop 3.
>
> -Eric
>
> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
> epa...@apache.org> wrote:
>
> Hello everyone!
>
> We are considering migrating to Hadoop 3, and we would be very interested
> to
> hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
> and can provide insights, please kindly consider attending the following:
>
> Date: Wednesday, Aug 26, 2020
> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
> Location: Zoom: https://cloudera.zoom.us/j/880548968
>
> Hope to see you there!
>
> Thank you!
> Eric Payne
> @ Verizon Media

>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

-- 



--Brahma Reddy Battula


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2020-08-26 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/38/

No changes

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

[DISCUSS] GitHub PR link auto-posting to JIRA?

2020-08-26 Thread Mingliang Liu
Hi,

I found that GitHub PR will not show up as "links" of the JIRA even if the
PR subject starts with a JIRA number.

Is this a known issue? I see this works for HBase projects, but not Hadoop.

Thanks,


Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread Brahma Reddy Battula
One more update from me.

We didn't face any issues with YARN, for HDFS you can have a look at the
following jira's.

https://issues.apache.org/jira/browse/HDFS-13596
https://issues.apache.org/jira/browse/HDFS-14396
https://issues.apache.org/jira/browse/HDFS-14509

Following jira is incompatible for ACL commands.Only hadoop-3 clients will
work against hadoop-3 server during the upgrade.

https://issues.apache.org/jira/browse/HDFS-6984



On Wed, Aug 26, 2020 at 11:06 PM Brahma Reddy Battula 
wrote:

>
> Hi Eric,
>
> check the following references for the same.
>
> 01/02/2020 Didi talked about their large scale HDFS cluster upgrade
> experience.
>
> Slides:
> https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy
>
> Recording:
> https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl
>
> Didi studied two upgrade approaches from the community documentation:
> express upgrade and rolling upgrade. Rolling upgrade was selected.
>
> Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1
>
> https://techblog.yahoo.co.jp/entry/20191206786320/
>
> On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org 
> wrote:
>
>> Hello. Just a reminder that today I would like to invite you all to
>> discuss your
>> experiences migrating from Hadoop 2 to Hadoop 3.
>>
>> -Eric
>>
>> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
>> epa...@apache.org> wrote:
>>
>> Hello everyone!
>>
>> We are considering migrating to Hadoop 3, and we would be very interested
>> to
>> hear about your experiences. If you have migrated from Hadoop 2 to Hadoop
>> 3
>> and can provide insights, please kindly consider attending the following:
>>
>> Date: Wednesday, Aug 26, 2020
>> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
>> Location: Zoom: https://cloudera.zoom.us/j/880548968
>>
>> Hope to see you there!
>>
>> Thank you!
>> Eric Payne
>> @ Verizon Media
>>
>> -
>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>
>>
>
> --
>
>
>
> --Brahma Reddy Battula
>


-- 



--Brahma Reddy Battula


Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread Brahma Reddy Battula
Hi Eric,

check the following references for the same.

01/02/2020 Didi talked about their large scale HDFS cluster upgrade
experience.

Slides:
https://drive.google.com/open?id=1iwJ1asalYfgnOCBuE-RfeG-NpSocjIcy

Recording:
https://cloudera.zoom.us/rec/share/7MF_dLX0339OY5391xvkZP8NLrXieaa8gyZK-fYJnUkGOUUXvaUh5cl_6AVYetQl

Didi studied two upgrade approaches from the community documentation:
express upgrade and rolling upgrade. Rolling upgrade was selected.

Yahoo Japan was trying out from hadoop-2.6 to hadop-3.2.1

https://techblog.yahoo.co.jp/entry/20191206786320/

On Wed, Aug 26, 2020 at 6:56 PM epa...@apache.org  wrote:

> Hello. Just a reminder that today I would like to invite you all to
> discuss your
> experiences migrating from Hadoop 2 to Hadoop 3.
>
> -Eric
>
> On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org <
> epa...@apache.org> wrote:
>
> Hello everyone!
>
> We are considering migrating to Hadoop 3, and we would be very interested
> to
> hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
> and can provide insights, please kindly consider attending the following:
>
> Date: Wednesday, Aug 26, 2020
> Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
> Location: Zoom: https://cloudera.zoom.us/j/880548968
>
> Hope to see you there!
>
> Thank you!
> Eric Payne
> @ Verizon Media
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

-- 



--Brahma Reddy Battula


[jira] [Created] (HADOOP-17230) JAVA_HOME with spaces not supported

2020-08-26 Thread Wayne Seguin (Jira)
Wayne Seguin created HADOOP-17230:
-

 Summary: JAVA_HOME with spaces not supported
 Key: HADOOP-17230
 URL: https://issues.apache.org/jira/browse/HADOOP-17230
 Project: Hadoop Common
  Issue Type: Bug
  Components: common
Affects Versions: 3.3.0
 Environment: Windows 10

Hadoop 3.1.0, 3.2.1, 3.3.0, etc
Reporter: Wayne Seguin
 Attachments: image-2020-08-26-12-24-04-118.png

When running on Windows, if JAVA_HOME contains a space (which is frequently 
since the default Java install path is "C:\Program Files\Java", running Hadoop 
fails to run. 

!image-2020-08-26-12-24-04-118.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17224) Install Intel ISA-L library in Dockerfile

2020-08-26 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma resolved HADOOP-17224.
---
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged to trunk. Thanks for reviewing it, [~iwasakims].

> Install Intel ISA-L library in Dockerfile
> -
>
> Key: HADOOP-17224
> URL: https://issues.apache.org/jira/browse/HADOOP-17224
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Takanobu Asanuma
>Assignee: Takanobu Asanuma
>Priority: Major
> Fix For: 3.4.0
>
>
> Currently, there is not isa-l library in the docker container, and jenkins 
> skips the natvie tests, TestNativeRSRawCoder and TestNativeXORRawCoder.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [Virtual MEETUP]: Migration to Hadoop 3

2020-08-26 Thread epa...@apache.org
Hello. Just a reminder that today I would like to invite you all to discuss your
experiences migrating from Hadoop 2 to Hadoop 3.

-Eric

On Monday, August 24, 2020, 1:58:37 PM CDT, epa...@apache.org 
 wrote: 

Hello everyone!

We are considering migrating to Hadoop 3, and we would be very interested to
hear about your experiences. If you have migrated from Hadoop 2 to Hadoop 3
and can provide insights, please kindly consider attending the following:

Date: Wednesday, Aug 26, 2020
Time: 10:00 A.M. PDT / 12:00 P.M. CDT / 01:00 P.M. EDT / 05:00 P.M. GMT
Location: Zoom: https://cloudera.zoom.us/j/880548968

Hope to see you there!

Thank you!
Eric Payne
@ Verizon Media

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org