[jira] [Reopened] (HADOOP-15547) WASB: improve listStatus performance

2018-08-31 Thread Thomas Marquardt (JIRA)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Marquardt reopened HADOOP-15547:
---

Reactivating for branch-2 backport.

> WASB: improve listStatus performance
> 
>
> Key: HADOOP-15547
> URL: https://issues.apache.org/jira/browse/HADOOP-15547
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 2.9.1, 3.0.2
>Reporter: Thomas Marquardt
>Assignee: Thomas Marquardt
>Priority: Major
> Fix For: 3.1.1
>
> Attachments: HADOOP-15547-004.patch, HADOOP-15547-004.patch, 
> HADOOP-15547.001.patch, HADOOP-15547.002.patch, HADOOP-15547.003.patch
>
>
> The WASB implementation of Filesystem.listStatus is very slow due to O(n!) 
> algorithm to remove duplicates and uses too much memory due to the extra 
> conversion from BlobListItem to FileMetadata to FileStatus.  It takes over 30 
> minutes to list 700,000 files.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Wangda Tan
+1, thanks for working on this, Marton!

Best,
Wangda

On Fri, Aug 31, 2018 at 11:24 AM Arpit Agarwal 
wrote:

> +1
>
> Thanks for initiating this Marton.
>
>
> On 8/31/18, 1:07 AM, "Elek, Marton"  wrote:
>
> Bumping this thread at last time.
>
> I have the following proposal:
>
> 1. I will request a new git repository hadoop-site.git and import the
> new site to there (which has exactly the same content as the existing
> site).
>
> 2. I will ask infra to use the new repository as the source of
> hadoop.apache.org
>
> 3. I will sync manually all of the changes in the next two months back
> to the svn site from the git (release announcements, new committers)
>
> IN CASE OF ANY PROBLEM we can switch back to the svn without any
> problem.
>
> If no-one objects within three days, I'll assume lazy consensus and
> start with this plan. Please comment if you have objections.
>
> Again: it allows immediate fallback at any time as svn repo will be
> kept
> as is (+ I will keep it up-to-date in the next 2 months)
>
> Thanks,
> Marton
>
>
> On 06/21/2018 09:00 PM, Elek, Marton wrote:
> >
> > Thank you very much to bump up this thread.
> >
> >
> > About [2]: (Just for the clarification) the content of the proposed
> > website is exactly the same as the old one.
> >
> > About [1]. I believe that the "mvn site" is perfect for the
> > documentation but for website creation there are more simple and
> > powerful tools.
> >
> > Hugo has more simple compared to jekyll. Just one binary, without
> > dependencies, works everywhere (mac, linux, windows)
> >
> > Hugo has much more powerful compared to "mvn site". Easier to
> create/use
> > more modern layout/theme, and easier to handle the content (for
> example
> > new release announcements could be generated as part of the release
> > process)
> >
> > I think it's very low risk to try out a new approach for the site
> (and
> > easy to rollback in case of problems)
> >
> > Marton
> >
> > ps: I just updated the patch/preview site with the recent releases:
> >
> > ***
> > * http://hadoop.anzix.net *
> > ***
> >
> > On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:
> >> Got pinged about this offline.
> >>
> >> Thanks for keeping at it, Marton!
> >>
> >> I think there are two road-blocks here
> >>   (1) Is the mechanism using which the website is built good enough
> -
> >> mvn-site / hugo etc?
> >>   (2) Is the new website good enough?
> >>
> >> For (1), I just think we need more committer attention and get
> >> feedback rapidly and get it in.
> >>
> >> For (2), how about we do it in a different way in the interest of
> >> progress?
> >>   - We create a hadoop.apache.org/new-site/ where this new site
> goes.
> >>   - We then modify the existing web-site to say that there is a new
> >> site/experience that folks can click on a link and navigate to
> >>   - As this new website matures and gets feedback & fixes, we
> finally
> >> pull the plug at a later point of time when we think we are good to
> go.
> >>
> >> Thoughts?
> >>
> >> +Vinod
> >>
> >>> On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I would like to bump this thread up.
> >>>
> >>> TLDR; There is a proposed version of a new hadoop site which is
> >>> available from here: https://elek.github.io/hadoop-site-proposal/
> and
> >>> https://issues.apache.org/jira/browse/HADOOP-14163
> >>>
> >>> Please let me know what you think about it.
> >>>
> >>>
> >>> Longer version:
> >>>
> >>> This thread started long time ago to use a more modern hadoop site:
> >>>
> >>> Goals were:
> >>>
> >>> 1. To make it easier to manage it (the release entries could be
> >>> created by a script as part of the release process)
> >>> 2. To use a better look-and-feel
> >>> 3. Move it out from svn to git
> >>>
> >>> I proposed to:
> >>>
> >>> 1. Move the existing site to git and generate it with hugo (which
> is
> >>> a single, standalone binary)
> >>> 2. Move both the rendered and source branches to git.
> >>> 3. (Create a jenkins job to generate the site automatically)
> >>>
> >>> NOTE: this is just about forrest based hadoop.apache.org, NOT
> about
> >>> the documentation which is generated by mvn-site (as before)
> >>>
> >>>
> >>> I got multiple valuable feedback and I improved the proposed site
> >>> according to the comments. Allen had some concerns about the used
> >>> technologies (hugo vs. mvn-site) and I answered all the questions
> why
> >>> I think mvn-site is the best for documentation and hugo is best
> for
> >>> 

Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Arpit Agarwal
+1

Thanks for initiating this Marton.


On 8/31/18, 1:07 AM, "Elek, Marton"  wrote:

Bumping this thread at last time.

I have the following proposal:

1. I will request a new git repository hadoop-site.git and import the 
new site to there (which has exactly the same content as the existing site).

2. I will ask infra to use the new repository as the source of 
hadoop.apache.org

3. I will sync manually all of the changes in the next two months back 
to the svn site from the git (release announcements, new committers)

IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.

If no-one objects within three days, I'll assume lazy consensus and 
start with this plan. Please comment if you have objections.

Again: it allows immediate fallback at any time as svn repo will be kept 
as is (+ I will keep it up-to-date in the next 2 months)

Thanks,
Marton


On 06/21/2018 09:00 PM, Elek, Marton wrote:
> 
> Thank you very much to bump up this thread.
> 
> 
> About [2]: (Just for the clarification) the content of the proposed 
> website is exactly the same as the old one.
> 
> About [1]. I believe that the "mvn site" is perfect for the 
> documentation but for website creation there are more simple and 
> powerful tools.
> 
> Hugo has more simple compared to jekyll. Just one binary, without 
> dependencies, works everywhere (mac, linux, windows)
> 
> Hugo has much more powerful compared to "mvn site". Easier to create/use 
> more modern layout/theme, and easier to handle the content (for example 
> new release announcements could be generated as part of the release 
> process)
> 
> I think it's very low risk to try out a new approach for the site (and 
> easy to rollback in case of problems)
> 
> Marton
> 
> ps: I just updated the patch/preview site with the recent releases:
> 
> ***
> * http://hadoop.anzix.net *
> ***
> 
> On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:
>> Got pinged about this offline.
>>
>> Thanks for keeping at it, Marton!
>>
>> I think there are two road-blocks here
>>   (1) Is the mechanism using which the website is built good enough - 
>> mvn-site / hugo etc?
>>   (2) Is the new website good enough?
>>
>> For (1), I just think we need more committer attention and get 
>> feedback rapidly and get it in.
>>
>> For (2), how about we do it in a different way in the interest of 
>> progress?
>>   - We create a hadoop.apache.org/new-site/ where this new site goes.
>>   - We then modify the existing web-site to say that there is a new 
>> site/experience that folks can click on a link and navigate to
>>   - As this new website matures and gets feedback & fixes, we finally 
>> pull the plug at a later point of time when we think we are good to go.
>>
>> Thoughts?
>>
>> +Vinod
>>
>>> On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:
>>>
>>> Hi,
>>>
>>> I would like to bump this thread up.
>>>
>>> TLDR; There is a proposed version of a new hadoop site which is 
>>> available from here: https://elek.github.io/hadoop-site-proposal/ and 
>>> https://issues.apache.org/jira/browse/HADOOP-14163
>>>
>>> Please let me know what you think about it.
>>>
>>>
>>> Longer version:
>>>
>>> This thread started long time ago to use a more modern hadoop site:
>>>
>>> Goals were:
>>>
>>> 1. To make it easier to manage it (the release entries could be 
>>> created by a script as part of the release process)
>>> 2. To use a better look-and-feel
>>> 3. Move it out from svn to git
>>>
>>> I proposed to:
>>>
>>> 1. Move the existing site to git and generate it with hugo (which is 
>>> a single, standalone binary)
>>> 2. Move both the rendered and source branches to git.
>>> 3. (Create a jenkins job to generate the site automatically)
>>>
>>> NOTE: this is just about forrest based hadoop.apache.org, NOT about 
>>> the documentation which is generated by mvn-site (as before)
>>>
>>>
>>> I got multiple valuable feedback and I improved the proposed site 
>>> according to the comments. Allen had some concerns about the used 
>>> technologies (hugo vs. mvn-site) and I answered all the questions why 
>>> I think mvn-site is the best for documentation and hugo is best for 
>>> generating site.
>>>
>>>
>>> I would like to finish this effort/jira: I would like to start a 
>>> discussion about using this proposed version and approach as a new 
>>> site of Apache Hadoop. Please let me know what you think.
>>>
>>>
>>> Thanks a lot,
>>> Marton
>>>
   

Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Brahma Reddy Battula
+1

It’s better to new version link in old version.


Brahma Reddy Battula

On Fri, Aug 31, 2018 at 9:59 PM, Sangjin Lee  wrote:

> +1. Thanks for the work, Marton!
>
> On Fri, Aug 31, 2018 at 8:37 AM Vinod Kumar Vavilapalli <
> vino...@apache.org>
> wrote:
>
> > Is there no way to host the new site and the old site concurrently? And
> > link back & forth?
> >
> > +Vinod
> >
> >
> > > On Aug 31, 2018, at 1:07 AM, Elek, Marton  wrote:
> > >
> > > Bumping this thread at last time.
> > >
> > > I have the following proposal:
> > >
> > > 1. I will request a new git repository hadoop-site.git and import the
> > new site to there (which has exactly the same content as the existing
> site).
> > >
> > > 2. I will ask infra to use the new repository as the source of
> > hadoop.apache.org
> > >
> > > 3. I will sync manually all of the changes in the next two months back
> > to the svn site from the git (release announcements, new committers)
> > >
> > > IN CASE OF ANY PROBLEM we can switch back to the svn without any
> problem.
> > >
> > > If no-one objects within three days, I'll assume lazy consensus and
> > start with this plan. Please comment if you have objections.
> > >
> > > Again: it allows immediate fallback at any time as svn repo will be
> kept
> > as is (+ I will keep it up-to-date in the next 2 months)
> > >
> > > Thanks,
> > > Marton
> > >
> > >
> > > On 06/21/2018 09:00 PM, Elek, Marton wrote:
> > >> Thank you very much to bump up this thread.
> > >> About [2]: (Just for the clarification) the content of the proposed
> > website is exactly the same as the old one.
> > >> About [1]. I believe that the "mvn site" is perfect for the
> > documentation but for website creation there are more simple and powerful
> > tools.
> > >> Hugo has more simple compared to jekyll. Just one binary, without
> > dependencies, works everywhere (mac, linux, windows)
> > >> Hugo has much more powerful compared to "mvn site". Easier to
> > create/use more modern layout/theme, and easier to handle the content
> (for
> > example new release announcements could be generated as part of the
> release
> > process)
> > >> I think it's very low risk to try out a new approach for the site (and
> > easy to rollback in case of problems)
> > >> Marton
> > >> ps: I just updated the patch/preview site with the recent releases:
> > >> ***
> > >> * http://hadoop.anzix.net *
> > >> ***
> > >> On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:
> > >>> Got pinged about this offline.
> > >>>
> > >>> Thanks for keeping at it, Marton!
> > >>>
> > >>> I think there are two road-blocks here
> > >>>   (1) Is the mechanism using which the website is built good enough -
> > mvn-site / hugo etc?
> > >>>   (2) Is the new website good enough?
> > >>>
> > >>> For (1), I just think we need more committer attention and get
> > feedback rapidly and get it in.
> > >>>
> > >>> For (2), how about we do it in a different way in the interest of
> > progress?
> > >>>   - We create a hadoop.apache.org/new-site/ where this new site
> goes.
> > >>>   - We then modify the existing web-site to say that there is a new
> > site/experience that folks can click on a link and navigate to
> > >>>   - As this new website matures and gets feedback & fixes, we finally
> > pull the plug at a later point of time when we think we are good to go.
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> +Vinod
> > >>>
> >  On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:
> > 
> >  Hi,
> > 
> >  I would like to bump this thread up.
> > 
> >  TLDR; There is a proposed version of a new hadoop site which is
> > available from here: https://elek.github.io/hadoop-site-proposal/ and
> > https://issues.apache.org/jira/browse/HADOOP-14163
> > 
> >  Please let me know what you think about it.
> > 
> > 
> >  Longer version:
> > 
> >  This thread started long time ago to use a more modern hadoop site:
> > 
> >  Goals were:
> > 
> >  1. To make it easier to manage it (the release entries could be
> > created by a script as part of the release process)
> >  2. To use a better look-and-feel
> >  3. Move it out from svn to git
> > 
> >  I proposed to:
> > 
> >  1. Move the existing site to git and generate it with hugo (which is
> > a single, standalone binary)
> >  2. Move both the rendered and source branches to git.
> >  3. (Create a jenkins job to generate the site automatically)
> > 
> >  NOTE: this is just about forrest based hadoop.apache.org, NOT about
> > the documentation which is generated by mvn-site (as before)
> > 
> > 
> >  I got multiple valuable feedback and I improved the proposed site
> > according to the comments. Allen had some concerns about the used
> > technologies (hugo vs. mvn-site) and I answered all the questions why I
> > think mvn-site is the best for documentation and hugo is best for
> > generating site.
> > 

Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Sangjin Lee
+1. Thanks for the work, Marton!

On Fri, Aug 31, 2018 at 8:37 AM Vinod Kumar Vavilapalli 
wrote:

> Is there no way to host the new site and the old site concurrently? And
> link back & forth?
>
> +Vinod
>
>
> > On Aug 31, 2018, at 1:07 AM, Elek, Marton  wrote:
> >
> > Bumping this thread at last time.
> >
> > I have the following proposal:
> >
> > 1. I will request a new git repository hadoop-site.git and import the
> new site to there (which has exactly the same content as the existing site).
> >
> > 2. I will ask infra to use the new repository as the source of
> hadoop.apache.org
> >
> > 3. I will sync manually all of the changes in the next two months back
> to the svn site from the git (release announcements, new committers)
> >
> > IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.
> >
> > If no-one objects within three days, I'll assume lazy consensus and
> start with this plan. Please comment if you have objections.
> >
> > Again: it allows immediate fallback at any time as svn repo will be kept
> as is (+ I will keep it up-to-date in the next 2 months)
> >
> > Thanks,
> > Marton
> >
> >
> > On 06/21/2018 09:00 PM, Elek, Marton wrote:
> >> Thank you very much to bump up this thread.
> >> About [2]: (Just for the clarification) the content of the proposed
> website is exactly the same as the old one.
> >> About [1]. I believe that the "mvn site" is perfect for the
> documentation but for website creation there are more simple and powerful
> tools.
> >> Hugo has more simple compared to jekyll. Just one binary, without
> dependencies, works everywhere (mac, linux, windows)
> >> Hugo has much more powerful compared to "mvn site". Easier to
> create/use more modern layout/theme, and easier to handle the content (for
> example new release announcements could be generated as part of the release
> process)
> >> I think it's very low risk to try out a new approach for the site (and
> easy to rollback in case of problems)
> >> Marton
> >> ps: I just updated the patch/preview site with the recent releases:
> >> ***
> >> * http://hadoop.anzix.net *
> >> ***
> >> On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:
> >>> Got pinged about this offline.
> >>>
> >>> Thanks for keeping at it, Marton!
> >>>
> >>> I think there are two road-blocks here
> >>>   (1) Is the mechanism using which the website is built good enough -
> mvn-site / hugo etc?
> >>>   (2) Is the new website good enough?
> >>>
> >>> For (1), I just think we need more committer attention and get
> feedback rapidly and get it in.
> >>>
> >>> For (2), how about we do it in a different way in the interest of
> progress?
> >>>   - We create a hadoop.apache.org/new-site/ where this new site goes.
> >>>   - We then modify the existing web-site to say that there is a new
> site/experience that folks can click on a link and navigate to
> >>>   - As this new website matures and gets feedback & fixes, we finally
> pull the plug at a later point of time when we think we are good to go.
> >>>
> >>> Thoughts?
> >>>
> >>> +Vinod
> >>>
>  On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:
> 
>  Hi,
> 
>  I would like to bump this thread up.
> 
>  TLDR; There is a proposed version of a new hadoop site which is
> available from here: https://elek.github.io/hadoop-site-proposal/ and
> https://issues.apache.org/jira/browse/HADOOP-14163
> 
>  Please let me know what you think about it.
> 
> 
>  Longer version:
> 
>  This thread started long time ago to use a more modern hadoop site:
> 
>  Goals were:
> 
>  1. To make it easier to manage it (the release entries could be
> created by a script as part of the release process)
>  2. To use a better look-and-feel
>  3. Move it out from svn to git
> 
>  I proposed to:
> 
>  1. Move the existing site to git and generate it with hugo (which is
> a single, standalone binary)
>  2. Move both the rendered and source branches to git.
>  3. (Create a jenkins job to generate the site automatically)
> 
>  NOTE: this is just about forrest based hadoop.apache.org, NOT about
> the documentation which is generated by mvn-site (as before)
> 
> 
>  I got multiple valuable feedback and I improved the proposed site
> according to the comments. Allen had some concerns about the used
> technologies (hugo vs. mvn-site) and I answered all the questions why I
> think mvn-site is the best for documentation and hugo is best for
> generating site.
> 
> 
>  I would like to finish this effort/jira: I would like to start a
> discussion about using this proposed version and approach as a new site of
> Apache Hadoop. Please let me know what you think.
> 
> 
>  Thanks a lot,
>  Marton
> 
>  -
>  To unsubscribe, e-mail: 

Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Vinod Kumar Vavilapalli
Is there no way to host the new site and the old site concurrently? And link 
back & forth?

+Vinod


> On Aug 31, 2018, at 1:07 AM, Elek, Marton  wrote:
> 
> Bumping this thread at last time.
> 
> I have the following proposal:
> 
> 1. I will request a new git repository hadoop-site.git and import the new 
> site to there (which has exactly the same content as the existing site).
> 
> 2. I will ask infra to use the new repository as the source of 
> hadoop.apache.org
> 
> 3. I will sync manually all of the changes in the next two months back to the 
> svn site from the git (release announcements, new committers)
> 
> IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.
> 
> If no-one objects within three days, I'll assume lazy consensus and start 
> with this plan. Please comment if you have objections.
> 
> Again: it allows immediate fallback at any time as svn repo will be kept as 
> is (+ I will keep it up-to-date in the next 2 months)
> 
> Thanks,
> Marton
> 
> 
> On 06/21/2018 09:00 PM, Elek, Marton wrote:
>> Thank you very much to bump up this thread.
>> About [2]: (Just for the clarification) the content of the proposed website 
>> is exactly the same as the old one.
>> About [1]. I believe that the "mvn site" is perfect for the documentation 
>> but for website creation there are more simple and powerful tools.
>> Hugo has more simple compared to jekyll. Just one binary, without 
>> dependencies, works everywhere (mac, linux, windows)
>> Hugo has much more powerful compared to "mvn site". Easier to create/use 
>> more modern layout/theme, and easier to handle the content (for example new 
>> release announcements could be generated as part of the release process)
>> I think it's very low risk to try out a new approach for the site (and easy 
>> to rollback in case of problems)
>> Marton
>> ps: I just updated the patch/preview site with the recent releases:
>> ***
>> * http://hadoop.anzix.net *
>> ***
>> On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:
>>> Got pinged about this offline.
>>> 
>>> Thanks for keeping at it, Marton!
>>> 
>>> I think there are two road-blocks here
>>>   (1) Is the mechanism using which the website is built good enough - 
>>> mvn-site / hugo etc?
>>>   (2) Is the new website good enough?
>>> 
>>> For (1), I just think we need more committer attention and get feedback 
>>> rapidly and get it in.
>>> 
>>> For (2), how about we do it in a different way in the interest of progress?
>>>   - We create a hadoop.apache.org/new-site/ where this new site goes.
>>>   - We then modify the existing web-site to say that there is a new 
>>> site/experience that folks can click on a link and navigate to
>>>   - As this new website matures and gets feedback & fixes, we finally pull 
>>> the plug at a later point of time when we think we are good to go.
>>> 
>>> Thoughts?
>>> 
>>> +Vinod
>>> 
 On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:
 
 Hi,
 
 I would like to bump this thread up.
 
 TLDR; There is a proposed version of a new hadoop site which is available 
 from here: https://elek.github.io/hadoop-site-proposal/ and 
 https://issues.apache.org/jira/browse/HADOOP-14163
 
 Please let me know what you think about it.
 
 
 Longer version:
 
 This thread started long time ago to use a more modern hadoop site:
 
 Goals were:
 
 1. To make it easier to manage it (the release entries could be created by 
 a script as part of the release process)
 2. To use a better look-and-feel
 3. Move it out from svn to git
 
 I proposed to:
 
 1. Move the existing site to git and generate it with hugo (which is a 
 single, standalone binary)
 2. Move both the rendered and source branches to git.
 3. (Create a jenkins job to generate the site automatically)
 
 NOTE: this is just about forrest based hadoop.apache.org, NOT about the 
 documentation which is generated by mvn-site (as before)
 
 
 I got multiple valuable feedback and I improved the proposed site 
 according to the comments. Allen had some concerns about the used 
 technologies (hugo vs. mvn-site) and I answered all the questions why I 
 think mvn-site is the best for documentation and hugo is best for 
 generating site.
 
 
 I would like to finish this effort/jira: I would like to start a 
 discussion about using this proposed version and approach as a new site of 
 Apache Hadoop. Please let me know what you think.
 
 
 Thanks a lot,
 Marton
 
 -
 To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
 For additional commands, e-mail: common-dev-h...@hadoop.apache.org
 
>>> 
>>> 
>>> -
>>> To unsubscribe, 

[jira] [Created] (HADOOP-15710) ABFS checkException to map 403 to AccessDeniedException

2018-08-31 Thread Steve Loughran (JIRA)
Steve Loughran created HADOOP-15710:
---

 Summary: ABFS checkException to map 403 to AccessDeniedException
 Key: HADOOP-15710
 URL: https://issues.apache.org/jira/browse/HADOOP-15710
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/azure
Affects Versions: HADOOP-15407
Reporter: Steve Loughran


when you can't auth to ABFS, you get a 403 exception back. This should be 
translated into an access denied exception for better clarity/handling



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread larry mccay
+1 from me

On Fri, Aug 31, 2018, 5:30 AM Steve Loughran  wrote:

>
>
> > On 31 Aug 2018, at 09:07, Elek, Marton  wrote:
> >
> > Bumping this thread at last time.
> >
> > I have the following proposal:
> >
> > 1. I will request a new git repository hadoop-site.git and import the
> new site to there (which has exactly the same content as the existing site).
> >
> > 2. I will ask infra to use the new repository as the source of
> hadoop.apache.org
> >
> > 3. I will sync manually all of the changes in the next two months back
> to the svn site from the git (release announcements, new committers)
> >
> > IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.
> >
> > If no-one objects within three days, I'll assume lazy consensus and
> start with this plan. Please comment if you have objections.
> >
> > Again: it allows immediate fallback at any time as svn repo will be kept
> as is (+ I will keep it up-to-date in the next 2 months)
> >
> > Thanks,
> > Marton
>
> sounds good to me
>
> +1
>
>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


[jira] [Created] (HADOOP-15709) Move S3Guard LocalMetadataStore constants to org.apache.hadoop.fs.s3a.Constants

2018-08-31 Thread Gabor Bota (JIRA)
Gabor Bota created HADOOP-15709:
---

 Summary: Move S3Guard LocalMetadataStore constants to 
org.apache.hadoop.fs.s3a.Constants
 Key: HADOOP-15709
 URL: https://issues.apache.org/jira/browse/HADOOP-15709
 Project: Hadoop Common
  Issue Type: Sub-task
Affects Versions: 3.2
Reporter: Gabor Bota
Assignee: Gabor Bota


Move the following constants from 
{{org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore}} to  
{{org.apache.hadoop.fs.s3a.Constants}} (where they should be):
* DEFAULT_MAX_RECORDS
* DEFAULT_CACHE_ENTRY_TTL_MSEC
* CONF_MAX_RECORDS
* CONF_CACHE_ENTRY_TTL



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



review for IPC client change needed

2018-08-31 Thread Steve Loughran
Hi

While we wait for Jenkins to return, there's a patch for IPC client shutdown 
which needs some review by people with experience of that IPC client code

https://issues.apache.org/jira/browse/HADOOP-10219

the IPC code is a key area, which is why it's sensitive -yet its shutdown logic 
is known to be broken & a source of timeouts on shutdown hooks.

Can anyone with experience in this area take a look, otherwise those of us will 
superficial experience will be doing that voting for you

thanks

-steve


Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Steve Loughran



> On 31 Aug 2018, at 09:07, Elek, Marton  wrote:
> 
> Bumping this thread at last time.
> 
> I have the following proposal:
> 
> 1. I will request a new git repository hadoop-site.git and import the new 
> site to there (which has exactly the same content as the existing site).
> 
> 2. I will ask infra to use the new repository as the source of 
> hadoop.apache.org
> 
> 3. I will sync manually all of the changes in the next two months back to the 
> svn site from the git (release announcements, new committers)
> 
> IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.
> 
> If no-one objects within three days, I'll assume lazy consensus and start 
> with this plan. Please comment if you have objections.
> 
> Again: it allows immediate fallback at any time as svn repo will be kept as 
> is (+ I will keep it up-to-date in the next 2 months)
> 
> Thanks,
> Marton

sounds good to me

+1



-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-15708) Reading values from Configuration before adding deprecations make it impossible to read value with deprecated key

2018-08-31 Thread Szilard Nemeth (JIRA)
Szilard Nemeth created HADOOP-15708:
---

 Summary: Reading values from Configuration before adding 
deprecations make it impossible to read value with deprecated key
 Key: HADOOP-15708
 URL: https://issues.apache.org/jira/browse/HADOOP-15708
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Szilard Nemeth
Assignee: Szilard Nemeth


Hadoop Common contains a widely used Configuration class.
This class can handle deprecations of properties, e.g. if property 'A' gets 
deprecated with an alternative property key 'B', users can access property 
values with keys 'A' and 'B'.
Unfortunately, this does not work in one case.
When a config file is specified (for instance, XML) and a property is read with 
the config.get() method, the config is loaded from the file at this time. 
If the deprecation mapping is not yet specified by the time any config value is 
retrieved and the XML config refers to a deprecated key, then the deprecation 
mapping specified, the config value cannot be retrieved neither with the 
deprecated nor with the new key.
The attached patch contains a testcase that reproduces this wrong behavior.

Here are the steps outlined what the testcase does:
1. Creates an XML config file with a deprecated property
2. Adds the config to the Configuration object
3. Retrieves the config with its deprecated key (it does not really matter 
which property the user gets, could be any)
4. Specifies the deprecation rules including the one defined in the config
5. Prints and asserts the property retrieved from the config with both the 
deprecated and the new property keys. 

For reference, here is the log of one execution that actually shows what the 
issue is:

{noformat}
Loaded items: 1
Looked up property value with name hadoop.zk.address: null
Looked up property value with name yarn.resourcemanager.zk-address: 
dummyZkAddress
Contents of config file: [, , 
yarn.resourcemanager.zk-addressdummyZkAddress,
 ]
Looked up property value with name hadoop.zk.address: null
2018-08-31 10:10:06,484 INFO  Configuration.deprecation 
(Configuration.java:logDeprecation(1397)) - yarn.resourcemanager.zk-address is 
deprecated. Instead, use hadoop.zk.address
Looked up property value with name hadoop.zk.address: null
Looked up property value with name hadoop.zk.address: null

java.lang.AssertionError: 
Expected :dummyZkAddress
Actual   :null
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-08-31 Thread Elek, Marton

Bumping this thread at last time.

I have the following proposal:

1. I will request a new git repository hadoop-site.git and import the 
new site to there (which has exactly the same content as the existing site).


2. I will ask infra to use the new repository as the source of 
hadoop.apache.org


3. I will sync manually all of the changes in the next two months back 
to the svn site from the git (release announcements, new committers)


IN CASE OF ANY PROBLEM we can switch back to the svn without any problem.

If no-one objects within three days, I'll assume lazy consensus and 
start with this plan. Please comment if you have objections.


Again: it allows immediate fallback at any time as svn repo will be kept 
as is (+ I will keep it up-to-date in the next 2 months)


Thanks,
Marton


On 06/21/2018 09:00 PM, Elek, Marton wrote:


Thank you very much to bump up this thread.


About [2]: (Just for the clarification) the content of the proposed 
website is exactly the same as the old one.


About [1]. I believe that the "mvn site" is perfect for the 
documentation but for website creation there are more simple and 
powerful tools.


Hugo has more simple compared to jekyll. Just one binary, without 
dependencies, works everywhere (mac, linux, windows)


Hugo has much more powerful compared to "mvn site". Easier to create/use 
more modern layout/theme, and easier to handle the content (for example 
new release announcements could be generated as part of the release 
process)


I think it's very low risk to try out a new approach for the site (and 
easy to rollback in case of problems)


Marton

ps: I just updated the patch/preview site with the recent releases:

***
* http://hadoop.anzix.net *
***

On 06/21/2018 01:27 AM, Vinod Kumar Vavilapalli wrote:

Got pinged about this offline.

Thanks for keeping at it, Marton!

I think there are two road-blocks here
  (1) Is the mechanism using which the website is built good enough - 
mvn-site / hugo etc?

  (2) Is the new website good enough?

For (1), I just think we need more committer attention and get 
feedback rapidly and get it in.


For (2), how about we do it in a different way in the interest of 
progress?

  - We create a hadoop.apache.org/new-site/ where this new site goes.
  - We then modify the existing web-site to say that there is a new 
site/experience that folks can click on a link and navigate to
  - As this new website matures and gets feedback & fixes, we finally 
pull the plug at a later point of time when we think we are good to go.


Thoughts?

+Vinod


On Feb 16, 2018, at 3:10 AM, Elek, Marton  wrote:

Hi,

I would like to bump this thread up.

TLDR; There is a proposed version of a new hadoop site which is 
available from here: https://elek.github.io/hadoop-site-proposal/ and 
https://issues.apache.org/jira/browse/HADOOP-14163


Please let me know what you think about it.


Longer version:

This thread started long time ago to use a more modern hadoop site:

Goals were:

1. To make it easier to manage it (the release entries could be 
created by a script as part of the release process)

2. To use a better look-and-feel
3. Move it out from svn to git

I proposed to:

1. Move the existing site to git and generate it with hugo (which is 
a single, standalone binary)

2. Move both the rendered and source branches to git.
3. (Create a jenkins job to generate the site automatically)

NOTE: this is just about forrest based hadoop.apache.org, NOT about 
the documentation which is generated by mvn-site (as before)



I got multiple valuable feedback and I improved the proposed site 
according to the comments. Allen had some concerns about the used 
technologies (hugo vs. mvn-site) and I answered all the questions why 
I think mvn-site is the best for documentation and hugo is best for 
generating site.



I would like to finish this effort/jira: I would like to start a 
discussion about using this proposed version and approach as a new 
site of Apache Hadoop. Please let me know what you think.



Thanks a lot,
Marton

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org




-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org