Re: [VOTE] Hadoop 3.2.x EOL

2023-12-05 Thread Mingliang Liu
+1

On Tue, Dec 5, 2023 at 8:09 PM Xiaoqiao He  wrote:

> Dear Hadoop devs,
>
> Given the feedback from the discussion thread [1], I'd like to start
> an official thread for the community to vote on release line 3.2 EOL.
>
> It will include,
> a. An official announcement informs no further regular Hadoop 3.2.x
> releases.
> b. Issues which target 3.2.5 will not be fixed.
>
> This vote will run for 7 days and conclude by Dec 13, 2023.
>
> I’ll start with my +1.
>
> Best Regards,
> - He Xiaoqiao
>
> [1] https://lists.apache.org/thread/bbf546c6jz0og3xcl9l3qfjo93b65szr
>


Re: [ANNOUNCE] New Hadoop Committer - Simbarashe Dzinamarira

2023-10-03 Thread Mingliang Liu
Congratulations!

On Tue, Oct 3, 2023 at 8:55 AM Ayush Saxena  wrote:

> Congratulations!!!
>
> -Ayush
>
> > On 03-Oct-2023, at 5:42 AM, Erik Krogen  wrote:
> >
> > Congratulations Simba! Thanks for the great work you've done on making
> HDFS
> > more scalable!
> >
> >> On Mon, Oct 2, 2023 at 4:31 PM Iñigo Goiri  wrote:
> >>
> >> I am pleased to announce that Simbarashe Dzinamarira has been elected
> as a
> >> committer on the Apache Hadoop project.
> >> We appreciate all of Simbarashe's work, and look forward to his
> continued
> >> contributions.
> >>
> >> Congratulations and welcome !
> >>
> >> Best Regards,
> >> Inigo Goiri
> >> (On behalf of the Apache Hadoop PMC)
> >>
>
> -
> To unsubscribe, e-mail: general-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: general-h...@hadoop.apache.org
>
>


[jira] [Resolved] (HADOOP-18592) Sasl connection failure should log remote address

2023-02-01 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-18592.

Fix Version/s: 3.4.0
   3.3.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to branch-3.3 and trunk branches. Thank you for your contribution, 
[~vjasani] . Thank you all for review.

> Sasl connection failure should log remote address
> -
>
> Key: HADOOP-18592
> URL: https://issues.apache.org/jira/browse/HADOOP-18592
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> If Sasl connection fails with some generic error, we miss logging remote 
> server that the client was trying to connect to.
> Sample log:
> {code:java}
> 2023-01-12 00:22:28,148 WARN  [20%2C1673404849949,1] ipc.Client - Exception 
> encountered while connecting to the server 
> java.io.IOException: Connection reset by peer
>     at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>     at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>     at 
> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:141)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1950)
>     at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
>     at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
>     at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
> ...
> ... {code}
> We should log the remote server address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Disable auto-posting comment on pull requests to JIRA?

2023-01-03 Thread Mingliang Liu
Hi all,

I find for any new comment on a pull request (PR), it will get auto-posted
to the JIRA as comments. As we are moving to GitHub for code reviews, the
JIRA is more about the problem definition, discussion on solution and
design. Comments happening in the PR are likely scoped to the
implementation and can become stale quickly. Auto-posting those comments
back to JIRA makes the JIRA too verbose.

I propose we disable this auto-posting along with the QA report by
GitHub bot. The comment section in JIRA will be more concise and easier to
revisit.

Thoughts?


Re: [DISCUSS] JIRA Public Signup Disabled

2022-11-24 Thread Mingliang Liu
Thanks Ayush for taking care of this. I think option 1 sounds good.

> On Nov 22, 2022, at 2:51 PM, Ayush Saxena  wrote:
> 
> Hi Folks,
> Just driving the attention towards the recent change from Infra, which
> disables new people from creating a Jira account, in order to prevent spams
> around JIRA.
> So, the new Jira account creation request needs to be routed via the PMC of
> the project.
> So, we have 2 options, which I can think of:
> 1. Update the contribution guidelines to route such requests to private@
> 2. Create a dedicated ML for it. A couple of projects which I know did that.
> 
> The Infra page: https://infra.apache.org/jira-guidelines.html
> 
> Let me know what folks think, if nothing, I will go with the 1st option and
> update the contribution guidelines mostly by next week or a week after that.
> 
> -Ayuhs


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reopened HADOOP-17728:


> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17728.

Fix Version/s: 3.2.3
   3.4.0
   3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to 3.1+ Thanks for your reporting and contribution [~kaifeiYi]. 
Thanks for your review [~ste...@apache.org]

> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Hadoop 3.1 branch broken?

2021-03-24 Thread Mingliang Liu
Thank you Akira and Wei-Chiu for quick response.

On Wed, Mar 24, 2021 at 12:39 AM Akira Ajisaka  wrote:

> > Should we move to JDK11?
>
> In trunk, JDK11 is already installed in the docker image and the
> pre-commit job compiles Hadoop in both Java 8 and Java 11:
> https://issues.apache.org/jira/browse/HADOOP-16888
> AFAIK there are still some compile errors in branch-3.1 and branch-3.2.
>
> On Wed, Mar 24, 2021 at 4:17 PM Wei-Chiu Chuang 
> wrote:
> >
> > JDK9 is end of support. Should we move to JDK11?
> >
> > On Wed, Mar 24, 2021 at 2:51 PM Akira Ajisaka 
> wrote:
> >>
> >> In branch-3.1, JDK 9 is installed and the default java version is set
> >> to 9. In other branches, the default java version is 8. I'll backport
> >> some related JIRAs to branch-3.2.
> >>
> >> -Akira
> >>
> >> On Wed, Mar 24, 2021 at 3:27 PM Mingliang Liu 
> wrote:
> >> >
> >> > Hi,
> >> >
> >> > From this build log <
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2799/4/console>
> for PR 2799 in branch-3.1, I see there are failures complaining the
> sun.misc.Cleaner class is not found.
> >> >
> >> > Did we upgrade the JDK version, or any build file for that branch? I
> can check this later, but it would be great if you can share how to fix it.
> >> >
> >> > Thanks,
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
>


Hadoop 3.1 branch broken?

2021-03-24 Thread Mingliang Liu
Hi,

From this build log 
 for 
PR 2799 in branch-3.1, I see there are failures complaining the 
sun.misc.Cleaner class is not found.

Did we upgrade the JDK version, or any build file for that branch? I can check 
this later, but it would be great if you can share how to fix it.

Thanks,

[ANNOUNCE] Mukund Thakur is a new Apache Hadoop Committer

2021-02-02 Thread Mingliang Liu
Hi all,

It's my pleasure to announce that Mukund Thakur has been elected as a
committer on the Apache Hadoop project recognizing his continued
contributions to the project. He has accepted the invitation.


His involvement in the hadoop object store modules, hadoop common and
distcp tool is a great addition to the Hadoop project. His reviews are
comprehensive with high standards to "approve a pull request". The granting
of committership to him will enable better productivity.


Please join me in congratulating him. Welcome aboard, Mukund!



Mingliang Liu

(on behalf of the Apache Hadoop PMC)


[jira] [Resolved] (HADOOP-16355) ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store

2021-01-25 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-16355.

Resolution: Abandoned

As documented by [[HADOOP-17480]], AWS S3 is consistent and that S3Guard is not 
needed.

> ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store
> --
>
> Key: HADOOP-16355
> URL: https://issues.apache.org/jira/browse/HADOOP-16355
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>    Reporter: Mingliang Liu
>Priority: Major
>
> When S3Guard was proposed, there are a couple of valid reasons to choose 
> DynamoDB as its default backend store: 0) seamless integration as part of AWS 
> ecosystem e.g. client library 1) it's a managed web service which is zero 
> operational cost, highly available and infinitely scalable 2) it's performant 
> with single digit latency 3) it's proven by Netflix's S3mper (not actively 
> maintained) and EMRFS (closed source and usage). As it's pluggable, it's 
> possible to implement {{MetadataStore}} with other backend store without 
> changing semantics, besides null and in-memory local ones.
> Here we propose {{ZookeeperMetadataStore}} which uses Zookeeper as S3Guard 
> backend store. Its main motivation is to provide a new MetadataStore option 
> which:
>  # can be easily integrated as Zookeeper is heavily used in Hadoop community
>  # affordable performance as both client and Zookeeper ensemble are usually 
> "local" in a Hadoop cluster (ZK/HBase/Hive etc)
>  # removes DynamoDB dependency
> Obviously all use cases will not prefer this to default DynamoDB store. For 
> e.g. ZK might not scale well if there are dozens of S3 buckets and each has 
> millions of objects. Our use case is targeting HBase to store HFiles on S3 
> instead of HDFS. A total solution for HBase on S3 must be HBOSS (see 
> HBASE-22149) for recovering atomicity of metadata operations like rename, and 
> S3Guard for consistent enumeration and access to object store bucket 
> metadata. We would like to use Zookeeper as backend store for both.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Moving Ozone to a separated Apache project

2020-09-29 Thread Mingliang Liu
+1

On Tue, Sep 29, 2020 at 3:44 PM Wangda Tan  wrote:

> +1,
>
> Thanks,
> Wangda Tan
>
> On Tue, Sep 29, 2020 at 10:10 AM Aravindan Vijayan
>  wrote:
>
> > +1, thank you Marton.
> >
> > On Tue, Sep 29, 2020 at 9:17 AM Bharat Viswanadham 
> > wrote:
> >
> > > +1
> > > Thank You @Elek, Marton  for driving this.
> > >
> > >
> > > Thanks,
> > > Bharat
> > >
> > >
> > > On Mon, Sep 28, 2020 at 10:54 AM Vivek Ratnavel <
> > vivekratna...@apache.org>
> > > wrote:
> > >
> > > > +1 for moving Ozone to a separated Top-Level Apache Project.
> > > >
> > > > Thanks,
> > > > Vivek Subramanian
> > > >
> > > > On Mon, Sep 28, 2020 at 8:30 AM Hanisha Koneru
> > > > 
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Thanks,
> > > > > Hanisha
> > > > >
> > > > > > On Sep 27, 2020, at 11:48 PM, Akira Ajisaka  >
> > > > wrote:
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > Thanks,
> > > > > > Akira
> > > > > >
> > > > > > On Fri, Sep 25, 2020 at 3:00 PM Elek, Marton  > >  > > > > e...@apache.org>> wrote:
> > > > > >>
> > > > > >> Hi all,
> > > > > >>
> > > > > >> Thank you for all the feedback and requests,
> > > > > >>
> > > > > >> As we discussed in the previous thread(s) [1], Ozone is proposed
> > to
> > > > be a
> > > > > >> separated Apache Top Level Project (TLP)
> > > > > >>
> > > > > >> The proposal with all the details, motivation and history is
> here:
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Hadoop+subproject+to+Apache+TLP+proposal
> > > > > >>
> > > > > >> This voting runs for 7 days and will be concluded at 2nd of
> > October,
> > > > 6AM
> > > > > >> GMT.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Marton Elek
> > > > > >>
> > > > > >> [1]:
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://lists.apache.org/thread.html/rc6c79463330b3e993e24a564c6817aca1d290f186a1206c43ff0436a%40%3Chdfs-dev.hadoop.apache.org%3E
> > > > > >>
> > > > > >>
> > > -
> > > > > >> To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > > >  > > > > yarn-dev-unsubscr...@hadoop.apache.org>
> > > > > >> For additional commands, e-mail:
> yarn-dev-h...@hadoop.apache.org
> > > > > 
> > > > > >>
> > > > > >
> > > > > >
> > -
> > > > > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > > > 
> > > > > > For additional commands, e-mail:
> common-dev-h...@hadoop.apache.org
> > > > > 
> > > > >
> > > >
> > >
> >
> >
> > --
> > Thanks & Regards,
> > Aravindan
> >
>


Re: [VOTE] Release Apache Hadoop 2.10.1 (RC0)

2020-09-21 Thread Mingliang Liu
+1 (binding)

1. Download binary and check signature / checksum successfully
2. Create a 3 node cluster in Docker containers and start the HDFS/YARN
services
3. Verify the running version
4. Run simple HDFS/YARN client/admin commands and verify the output
5. Run example programs wordcount and grep
6. Check NN/DN/RM status and service logs

Thanks,

On Mon, Sep 21, 2020 at 12:34 AM Vinayakumar B 
wrote:

> Apologies for delayed voting.
>
> +1 (Binding)
>
> 1. Verified the signature. Signatures are good.
> 2. Verified the sha512 checksum
>(I tried to use the sha512sum tool to verify. But ".sha512"
> files doesn't have the expected format. I guess these files are regenerated
> using "gpg --print-md", not the one used by createrelease scripts,
> 'sha512sum --tag'. It would be easy to verify using tool if the .sha512
> have the proper format)
> 3. src tar have the "patchprocess" directory, may be need to backport the
> fix in trunk to branch-2.10 to avoid this in future.
> 4. Deployed 3 node docker cluster and ran basic jobs. All-Ok.
>
> -Vinay
>
>
> On Mon, Sep 21, 2020 at 5:24 AM Wei-Chiu Chuang 
> wrote:
>
> > +1 (binding)
> >
> > I did a security scan for the 2.10.1 RC0 and it looks fine to me.
> >
> > Checked recent critical/blocker HDFS issues that are not in 2.10.1. It
> > looks mostly fine. Most of them are Hadoop 3.x features (EC, ... etc) but
> > there is one worth attention:
> >
> >
> >1. HDFS-14674  [SBN
> >read] Got an unexpected txid when tail editlog.
> >2.
> >   1. But looking at the jira, it doesn't apply to 2.x so I think we
> are
> >   good there.
> >   2.
> >   3.
> >   4. I wanted to do an API compat check but didn't finish it yet. If
> >   someone can do it quickly that would be great. (Does anyone know
> > of a cloud
> >   service that we can quickly do a Java API compat check?)
> >
> > Cheers,
> > Wei-Chiu
> >
> > On Sun, Sep 20, 2020 at 9:25 AM Sunil Govindan 
> wrote:
> >
> > > +1 (binding)
> > >
> > > - verified checksum and sign. Shows as a Good signature from "Masatake
> > > Iwasaki (CODE SIGNING KEY) "
> > > - built from source
> > > - ran basic MR job and looks good
> > > - UI also seems fine
> > >
> > > Thanks,
> > > Sunil
> > >
> > > On Sun, Sep 20, 2020 at 11:38 AM Masatake Iwasaki <
> > > iwasak...@oss.nttdata.co.jp> wrote:
> > >
> > > > The RC0 got 2 binding +1's and 2 non-binging +1's [1].
> > > >
> > > > Based on the discussion about release vote [2],
> > > > bylaws[3] defines the periods in minimum terms.
> > > > We can extend it if there is not enough activity.
> > > >
> > > > I would like to extend the period to 7 days,
> > > > until Monday September 21 at 10:00 am PDT.
> > > >
> > > > I will appreciate additional votes.
> > > >
> > > > Thanks,
> > > > Masatake Iwasaki
> > > >
> > > > [1]
> > > >
> > >
> >
> https://lists.apache.org/thread.html/r16a7f36315a0673c7d522c41065e7ef9c9ee15c76ffcb5db80931002%40%3Ccommon-dev.hadoop.apache.org%3E
> > > > [2]
> > > >
> > >
> >
> https://lists.apache.org/thread.html/e392b902273ee0c14ba34d72c44630e05f54cb3976109af510592ea2%401403330080%40%3Ccommon-dev.hadoop.apache.org%3E
> > > > [3] https://hadoop.apache.org/bylaws.html
> > > >
> > > > On 2020/09/15 2:59, Masatake Iwasaki wrote:
> > > > > Hi folks,
> > > > >
> > > > > This is the first release candidate for the second release of
> Apache
> > > > Hadoop 2.10.
> > > > > It contains 218 fixes/improvements since 2.10.0 [1].
> > > > >
> > > > > The RC0 artifacts are at:
> > > > > http://home.apache.org/~iwasakims/hadoop-2.10.1-RC0/
> > > > >
> > > > > RC tag is release-2.10.1-RC0:
> > > > > https://github.com/apache/hadoop/tree/release-2.10.1-RC0
> > > > >
> > > > > The maven artifacts are hosted here:
> > > > >
> > >
> https://repository.apache.org/content/repositories/orgapachehadoop-1279/
> > > > >
> > > > > My public key is available here:
> > > > > https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> > > > >
> > > > > The vote will run for 5 days, until Saturday, September 19 at 10:00
> > am
> > > > PDT.
> > > > >
> > > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/issues/?jql=project%20in%20(HDFS%2C%20YARN%2C%20HADOOP%2C%20MAPREDUCE)%20AND%20resolution%20%3D%20Fixed%20AND%20fixVersion%20%3D%202.10.1
> > > > >
> > > > > Thanks,
> > > > > Masatake Iwasaki
> > > > >
> > > > >
> -
> > > > > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > > > > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> > > > For additional commands, e-mail:
> mapreduce-dev-h...@hadoop.apache.org
> > > >
> > > >
> > >
> >
>


Re: Unstable Unit Tests in Trunk

2020-09-09 Thread Mingliang Liu
Thanks Eric.

I also see some intermittent test failures in recent builds.

Maybe we can mark those failing tests as @Flaky, and/or add the RetryRule()
to them. Thoughts?

On Tue, Sep 1, 2020 at 10:48 AM Eric Badger
 wrote:

> While putting up patches for HADOOP-17169
>  I noticed that the
> unit tests in trunk, specifically in HDFS, are incredibly unstable. Every
> time I put up a new patch, 4-8 unit tests failed with failures that were
> completely unrelated to the patch. I'm pretty confident in that since the
> patch is simply changing variable names. I also ran the unit tests locally
> and they would pass (or fail intermittently).
>
> Is there an effort to stabilize the unit tests? I don't know if these are
> bugs or if they're bad tests. But in either case, it's bad for the
> stability of the project.
>
> Eric
>


-- 
L


Re: How to manually trigger a PreCommit build for a github PR?

2020-09-09 Thread Mingliang Liu
> you mean to say rerun jenkins, that means it ran once, you want to run it
again without any code changes?
Yes.

Thank you Ayush!

On Tue, Sep 8, 2020 at 6:20 PM Ayush Saxena  wrote:

> Hi Mingliang,
> If by running Pre-Commit without any code change, you mean to say rerun
> jenkins, that means it ran once, you want to run it again without any code
> changes?
> If so,
> You can go to that PR build and click on replay. For example :
> Once logged in, if you click on the below link :
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2281/2/
>
> You can see an option 'replay' on the left, Clicking on it will rerun the
> build.
>
> Secondly, If the last build isn't available, then I think creating an empty
> commit is the only way as Dinesh also suggested.
>
> -Ayush
>
> On Wed, 9 Sep 2020 at 03:43, Mingliang Liu  wrote:
>
> > Thanks Dinesh. This is very helpful.
> >
> > I will add this to the wiki page
> > <https://cwiki.apache.org/confluence/display/HADOOP/GitHub+Integration>
> if
> > this is the suggested way of doing it.
> >
> >
> >
> > On Tue, Sep 8, 2020 at 2:54 PM Dinesh Chitlangia <
> dchitlan...@cloudera.com
> > >
> > wrote:
> >
> > > You could try doing an empty commit.
> > >
> > > git commit --allow-empty -m 'trigger new CI check' && git push
> > >
> > >
> > > Thanks,
> > > Dinesh
> > >
> > >
> > >
> > > On Tue, Sep 8, 2020 at 5:39 PM Mingliang Liu 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> To trigger a PreCommit build without code change, I can make the JIRA
> > >> status "Patch Available" and provide the JIRA number to "Build With
> > >> Parameters" link
> > >> <
> > >>
> >
> https://ci-hadoop.apache.org/view/Hadoop/job/PreCommit-HADOOP-Build/build?delay=0sec
> > >> >
> > >> .
> > >>
> > >> Not sure how to do that for a PR without a real commit to the PR
> branch?
> > >>
> > >> Thanks,
> > >>
> > >
> >
>


Re: How to manually trigger a PreCommit build for a github PR?

2020-09-08 Thread Mingliang Liu
Thanks Dinesh. This is very helpful.

I will add this to the wiki page
<https://cwiki.apache.org/confluence/display/HADOOP/GitHub+Integration> if
this is the suggested way of doing it.



On Tue, Sep 8, 2020 at 2:54 PM Dinesh Chitlangia 
wrote:

> You could try doing an empty commit.
>
> git commit --allow-empty -m 'trigger new CI check' && git push
>
>
> Thanks,
> Dinesh
>
>
>
> On Tue, Sep 8, 2020 at 5:39 PM Mingliang Liu  wrote:
>
>> Hi,
>>
>> To trigger a PreCommit build without code change, I can make the JIRA
>> status "Patch Available" and provide the JIRA number to "Build With
>> Parameters" link
>> <
>> https://ci-hadoop.apache.org/view/Hadoop/job/PreCommit-HADOOP-Build/build?delay=0sec
>> >
>> .
>>
>> Not sure how to do that for a PR without a real commit to the PR branch?
>>
>> Thanks,
>>
>


[jira] [Created] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-17252:
--

 Summary: Website to link to latest Hadoop wiki
 Key: HADOOP-17252
 URL: https://issues.apache.org/jira/browse/HADOOP-17252
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu


Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
Shall we update that to the latest one: 
https://cwiki.apache.org/confluence/display/HADOOP2/Home



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



How to manually trigger a PreCommit build for a github PR?

2020-09-08 Thread Mingliang Liu
Hi,

To trigger a PreCommit build without code change, I can make the JIRA
status "Patch Available" and provide the JIRA number to "Build With
Parameters" link

.

Not sure how to do that for a PR without a real commit to the PR branch?

Thanks,


Re: [DISCUSS] Hadoop 2.10.1 release

2020-08-31 Thread Mingliang Liu
I can see how I can help, but I can not take the RM role this time.

Thanks,

On Mon, Aug 31, 2020 at 12:15 PM Wei-Chiu Chuang
 wrote:

> Hello,
>
> I see that Masatake graciously agreed to volunteer with the Hadoop 2.10.1
> release work in the 2.9 branch EOL discussion thread
> https://s.apache.org/hadoop2.9eold
>
> Anyone else likes to contribute also?
>
> Thanks
>


-- 
L


Re: [VOTE] End of Life Hadoop 2.9

2020-08-31 Thread Mingliang Liu
+1 (binding)

p.s. We should still maintain 2.10 for Hadoop 2 users and encourage them to
upgrade to Hadoop 3.

On Mon, Aug 31, 2020 at 12:10 PM Wei-Chiu Chuang  wrote:

> Dear fellow Hadoop developers,
>
> Given the overwhelming feedback from the discussion thread
> https://s.apache.org/hadoop2.9eold, I'd like to start an official vote
> thread for the community to vote and start the 2.9 EOL process.
>
> What this entails:
>
> (1) an official announcement that no further regular Hadoop 2.9.x releases
> will be made after 2.9.2 (which was GA on 11/19/2019)
> (2) resolve JIRAs that specifically target 2.9.3 as won't fix.
>
>
> This vote will run for 7 days and will conclude by September 7th, 12:00pm
> pacific time.
> Committers are eligible to cast binding votes. Non-committers are welcomed
> to cast non-binding votes.
>
> Here is my vote, +1
>


-- 
L


[jira] [Resolved] (HADOOP-17159) Make UGI support forceful relogin from keytab ignoring the last login time

2020-08-27 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17159.

Fix Version/s: 2.10.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to 2.10.1 and 3.1.5+ see "Fix Version/s". Thank you for your 
contribution, [~sandeep.guggilam]

> Make UGI support forceful relogin from keytab ignoring the last login time
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] fate of branch-2.9

2020-08-27 Thread Mingliang Liu
> are there any Hadoop branch-2 releases planned, ever? If so I'll need to
backport my s3a directory compatibility patch to whatever is still live.

The branch-2 is gone. I think you mean branch-2.10, Steve.

Many HBase users are still using Hadoop 2, so I hope Hadoop 2.10.x should
still be released at least every 12 months. If there is no volunteer for
2.10.1 RM, I can see how I can help.

Thanks,

On Thu, Aug 27, 2020 at 8:55 AM John Zhuge  wrote:

> +1
>
> On Thu, Aug 27, 2020 at 6:01 AM Ayush Saxena  wrote:
>
> > +1
> >
> > -Ayush
> >
> > > On 27-Aug-2020, at 6:24 PM, Steve Loughran  >
> > wrote:
> > >
> > > 
> > >
> > > +1
> > >
> > > are there any Hadoop branch-2 releases planned, ever? If so I'll need
> to
> > backport my s3a directory compatibility patch to whatever is still live.
> > >
> > >
> > >> On Thu, 27 Aug 2020 at 06:55, Wei-Chiu Chuang 
> > wrote:
> > >> Bump up this thread after 6 months.
> > >>
> > >> Is anyone still interested in the 2.9 release line? Or are we good to
> > start
> > >> the EOL process? The 2.9.2 was released in Nov 2018.
> > >>
> > >> I'd really like to see the community to converge to fewer release
> lines
> > and
> > >> make more frequent releases in each line.
> > >>
> > >> Thanks,
> > >> Weichiu
> > >>
> > >>
> > >> On Fri, Mar 6, 2020 at 5:47 PM Wei-Chiu Chuang 
> > wrote:
> > >>
> > >> > I think that's a great suggestion.
> > >> > Currently, we make 1 minor release per year, and within each minor
> > release
> > >> > we bring up 1 thousand to 2 thousand commits in it compared with the
> > >> > previous one.
> > >> > I can totally understand it is a big bite for users to swallow.
> > Having a
> > >> > more frequent release cycle, plus LTS and non-LTS releases should
> > help with
> > >> > this. (Of course we will need to make the release preparation much
> > easier,
> > >> > which is currently a pain)
> > >> >
> > >> > I am happy to discuss the release model further in the dev ML. LTS
> > v.s.
> > >> > non-LTS is one suggestion.
> > >> >
> > >> > Another similar issue: In the past Hadoop strived to
> > >> > maintain compatibility. However, this is no longer sustainable as
> > more CVEs
> > >> > coming from our dependencies: netty, jetty, jackson ... etc.
> > >> > In many cases, updating the dependencies brings breaking changes.
> More
> > >> > recently, especially in Hadoop 3.x, I started to make the effort to
> > update
> > >> > dependencies much more frequently. How do users feel about this
> > change?
> > >> >
> > >> > On Thu, Mar 5, 2020 at 7:58 AM Igor Dvorzhak  >
> > >> > wrote:
> > >> >
> > >> >> Maybe Hadoop will benefit from adopting a similar release and
> support
> > >> >> strategy as Java? I.e. designate some releases as LTS and support
> > them for
> > >> >> 2 (?) years (it seems that 2.7.x branch was de-facto LTS), other
> > non-LTS
> > >> >> releases will be supported for 6 months (or until next release).
> This
> > >> >> should allow to reduce maintenance cost of non-LTS release and
> > provide
> > >> >> conservative users desired stability by allowing them to wait for
> > new LTS
> > >> >> release and upgrading to it.
> > >> >>
> > >> >> On Thu, Mar 5, 2020 at 1:26 AM Rupert Mazzucco <
> > rupert.mazzu...@gmail.com>
> > >> >> wrote:
> > >> >>
> > >> >>> After recently jumping from 2.7.7 to 2.10 without issue myself, I
> > vote
> > >> >>> for keeping only the 2.10 line.
> > >> >>> It would seem all other 2.x branches can upgrade to a 2.10.x
> easily
> > if
> > >> >>> they feel like upgrading at all,
> > >> >>> unlike a jump to 3.x, which may require more planning.
> > >> >>>
> > >> >>> I also vote for having only one main 3.x branch. Why are there
> > 3.1.x and
> > >> >>> 3.2.x seemingly competing,
> > >> >>> and now 3.3.x? For a community that does not have the resources to
> > >> >>> manage multiple release lines,
> > >> >>> you guys sure like to multiply release lines a lot.
> > >> >>>
> > >> >>> Cheers
> > >> >>> Rupert
> > >> >>>
> > >> >>> Am Mi., 4. März 2020 um 19:40 Uhr schrieb Wei-Chiu Chuang
> > >> >>> :
> > >> >>>
> > >>  Forwarding the discussion thread from the dev mailing lists to
> the
> > user
> > >>  mailing lists.
> > >> 
> > >>  I'd like to get an idea of how many users are still on Hadoop
> 2.9.
> > >>  Please share your thoughts.
> > >> 
> > >>  On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
> > >>   wrote:
> > >> 
> > >> > +1
> > >> >
> > >> > Sent from Yahoo Mail on Android
> > >> >
> > >> >   On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang<
> > weic...@apache.org>
> > >> > wrote:   Hi,
> > >> >
> > >> > Following the discussion to end branch-2.8, I want to start a
> > >> > discussion
> > >> > around what's next with branch-2.9. I am hesitant to use the
> word
> > "end
> > >> > of
> > >> > life" but consider these facts:
> > >> >
> > >> > * 2.9.0 was released Dec 17, 2017.
> > >> > * 2.9.2, the last 2.9.x release, went out Nov 19 

Re: [DISCUSS] GitHub PR link auto-posting to JIRA?

2020-08-27 Thread Mingliang Liu
Thank you very much Ayush!

In HDFS-15025, the ASF GitHub Bot is able to edit the "Time Tracking"
fields on the right side. This indicates the permission should work now. I
do not have a new PR filed. We can check incoming PRs and see if they get
posted to JIRA automatically.

On Wed, Aug 26, 2020 at 9:40 PM Ayush Saxena  wrote:

> Hi Mingliang,
> I think this issue has been there for a couple of months, It used to work
> earlier IIRC.
> I tried checking a bit, I think ASF-GITHUB-BOT didn't have permissions, As
> of now I added it as HDFS-Contributor-1(temporarily) and I just saw one
> notification on HDFS-15025 from Github.
> Can you check if that solves the issue?
>
> -Ayush
>
> On Thu, 27 Aug 2020 at 03:41, Mingliang Liu  wrote:
>
>> Hi,
>>
>> I found that GitHub PR will not show up as "links" of the JIRA even if the
>> PR subject starts with a JIRA number.
>>
>> Is this a known issue? I see this works for HBase projects, but not
>> Hadoop.
>>
>> Thanks,
>>
>


Re: [DISCUSS] fate of branch-2.9

2020-08-27 Thread Mingliang Liu
+1 for putting 2.9 lines to EOL.

Let's focus on 2.10 releases for Hadoop 2. Also is there any plan for
2.10.1? It has been 11 months since 2.10 first release.

Thanks,

On Wed, Aug 26, 2020 at 10:57 PM Wei-Chiu Chuang  wrote:

> Bump up this thread after 6 months.
>
> Is anyone still interested in the 2.9 release line? Or are we good to start
> the EOL process? The 2.9.2 was released in Nov 2018.
>
> I'd really like to see the community to converge to fewer release lines and
> make more frequent releases in each line.
>
> Thanks,
> Weichiu
>
>
> On Fri, Mar 6, 2020 at 5:47 PM Wei-Chiu Chuang  wrote:
>
> > I think that's a great suggestion.
> > Currently, we make 1 minor release per year, and within each minor
> release
> > we bring up 1 thousand to 2 thousand commits in it compared with the
> > previous one.
> > I can totally understand it is a big bite for users to swallow. Having a
> > more frequent release cycle, plus LTS and non-LTS releases should help
> with
> > this. (Of course we will need to make the release preparation much
> easier,
> > which is currently a pain)
> >
> > I am happy to discuss the release model further in the dev ML. LTS v.s.
> > non-LTS is one suggestion.
> >
> > Another similar issue: In the past Hadoop strived to
> > maintain compatibility. However, this is no longer sustainable as more
> CVEs
> > coming from our dependencies: netty, jetty, jackson ... etc.
> > In many cases, updating the dependencies brings breaking changes. More
> > recently, especially in Hadoop 3.x, I started to make the effort to
> update
> > dependencies much more frequently. How do users feel about this change?
> >
> > On Thu, Mar 5, 2020 at 7:58 AM Igor Dvorzhak 
> > wrote:
> >
> >> Maybe Hadoop will benefit from adopting a similar release and support
> >> strategy as Java? I.e. designate some releases as LTS and support them
> for
> >> 2 (?) years (it seems that 2.7.x branch was de-facto LTS), other non-LTS
> >> releases will be supported for 6 months (or until next release). This
> >> should allow to reduce maintenance cost of non-LTS release and provide
> >> conservative users desired stability by allowing them to wait for new
> LTS
> >> release and upgrading to it.
> >>
> >> On Thu, Mar 5, 2020 at 1:26 AM Rupert Mazzucco <
> rupert.mazzu...@gmail.com>
> >> wrote:
> >>
> >>> After recently jumping from 2.7.7 to 2.10 without issue myself, I vote
> >>> for keeping only the 2.10 line.
> >>> It would seem all other 2.x branches can upgrade to a 2.10.x easily if
> >>> they feel like upgrading at all,
> >>> unlike a jump to 3.x, which may require more planning.
> >>>
> >>> I also vote for having only one main 3.x branch. Why are there 3.1.x
> and
> >>> 3.2.x seemingly competing,
> >>> and now 3.3.x? For a community that does not have the resources to
> >>> manage multiple release lines,
> >>> you guys sure like to multiply release lines a lot.
> >>>
> >>> Cheers
> >>> Rupert
> >>>
> >>> Am Mi., 4. März 2020 um 19:40 Uhr schrieb Wei-Chiu Chuang
> >>> :
> >>>
>  Forwarding the discussion thread from the dev mailing lists to the
> user
>  mailing lists.
> 
>  I'd like to get an idea of how many users are still on Hadoop 2.9.
>  Please share your thoughts.
> 
>  On Mon, Mar 2, 2020 at 6:30 PM Sree Vaddi
>   wrote:
> 
> > +1
> >
> > Sent from Yahoo Mail on Android
> >
> >   On Mon, Mar 2, 2020 at 5:12 PM, Wei-Chiu Chuang >
> > wrote:   Hi,
> >
> > Following the discussion to end branch-2.8, I want to start a
> > discussion
> > around what's next with branch-2.9. I am hesitant to use the word
> "end
> > of
> > life" but consider these facts:
> >
> > * 2.9.0 was released Dec 17, 2017.
> > * 2.9.2, the last 2.9.x release, went out Nov 19 2018, which is more
> > than
> > 15 months ago.
> > * no one seems to be interested in being the release manager for
> 2.9.3.
> > * Most if not all of the active Hadoop contributors are using Hadoop
> > 2.10
> > or Hadoop 3.x.
> > * We as a community do not have the cycle to manage multiple release
> > line,
> > especially since Hadoop 3.3.0 is coming out soon.
> >
> > It is perhaps the time to gradually reduce our footprint in Hadoop
> > 2.x, and
> > encourage people to upgrade to Hadoop 3.x
> >
> > Thoughts?
> >
> >
>


-- 
L


[DISCUSS] GitHub PR link auto-posting to JIRA?

2020-08-26 Thread Mingliang Liu
Hi,

I found that GitHub PR will not show up as "links" of the JIRA even if the
PR subject starts with a JIRA number.

Is this a known issue? I see this works for HBase projects, but not Hadoop.

Thanks,


[jira] [Created] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-17184:
--

 Summary: Add --mvn-custom-repos parameter to yetus calls
 Key: HADOOP-17184
 URL: https://issues.apache.org/jira/browse/HADOOP-17184
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Mingliang Liu


In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}

As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17052) NetUtils.connect() throws unchecked exception (UnresolvedAddressException) causing clients to abort

2020-06-01 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17052.

Fix Version/s: 3.4.0
   3.3.1
   2.10.1
   3.2.2
   3.1.4
   2.9.3
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to all fixed versions branches. Thanks [~dhegde] for the reporting 
and providing a fix; thanks [~hemanthboyina] for review.

> NetUtils.connect() throws unchecked exception (UnresolvedAddressException) 
> causing clients to abort
> ---
>
> Key: HADOOP-17052
> URL: https://issues.apache.org/jira/browse/HADOOP-17052
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.10.0, 2.9.2, 3.2.1, 3.1.3
>Reporter: Dhiraj Hegde
>Assignee: Dhiraj Hegde
>Priority: Major
> Fix For: 2.9.3, 3.1.4, 3.2.2, 2.10.1, 3.3.1, 3.4.0
>
> Attachments: read_failure.log, write_failure1.log, write_failure2.log
>
>
> Hadoop components are increasingly being deployed on VMs and containers. One 
> aspect of this environment is that DNS is dynamic. Hostname records get 
> modified (or deleted/recreated) as a container in Kubernetes (or even VM) is 
> being created/recreated. In such dynamic environments, the initial DNS 
> resolution request might return resolution failure briefly as DNS client 
> doesn't always get the latest records. This has been observed in Kubernetes 
> in particular. In such cases NetUtils.connect() appears to throw 
> java.nio.channels.UnresolvedAddressException.  In much of Hadoop code (like 
> DFSInputStream and DFSOutputStream), the code is designed to retry 
> IOException. However, since UnresolvedAddressException is not child of 
> IOException, no retry happens and the code aborts immediately. It is much 
> better if NetUtils.connect() throws java.net.UnknownHostException as that is 
> derived from IOException and the code will treat this as a retry-able error.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] making Ozone a separate Apache project

2020-05-18 Thread Mingliang Liu
+1

On Mon, May 18, 2020 at 12:37 AM Elek, Marton  wrote:

>
>
> > One question, for the committers who contributed to Ozone before and got
> > the committer-role in the past (like me), will they carry the
> > committer-role to the new repo?
>
>
> In short: yes.
>
>
> In more details:
>
> This discussion (if there is an agreement) should be followed by a next
> discussion + vote about a very specific proposal which should contain
> all the technical information (including committer list)
>
> I support the the same approach what we followed with Submarine:
>
> ALL the existing (Hadoop) committers should have a free / opt-in
> opportunity to be a committer in Ozone.
>
> (After proposal is created on the wiki, you can add your name, or
> request to be added. But as the initial list can be created based on
> statistics from the Jira, your name can be already there ;-) )
>
>
>
> Marton
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
>

-- 
L


Re: [Hadoop-3.3 Release update]- branch-3.3 has created

2020-04-24 Thread Mingliang Liu
Brahma,

What about https://issues.apache.org/jira/browse/HADOOP-17007?

Thanks,

On Fri, Apr 24, 2020 at 11:07 AM Brahma Reddy Battula 
wrote:

> Ok. Done. Branch created.
>
> Following blockers are pending, will closely track this.
>
> https://issues.apache.org/jira/browse/HDFS-15287 ( Open: Under discussion
> )
> https://issues.apache.org/jira/browse/YARN-10194 ( Patch Available)
> https://issues.apache.org/jira/browse/HDFS-15286 ( Patch Available)
> https://issues.apache.org/jira/browse/YARN-9898 ( Patch Available)
>
>
> On Fri, Apr 24, 2020 at 7:42 PM Wei-Chiu Chuang
>  wrote:
>
> > +1 we should have the branch ASAP.
> >
> > On Wed, Apr 22, 2020 at 11:07 PM Akira Ajisaka 
> > wrote:
> >
> > > > Since blockers are not closed, I didn't cut the branch because
> > > multiple branches might confuse or sombody might miss to commit.
> > >
> > > The current situation is already confusing. The 3.3.1 version already
> > > exists in JIRA, so some committers wrongly commit non-critical issues
> to
> > > branch-3.3 and set the fix version to 3.3.1.
> > > I think now we should cut branch-3.3.0 and freeze source code except
> the
> > > blockers.
> > >
> > > -Akira
> > >
> > > On Tue, Apr 21, 2020 at 3:05 PM Brahma Reddy Battula <
> bra...@apache.org>
> > > wrote:
> > >
> > >> Sure, I will do that.
> > >>
> > >> Since blockers are not closed, I didn't cut the branch because
> > >> multiple branches might confuse or sombody might miss to commit.Shall
> I
> > >> wait till this weekend to create..?
> > >>
> > >> On Mon, Apr 20, 2020 at 11:57 AM Akira Ajisaka 
> > >> wrote:
> > >>
> > >>> Hi Brahma,
> > >>>
> > >>> Thank you for preparing the release.
> > >>> Could you cut branch-3.3.0? I would like to backport some fixes for
> > >>> 3.3.1 and not for 3.3.0.
> > >>>
> > >>> Thanks and regards,
> > >>> Akira
> > >>>
> > >>> On Fri, Apr 17, 2020 at 11:11 AM Brahma Reddy Battula <
> > bra...@apache.org>
> > >>> wrote:
> > >>>
> >  Hi All,
> > 
> >  we are down to two blockers issues now (YARN-10194 and YARN-9848)
> > which
> >  are in patch available state.Hopefully we can out the RC soon.
> > 
> >  thanks to @Prabhu Joseph 
> > ,@masakate,@akira
> >  and @Wei-Chiu Chuang   and others for helping
> >  resloving the blockers.
> > 
> > 
> > 
> >  On Tue, Apr 14, 2020 at 10:49 PM Brahma Reddy Battula <
> >  bra...@apache.org> wrote:
> > 
> > >
> > > @Prabhu Joseph 
> > > >>> Have committed the YARN blocker YARN-10219 to trunk and
> > > cherry-picked to branch-3.3. Right now, there are two blocker
> Jiras -
> > > YARN-10233 and HADOOP-16982
> > > which i will help to review and commit. Thanks.
> > >
> > > Looks you committed YARN-10219. Noted YARN-10233 and HADOOP-16982
> as
> > a
> > > blockers. (without YARN-10233 we have given so many releases,it's
> > not newly
> > > introduced.).. Thanks
> > >
> > > @Vinod Kumar Vavilapalli  ,@adam Antal,
> > >
> > > I noted YARN-9848 as a blocker as you mentioned above.
> > >
> > > @All,
> > >
> > > Currently following four blockers are pending for 3.3.0 RC.
> > >
> > > HADOOP-16963,YARN-10233,HADOOP-16982 and YARN-9848.
> > >
> > >
> > >
> > > On Tue, Apr 14, 2020 at 8:11 PM Vinod Kumar Vavilapalli <
> > > vino...@apache.org> wrote:
> > >
> > >> Looks like a really bad bug to me.
> > >>
> > >> +1 for revert and +1 for making that a 3.3.0 blocker. I think
> should
> > >> also revert it in a 3.2 maintenance release too.
> > >>
> > >> Thanks
> > >> +Vinod
> > >>
> > >> > On Apr 14, 2020, at 5:03 PM, Adam Antal <
> adam.an...@cloudera.com
> > .INVALID>
> > >> wrote:
> > >> >
> > >> > Hi everyone,
> > >> >
> > >> > Sorry for coming a bit late with this, but there's also one jira
> > >> that can
> > >> > have potential impact on clusters and we should talk about it.
> > >> >
> > >> > Steven Rand found this problem earlier and commented to
> > >> > https://issues.apache.org/jira/browse/YARN-4946.
> > >> > The bug has impact on the RM state store: the RM does not delete
> > >> apps - see
> > >> > more details in his comment here:
> > >> >
> > >>
> >
> https://issues.apache.org/jira/browse/YARN-4946?focusedCommentId=16898599=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16898599
> > >> > .
> > >> > (FYI He also created
> > >> https://issues.apache.org/jira/browse/YARN-9848 with
> > >> > the revert task).
> > >> >
> > >> > It might not be an actual blocker, but since there wasn't any
> > >> consensus
> > >> > about a follow up action, I thought we should decide how to
> > proceed
> > >> before
> > >> > release 3.3.0.
> > >> >
> > >> > Regards,
> > >> > Adam
> > >> >
> > >> > On Tue, Apr 14, 2020 at 9:35 AM Prabhu Joseph <
> > >> 

[jira] [Resolved] (HADOOP-17013) this bug is bla bla bla

2020-04-24 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17013.

   Fix Version/s: (was: 0.24.0)
Target Version/s:   (was: 3.2.1)
  Resolution: Invalid

This is not fun [~islam.saied]. Please stop doing this. 

> this bug is bla bla bla
> ---
>
> Key: HADOOP-17013
> URL: https://issues.apache.org/jira/browse/HADOOP-17013
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: bin
>Affects Versions: 3.2.1
>Reporter: islam
>Priority: Major
>  Labels: bulk-closed
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16853) ITestS3GuardOutOfBandOperations failing on versioned S3 buckets

2020-02-24 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-16853.

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to {{trunk}}. Thanks [~ste...@apache.org] for reporting and fixing. 
Thanks [~gabor.bota] for review.

> ITestS3GuardOutOfBandOperations failing on versioned S3 buckets
> ---
>
> Key: HADOOP-16853
> URL: https://issues.apache.org/jira/browse/HADOOP-16853
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.3.0
>
>
> org.apache.hadoop.fs.s3a.ITestS3GuardOutOfBandOperations.testListingDelete[auth=true]
> failing because the deleted file can still be read when the s3guard entry has 
> the versionId.
> Proposed: if the FS is versioned and the file status has versionID then we 
> switch to tests which assert the file is readable, rather than tests which 
> assert it isn't there



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16827) TestHarFileSystem.testInheritedMethodsImplemented broken

2020-01-24 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-16827.

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> TestHarFileSystem.testInheritedMethodsImplemented broken
> 
>
> Key: HADOOP-16827
> URL: https://issues.apache.org/jira/browse/HADOOP-16827
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, test
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 3.3.0
>
>
> caused by HADOOP-16759.
> I'm surprised this didn't surface earlier -or embarrassed that if it did, 
> somehow I missed it. Will fix.
>  
> will also review the checksum FS to make sure it's gone through there too. 
>  
> Given I was using the IDE to refactor, it should have all been automatic



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16759) Filesystem openFile() builder to take a FileStatus param

2020-01-21 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-16759.

Fix Version/s: 3.3.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to {{trunk}} branch. I see major conflicts when backporting to 3.2. 
[~ste...@apache.org] Do you plan to bulk backport some time later those patches 
which apply, or you upload a new patch for branch-3.2/3.1? Thanks!

> Filesystem openFile() builder to take a FileStatus param
> 
>
> Key: HADOOP-16759
> URL: https://issues.apache.org/jira/browse/HADOOP-16759
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs, fs/azure, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 3.3.0
>
>
> Let us pass in a file status to openFile() so that S3A  & ABFS will skip 
> their own HEAD requests just to see if a file is there, a normal file and get 
> its length + etag, 
> {code}
> CompletableFuture streamF = fs.openFile(stat.getPath())
>   .withFileStatus(stat).build()
> {code}
> code opening files off a listing of everything in a directory can eliminate a 
> lot of requests here.
> Also: change the specification of openFile's completable future to say 
> "returned stream may only raise FNFE, access restrictions on the first read"
> That is: it's not just potentially an async open, it's possibly lazy 
> evaluated entirely. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16758) Refine testing.md to tell user better how to use auth-keys.xml

2019-12-11 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-16758:
--

 Summary: Refine testing.md to tell user better how to use 
auth-keys.xml
 Key: HADOOP-16758
 URL: https://issues.apache.org/jira/browse/HADOOP-16758
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


I have a small patch to refine \{{testing.md}} about `auth-keys.xml`. I think 
this might be applicable to others.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16757) Increase timeout unit test rule for ITestDynamoDBMetadataStore

2019-12-10 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-16757:
--

 Summary: Increase timeout unit test rule for 
ITestDynamoDBMetadataStore
 Key: HADOOP-16757
 URL: https://issues.apache.org/jira/browse/HADOOP-16757
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Mingliang Liu


Not sure if this is a good proposal, but I saw a few cases where some 
integration test methods in {{ITestDynamoDBMetadataStore}} timed out simply. 
Specially, the one keeps failing me today is {{testAncestorOverwriteConflict}}. 
I increase the timeout to 20s and it works for me happily. Am I using VPN and a 
slow home network, I'm afraid so.

The time out rule, as inherited from base class {{HadoopTestBase}}, is 10s by 
default. Though that 10s time out default value can be overridden in base class 
via system property {{test.default.timeout}}, that's system wide affecting all 
other tests. Changing that time out value for one test is no better than 
overriding in this test {{ITestDynamoDBMetadataStore}} explicitly. I think 
doubling it to 20s would not be very crazy considering we are testing against a 
remote web service, create and destroy tables sometimes.

{code}
  @Rule
  public Timeout timeout = new Timeout(20 * 1000);
{code}

Thoughts?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16735) Make it clearer in config default that EnvironmentVariableCredentialsProvider supports AWS_SESSION_TOKEN

2019-11-28 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-16735:
--

 Summary: Make it clearer in config default that 
EnvironmentVariableCredentialsProvider supports AWS_SESSION_TOKEN
 Key: HADOOP-16735
 URL: https://issues.apache.org/jira/browse/HADOOP-16735
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation, fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


In the great doc {{hadoop-aws/tools/hadoop-aws/index.html}}, user can find that 
authenticating via the AWS Environment Variables supports session token. 
However, the config description in core-default.xml does not make it clear.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16732) S3Guard to support encrypted DynamoDB table

2019-11-28 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-16732:
--

 Summary: S3Guard to support encrypted DynamoDB table
 Key: HADOOP-16732
 URL: https://issues.apache.org/jira/browse/HADOOP-16732
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Mingliang Liu


S3Guard is not yet supporting [encrypted DynamoDB 
table|https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/encryption.tutorial.html].
 We can provide an option to enable encrypted DynamoDB table so data at rest 
could be encrypted. S3Guard data in DynamoDB usually is not sensitive since 
it's the S3 namespace mirroring, but some times even this is a concern. By 
default it's not enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16722) S3GuardTool to support FilterFileSystem

2019-11-19 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-16722:
--

 Summary: S3GuardTool to support FilterFileSystem
 Key: HADOOP-16722
 URL: https://issues.apache.org/jira/browse/HADOOP-16722
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/s3
Reporter: Mingliang Liu


Currently S3GuardTool is operating against a S3AFileSystem implementation. 
There are cases where the {{fs.hboss.fs.s3a.impl}} is a FilterFileSystem 
wrapping another implementation. For example, [[HBASE-23314]] made 
{{HBaseObjectStoreSemantics}} a {{FilterFileSystem}} from {{FileSystem}}. 
S3GuardTool could use {{FilterFileSystem::getRawFileSystem}} method to get the 
wrapped S3AFileSystem implementation. Without this support, a simple 
S3GuardTool against HBOSS will get confusing error like "s3a://mybucket is not 
a S3A file system".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-16355) ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store

2019-06-07 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-16355:
--

 Summary: ZookeeperMetadataStore: Use Zookeeper as S3Guard backend 
store
 Key: HADOOP-16355
 URL: https://issues.apache.org/jira/browse/HADOOP-16355
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs
Reporter: Mingliang Liu


When S3Guard was proposed, there are a couple of valid reasons to choose 
DynamoDB as its default backend store: 0) seamless integration as part of AWS 
ecosystem e.g. client library 1) it's a managed web service which is zero 
operational cost, highly available and infinitely scalable 2) it's performant 
with single digit latency 3) it's proven by Netflix's S3mper (not actively 
maintained) and EMRFS (closed source and usage). As it's pluggable, it's 
possible to implement {{MetadataStore}} with other backend store without 
changing semantics, besides null and in-memory local ones.

Here we propose {{ZookeeperMetadataStore}} which uses Zookeeper as S3Guard 
backend store. Its main motivation is to provide a new MetadataStore option 
which:
 # can be easily integrated as Zookeeper is heavily used in Hadoop community
 # affordable performance as both client and Zookeeper ensemble are usually 
"local" in a Hadoop cluster (ZK/HBase/Hive etc)
 # removes DynamoDB dependency

Obviously all use cases will not prefer this to default DynamoDB store. For 
e.g. ZK might not scale well if there are dozens of S3 buckets and each has 
millions of objects.

Our use case is targeting HBase to store HFiles on S3 instead of HDFS. A total 
solution for HBase on S3 must be HBOSS (see HBASE-22149) for recovering 
atomicity of metadata operations like rename, and S3Guard for consistent 
enumeration and access to object store bucket metadata. We would like to use 
Zookeeper as backend store for both.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: HADOOP-14163 proposal for new hadoop.apache.org

2018-09-04 Thread Mingliang Liu
bout
> >>> the documentation which is generated by mvn-site (as before)
> >>>
> >>>
> >>> I got multiple valuable feedback and I improved the proposed site
> >>> according to the comments. Allen had some concerns about the used
> >>> technologies (hugo vs. mvn-site) and I answered all the questions why
> >>> I think mvn-site is the best for documentation and hugo is best for
> >>> generating site.
> >>>
> >>>
> >>> I would like to finish this effort/jira: I would like to start a
> >>> discussion about using this proposed version and approach as a new
> >>> site of Apache Hadoop. Please let me know what you think.
> >>>
> >>>
> >>> Thanks a lot,
> >>> Marton
> >>>
> >>> -
> >>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>

-- 
Mingliang Liu


Re: [VOTE] Release Apache Hadoop 2.8.4 (RC0)

2018-05-12 Thread Mingliang Liu
+1 (binding)

1. Download src file, and check md5
2. Build package from src
3. Create a 6 node cluster in Docker containers and start the HDFS/YARN
services (The script is caochong <https://github.com/weiqingy/caochong>).
4. Verify the running version
5. Run simple HDFS/YARN client/admin commands and verify the output
6. Run example programs wordcount and grep
7. Check NN/DN/RM status and logs

Thanks Junping for driving this!

On Tue, May 8, 2018 at 10:41 AM, 俊平堵 <junping...@apache.org> wrote:

> Hi all,
>  I've created the first release candidate (RC0) for Apache Hadoop
> 2.8.4. This is our next maint release to follow up 2.8.3. It includes 77
> important fixes and improvements.
>
> The RC artifacts are available at:
> http://home.apache.org/~junping_du/hadoop-2.8.4-RC0
>
> The RC tag in git is: release-2.8.4-RC0
>
> The maven artifacts are available via repository.apache.org<
> http://repository.apache.org> at:
> https://repository.apache.org/content/repositories/orgapachehadoop-1118
>
> Please try the release and vote; the vote will run for the usual 5
> working days, ending on 5/14/2018 PST time.
>
> Thanks,
>
> Junping
>



-- 
Mingliang Liu


Re: [DISCUSS] official docker image(s) for hadoop

2017-09-13 Thread Mingliang Liu
> It would be very helpful for testing the RC.
For testing and voting, I have been using docker containers for a while, see 
code at: https://github.com/weiqingy/caochong 


> TL;DR: I propose to create official hadoop images and upload them to the 
> dockerhub
I’m +1 on this idea. The “official” docker image basically means a commitment 
to maintain well documented and broadly tested images, which seems not a burden 
to us.

Ceph has a community docker project https://github.com/ceph/ceph-docker 
 and I think our scope here is similar to 
it.

Mingliang

> On Sep 13, 2017, at 11:39 AM, Yufei Gu  wrote:
> 
> It would be very helpful for testing the RC. To vote a RC, committers and
> PMCs usually spend lots of time to compile, deploy the RC, do several
> sanity tests, then +1 for the RC. The docker image potentially saves the
> compilation and deployment time, and people can do more tests.
> 
> Best,
> 
> Yufei
> 
> On Wed, Sep 13, 2017 at 11:19 AM, Wangda Tan  wrote:
> 
>> +1 to add Hadoop docker image for easier testing / prototyping, it gonna be
>> super helpful!
>> 
>> Thanks,
>> Wangda
>> 
>> On Wed, Sep 13, 2017 at 10:48 AM, Miklos Szegedi <
>> miklos.szeg...@cloudera.com> wrote:
>> 
>>> Marton, thank you for working on this. I think Official Docker images for
>>> Hadoop would be very useful for a lot of reasons. I think that it is
>> better
>>> to have a coordinated effort with production ready base images with
>>> dependent images for prototyping. Does anyone else have an opinion about
>>> this?
>>> 
>>> Thank you,
>>> Miklos
>>> 
>>> On Fri, Sep 8, 2017 at 5:45 AM, Marton, Elek  wrote:
>>> 
 
 TL;DR: I propose to create official hadoop images and upload them to
>> the
 dockerhub.
 
 GOAL/SCOPE: I would like improve the existing documentation with
 easy-to-use docker based recipes to start hadoop clusters with various
 configuration.
 
 The images also could be used to test experimental features. For
>> example
 ozone could be tested easily with these compose file and configuration:
 
 https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6
 
 Or even the configuration could be included in the compose file:
 
 https://github.com/elek/hadoop/blob/docker-2.8.0/example/doc
 ker-compose.yaml
 
 I would like to create separated example compose files for federation,
>>> ha,
 metrics usage, etc. to make it easier to try out and understand the
 features.
 
 CONTEXT: There is an existing Jira https://issues.apache.org/jira
 /browse/HADOOP-13397
 But it’s about a tool to generate production quality docker images
 (multiple types, in a flexible way). If no objections, I will create a
 separated issue to create simplified docker images for rapid
>> prototyping
 and investigating new features. And register the branch to the
>> dockerhub
>>> to
 create the images automatically.
 
 MY BACKGROUND: I am working with docker based hadoop/spark clusters
>> quite
 a while and run them succesfully in different environments (kubernetes,
 docker-swarm, nomad-based scheduling, etc.) My work is available from
>>> here:
 https://github.com/flokkr but they could handle more complex use cases
 (eg. instrumenting java processes with btrace, or read/reload
>>> configuration
 from consul).
 And IMHO in the official hadoop documentation it’s better to suggest
>> to
 use official apache docker images and not external ones (which could be
 changed).
 
 Please let me know if you have any comments.
 
 Marton
 
 -
 To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
 For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
 
 
>>> 
>> 



Re: [VOTE] Release Apache Hadoop 2.8.2 (RC0)

2017-09-10 Thread Mingliang Liu
Thanks Junping for doing this!

+1 (non-binding)

- Download the hadoop-2.8.2-src.tar.gz file and checked the md5 value
- Build package using maven (skipping tests) with Java 8
- Spin up a test cluster in Docker containers having 1 master node (NN/RM) and 
3 slave nodes (DN/NM)
- Operate the basic HDFS/YARN operations from command line, both client and 
admin
- Check NN/RM Web UI
- Run distcp to copy files from/to local and HDFS
- Run hadoop mapreduce examples: grep and wordcount
- Check the HDFS service logs

All looked good to me.

Mingliang

> On Sep 10, 2017, at 5:00 PM, Junping Du  wrote:
> 
> Hi folks,
> With fix of HADOOP-14842 get in, I've created our first release candidate 
> (RC0) for Apache Hadoop 2.8.2.
> 
> Apache Hadoop 2.8.2 is the first stable release of Hadoop 2.8 line and 
> will be the latest stable/production release for Apache Hadoop - it includes 
> 305 new fixed issues since 2.8.1 and 63 fixes are marked as blocker/critical 
> issues.
> 
>  More information about the 2.8.2 release plan can be found here: 
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
> 
>  New RC is available at: 
> http://home.apache.org/~junping_du/hadoop-2.8.2-RC0
> 
>  The RC tag in git is: release-2.8.2-RC0, and the latest commit id is: 
> e6597fe3000b06847d2bf55f2bab81770f4b2505
> 
>  The maven artifacts are available via repository.apache.org at: 
> https://repository.apache.org/content/repositories/orgapachehadoop-1062
> 
>  Please try the release and vote; the vote will run for the usual 5 days, 
> ending on 09/15/2017 5pm PST time.
> 
> Thanks,
> 
> Junping
> 


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Merge HADOOP-13345 (S3Guard feature branch)

2017-08-24 Thread Mingliang Liu
Thanks Andrew. Arpit also told me about this but I forgot to bring it up here.

Best,

> On Aug 24, 2017, at 10:59 AM, Andrew Wang <andrew.w...@cloudera.com> wrote:
> 
> FYI that committer +1s are binding on merges, so Sean and Mingliang's +1s
> can be upgraded to binding.
> 
> On Thu, Aug 24, 2017 at 6:09 AM, Kihwal Lee <kih...@oath.com.invalid> wrote:
> 
>> +1 (binding)
>> Great work guys!
>> 
>> On Thu, Aug 24, 2017 at 5:01 AM, Steve Loughran <ste...@hortonworks.com>
>> wrote:
>> 
>>> 
>>> On 23 Aug 2017, at 19:21, Aaron Fabbri <fab...@cloudera.com<mailto:fa
>>> b...@cloudera.com>> wrote:
>>> 
>>> 
>>> On Tue, Aug 22, 2017 at 10:24 AM, Steve Loughran <ste...@hortonworks.com
>> <
>>> mailto:ste...@hortonworks.com>> wrote:
>>> video being processed:  https://www.youtube.com/watch?
>>> v=oIe5Zl2YsLE=youtu.be
>>> 
>>> 
>>> Awesome demo Steve, thanks for doing this.  Particularly glad to see
>> folks
>>> using and extending the failure injection client.
>>> 
>>> The HADOOP-13786 iteration turns on throttle event generation. All the
>> new
>>> committer stuff is ready for it, but all the existing S3A FS ops react
>> to a
>>> throttle exception by failing, when they need to just back off a bit.
>> This
>>> complicates testing as I have to explicitly turn off fault injection for
>>> setup & teardown
>>> 
>>> 
>>> Demoing the CLI tool was great as well.
>>> 
>>> 
>>> I'm going to have to do another iteration on that CLI tool post-merge, as
>>> I had one big problem: working out if the bucket and all the binding
>>> settings meant it was "guarded". I think we'll need to track what issues
>>> like that crop up in the field and add the diagnostics/other options.
>>> 
>>> +I think another one that'd be useful would be to enum all s3guard DDB
>>> tables in a region/globally & list their allocated IOPs. I know the AWS
>> UI
>>> can list tables by region, but you need to look around every region to
>> find
>>> out if you've accidentally created one. If you enum all table & look for
>> a
>>> s3guard version marker, then you can identify tables.
>>> 
>>> Wanted to mention two things:
>>> 
>>> 1. Authoritative mode is not fully implemented yet with Dynamo (it needs
>>> to persist an extra bit for directories).  I do have an auth-mode patch
>>> (done for a hackathon) that I need to post which shows large performance
>>> improvements over what S3Guard has today.  As you said, we don't consider
>>> authoritative mode ready for production yet: we want to play with it more
>>> and improve the prune algorithm first.  Authoritative mode can be thought
>>> of as a nice bonus in the future: The main goal of S3Guard v1 is to fix
>> the
>>> get / list consistency issues you mentioned, which it does well.
>>> 
>>> 
>>> we need to call that out in the release notes.
>>> 
>>> 2. Also wanted to thank Lei (Eddy) Xu, he was very active during early
>>> design and contributed some patches as well.
>>> 
>>> 
>>> good point. Lei: you will get a special mention the next time I do the
>> demo
>>> 
>>> 
>>> Again, great demo, enjoyed it!
>>> 
>>> -AF
>>> 
>>> 
>>> its actually quite hard to show any benefits of s3guard on the command
>>> line, so I've ended up showing some scala tests where I turn on the
>>> (bundled) inconsistent AWS client to show how you then need to enable
>>> s3guard to make the stack traces go away
>>> 
>>> 
>>> On 22 Aug 2017, at 11:17, Steve Loughran <ste...@hortonworks.com>> ste...@hortonworks.com><mailto:ste...@hortonworks.com>> ste...@hortonworks.com>>> wrote:
>>> 
>>> +1 (binding)
>>> 
>>> I'm happy with it; it's a great piece of work by (in no particular
>> order):
>>> Chris Nauroth, Aaron Fabbri, Sean McRory & Mingliang Liu. plus a few bits
>>> in the corners where I got to break things while they were all asleep.
>> Also
>>> deserving a mention: Thomas Demoor & Ewan Higgs @ WDC for consultancy on
>>> the corners of S3, everyone who tested in (including our QA team), Sanjay
>>> Radia, & others.
>>> 
>>> I've already done a couple of iterations of fixing checksyles & code
>&g

Re: [VOTE] Merge HADOOP-13345 (S3Guard feature branch)

2017-08-19 Thread Mingliang Liu
+1 (non-binding)

I also worked on this project from start to finish and I really enjoyed the 
collaboration in community. The feature is to solve the very important and 
challenging consistency problem as stated in the design doc. All patches were 
reviewed by feature/trunk committers and we have been testing it with real 
world applications. Overall I think it is now production ready. Most 
contributors of this project are active in community and I believe the future 
work and code maintenance will be well addressed.

Thanks,

> On Aug 17, 2017, at 3:07 PM, Aaron Fabbri  wrote:
> 
> Hello,
> 
> I'd like to open a vote (7 days, ending August 24 at 3:10 PST) to merge the
> HADOOP-13345 feature branch into trunk.
> 
> This branch contains the new S3Guard feature which adds metadata
> consistency features to the S3A client.  Formatted site documentation can
> be found here:
> 
> https://github.com/apache/hadoop/blob/HADOOP-13345/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/s3guard.md
> 
> The current patch against trunk is posted here:
> 
> https://issues.apache.org/jira/browse/HADOOP-13998
> 
> The branch modifies the s3a portion of the hadoop-tools/hadoop-aws module:
> 
> - The feature is off by default, and care has been taken to insure it has
> no impact when disabled.
> - S3Guard can be enabled with the production database which is backed by
> DynamoDB, or with a local, in-memory implementation that facilitates
> integration testing without having to pay for a database.
> - getFileStatus() as well as directory listing consistency has been
> implemented and thoroughly tested, including delete tracking.
> - Convenient Maven profiles for testing with and without S3Guard.
> - New failure injection code and integration tests that exercise it.  We
> use timers and a wrapper around the Amazon SDK client object to force
> consistency delays to occur.  This allows us to assert that S3Guard works
> as advertised.  This will be extended with more types of failure injection
> to continue hardening the S3A client.
> 
> Outside of hadoop-tools/hadoop-aws's s3a directory there are some minor
> changes:
> 
> - core-default.xml defaults and documentation for s3guard parameters.
> - A couple additional FS contract test cases around rename.
> - More goodies in LambdaTestUtils
> - A new CLI tool for inspecting and manipulating S3Guard features,
> including the backing MetadataStore database.
> 
> This branch has seen extensive testing as well as use in production.  This
> branch makes significant improvements to S3A's test toolkit as well.
> 
> Performance is typically on par with, and in some cases better than, the
> existing S3A code without S3Guard enabled.
> 
> This feature was developed with contributions and feedback from many
> people.  I'd like to thank everyone who worked on HADOOP-13345 as well as
> all of those who contributed feedback and work on the original design
> document.
> 
> This is the first major Apache Hadoop project I've worked on from start to
> finish, and I've really enjoyed it.  Please shout if I've missed anything
> important here or in the VOTE process.
> 
> Cheers,
> Aaron Fabbri


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [VOTE] Release Apache Hadoop 2.7.4 (RC0)

2017-08-02 Thread Mingliang Liu
Thanks Konstantin!

+1 (non-binding)

I have tried several steps including:
- Download source package and check the checksums
- Build from source using Java 8
- Deploy a test cluster using docker containers w/ 5 DNs and 1 NN
- Run basic HDFS client/admin operations (e.g. read/write files, dfsadmin)
- Run NNThroughputBenchmark (using the new -fs option)
- Run DistCp and example apps (wordcount, sort)

All looked good to me.

> On Jul 29, 2017, at 4:29 PM, Konstantin Shvachko  wrote:
> 
> Hi everybody,
> 
> Here is the next release of Apache Hadoop 2.7 line. The previous stable
> release 2.7.3 was available since 25 August, 2016.
> Release 2.7.4 includes 264 issues fixed after release 2.7.3, which are
> critical bug fixes and major optimizations. See more details in Release
> Note:
> http://home.apache.org/~shv/hadoop-2.7.4-RC0/releasenotes.html
> 
> The RC0 is available at: http://home.apache.org/~shv/hadoop-2.7.4-RC0/
> 
> Please give it a try and vote on this thread. The vote will run for 5 days
> ending 08/04/2017.
> 
> Please note that my up to date public key are available from:
> https://dist.apache.org/repos/dist/release/hadoop/common/KEYS
> Please don't forget to refresh the page if you've been there recently.
> There are other place on Apache sites, which may contain my outdated key.
> 
> Thanks,
> --Konstantin


-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14601) Azure: Reuse ObjectMapper

2017-06-27 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14601:
--

 Summary: Azure: Reuse ObjectMapper
 Key: HADOOP-14601
 URL: https://issues.apache.org/jira/browse/HADOOP-14601
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fa/azure
Reporter: Mingliang Liu
Assignee: Mingliang Liu


Currently there are a few places in {{hadoop-azure}} module that uses creates 
{{ObjectMapper}} for each request/call. We should re-use the object mapper for 
performance purpose.

The general caveat is about thread safety; I think the change will be safe 
though.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14594) ITestS3AFileOperationCost::testFakeDirectoryDeletion to uncomment metric assertions

2017-06-26 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14594:
--

 Summary: ITestS3AFileOperationCost::testFakeDirectoryDeletion to 
uncomment metric assertions
 Key: HADOOP-14594
 URL: https://issues.apache.org/jira/browse/HADOOP-14594
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu
Assignee: Mingliang Liu


Per discussion [HADOOP-14255] and [HADOOP-13222], we can delete the TODO 
comment in tests for metric assertions.

See the attached patch for more details.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14536) Update azure-storage sdk to version 5.2.0

2017-06-16 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14536:
--

 Summary: Update azure-storage sdk to version 5.2.0
 Key: HADOOP-14536
 URL: https://issues.apache.org/jira/browse/HADOOP-14536
 Project: Hadoop Common
  Issue Type: Improvement
  Components: fs/azure
Affects Versions: 3.0.0-alpha3
Reporter: Mingliang Liu


Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft Azure 
Storage Clients. We are currently using version 4.2.0 of the SDK.

Azure Storage Clients changes between 4.2 and 5.2:

 * Fixed Exists() calls on Shares and Directories to now populate metadata. 
This was already being done for Files.
 * Changed blob constants to support up to 256 MB on put blob for block blobs. 
The default value for put blob threshold has also been updated to half of the 
maximum, or 128 MB currently.
 * Fixed a bug that prevented setting content MD5 to true when creating a new 
file.
 * Fixed a bug where access conditions, options, and operation context were not 
being passed when calling openWriteExisting() on a page blob or a file.
 * Fixed a bug where an exception was being thrown on a range get of a blob or 
file when the options disableContentMD5Validation is set to false and 
useTransactionalContentMD5 is set to true and there is no overall MD5.
 * Fixed a bug where retries were happening immediately if a socket exception 
was thrown.
 * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
This check has been moved to create() and uploadProperties() in CloudFileShare.
 * Prefix support for listing files and directories.
 * Added support for setting public access when creating a blob container
 * The public access setting on a blob container is now a container property 
returned from downloadProperties.
 * Add Message now modifies the PopReceipt, Id, NextVisibleTime, InsertionTime, 
and ExpirationTime properties of its CloudQueueMessage parameter.
 * Populate content MD5 for range gets on Blobs and Files.
 * Added support in Page Blob for incremental copy.
 * Added large BlockBlob upload support. Blocks can now support sizes up to 100 
MB.
 * Added a new, memory-optimized upload strategy for the upload* APIs. This 
algorithm only applies for blocks greater than 4MB and when storeBlobContentMD5 
and Client-Side Encryption are disabled.
 * getQualifiedUri() has been deprecated for Blobs. Please use 
getSnapshotQualifiedUri() instead. This new function will return the blob 
including the snapshot (if present) and no SAS token.
 * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
getSnapshotQualifiedStorageUri() instead. This new function will return the 
blob including the snapshot (if present) and no SAS token.
 * Fixed a bug where copying from a blob that included a SAS token and a 
snapshot ommitted the SAS token.
 * Fixed a bug in client-side encryption for tables that was preventing the 
Java client from decrypting entities encrypted with the .NET client, and vice 
versa.
 * Added support for server-side encryption.
 * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
to support retrieving a blob without knowing its type.
 * Fixed a bug in the retry policies where 300 status codes were being retried 
when they shouldn't be.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14516) Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft Azure Storage Clients

2017-06-09 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-14516.

Resolution: Duplicate

Closing as duplicates. Please see [HADOOP-14490] and comment there. Thanks,

> Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft 
> Azure Storage Clients
> ---
>
> Key: HADOOP-14516
> URL: https://issues.apache.org/jira/browse/HADOOP-14516
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs/azure
>Affects Versions: 3.0.0-alpha3
>Reporter: Georgi Chalakov
>
> Update WASB driver to use the latest version (5.2.0) of SDK for Microsoft 
> Azure Storage Clients. We are currently using version 4.2.0 of the SDK.
> Azure Storage Clients changes between 4.2 and 5.2:
>  * Fixed Exists() calls on Shares and Directories to now populate metadata. 
> This was already being done for Files.
>  * Changed blob constants to support up to 256 MB on put blob for block 
> blobs. The default value for put blob threshold has also been updated to half 
> of the maximum, or 128 MB currently.
>  * Fixed a bug that prevented setting content MD5 to true when creating a new 
> file.
>  * Fixed a bug where access conditions, options, and operation context were 
> not being passed when calling openWriteExisting() on a page blob or a file.
>  * Fixed a bug where an exception was being thrown on a range get of a blob 
> or file when the options disableContentMD5Validation is set to false and 
> useTransactionalContentMD5 is set to true and there is no overall MD5.
>  * Fixed a bug where retries were happening immediately if a socket exception 
> was thrown.
>  * In CloudFileShareProperties, setShareQuota() no longer asserts in bounds. 
> This check has been moved to create() and uploadProperties() in 
> CloudFileShare.
>  * Prefix support for listing files and directories.
>  * Added support for setting public access when creating a blob container
>  * The public access setting on a blob container is now a container property 
> returned from downloadProperties.
>  * Add Message now modifies the PopReceipt, Id, NextVisibleTime, 
> InsertionTime, and ExpirationTime properties of its CloudQueueMessage 
> parameter.
>  * Populate content MD5 for range gets on Blobs and Files.
>  * Added support in Page Blob for incremental copy.
>  * Added large BlockBlob upload support. Blocks can now support sizes up to 
> 100 MB.
>  * Added a new, memory-optimized upload strategy for the upload* APIs. This 
> algorithm only applies for blocks greater than 4MB and when 
> storeBlobContentMD5 and Client-Side Encryption are disabled.
>  * getQualifiedUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedUri() instead. This new function will return the blob 
> including the snapshot (if present) and no SAS token.
>  * getQualifiedStorageUri() has been deprecated for Blobs. Please use 
> getSnapshotQualifiedStorageUri() instead. This new function will return the 
> blob including the snapshot (if present) and no SAS token.
>  * Fixed a bug where copying from a blob that included a SAS token and a 
> snapshot ommitted the SAS token.
>  * Fixed a bug in client-side encryption for tables that was preventing the 
> Java client from decrypting entities encrypted with the .NET client, and vice 
> versa.
>  * Added support for server-side encryption.
>  * Added support for getBlobReferenceFromServer methods on CloudBlobContainer 
> to support retrieving a blob without knowing its type.
>  * Fixed a bug in the retry policies where 300 status codes were being 
> retried when they shouldn't be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14500) Azure: TestFileSystemOperationExceptionHandling{,MultiThreaded} fails

2017-06-06 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14500:
--

 Summary: Azure: 
TestFileSystemOperationExceptionHandling{,MultiThreaded} fails
 Key: HADOOP-14500
 URL: https://issues.apache.org/jira/browse/HADOOP-14500
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/azure, test
Reporter: Mingliang Liu


The following test fails:
{code}
TestFileSystemOperationExceptionHandling.testSingleThreadBlockBlobSeekScenario 
Expected exception: 
java.io.FileNotFoundExceptionTestFileSystemOperationsExceptionHandlingMultiThreaded.testMultiThreadBlockBlobSeekScenario
 Expected exception: java.io.FileNotFoundException
{code}

I did early analysis and found [HADOOP-14478] maybe the reason. I think we can 
fix the test itself here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14498) HADOOP_OPTIONAL_TOOLS not parsed correctly

2017-06-06 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14498:
--

 Summary: HADOOP_OPTIONAL_TOOLS not parsed correctly
 Key: HADOOP-14498
 URL: https://issues.apache.org/jira/browse/HADOOP-14498
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Affects Versions: 3.0.0-alpha1
Reporter: Mingliang Liu



# This will make hadoop-azure not show up in the hadoop classpath, though both 
hadoop-aws and hadoop-azure-datalake are in the 
classpath.{code:title=hadoop-env.sh}
export HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws,hadoop-azure-datalake"
{code}
# And if we put only hadoop-azure and hadoop-aws, both of them are shown in the 
classpath.
{code:title=hadoop-env.sh}
export HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws"
{code}

This makes me guess that, while parsing the {{HADOOP_OPTIONAL_TOOLS}}, we make 
some assumptions that hadoop tool modules have a single "-" in names, and the 
_hadoop-azure-datalake_ overrides the _hadoop-azure_. Or any other assumptions 
about the {{${project.artifactId\}}}?

Ping [~aw].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14491) Azure has messed doc structure

2017-06-05 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14491:
--

 Summary: Azure has messed doc structure
 Key: HADOOP-14491
 URL: https://issues.apache.org/jira/browse/HADOOP-14491
 Project: Hadoop Common
  Issue Type: Improvement
  Components: documentation, fs/azure
Reporter: Mingliang Liu
Assignee: Mingliang Liu


# The _WASB Secure mode and configuration_ and _Authorization Support in WASB_ 
sections are missing in the navigation
# _Authorization Support in WASB_ should be header level 3 instead of level 2 
# Some of the code format is not specified
# Sample code indent not unified.

Let's use the auto-generated navigation instead of manually updating it, just 
as other documents.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14490) Upgrade azure-storage sdk version

2017-06-05 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14490:
--

 Summary: Upgrade azure-storage sdk version
 Key: HADOOP-14490
 URL: https://issues.apache.org/jira/browse/HADOOP-14490
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mingliang Liu


As required by [HADOOP-14478], we're expecting the {{BlobInputStream}} to 
support advanced {{readFully()}} by taking hints of mark. This can only be done 
by means of sdk version bump.

cc: [~rajesh.balamohan].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14472) Azure: TestReadAndSeekPageBlobAfterWrite fails intermittently

2017-05-31 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14472:
--

 Summary: Azure: TestReadAndSeekPageBlobAfterWrite fails 
intermittently
 Key: HADOOP-14472
 URL: https://issues.apache.org/jira/browse/HADOOP-14472
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/azure
Reporter: Mingliang Liu


Reported by [HADOOP-14461]
{code}
testManySmallWritesWithHFlush(org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite)
  Time elapsed: 1.051 sec  <<< FAILURE!
java.lang.AssertionError: hflush duration of 13, less than minimum expected of 
20
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.writeAndReadOneFile(TestReadAndSeekPageBlobAfterWrite.java:286)
at 
org.apache.hadoop.fs.azure.TestReadAndSeekPageBlobAfterWrite.testManySmallWritesWithHFlush(TestReadAndSeekPageBlobAfterWrite.java:247)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14460) Azure: fs.azure.account.key.youraccount.blob.core.windows.net -> fs.azure.account.key.youraccount

2017-05-26 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14460:
--

 Summary: Azure: 
fs.azure.account.key.youraccount.blob.core.windows.net -> 
fs.azure.account.key.youraccount
 Key: HADOOP-14460
 URL: https://issues.apache.org/jira/browse/HADOOP-14460
 Project: Hadoop Common
  Issue Type: Bug
  Components: documentation, fs/azure
Reporter: Mingliang Liu
Assignee: Mingliang Liu


In {{SimpleKeyProvider}}, we have following code for getting the key:
{code}
  protected static final String KEY_ACCOUNT_KEY_PREFIX =
  "fs.azure.account.key.";
...
  protected String getStorageAccountKeyName(String accountName) {
return KEY_ACCOUNT_KEY_PREFIX + accountName;
  }
{code}

While in documentation {{index.md}}, we have:
{code}
  
fs.azure.account.key.youraccount.blob.core.windows.net
YOUR ACCESS KEY
  
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: s3a hadoop 2.8.0 tests vs 2.7.3

2017-05-26 Thread Mingliang Liu
Hi,

Many tests of S3A have been moved to “integration tests”, whose names start 
with “ITestS3A”. Moreover, the phase of Maven for those tests are “verify” 
instead of “test” now.

So, you can specify the "mvn -Dit.test=‘ITestS3A*’ verify" for integration 
tests (and unit tests). “mvn test” will run unit tests only. Make sure the 
credentials are provided.

By the way, upgrading from 2.7 to 2.8 is a smart choice from S3A point of view.

https://github.com/apache/hadoop/blob/branch-2.8.1/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md#running-the-tests
 


L

> On May 26, 2017, at 9:00 AM, Vasu Kulkarni  wrote:
> 
> Sorry resending because i had problems with group subscribe
> 
> Hi,
> 
> I am trying to run hadoop s3a unit tests on 2.8.0 release(using ceph
> radosgw),  I notice that many tests that ran in 2.7.3 have been
> skipped in 2.8.0 hadoop release,
> I am following the configuration options from here that worked here
> for 2.7.3: 
> https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md
> 
> Has any of the configuration options changed for 2.8.0 which are not
> yet documented or the test structure has changed ? Thanks
> 
> on 2.8.0:
> 
> Tests: (mvn test -Dtest=S3a*,TestS3A* )
> Running org.apache.hadoop.fs.s3a.TestS3AExceptionTranslation
> Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.124
> sec - in org.apache.hadoop.fs.s3a.TestS3AExceptionTranslation
> Running org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider
> Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.385
> sec - in org.apache.hadoop.fs.s3a.TestS3AAWSCredentialsProvider
> Running org.apache.hadoop.fs.s3a.TestS3AInputPolicies
> Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.146
> sec - in org.apache.hadoop.fs.s3a.TestS3AInputPolicies
> Running org.apache.hadoop.fs.s3a.TestS3AGetFileStatus
> 
> 
> Tests run: 40, Failures: 0, Errors: 0, Skipped: 0
> 
> 
> 
> on 2.7.3:
> 
> Tests: ( mvn test -Dtest=S3a*,TestS3A*)
> 
> Running org.apache.hadoop.fs.contract.s3a.TestS3AContractMkdir
> Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRootDir
> Running org.apache.hadoop.fs.contract.s3a.TestS3AContractRename
> .
> .
> Running org.apache.hadoop.fs.s3a.scale.TestS3ADeleteManyFiles
> 
> Tests run: 88, Failures: 0, Errors: 0, Skipped: 48
> 
> Thanks
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 



[jira] [Created] (HADOOP-14458) TestAliyunOSSFileSystemContract missing imports/

2017-05-25 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14458:
--

 Summary: TestAliyunOSSFileSystemContract missing imports/
 Key: HADOOP-14458
 URL: https://issues.apache.org/jira/browse/HADOOP-14458
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/oss, test
Reporter: Mingliang Liu
Assignee: Mingliang Liu


{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile 
(default-testCompile) on project hadoop-aliyun: Compilation failure: 
Compilation failure:
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[71,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[90,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[91,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[92,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[93,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[95,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[96,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[98,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[99,5]
 cannot find symbol
[ERROR]   symbol:   method assertTrue(java.lang.String,boolean)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[115,7]
 cannot find symbol
[ERROR]   symbol:   method fail(java.lang.String)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[129,7]
 cannot find symbol
[ERROR]   symbol:   method fail(java.lang.String)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[143,7]
 cannot find symbol
[ERROR]   symbol:   method fail(java.lang.String)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss/TestAliyunOSSFileSystemContract.java:[163,7]
 cannot find symbol
[ERROR]   symbol:   method fail(java.lang.String)
[ERROR]   location: class 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract
[ERROR] 
/Users/mliu/Workspace/hadoop/hadoop-tools/hadoop-aliyun/src/test/java/org/apache/hadoop/fs/aliyun/oss

[jira] [Created] (HADOOP-14438) Make ADLS doc of setting up client key up to date

2017-05-19 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14438:
--

 Summary: Make ADLS doc of setting up client key up to date
 Key: HADOOP-14438
 URL: https://issues.apache.org/jira/browse/HADOOP-14438
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/adl
Reporter: Mingliang Liu


In the doc {{hadoop-tools/hadoop-azure-datalake/src/site/markdown/index.md}}, 
we have such a statement:
{code:title=Note down the properties you will need to auth}
...
- Resource: Always https://management.core.windows.net/ , for all customers
{code}
Is the {{Resource}} useful here? It seems not necessary to me.

{code:title=Adding the service principal to your ADL Account}
- ...
- Select Users under Settings
...
{code}
According to the portal, it should be "Access control (IAM)" under "Settings"



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14363) Inconsistent default block location in FileSystem javadoc

2017-04-28 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14363:
--

 Summary: Inconsistent default block location in FileSystem javadoc
 Key: HADOOP-14363
 URL: https://issues.apache.org/jira/browse/HADOOP-14363
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Mingliang Liu
Assignee: Chen Liang
Priority: Trivial


{code:title=FileSystem::getFileBlockLocations()}
..
   *
   * The default implementation returns an array containing one element:
   * 
   * BlockLocation( { "localhost:50010" },  { "localhost" }, 0, file.getLen())
   * >
   *
   * @param file FilesStatus to get data from
   * @param start offset into the given file
   * @param len length for which to get locations for
   * @throws IOException IO failure
   */
  public BlockLocation[] getFileBlockLocations(FileStatus file,
  long start, long len) throws IOException {
...
String[] name = {"localhost:9866"};
String[] host = {"localhost"};
return new BlockLocation[] {
  new BlockLocation(name, host, 0, file.getLen()) };
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14349) Rename ADLS CONTRACT_ENABLE_KEY

2017-04-24 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14349:
--

 Summary: Rename ADLS CONTRACT_ENABLE_KEY
 Key: HADOOP-14349
 URL: https://issues.apache.org/jira/browse/HADOOP-14349
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/adl
Affects Versions: 2.8.0
Reporter: Mingliang Liu


dfs.adl.test.contract.enable -> fs.adl.test.contract.enable



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14165) Add S3Guard.dirListingUnion in S3AFileSystem#listFiles, listLocatedStatus

2017-04-12 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-14165.

Resolution: Duplicate

Resolving as duplicate to [HADOOP-13926] and [HADOOP-14266].

> Add S3Guard.dirListingUnion in S3AFileSystem#listFiles, listLocatedStatus
> -
>
> Key: HADOOP-14165
> URL: https://issues.apache.org/jira/browse/HADOOP-14165
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Rajesh Balamohan
>Priority: Minor
>
> {{S3Guard::dirListingUnion}} merges information from backing store and DDB to 
> create consistent view. This needs to be added in 
> {{S3AFileSystem::listFiles}} and {{S3AFileSystem::listLocatedStatus}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14215) DynamoDB client should waitForActive on existing tables

2017-04-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-14215.

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HADOOP-13345

Committed to feature branch. Thanks for your contribution [~mackrorysd]. Thanks 
[~fabbri] for review and [~rajesh.balamohan] for offline discussion.

> DynamoDB client should waitForActive on existing tables
> ---
>
> Key: HADOOP-14215
> URL: https://issues.apache.org/jira/browse/HADOOP-14215
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
>Priority: Critical
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14215-HADOOP-13345.000.patch, 
> HADOOP-14215-HADOOP-13345.001.patch, HADOOP-14215-HADOOP-13345.002.patch
>
>
> I saw a case where 2 separate applications tried to use the same 
> non-pre-existing table with table.create = true at about the same time. One 
> failed with a ResourceInUse exception. If a table does not exist, we attempt 
> to create it and then wait for it to enter the active state. If another jumps 
> in in the middle of that, the table may exist, thus bypassing our call to 
> waitForActive(), and then try to use the table immediately.
> While we're at it, let's also make sure that the race condition where a table 
> might get created between checking if it exists and attempting to create it 
> is handled gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14282) S3Guard: DynamoDBMetadata::prune() should self interrupt correctly

2017-04-05 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14282:
--

 Summary: S3Guard: DynamoDBMetadata::prune() should self interrupt 
correctly
 Key: HADOOP-14282
 URL: https://issues.apache.org/jira/browse/HADOOP-14282
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Mingliang Liu
Assignee: Mingliang Liu






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14266) S3Guard: S3AFileSystem::listFiles() to employ MetadataStore

2017-03-31 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14266:
--

 Summary: S3Guard: S3AFileSystem::listFiles() to employ 
MetadataStore
 Key: HADOOP-14266
 URL: https://issues.apache.org/jira/browse/HADOOP-14266
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu


Similar to [HADOOP-13926], this is to track the effort of employing 
MetadataStore in {{S3AFileSystem::listFiles()}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14255) S3A to delete unnecessary fake directory objects in mkdirs()

2017-03-29 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14255:
--

 Summary: S3A to delete unnecessary fake directory objects in 
mkdirs()
 Key: HADOOP-14255
 URL: https://issues.apache.org/jira/browse/HADOOP-14255
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


In S3AFileSystem, as an optimization, we delete unnecessary fake directory 
objects if that directory contains at least one (nested) file. That is done in 
closing stream of newly created file. However, if the directory becomes 
non-empty after we just create an empty subdirectory, we do not delete its fake 
directory object though that fake directory object becomes "unnecessary".

So in {{S3AFileSystem::mkdirs()}}, we have a pending TODO:
{quote}
  // TODO: If we have created an empty file at /foo/bar and we then call
  // mkdirs for /foo/bar/baz/roo what happens to the empty file /foo/bar/?
  private boolean innerMkdirs(Path p, FsPermission permission)
{quote}

This JIRA is to fix the TODO: provide consistent behavior for a fake directory 
object between its nested subdirectory and nested file by deleting it.

See related discussion in [HADOOP-14236].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14248) Retire SharedInstanceProfileCredentialsProvider after AWS SDK upgrade

2017-03-27 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14248:
--

 Summary: Retire SharedInstanceProfileCredentialsProvider after AWS 
SDK upgrade
 Key: HADOOP-14248
 URL: https://issues.apache.org/jira/browse/HADOOP-14248
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: 3.0.0-alpha3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


This is from the discussion in [HADOOP-13050].

So [HADOOP-13727] added the SharedInstanceProfileCredentialsProvider, which 
effectively reduces high number of connections to EC2 Instance Metadata Service 
caused by InstanceProfileCredentialsProvider. That patch, in order to prevent 
the throttling problem, defined new class 
{{SharedInstanceProfileCredentialsProvider}} as a subclass of 
{{InstanceProfileCredentialsProvider}}, which enforces creation of only a 
single instance.

Per [HADOOP-13050], we upgraded the AWS Java SDK. Since then, the 
{{InstanceProfileCredentialsProvider}} in SDK code internally enforces a 
singleton. That  confirms that our effort in [HADOOP-13727] makes 100% sense. 
Meanwhile, {{SharedInstanceProfileCredentialsProvider}} can retire gracefully 
in trunk branch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14247) FileContextMainOperationsBaseTest should clean up test root path

2017-03-27 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14247:
--

 Summary: FileContextMainOperationsBaseTest should clean up test 
root path
 Key: HADOOP-14247
 URL: https://issues.apache.org/jira/browse/HADOOP-14247
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Affects Versions: 2.8.0
Reporter: Mingliang Liu
Assignee: Mingliang Liu






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14236) S3Guard: S3AFileSystem::rename() should move non-listed sub-directory entries in metadata store

2017-03-24 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14236:
--

 Summary: S3Guard: S3AFileSystem::rename() should move non-listed 
sub-directory entries in metadata store
 Key: HADOOP-14236
 URL: https://issues.apache.org/jira/browse/HADOOP-14236
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


After running integration test {{ITestS3AFileSystemContract}}, I found the 
following items are not cleaned up in DynamoDB:
{code}
parent=/mliu-s3guard/user/mliu/s3afilesystemcontract/testRenameDirectoryAsExisting/dir,
 child=subdir
parent=/mliu-s3guard/user/mliu/s3afilesystemcontract/testRenameDirectoryAsExistingNew/newdir/subdir,
 child=file2
{code}
At first I thought it’s similar to [HADOOP-14226] or [HADOOP-14227], and we 
need to be careful when cleaning up test data.

Then I found it’s a bug in the code of integrating S3Guard with S3AFileSystem: 
for rename we miss sub-directory items to put (dest) and delete (src). The 
reason is that in S3A, we delete those fake directory objects if they are not 
necessary, e.g. non-empty. So when we list the objects to rename, the object 
summaries will only return _file_ objects. This has two consequences after 
rename:
#  there will be left items for src path in metadata store - left-overs will 
confuse {{get(Path)}} which should return null
# we are not persisting the whole subtree for dest path to metadata store - 
this will break the DynamoDBMetadataStore invariant: _if a path exists, all its 
ancestors will also exist in the table_.

Existing tests are not complaining about this though. If this is a real bug, 
let’s address it here.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14227) S3Guard: ITestS3AConcurrentOps is not cleaning up test data

2017-03-23 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14227:
--

 Summary: S3Guard: ITestS3AConcurrentOps is not cleaning up test 
data
 Key: HADOOP-14227
 URL: https://issues.apache.org/jira/browse/HADOOP-14227
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


After running {{ITestS3AConcurrentOps}}, the test data is not cleanup in 
DynamoDB. There are two reasons:
# The {{ITestS3AConcurrentOps::teardown()}} method is not calling super 
teardown() method to clean up the default test directory.
# The {{auxFs}} is not S3Guard aware even though the {{fs}} to test is. That's 
because the {{auxFs}} is creating a new Configuration object without patching 
in S3Guard options (via {{maybeEnableS3Guard(conf);}}).

This JIRA is to clean up the data after test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14226) S3Guard: ITestDynamoDBMetadataStoreScale is not cleaning up test data

2017-03-23 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14226:
--

 Summary: S3Guard: ITestDynamoDBMetadataStoreScale is not cleaning 
up test data
 Key: HADOOP-14226
 URL: https://issues.apache.org/jira/browse/HADOOP-14226
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu
Assignee: Mingliang Liu


After running {{ITestDynamoDBMetadataStoreScale}}, the test data is not cleaned 
up. There is a call to {{clearMetadataStore(ms, count);}} in the finally clause 
though. The reason is that, the internally called method 
{{DynamoDBMetadataStore::deleteSubtree()}} is assuming there should be an item 
for the parent dest path:
{code}
parent=/fake-bucket, child=moved-here, is_dir=true
{code}

In DynamoDBMetadataStore implementation, we assume that _if a path exists, all 
its ancestors will also exist in the table_. We need to pre-create dest path to 
maintain this invariant so that test data can be cleaned up successfully.

I think there may be other tests with the same problem. Let's identify/address 
them separately.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14214) DomainSocketWatcher::add()/delete() should not self interrupt while looping await()

2017-03-22 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14214:
--

 Summary: DomainSocketWatcher::add()/delete() should not self 
interrupt while looping await()
 Key: HADOOP-14214
 URL: https://issues.apache.org/jira/browse/HADOOP-14214
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu
Assignee: Mingliang Liu





Our hive team found a TPCDS job whose queries running on LLAP seem to be 
getting stuck. Dozens of threads were waiting for the 
{{DfsClientShmManager::lock}}, as following jstack:
{code}
Thread 251 (IO-Elevator-Thread-5):
  State: WAITING
  Blocked count: 3871
  Wtaited count: 4565
  Waiting on 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@16ead198
  Stack:
sun.misc.Unsafe.park(Native Method)
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUninterruptibly(AbstractQueuedSynchronizer.java:1976)

org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager$EndpointShmManager.allocSlot(DfsClientShmManager.java:255)

org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager.allocSlot(DfsClientShmManager.java:434)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.allocShmSlot(ShortCircuitCache.java:1017)

org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:476)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:784)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:718)

org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)

org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1181)

org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1118)
org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1478)
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1441)
org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:111)

org.apache.orc.impl.RecordReaderUtils$DefaultDataReader.readStripeFooter(RecordReaderUtils.java:166)

org.apache.hadoop.hive.llap.io.metadata.OrcStripeMetadata.(OrcStripeMetadata.java:64)

org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.readStripesMetadata(OrcEncodedDataReader.java:622)
{code}

The thread that is expected to signal those threads is calling 
{{DomainSocketWatcher::add()}} method, but it gets stuck there dealing with 
InterruptedException infinitely. The jstack is like:
{code}
Thread 44417 (TezTR-257387_2840_12_10_52_0):
  State: RUNNABLE
  Blocked count: 3
  Wtaited count: 5
  Stack:
java.lang.Throwable.fillInStackTrace(Native Method)
java.lang.Throwable.fillInStackTrace(Throwable.java:783)
java.lang.Throwable.(Throwable.java:250)
java.lang.Exception.(Exception.java:54)
java.lang.InterruptedException.(InterruptedException.java:57)

java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2034)

org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:325)

org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager$EndpointShmManager.allocSlot(DfsClientShmManager.java:266)

org.apache.hadoop.hdfs.shortcircuit.DfsClientShmManager.allocSlot(DfsClientShmManager.java:434)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.allocShmSlot(ShortCircuitCache.java:1017)

org.apache.hadoop.hdfs.BlockReaderFactory.createShortCircuitReplicaInfo(BlockReaderFactory.java:476)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.create(ShortCircuitCache.java:784)

org.apache.hadoop.hdfs.shortcircuit.ShortCircuitCache.fetchOrCreate(ShortCircuitCache.java:718)

org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:422)
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:333)

org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1181)

org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1118)
org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1478)
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1441)
org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
{code}
The whole job makes no progress because of this.

The thread in {{DomainSocketWatcher::add()}} is expected to eventually break 
the while loop where it waits for the newly added entry being deleted by 
another thread. However, if this thread is ever interrupted, chances are that 
it will hold the lock forever so {{if(!toAdd.contains

Re: [VOTE] Release Apache Hadoop 2.8.0 (RC3)

2017-03-17 Thread Mingliang Liu
Thanks Junping for doing this.

+1 (non-binding)

0. Download the src tar.gz file; checked the MD5 checksum
1. Build Hadoop from source successfully
2. Deploy a single node cluster and start the cluster successfully
3. Operate the HDFS from command line: ls, put, distcp, dfsadmin etc
4. Run hadoop mapreduce examples: grep
5. Operate AWS S3 using S3A schema from commandline: ls, cat, distcp
6. Check the HDFS service logs

L

> On Mar 17, 2017, at 2:18 AM, Junping Du  wrote:
> 
> Hi all,
> With fix of HDFS-11431 get in, I've created a new release candidate (RC3) 
> for Apache Hadoop 2.8.0.
> 
> This is the next minor release to follow up 2.7.0 which has been released 
> for more than 1 year. It comprises 2,900+ fixes, improvements, and new 
> features. Most of these commits are released for the first time in branch-2.
> 
>  More information about the 2.8.0 release plan can be found here: 
> https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.8+Release
> 
>  New RC is available at: 
> http://home.apache.org/~junping_du/hadoop-2.8.0-RC3
> 
>  The RC tag in git is: release-2.8.0-RC3, and the latest commit id is: 
> 91f2b7a13d1e97be65db92ddabc627cc29ac0009
> 
>  The maven artifacts are available via repository.apache.org at: 
> https://repository.apache.org/content/repositories/orgapachehadoop-1057
> 
>  Please try the release and vote; the vote will run for the usual 5 days, 
> ending on 03/22/2017 PDT time.
> 
> Thanks,
> 
> Junping



[jira] [Created] (HADOOP-14194) Alyun OSS should not use empty endpoint as default

2017-03-16 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14194:
--

 Summary: Alyun OSS should not use empty endpoint as default
 Key: HADOOP-14194
 URL: https://issues.apache.org/jira/browse/HADOOP-14194
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/oss
Reporter: Mingliang Liu
Assignee: Xiaobing Zhou


In {{AliyunOSSFileSystemStore::initialize()}}, it retrieves the endPoint and 
using empty string as a default value.
{code}
String endPoint = conf.getTrimmed(ENDPOINT_KEY, "");
{code}

The plain value without validation is passed to OSSClient. If the endPoint is 
not provided (empty string) or the endPoint is not valid, users will get 
exception from Aliyun OSS sdk with raw exception message like:
{code}
java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected 
authority at index 8: https://

at com.aliyun.oss.OSSClient.toURI(OSSClient.java:359)
at com.aliyun.oss.OSSClient.setEndpoint(OSSClient.java:313)
at com.aliyun.oss.OSSClient.(OSSClient.java:297)
at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystemStore.initialize(AliyunOSSFileSystemStore.java:134)
at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem.initialize(AliyunOSSFileSystem.java:272)
at 
org.apache.hadoop.fs.aliyun.oss.AliyunOSSTestUtils.createTestFileSystem(AliyunOSSTestUtils.java:63)
at 
org.apache.hadoop.fs.aliyun.oss.TestAliyunOSSFileSystemContract.setUp(TestAliyunOSSFileSystemContract.java:47)
at junit.framework.TestCase.runBare(TestCase.java:139)
at junit.framework.TestResult$1.protect(TestResult.java:122)
at junit.framework.TestResult.runProtected(TestResult.java:142)
at junit.framework.TestResult.run(TestResult.java:125)
at junit.framework.TestCase.run(TestCase.java:129)
at junit.framework.TestSuite.runTest(TestSuite.java:255)
at junit.framework.TestSuite.run(TestSuite.java:250)
at 
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)
Caused by: java.net.URISyntaxException: Expected authority at index 8: https://
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.failExpecting(URI.java:2854)
at java.net.URI$Parser.parseHierarchical(URI.java:3102)
at java.net.URI$Parser.parse(URI.java:3053)
at java.net.URI.(URI.java:588)
at com.aliyun.oss.OSSClient.toURI(OSSClient.java:357)
{code}

Let's check endPoint is not null or empty, catch the IllegalArgumentException 
and log it, wrapping the exception with clearer message stating the 
misconfiguration in endpoint or credentials.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14192) Aliyun OSS FileSystem contract test should implement getTestBaseDir()

2017-03-16 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14192:
--

 Summary: Aliyun OSS FileSystem contract test should implement 
getTestBaseDir()
 Key: HADOOP-14192
 URL: https://issues.apache.org/jira/browse/HADOOP-14192
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/oss
Reporter: Mingliang Liu
Assignee: Mingliang Liu


[HADOOP-14170] is the recent effort of improving the file system contract tests 
{{FileSystemContractBaseTest}}, which make {{path()}} method final and add a 
new method {{getTestBaseDir()}} for subclasses to implement. Aliyun OSS should 
override that as it uses unique directory (naming with fork id) for supporting 
parallel tests. Plus, the current {{testWorkingDirectory}} needs not override 
per changes in {{FileSystemContractBaseTest}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13959) S3guard: replace dynamo.describe() call in init with more efficient query

2017-03-15 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-13959.

Resolution: Fixed

As [HADOOP-13985] is committed, I think we can resolve this one. Feel free to 
re-open if necessary.

> S3guard: replace dynamo.describe() call in init with more efficient query
> -
>
> Key: HADOOP-13959
> URL: https://issues.apache.org/jira/browse/HADOOP-13959
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Steve Loughran
>    Assignee: Mingliang Liu
>Priority: Minor
>
> HADOOP-13908 adds initialization when a table isn't created, using the 
> {{describe()}} call.
> AWS document this as inefficient, and throttle it. We should be able to get 
> away with a simple table lookup as the probe



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14180) FileSystem contract tests to replace JUnit 3 with 4

2017-03-13 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14180:
--

 Summary: FileSystem contract tests to replace JUnit 3 with 4
 Key: HADOOP-14180
 URL: https://issues.apache.org/jira/browse/HADOOP-14180
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Mingliang Liu


This is from discussion in [HADOOP-14170], as Steve commented:
{quote}
...it's time to move this to JUnit 4, annotate all tests with @test, and make 
the test cases skip if they don't have the test FS defined. JUnit 3 doesn't 
support Assume, so when I do test runs without the s3n or s3 fs specced, I get 
lots of errors I just ignore.
...Move to Junit 4, and, in our own code, find everywhere we've subclassed a 
method to make the test a no-op, and insert an Assume.assumeTrue(false) in 
there so they skip properly.
{quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14170) FileSystemContractBaseTest is not cleaning up test directory clearly

2017-03-09 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14170:
--

 Summary: FileSystemContractBaseTest is not cleaning up test 
directory clearly
 Key: HADOOP-14170
 URL: https://issues.apache.org/jira/browse/HADOOP-14170
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs
Reporter: Mingliang Liu
Assignee: Mingliang Liu


In {{FileSystemContractBaseTest::tearDown()}} method, it cleans up the 
{{path("/test")}} directory, which will be qualified as {{/test}} (against root 
instead of working directory because it's absolute):
{code}
  @Override
  protected void tearDown() throws Exception {
try {
  if (fs != null) {
fs.delete(path("/test"), true);
  }
} catch (IOException e) {
  LOG.error("Error deleting /test: " + e, e);
}
  }
{code}
But in the test, it uses {{path("test")}} sometimes, which will be made 
qualified against the working directory (e.g. {{/user/bob/test}}).

This makes some tests fail intermittently, e.g. {{ITestS3AFileSystemContract}}. 
Also see the discussion in [HADOOP-13934].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14153) ADL module has messed doc structure

2017-03-07 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14153:
--

 Summary: ADL module has messed doc structure
 Key: HADOOP-14153
 URL: https://issues.apache.org/jira/browse/HADOOP-14153
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/adl
Reporter: Mingliang Liu
Assignee: Mingliang Liu






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-14129) ITestS3ACredentialsInURL sometimes fails

2017-03-01 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reopened HADOOP-14129:


> ITestS3ACredentialsInURL sometimes fails
> 
>
> Key: HADOOP-14129
> URL: https://issues.apache.org/jira/browse/HADOOP-14129
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: HADOOP-13345
>Reporter: Sean Mackrory
>Assignee: Sean Mackrory
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-14129-HADOOP-13345.001.patch, 
> HADOOP-14129-HADOOP-13345.002.patch
>
>
> This test sometimes fails. I believe it's expected that DynamoDB doesn't have 
> access to the credentials if they're embedded in the URL instead of the 
> configuration (and IMO that's fine - since the functionality hasn't been in 
> previous releases and since we want to discourage this practice especially 
> now that there are better alternatives). Weirdly, I only sometimes get this 
> failure on the HADOOP-13345 branch. But if the problem turns out to be what I 
> think it is, a simple Assume should fix it.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14135) Remove URI parameter in AWSCredentialProvider constructors

2017-02-28 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14135:
--

 Summary: Remove URI parameter in AWSCredentialProvider constructors
 Key: HADOOP-14135
 URL: https://issues.apache.org/jira/browse/HADOOP-14135
 Project: Hadoop Common
  Issue Type: Sub-task
Reporter: Mingliang Liu
Assignee: Mingliang Liu


This was from comment in [HADOOP-13252].
It looks like the URI parameter is not needed for our AWSCredentialProvider 
constructors. This was useful because we relied on URI parameter for retrieving 
user:pass. Now in binding URIs, we have
{code}
279 S3xLoginHelper.Login creds = getAWSAccessKeys(binding, conf);
280   credentials.add(new BasicAWSCredentialsProvider(
281   creds.getUser(), creds.getPassword()));
{code}
This way, we only need configuration object (if necessary) for all 
AWSCredentialProvider implementations. The benefit is that, if we create 
AWSCredentialProvider list for DynamoDB, we don't have to pass down the 
associated file system URI. This might be useful to S3Guard tools.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14130) Simplify DynamoDBClientFactory for creating Amazon DynamoDB clients

2017-02-27 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14130:
--

 Summary: Simplify DynamoDBClientFactory for creating Amazon 
DynamoDB clients
 Key: HADOOP-14130
 URL: https://issues.apache.org/jira/browse/HADOOP-14130
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


So, we are using deprecated {{AmazonDynamoDBClient}} class to create a DynamoDB 
client instead of the recommended builder. We had discussion in [HADOOP-13345] 
for preferring region to endpoints for user to specify the DynamoDB region (if 
associated S3 region is unknown or different). We have reported inconsistent 
behavior if endpoint and S3 region are different in [HADOOP-14027]. We also 
noticed that {{DynamoDBMetadataStore}} may sometimes logs nonsense region. And 
in [HADOOP-13252], we also have feelings that file system URI is not needed to 
create a {{AWSCredentialProvider}}. Resultantly we don't need to pass down file 
system URI for creating a DynamoDB client.

So this JIRA is to change this, best effort.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13994) explicitly declare the commons-lang3 dependency as 3.4

2017-02-24 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-13994.

Resolution: Fixed

> explicitly declare the commons-lang3 dependency as 3.4
> --
>
> Key: HADOOP-13994
> URL: https://issues.apache.org/jira/browse/HADOOP-13994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, fs/azure, fs/s3
>Affects Versions: 3.0.0-alpha2, HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13994-HADOOP-13345-001.patch, 
> HADOOP-13994-HADOOP-13445-001.patch
>
>
> Other people aren't seeing this (yet?), but unless you explicitly exclude v 
> 3.4 of commons-lang3 from the azure build (which HADOOP-13660 does), then the 
> dependency declaration of commons-lang3 v 3.3.2 is creating a resolution 
> conflict. That's a dependency only needed for the local dynamodb & tests.
> I propose to fix this in s3guard by explicitly declaring the version used in 
> the tests to be that of the azure-storage one, excluding that you get for 
> free. It doesn't impact anything shipped in production, but puts the hadoop 
> build in control of what versions of commons-lang are coming in everywhere by 
> way of the commons-config version declared in hadoop-common



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-13994) explicitly declare the commons-lang3 dependency as 3.4

2017-02-24 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reopened HADOOP-13994:


> explicitly declare the commons-lang3 dependency as 3.4
> --
>
> Key: HADOOP-13994
> URL: https://issues.apache.org/jira/browse/HADOOP-13994
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build, fs/azure, fs/s3
>Affects Versions: 3.0.0-alpha2, HADOOP-13345
>Reporter: Steve Loughran
>Assignee: Steve Loughran
> Fix For: HADOOP-13345
>
> Attachments: HADOOP-13994-HADOOP-13345-001.patch, 
> HADOOP-13994-HADOOP-13445-001.patch
>
>
> Other people aren't seeing this (yet?), but unless you explicitly exclude v 
> 3.4 of commons-lang3 from the azure build (which HADOOP-13660 does), then the 
> dependency declaration of commons-lang3 v 3.3.2 is creating a resolution 
> conflict. That's a dependency only needed for the local dynamodb & tests.
> I propose to fix this in s3guard by explicitly declaring the version used in 
> the tests to be that of the azure-storage one, excluding that you get for 
> free. It doesn't impact anything shipped in production, but puts the hadoop 
> build in control of what versions of commons-lang are coming in everywhere by 
> way of the commons-config version declared in hadoop-common



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-14091) AbstractFileSystem implementaion for 'wasbs' scheme

2017-02-23 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-14091.

Resolution: Fixed

Sure, but let's track in our private channel. This is for community effort. 
Thanks

> AbstractFileSystem implementaion for 'wasbs' scheme
> ---
>
> Key: HADOOP-14091
> URL: https://issues.apache.org/jira/browse/HADOOP-14091
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/azure
> Environment: humboldt
>Reporter: Varada Hemeswari
>Assignee: Varada Hemeswari
>  Labels: SECURE, WASB
> Fix For: 2.8.0, 3.0.0-alpha3
>
> Attachments: HADOOP-14091.001.patch, HADOOP-14091.002.patch
>
>
> Currently  org.apache.hadoop.fs.azure.Wasb provides AbstractFileSystem 
> implementation for 'wasb' scheme.
> This task refers to providing AbstractFileSystem implementation for 'wasbs' 
> scheme



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14107) ITestS3GuardListConsistency fails intermittently

2017-02-22 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14107:
--

 Summary: ITestS3GuardListConsistency fails intermittently
 Key: HADOOP-14107
 URL: https://issues.apache.org/jira/browse/HADOOP-14107
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu


{code}
mvn -Dit.test='ITestS3GuardListConsistency' -Dtest=none -Dscale -Ds3guard 
-Ddynamo -q clean verify

---
 T E S T S
---
Running org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.544 sec <<< 
FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency
testListStatusWriteBack(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  
Time elapsed: 3.147 sec  <<< FAILURE!
java.lang.AssertionError: Unexpected number of results from metastore. 
Metastore should only know about /XYZ: 
DirListingMetadata{path=s3a://mliu-s3guard/test/ListStatusWriteBack, 
listMap={s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/XYZ;
 isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}, 
s3a://mliu-s3guard/test/ListStatusWriteBack/123=PathMetadata{fileStatus=S3AFileStatus{path=s3a://mliu-s3guard/test/ListStatusWriteBack/123;
 isDirectory=true; modification_time=0; access_time=0; owner=mliu; group=mliu; 
permission=rwxrwxrwx; isSymlink=false} isEmptyDirectory=true}}, 
isAuthoritative=false}
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.assertTrue(Assert.java:41)
at 
org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testListStatusWriteBack(ITestS3GuardListConsistency.java:127)
{code}

See discussion on the parent JIRA [HADOOP-13345].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14106) Fix deprecated FileSystem inefficient calls in unit test

2017-02-22 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14106:
--

 Summary: Fix deprecated FileSystem inefficient calls in unit test
 Key: HADOOP-14106
 URL: https://issues.apache.org/jira/browse/HADOOP-14106
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Mingliang Liu


[HADOOP-13321] deprecates FileSystem APIs that promote inefficient call 
patterns. There are existing code patterns in tests and this JIRA is to address 
them.

{code}
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/contract/AbstractFSContractTestBase.java:[372,10]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[249,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[251,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[264,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[266,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[280,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[282,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[288,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[290,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[306,14]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[308,16]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[313,14]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[315,16]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[340,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[342,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[351,14]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[353,16]
 [deprecation] isDirectory(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[400,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[759,14]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java:[761,16]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/FileSystemContractBaseTest.java:[124,18]
 [deprecation] isFile(Path) in FileSystem has been deprecated
[WARNING] 
/testptch/hadoop/hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs

[jira] [Created] (HADOOP-14102) Relax error message assertion in S3A test ITestS3AEncryptionSSEC::testCreateFileAndReadWithDifferentEncryptionKey

2017-02-21 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14102:
--

 Summary: Relax error message assertion in S3A test 
ITestS3AEncryptionSSEC::testCreateFileAndReadWithDifferentEncryptionKey
 Key: HADOOP-14102
 URL: https://issues.apache.org/jira/browse/HADOOP-14102
 Project: Hadoop Common
  Issue Type: Test
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu
Priority: Trivial


[HADOOP-13075] added support for SSE-KMS and SSE-C in s3a filesystem, along 
with integration test 
{{ITestS3AEncryptionSSEC::testCreateFileAndReadWithDifferentEncryptionKey}}. 
For {{AccessDeniedException}} test case, it assumes the error message contains 
string _Forbidden (Service: Amazon S3; Status Code: 403;_, which is true in the 
current AWS java sdk and current S3AFileSystem code path.

When enabling S3Guard (see feature JIRA [HADOOP-13345]), the code path that 
fails in {{S3AFileSystem}} changes for that test. Specifically, the request w/o 
S3Guard was calling {{getFileStatus()}} and fails with access denied exception 
containing _Forbidden_ keyword; while the request w/ S3Guard is able to call 
{{getFileStatus()}} successfully and then fails later for read operations with 
access denied exception containing _Access Denied_ keyword.

AWS sdk does not provide the exactly same error message for different 
AccessDeniedExceptions. In the meantime, the AWS sdk may evolve (as we have 
been upgrading the sdk version recently in a timely manner) and the error 
message may be different in the future. This is to relax exception message 
assertion in test.

Thanks [~fabbri] for discussion. See [HADOOP-13075].



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-14079) Fix breaking link in s3guard.md

2017-02-13 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-14079:
--

 Summary: Fix breaking link in s3guard.md
 Key: HADOOP-14079
 URL: https://issues.apache.org/jira/browse/HADOOP-14079
 Project: Hadoop Common
  Issue Type: Bug
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu
Assignee: Mingliang Liu
Priority: Trivial


See the initial patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13960) Initialize DynamoDBMetadataStore without associated S3AFileSystem

2017-01-08 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13960:
--

 Summary: Initialize DynamoDBMetadataStore without associated 
S3AFileSystem
 Key: HADOOP-13960
 URL: https://issues.apache.org/jira/browse/HADOOP-13960
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Reporter: Mingliang Liu
Assignee: Mingliang Liu


Per the discussion in [HADOOP-13650], it's helpful to initialize a 
DynamoDBMetadataStore object without S3AFileSystem. In the current code, we can 
achieve this via {{DynamoDBMetadataStore#initialize(Configuration)}}. However, 
users still have to provide the associated S3AFileSystem URI in the 
configuration, by means of either setting the {{fs.defaultFS}} in configuration 
file or {{-fs s3://bucket}} command line parameter. Setting the default 
FileSystem in configuration seems not necessary as the command line is to 
manipulate metadata store, e.g. command line tools on an existing HDFS cluster.

This JIRA is to track the effort of initializing a DynamoDBMetadataStore 
without associating any S3 buckets, so that S3AFileSystem is not needed. Users 
have to specify in configuration the DynamoDB endpoints (for region) and table 
name along with credentials, AWS client settings.

See [~eddyxu] and [~liuml07]'s comments in [HADOOP-13650] for more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-13946) Document how HDFS updates timestamps in the FS spec; compare with object stores

2017-01-03 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reopened HADOOP-13946:


> Document how HDFS updates timestamps in the FS spec; compare with object 
> stores
> ---
>
> Key: HADOOP-13946
> URL: https://issues.apache.org/jira/browse/HADOOP-13946
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation, fs
>Affects Versions: 2.7.3
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HADOOP-13946-001.patch
>
>
> SPARK-17159 shows that the behavior of when HDFS updates timestamps isn't 
> well documented. Document these in the FS spec.
> I'm not going to add tests for this, as it is so very dependent on FS 
> implementations, as in "POSIX filesystems may behave differently from HDFS". 
> If someone knows what happens there, their contribution is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-13938) Address javadoc errors in Azure WASB.

2016-12-26 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-13938.

Resolution: Duplicate

> Address javadoc errors in Azure WASB.
> -
>
> Key: HADOOP-13938
> URL: https://issues.apache.org/jira/browse/HADOOP-13938
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: azure, fs/azure
>Affects Versions: 2.8.0
>Reporter: Dushyanth
>
> As observed in HADOOP-13863, there are few javadocs error that are thrown 
> while building WASB in QA builds. This JIRA is created to track the fix for 
> these javadoc errors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13937) Mock bucket locations in MockS3ClientFactory

2016-12-22 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13937:
--

 Summary: Mock bucket locations in MockS3ClientFactory
 Key: HADOOP-13937
 URL: https://issues.apache.org/jira/browse/HADOOP-13937
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu
Assignee: Mingliang Liu
Priority: Minor


Currently the MockS3ClientFactory}} does not mock the bucket locations. One 
effect is that {{TestDynamoDBMetadataStore}} will have null region for the 
bucket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13908) Existing tables may not be initialized correctly in DynamoDBMetadataStore

2016-12-14 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13908:
--

 Summary: Existing tables may not be initialized correctly in 
DynamoDBMetadataStore
 Key: HADOOP-13908
 URL: https://issues.apache.org/jira/browse/HADOOP-13908
 Project: Hadoop Common
  Issue Type: Sub-task
  Components: fs/s3
Affects Versions: HADOOP-13345
Reporter: Mingliang Liu
Assignee: Mingliang Liu


This was based on discussion in [HADOOP-13455]. Though we should not create 
table unless the config {{fs.s3a.s3guard.ddb.table.create}} is set true, we 
still have to get the existing table in {{DynamoDBMetadataStore#initialize()}} 
and wait for its becoming active, before any table/item operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13813) TestDelegationTokenFetcher#testDelegationTokenWithoutRenewer is failing

2016-11-11 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13813:
--

 Summary: 
TestDelegationTokenFetcher#testDelegationTokenWithoutRenewer is failing
 Key: HADOOP-13813
 URL: https://issues.apache.org/jira/browse/HADOOP-13813
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu
Assignee: Mingliang Liu


[HADOOP-13720] added more info to the msgs printed in 
{{AbstractDelegationTokenSecretManager}} for better supportability, which is 
good. Unfortunately the unit test 
{{TestDelegationTokenFetcher#testDelegationTokenWithoutRenewer}} that asserts 
the message string was not updated accordingly. The unit test is failing in 
both {{trunk}} and {{branch-2}} branches, see example builds 
[1|https://issues.apache.org/jira/browse/HDFS-11129?focusedCommentId=15657488=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15657488],
 
[2|https://issues.apache.org/jira/browse/HDFS-11130?focusedCommentId=15658086=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15658086],
 and 
[3|https://issues.apache.org/jira/browse/HDFS-7?focusedCommentId=15656939=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15656939].

This JIRA is to track the effort of fixing this.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: Multiple node Hadoop cluster on Docker for test and debugging

2016-11-09 Thread Mingliang Liu
I know a similar tool; it has Ambari and Spark integration as well.

https://github.com/weiqingy/caochong 

Thanks,

L

> On Nov 9, 2016, at 5:38 AM, Sasaki Kai  wrote:
> 
> Hi Hadoop developers
> 
> The other day I created a tool for launching multiple node hadoop cluster on 
> docker container.
> You can easily launch multiple node hadoop cluster from your Hadoop source 
> code. 
> It is useful for testing and debugging. Actually I often use it before 
> submitting a patch to Hadoop project.
> https://github.com/Lewuathe/docker-hadoop-cluster 
> 
> 
> And I also updated to build the latest trunk image automatically and upload 
> onto Docker Hub.
> So you can easily check and test the latest trunk branch in the environment 
> which is more close to actual usage.
> 
> If you already installed docker and docker-compose, what needed is 
> docker-compose.yml like this.
> 
> version: '2'
> 
> services:
>  master:
>image: lewuathe/hadoop-master
>ports:
>  - "9870:9870"
>  - "8088:8088"
>  - "19888:19888"
>  - "8188:8188"
>container_name: "master"
>  slave1:
>image: lewuathe/hadoop-slave
>container_name: "slave1"
>depends_on:
>  - master
>ports:
>  - "9901:9864"
>  - "8041:8042"
>  slave2:
>image: lewuathe/hadoop-slave
>container_name: "slave2"
>depends_on:
>  - master
>ports:
>  - "9902:9864"
>  - "8042:8042"
> 
> The usage in detail is described in the repository.
> https://github.com/Lewuathe/docker-hadoop-cluster/blob/master/README.md 
> 
> 
> I would be glad if you use this tool for developing and debugging and make 
> our development more efficient.
> Please give me any feedbacks to me. Thanks you!
> 
> 
> Kai Sasaki
> mail: lewua...@me.com 
> github: https://github.com/Lewuathe 
> 
> 



Re: [DISCUSS] Pre-commit build in Windows platform

2016-10-30 Thread Mingliang Liu
I also like the idea of having a precommit build in Windows. If only the 
pre-commit infrastructure was reliable. If this is a pain, I see little value 
making Windows unit test failures release blockers.

Thanks,

Sent from my iPhone

> On Oct 28, 2016, at 8:16 AM, Allen Wittenauer  
> wrote:
> 
> 
>> On Oct 27, 2016, at 8:20 PM, Brahma Reddy Battula 
>>  wrote:
>> 
>> As we supporting the Hadoop in windows, I feel, we should have pre-commit 
>> build in windows( atleast in qbt).
> 
> 
>I actually tried to get Apache Yetus testing Apache Hadoop on the 
> hadoop-win box last year.  (This was before qbt mode existed.)  I gave up 
> because the components needed to build trunk weren't installed and it looked 
> like the box itself was ill. I moved on to the Mac build which, while tricky, 
> got it working with a bit of magic.  Then they took the Mac away.  IBM 
> provided access to a PowerPC machine, so I moved onto the ppc64le build. 
> After a lot of hand wringing, we got it up and running and stable enough to 
> show how broken the build is by our usage of leveldbjni.
> 
>It should be noted that getting infra support for non-Linux/x86 builds is 
> a pretty major exercise in frustration.  They don't really have the time, 
> those boxes tend to go down often, and rarely get updated. (The Solaris box 
> is running a really old build of Solaris 10.  So old, that when it came out, 
> I was still employed by Sun...)  If folks are actually serious about adding 
> more platforms, then I'd suggest that some of these big vendors actually 
> cough up some cash to move the precommit infrastructure off of the ASF and 
> onto something more reliable with more platform diversity. 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13697) LogLevel#main throws exception if no arguments provided

2016-10-07 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13697:
--

 Summary: LogLevel#main throws exception if no arguments provided
 Key: HADOOP-13697
 URL: https://issues.apache.org/jira/browse/HADOOP-13697
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 2.9.0
Reporter: Mingliang Liu
Assignee: Mingliang Liu


{code}
root@b9ab37566005:/# hadoop daemonlog

Usage: General options are:
[-getlevel   [-protocol (http|https)]
[-setlevel[-protocol (http|https)]

Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: No 
arguments specified
at org.apache.hadoop.log.LogLevel$CLI.parseArguments(LogLevel.java:138)
at org.apache.hadoop.log.LogLevel$CLI.run(LogLevel.java:106)
at org.apache.hadoop.log.LogLevel.main(LogLevel.java:70)
{code}

I think we can catch the exception in the main method, and dump a log error 
message instead of throw the stack which may frustrate users.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-11513) Artifact errors with Maven build

2016-10-06 Thread Mingliang Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-11513.

Resolution: Cannot Reproduce

Feel free to re-open if it happens again. Thanks.

> Artifact errors with Maven build
> 
>
> Key: HADOOP-11513
> URL: https://issues.apache.org/jira/browse/HADOOP-11513
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Affects Versions: 2.7.0
>Reporter: Arpit Agarwal
>
> I recently started getting the following errors with _mvn -q clean compile 
> install_ on Linux and OS X.
> {code}
> [ERROR] Artifact: org.xerial.snappy:snappy-java:jar:1.0.4.1 has no file.
> [ERROR] Artifact: xerces:xercesImpl:jar:2.9.1 has no file.
> [ERROR] Artifact: xml-apis:xml-apis:jar:1.3.04 has no file.
> [ERROR] Artifact: xmlenc:xmlenc:jar:0.52 has no file.
> [ERROR] Artifact: org.xerial.snappy:snappy-java:jar:1.0.4.1 has no file.
> [ERROR] Artifact: xerces:xercesImpl:jar:2.9.1 has no file.
> [ERROR] Artifact: xml-apis:xml-apis:jar:1.3.04 has no file.
> [ERROR] Artifact: xmlenc:xmlenc:jar:0.52 has no file.
> {code}
> _mvn --version_ on Linux reports:
> {code}
> Apache Maven 3.2.5 (12a6b3acb947671f09b81f49094c53f426d8cea1; 
> 2014-12-14T09:29:23-08:00)
> Maven home: /home/vagrant/usr/share/maven
> Java version: 1.7.0_65, vendor: Oracle Corporation
> Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "linux", version: "3.13.0-24-generic", arch: "amd64", family: "unix"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[jira] [Created] (HADOOP-13644) Replace config key literal strings with config key names

2016-09-22 Thread Mingliang Liu (JIRA)
Mingliang Liu created HADOOP-13644:
--

 Summary: Replace config key literal strings with config key names 
 Key: HADOOP-13644
 URL: https://issues.apache.org/jira/browse/HADOOP-13644
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu
Assignee: Chen Liang
Priority: Minor


There are some places that use config key literal strings instead of config key 
names, e.g.
{code:title=IOUtils.java}
copyBytes(in, out, conf.getInt("io.file.buffer.size", 4096), true);
{code}

We should replace places like this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



  1   2   >