[jira] [Resolved] (HADOOP-18592) Sasl connection failure should log remote address

2023-02-01 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-18592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-18592.

Fix Version/s: 3.4.0
   3.3.9
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to branch-3.3 and trunk branches. Thank you for your contribution, 
[~vjasani] . Thank you all for review.

> Sasl connection failure should log remote address
> -
>
> Key: HADOOP-18592
> URL: https://issues.apache.org/jira/browse/HADOOP-18592
> Project: Hadoop Common
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> If Sasl connection fails with some generic error, we miss logging remote 
> server that the client was trying to connect to.
> Sample log:
> {code:java}
> 2023-01-12 00:22:28,148 WARN  [20%2C1673404849949,1] ipc.Client - Exception 
> encountered while connecting to the server 
> java.io.IOException: Connection reset by peer
>     at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>     at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>     at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>     at sun.nio.ch.IOUtil.read(IOUtil.java:197)
>     at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379)
>     at 
> org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:141)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>     at java.io.FilterInputStream.read(FilterInputStream.java:133)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1950)
>     at 
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
>     at 
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:623)
>     at org.apache.hadoop.ipc.Client$Connection.access$2300(Client.java:414)
> ...
> ... {code}
> We should log the remote server address.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17728) Fix issue of the StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17362036#comment-17362036
 ] 

Mingliang Liu commented on HADOOP-17728:


Thanks [~Jim_Brennan]! I would like to keep it open for a while in case there 
are more comments. Apparently when this patch was discussed in PR, it was 
considered valid. I will follow up discussions in [HADOOP-17758].

When reporting a bug, if you find the JIRA related to the cause, you can also 
comment directly on the original JIRA (e.g. this one) directly instead of 
opening a new JIRA (e.g. HADOOP-17758). That way we can track the context one 
place. But if the patch that is in question is not clear, opening a new one 
will be better.



> Fix issue of the StatisticsDataReferenceCleaner cleanUp
> ---
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17758) NPE and excessive warnings after HADOOP-17728

2021-06-11 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361509#comment-17361509
 ] 

Mingliang Liu commented on HADOOP-17758:


I have not checked the failure detail here carefully, but clearly the NPE is a 
bug. I have reverted the original code in 3.2/3.3/3.4 branches. Thanks for 
reporting.

> NPE and excessive warnings after HADOOP-17728
> -
>
> Key: HADOOP-17758
> URL: https://issues.apache.org/jira/browse/HADOOP-17758
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> I'm noticing these warnings and NPE's when just running a simple pi test on a 
> one node cluster:
> {noformat}
> 2021-06-09 21:51:12,334 WARN  
> [org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner] 
> fs.FileSystem (FileSystem.java:run(4025)) - Exception in the cleaner thread 
> but it will continue to run
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:4020)
>   at java.lang.Thread.run(Thread.java:748){noformat}
> This appears to be due to [HADOOP-17728].
> I'm not sure I understand why that change was made?  Wasn't it by design that 
> the remove should wait until something is queued?
> [~kaifeiYi] can you please investigate?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17728) Fix issue of the StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17728:
---
Summary: Fix issue of the StatisticsDataReferenceCleaner cleanUp  (was: 
Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp)

> Fix issue of the StatisticsDataReferenceCleaner cleanUp
> ---
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17361505#comment-17361505
 ] 

Mingliang Liu commented on HADOOP-17728:


Per discussion in [HADOOP-17758], I have reverted this and removed the fixed 
version.

> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Reopened] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reopened HADOOP-17728:


> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17728:
---
Fix Version/s: (was: 3.3.2)
   (was: 3.2.3)
   (was: 3.4.0)
   Labels: pull-request-available reverted  (was: 
pull-request-available)

> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available, reverted
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17728.

Fix Version/s: 3.2.3
   3.4.0
   3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to 3.1+ Thanks for your reporting and contribution [~kaifeiYi]. 
Thanks for your review [~ste...@apache.org]

> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17728) Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp

2021-06-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17728:
--

Assignee: yikf

> Deadlock in FileSystem StatisticsDataReferenceCleaner cleanUp
> -
>
> Key: HADOOP-17728
> URL: https://issues.apache.org/jira/browse/HADOOP-17728
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 3.2.1
>Reporter: yikf
>Assignee: yikf
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Cleaner thread will be blocked if we remove reference from ReferenceQueue 
> unless the `queue.enqueue` called.
> 
>     As shown below, We call ReferenceQueue.remove() now while cleanUp, Call 
> chain as follow:
>                          *StatisticsDataReferenceCleaner#queue.remove()  ->  
> ReferenceQueue.remove(0)  -> lock.wait(0)*
>     But, lock.notifyAll is called when queue.enqueue only, so Cleaner thread 
> will be blocked.
>  
> ThreadDump:
> {code:java}
> "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x7f7afc088800 
> nid=0x2119 in Object.wait() [0x7f7b0023]
>java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> - waiting on <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at java.lang.Object.wait(Object.java:502)
> at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
> - locked <0xc00c2f58> (a java.lang.ref.Reference$Lock)
> at 
> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address leveraging URI cache

2021-03-31 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17312699#comment-17312699
 ] 

Mingliang Liu commented on HADOOP-17222:


[~sodonnell] I was oncall recently so did not have time to review that in time. 
I just checked and the backport looks great. Thanks!

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address leveraging URI cache

2021-03-24 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17308069#comment-17308069
 ] 

Mingliang Liu commented on HADOOP-17222:


Yes thank you!

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address leveraging URI cache

2021-03-22 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17306606#comment-17306606
 ] 

Mingliang Liu commented on HADOOP-17222:


Thanks [~sodonnell]. 

I guess I do not play with clusters with >1000 nodes nowdays (we have more and 
small clusters) and do not have preference about >1000 cache size. But totally 
understand your use case. Since this is just a max size (or capacity) instead 
of the typical real size, I think it makes sense if you want to increase the 
hard limit from 1000 to something like 3000. The number is more an art so 
anything above 2000 makes sense I assume.

Yes, I think this can be backported in old branches like 3.3. I recall vaguely 
that there is not 3.4 specific logic or assumption. Will defer to [~fanrui] and 
[~hexiaoqiao] to confirm. I can help backport. So, is 3.2/3.1 also doable? 

 

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15566) Support OpenTelemetry

2021-03-16 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-15566:
---
Summary: Support OpenTelemetry  (was: Support Opentracing)

> Support OpenTelemetry
> -
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics, tracing
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Assignee: Siyao Meng
>Priority: Major
>  Labels: security
> Attachments: HADOOP-15566.000.WIP.patch, OpenTracing Support Scope 
> Doc.pdf, Screen Shot 2018-06-29 at 11.59.16 AM.png, ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-16887) [OpenTelemetry] Update doc for tracing

2021-03-16 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-16887:
--

Component/s: tracing
 documentation
   Assignee: Kiran Kumar Maturi
Summary: [OpenTelemetry] Update doc for tracing  (was: [OpenTracing] 
Add doc)

> [OpenTelemetry] Update doc for tracing
> --
>
> Key: HADOOP-16887
> URL: https://issues.apache.org/jira/browse/HADOOP-16887
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: documentation, tracing
>Reporter: Wei-Chiu Chuang
>Assignee: Kiran Kumar Maturi
>Priority: Major
>
> We should remove this doc 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Tracing.html
> and replace it with the OT usage in Hadoop.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17571) Upgrade com.fasterxml.woodstox:woodstox-core for security reasons

2021-03-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17571:
---
Hadoop Flags: Reviewed
  Resolution: Fixed
  Status: Resolved  (was: Patch Available)

Committed to all target branches (2.10.2+). Thank you [~vjasani] for your 
contribution. Thank you [~aajisaka] for your review.

> Upgrade com.fasterxml.woodstox:woodstox-core for security reasons
> -
>
> Key: HADOOP-17571
> URL: https://issues.apache.org/jira/browse/HADOOP-17571
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0, 3.1.5, 2.10.2, 3.2.3
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Due to security concerns (CVE: sonatype-2018-0624), we should bump up 
> woodstox-core to 5.3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17571) Upgrade com.fasterxml.woodstox:woodstox-core for security reasons

2021-03-09 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17298551#comment-17298551
 ] 

Mingliang Liu commented on HADOOP-17571:


Thank you filing this one. I have added you to "Contributor1" list and assigned 
this Jira to you.

> Upgrade com.fasterxml.woodstox:woodstox-core for security reasons
> -
>
> Key: HADOOP-17571
> URL: https://issues.apache.org/jira/browse/HADOOP-17571
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Due to security concerns (CVE: sonatype-2018-0624), we should bump up 
> woodstox-core to 5.3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17571) Upgrade com.fasterxml.woodstox:woodstox-core for security reasons

2021-03-09 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17571:
--

Assignee: Viraj Jasani

> Upgrade com.fasterxml.woodstox:woodstox-core for security reasons
> -
>
> Key: HADOOP-17571
> URL: https://issues.apache.org/jira/browse/HADOOP-17571
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> Due to security concerns (CVE: sonatype-2018-0624), we should bump up 
> woodstox-core to 5.3.0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-15566) Support Opentracing

2021-02-12 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-15566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17284045#comment-17284045
 ] 

Mingliang Liu commented on HADOOP-15566:


As the newly merged project OpenTelemetry is to release very soon, I think we 
should move to OpenTelemetry. The current PoC patch has not been updated for a 
while, and those subtasks have not all been started. So my suggestion is to:
 # Update design doc (v2) to indicate the change of using OpenTelemetry
 # Update subtasks to replace OpenTracing with OpenTelemetry
 # Start working on subtasks in a feature branch, targeting 3.4.0 

 [~smeng] and [~weichiu] Are you still interested in working on this project? 
[~kiran.maturi] showed interest to contribute/collaborate. Thanks

> Support Opentracing
> ---
>
> Key: HADOOP-15566
> URL: https://issues.apache.org/jira/browse/HADOOP-15566
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: metrics, tracing
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Assignee: Siyao Meng
>Priority: Major
>  Labels: security
> Attachments: HADOOP-15566.000.WIP.patch, OpenTracing Support Scope 
> Doc.pdf, Screen Shot 2018-06-29 at 11.59.16 AM.png, ss-trace-s3a.png
>
>
> The HTrace incubator project has voted to retire itself and won't be making 
> further releases. The Hadoop project currently has various hooks with HTrace. 
> It seems in some cases (eg HDFS-13702) these hooks have had measurable 
> performance overhead. Given these two factors, I think we should consider 
> removing the HTrace integration. If there is someone willing to do the work, 
> replacing it with OpenTracing might be a better choice since there is an 
> active community.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-16355) ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store

2021-01-25 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-16355.

Resolution: Abandoned

As documented by [[HADOOP-17480]], AWS S3 is consistent and that S3Guard is not 
needed.

> ZookeeperMetadataStore: Use Zookeeper as S3Guard backend store
> --
>
> Key: HADOOP-16355
> URL: https://issues.apache.org/jira/browse/HADOOP-16355
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Mingliang Liu
>Priority: Major
>
> When S3Guard was proposed, there are a couple of valid reasons to choose 
> DynamoDB as its default backend store: 0) seamless integration as part of AWS 
> ecosystem e.g. client library 1) it's a managed web service which is zero 
> operational cost, highly available and infinitely scalable 2) it's performant 
> with single digit latency 3) it's proven by Netflix's S3mper (not actively 
> maintained) and EMRFS (closed source and usage). As it's pluggable, it's 
> possible to implement {{MetadataStore}} with other backend store without 
> changing semantics, besides null and in-memory local ones.
> Here we propose {{ZookeeperMetadataStore}} which uses Zookeeper as S3Guard 
> backend store. Its main motivation is to provide a new MetadataStore option 
> which:
>  # can be easily integrated as Zookeeper is heavily used in Hadoop community
>  # affordable performance as both client and Zookeeper ensemble are usually 
> "local" in a Hadoop cluster (ZK/HBase/Hive etc)
>  # removes DynamoDB dependency
> Obviously all use cases will not prefer this to default DynamoDB store. For 
> e.g. ZK might not scale well if there are dozens of S3 buckets and each has 
> millions of objects. Our use case is targeting HBase to store HFiles on S3 
> instead of HDFS. A total solution for HBase on S3 must be HBOSS (see 
> HBASE-22149) for recovering atomicity of metadata operations like rename, and 
> S3Guard for consistent enumeration and access to object store bucket 
> metadata. We would like to use Zookeeper as backend store for both.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17068) client fails forever when namenode ipaddr changed

2020-10-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17210542#comment-17210542
 ] 

Mingliang Liu commented on HADOOP-17068:


Does this apply to Hadoop 2? If so could you provide a branch-2 patch? Thanks,

> client fails forever when namenode ipaddr changed
> -
>
> Key: HADOOP-17068
> URL: https://issues.apache.org/jira/browse/HADOOP-17068
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HADOOP-17068.001.patch, HDFS-15390.01.patch
>
>
> For machine replacement, I replace my standby namenode with a new ipaddr and 
> keep the same hostname. Also update the client's hosts to make it resolve 
> correctly
> When I try to run failover to transite the new namenode(let's say nn2), the 
> client will fail to read or write forever until it's restarted.
> That make yarn nodemanager in sick state. Even the new tasks will encounter 
> this exception  too. Until all nodemanager restart.
>  
> {code:java}
> 20/06/02 15:12:25 WARN ipc.Client: Address change detected. Old: 
> nn2-192-168-1-100/192.168.1.100:9000 New: nn2-192-168-1-100/192.168.1.200:9000
> 20/06/02 15:12:25 DEBUG ipc.Client: closing ipc connection to 
> nn2-192-168-1-100/192.168.1.200:9000: Connection refused
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:608)
> at 
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:707)
> at 
> org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1517)
> at org.apache.hadoop.ipc.Client.call(Client.java:1440)
> at org.apache.hadoop.ipc.Client.call(Client.java:1401)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
> at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
> at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:193)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
> {code}
>  
> We can see the client has {{Address change detected}}, but it still fails. I 
> find out that's because when method {{updateAddress()}} return true,  the 
> {{handleConnectionFailure()}} thow an exception that break the next retry 
> with the right ipaddr.
> Client.java: setupConnection()
> {code:java}
> } catch (ConnectTimeoutException toe) {
>   /* Check for an address change and update the local reference.
>* Reset the failure counter if the address was changed
>*/
>   if (updateAddress()) {
> timeoutFailures = ioFailures = 0;
>   }
>   handleConnectionTimeout(timeoutFailures++,
>   maxRetriesOnSocketTimeouts, toe);
> } catch (IOException ie) {
>   if (updateAddress()) {
> timeoutFailures = ioFailures = 0;
>   }
> // because the namenode ip changed in updateAddress(), the old namenode 
> ipaddress cannot be accessed now
> // handleConnectionFailure will thow an exception, the next retry never have 
> a chance to use the right server updated in updateAddress()
>   handleConnectionFailure(ioFailures++, ie);
> }
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17276) Extend CallerContext to make it include many items

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202490#comment-17202490
 ] 

Mingliang Liu edited comment on HADOOP-17276 at 9/26/20, 2:07 AM:
--

So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:
 # This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
 # Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~inigoiri] and [~aajisaka]

Thanks,

 

[EDIT]: tagged wrong Goiri user, fixed


was (Author: liuml07):
So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:
 # This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
 # Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~inigoiri] and [~aajisaka]

Thanks,

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17276) Extend CallerContext to make it include many items

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202490#comment-17202490
 ] 

Mingliang Liu edited comment on HADOOP-17276 at 9/26/20, 2:06 AM:
--

So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:

# This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
# Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~[~inigoiri] and [~aajisaka]

Thanks,


was (Author: liuml07):
So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:

# This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
# Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~goiri] and [~aajisaka]

Thanks,

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17276) Extend CallerContext to make it include many items

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202490#comment-17202490
 ] 

Mingliang Liu edited comment on HADOOP-17276 at 9/26/20, 2:06 AM:
--

So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:
 # This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
 # Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~inigoiri] and [~aajisaka]

Thanks,


was (Author: liuml07):
So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:

# This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
# Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~[~inigoiri] and [~aajisaka]

Thanks,

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17276) Extend CallerContext to make it include many items

2020-09-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17202490#comment-17202490
 ] 

Mingliang Liu commented on HADOOP-17276:


So, for the code reviews, I'll leave comments about implementation in the pull 
request. Allow me sort my understanding out here:

# This is compatible change because it does not enforce any format of the 
existing caller context. This is very important to confirm. The initial 
{{CallerContext}} design used a plain string format instead of 
{{key:value,key:value}} format for sake of flexibility and simplicity. Audit 
log per se is using {{key=value}} format and was not designed with nested 
structure in mind. I believe every time we revisit audit format, people would 
think of better ways e.g. JSON. But compatibility - not only code setting it 
but also existing tools parsing it - is always one of the key concern.
# Here for easier appending more information into the caller context in an 
organized way, in this JIRA we propose that user can call multiple times 
{{builder.append(key, value)}} to append information into the context. The 
CallContext object once constructed is still immutable as it is. So we are not 
changing any existing behavior in this proposal.

Given above points, I would be +1 on this.

For the implementation, one point is not allowing "=" or "\t" as the separator 
of the key/value items in the context string. This is because the audit log 
itself is using = to separate key/values in every top-level field, and using 
"\t" to separate top-level fields.

CC: [~jitendra] [~stevel] [~goiri] and [~aajisaka]

Thanks,

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17267) Add debug-level logs in Filesystem#close

2020-09-23 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17201047#comment-17201047
 ] 

Mingliang Liu commented on HADOOP-17267:


I have added you to the Hadoop Contributor1 list and assigned this JIRA to you, 
[~klcopp]

> Add debug-level logs in Filesystem#close
> 
>
> Key: HADOOP-17267
> URL: https://issues.apache.org/jira/browse/HADOOP-17267
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> HDFS reuses the same cached FileSystem object across the file system. If the 
> client calls FileSystem.close(), closeAllForUgi(), or closeAll() (if it 
> applies to the instance) anywhere in the system it purges the cache of that 
> FS instance, and trying to use the instance results in an IOException: 
> FileSystem closed.
> It would be a great help to clients to see where and when a given FS instance 
> was closed. I.e. in close(), closeAllForUgi(), or closeAll(), it would be 
> great to see a DEBUG-level log of
>  * calling method name, class, file name/line number
>  * FileSystem object's identity hash (FileSystem.close() only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17267) Add debug-level logs in Filesystem#close

2020-09-23 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17267:
--

Assignee: Karen Coppage

> Add debug-level logs in Filesystem#close
> 
>
> Key: HADOOP-17267
> URL: https://issues.apache.org/jira/browse/HADOOP-17267
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.0
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> HDFS reuses the same cached FileSystem object across the file system. If the 
> client calls FileSystem.close(), closeAllForUgi(), or closeAll() (if it 
> applies to the instance) anywhere in the system it purges the cache of that 
> FS instance, and trying to use the instance results in an IOException: 
> FileSystem closed.
> It would be a great help to clients to see where and when a given FS instance 
> was closed. I.e. in close(), closeAllForUgi(), or closeAll(), it would be 
> great to see a DEBUG-level log of
>  * calling method name, class, file name/line number
>  * FileSystem object's identity hash (FileSystem.close() only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17267) Add debug-level logs in Filesystem#close

2020-09-20 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17198929#comment-17198929
 ] 

Mingliang Liu commented on HADOOP-17267:


This is useful debugging information. Mind providing patch?

> Add debug-level logs in Filesystem#close
> 
>
> Key: HADOOP-17267
> URL: https://issues.apache.org/jira/browse/HADOOP-17267
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.0
>Reporter: Karen Coppage
>Priority: Minor
>
> HDFS reuses the same cached FileSystem object across the file system. If the 
> client calls FileSystem.close(), closeAllForUgi(), or closeAll() (if it 
> applies to the instance) anywhere in the system it purges the cache of that 
> FS instance, and trying to use the instance results in an IOException: 
> FileSystem closed.
> It would be a great help to clients to see where and when a given FS instance 
> was closed. I.e. in close(), closeAllForUgi(), or closeAll(), it would be 
> great to see a DEBUG-level log of
>  * calling method name, class, file name/line number
>  * FileSystem object's identity hash (FileSystem.close() only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Release Note: 
DFS client can use the newly added URI cache when creating socket address for 
read operations. By default it is disabled. When enabled, creating socket 
address will use cached URI object based on host:port to reduce the frequency 
of URI object creation.

To enable it, set the following config key to true:

  dfs.client.read.uri.cache.enabled
  true


  was:
DFS client can use the newly added URI cache when creating socket address for 
read operations. When enabled, creating socket address will use cached URI 
object based on host:port to reduce the frequency of URI object creation.

To enable it, set the following config key to true:

  dfs.client.read.uri.cache.enabled
  true



>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-11 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17194605#comment-17194605
 ] 

Mingliang Liu commented on HADOOP-17222:


[~fanrui] I have merged this to {{trunk}} branch. Also I added the Release Note 
for this JIRA. Please review and update the "Release Note" field if it applies. 
Thanks for your contribution!

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-11 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Release Note: 
DFS client can use the newly added URI cache when creating socket address for 
read operations. When enabled, creating socket address will use cached URI 
object based on host:port to reduce the frequency of URI object creation.

To enable it, set the following config key to true:

  dfs.client.read.uri.cache.enabled
  true


>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-10 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address leveraging URI cache

2020-09-10 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Summary:  Create socket address leveraging URI cache  (was: Create socket 
address combined with cache to speed up hdfs client choose DataNode)

>  Create socket address leveraging URI cache
> ---
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17254) Upgrade hbase to 1.4.13 on branch-2.10

2020-09-10 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193707#comment-17193707
 ] 

Mingliang Liu commented on HADOOP-17254:


+1

Thanks!

> Upgrade hbase to 1.4.13 on branch-2.10
> --
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> hbase.version must be updated to address CVE-2018-8025 on branch-2.10.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17254) Upgrade hbase to 1.2.6.1 on branch-2.10

2020-09-09 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193275#comment-17193275
 ] 

Mingliang Liu commented on HADOOP-17254:


Sorry I do not know what the hbase version is for the build, but the latest 
HBase 1.x version is 1.4.13 and versions prior to 1.3 have been EoL. CC: 
[~stack]

> Upgrade hbase to 1.2.6.1 on branch-2.10
> ---
>
> Key: HADOOP-17254
> URL: https://issues.apache.org/jira/browse/HADOOP-17254
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Masatake Iwasaki
>Assignee: Masatake Iwasaki
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17181) Handle transient stream read failures in FileSystem contract tests

2020-09-09 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17193153#comment-17193153
 ] 

Mingliang Liu commented on HADOOP-17181:


Added 3.4.0 as the fix version as well.

> Handle transient stream read failures in FileSystem contract tests
> --
>
> Key: HADOOP-17181
> URL: https://issues.apache.org/jira/browse/HADOOP-17181
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Seen 2x recently, failure in ITestS3AContractUnbuffer as not enough data came 
> back in the read. 
> The contract test assumes that stream.read() will return everything, but it 
> could be some buffering problem. Proposed: switch to ReadFully to see if it 
> is a quirk of the read/get or is something actually wrong with the production 
> code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17181) Handle transient stream read failures in FileSystem contract tests

2020-09-09 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17181:
---
Fix Version/s: 3.4.0

> Handle transient stream read failures in FileSystem contract tests
> --
>
> Key: HADOOP-17181
> URL: https://issues.apache.org/jira/browse/HADOOP-17181
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.3.1, 3.4.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Seen 2x recently, failure in ITestS3AContractUnbuffer as not enough data came 
> back in the read. 
> The contract test assumes that stream.read() will return everything, but it 
> could be some buffering problem. Proposed: switch to ReadFully to see if it 
> is a quirk of the read/get or is something actually wrong with the production 
> code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192606#comment-17192606
 ] 

Mingliang Liu commented on HADOOP-17252:


Thanks [~aajisaka] If the current wiki page is 
https://cwiki.apache.org/confluence/display/HADOOP then I think the current 
website is fine because the link it has already automatically redirects to it.

I will close this JIRA shortly.

> Website to link to latest Hadoop wiki
> -
>
> Key: HADOOP-17252
> URL: https://issues.apache.org/jira/browse/HADOOP-17252
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Reporter: Mingliang Liu
>Priority: Major
>
> Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
> Shall we update that to the latest one: 
> https://cwiki.apache.org/confluence/display/HADOOP2/Home
> Or am I confused which one is latest, https://wiki.apache.org/hadoop or 
> https://cwiki.apache.org/confluence/display/HADOOP2?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17252:
---
Description: 
Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
Shall we update that to the latest one: 
https://cwiki.apache.org/confluence/display/HADOOP2/Home

Or am I confused which one is latest, https://wiki.apache.org/hadoop or 
https://cwiki.apache.org/confluence/display/HADOOP2?

  was:Currently the website links to the [old 
wiki|https://wiki.apache.org/hadoop]. Shall we update that to the latest one: 
https://cwiki.apache.org/confluence/display/HADOOP2/Home


> Website to link to latest Hadoop wiki
> -
>
> Key: HADOOP-17252
> URL: https://issues.apache.org/jira/browse/HADOOP-17252
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Reporter: Mingliang Liu
>Priority: Major
>
> Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
> Shall we update that to the latest one: 
> https://cwiki.apache.org/confluence/display/HADOOP2/Home
> Or am I confused which one is latest, https://wiki.apache.org/hadoop or 
> https://cwiki.apache.org/confluence/display/HADOOP2?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192502#comment-17192502
 ] 

Mingliang Liu commented on HADOOP-17252:


CC [~aajisaka]

> Website to link to latest Hadoop wiki
> -
>
> Key: HADOOP-17252
> URL: https://issues.apache.org/jira/browse/HADOOP-17252
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Reporter: Mingliang Liu
>Priority: Major
>
> Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
> Shall we update that to the latest one: 
> https://cwiki.apache.org/confluence/display/HADOOP2/Home



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-17252:
--

 Summary: Website to link to latest Hadoop wiki
 Key: HADOOP-17252
 URL: https://issues.apache.org/jira/browse/HADOOP-17252
 Project: Hadoop Common
  Issue Type: Bug
Reporter: Mingliang Liu


Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
Shall we update that to the latest one: 
https://cwiki.apache.org/confluence/display/HADOOP2/Home



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17252:
---
Component/s: site

> Website to link to latest Hadoop wiki
> -
>
> Key: HADOOP-17252
> URL: https://issues.apache.org/jira/browse/HADOOP-17252
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Reporter: Mingliang Liu
>Priority: Major
>
> Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
> Shall we update that to the latest one: 
> https://cwiki.apache.org/confluence/display/HADOOP2/Home



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17252) Website to link to latest Hadoop wiki

2020-09-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192500#comment-17192500
 ] 

Mingliang Liu commented on HADOOP-17252:


Code: 
https://github.com/apache/hadoop-site/blob/asf-site/layouts/partials/navbar.html#L31


> Website to link to latest Hadoop wiki
> -
>
> Key: HADOOP-17252
> URL: https://issues.apache.org/jira/browse/HADOOP-17252
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: site
>Reporter: Mingliang Liu
>Priority: Major
>
> Currently the website links to the [old wiki|https://wiki.apache.org/hadoop]. 
> Shall we update that to the latest one: 
> https://cwiki.apache.org/confluence/display/HADOOP2/Home



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-09-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17222:
---
Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
>  Labels: pull-request-available
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17234) Add .asf.yaml to allow github and jira integration

2020-08-28 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186780#comment-17186780
 ] 

Mingliang Liu commented on HADOOP-17234:


Great work, Thank you!

> Add .asf.yaml to allow github and jira integration
> --
>
> Key: HADOOP-17234
> URL: https://issues.apache.org/jira/browse/HADOOP-17234
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
> Fix For: 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> As of now the default for github is set only to worklog, To enable link and 
> label, We need to add this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Resolved] (HADOOP-17159) Make UGI support forceful relogin from keytab ignoring the last login time

2020-08-27 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu resolved HADOOP-17159.

Fix Version/s: 2.10.1
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to 2.10.1 and 3.1.5+ see "Fix Version/s". Thank you for your 
contribution, [~sandeep.guggilam]

> Make UGI support forceful relogin from keytab ignoring the last login time
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17232) Erasure Coding: Typo in document

2020-08-27 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186040#comment-17186040
 ] 

Mingliang Liu edited comment on HADOOP-17232 at 8/27/20, 6:35 PM:
--

The "set" happens twice in the sentence, existing sentence has a simple syntax 
error (missing "the", and "use" should be "uses"), and it can be shorter (by 
using "Otherwise"). So overall I propose the following revised version:
{code}
The ERASURECODING\_POLICY is name of the policy for the file. If an erasure 
coding policy is set on that file, it will return the name of the policy. 
Otherwise, it will return \"Replicated\" which means it uses the replication 
storage strategy.
{code}


was (Author: liuml07):
Since the "set" happens twice in the sentence and existing sentence has a 
simple syntax error (missing "the", and "use" should be "uses), I propose the 
following revised version:
{code}
The ERASURECODING\_POLICY is name of the policy for the file. If an erasure 
coding policy is set on that file, it will return the name of the policy. If no 
erasure coding policy is set, it will return \"Replicated\" which means it uses 
the replication storage strategy.
{code}

> Erasure Coding: Typo in document
> 
>
> Key: HADOOP-17232
> URL: https://issues.apache.org/jira/browse/HADOOP-17232
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Trivial
> Attachments: HADOOP-17232.001.patch
>
>
> When review ec document and code, find the typo.
> Change "a erasure code" to "an erasure code"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17232) Erasure Coding: Typo in document

2020-08-27 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186040#comment-17186040
 ] 

Mingliang Liu commented on HADOOP-17232:


Since the "set" happens twice in the sentence and existing sentence has a 
simple syntax error (missing "the", and "use" should be "uses), I propose the 
following revised version:
{code}
The ERASURECODING\_POLICY is name of the policy for the file. If an erasure 
coding policy is set on that file, it will return the name of the policy. If no 
erasure coding policy is set, it will return \"Replicated\" which means it uses 
the replication storage strategy.
{code}

> Erasure Coding: Typo in document
> 
>
> Key: HADOOP-17232
> URL: https://issues.apache.org/jira/browse/HADOOP-17232
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Trivial
> Attachments: HADOOP-17232.001.patch
>
>
> When review ec document and code, find the typo.
> Change "a erasure code" to "an erasure code"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17232) Erasure Coding: Typo in document

2020-08-27 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186037#comment-17186037
 ] 

Mingliang Liu commented on HADOOP-17232:


yeah, other than that I'm +1

4 binding +1 is way more enough for a typo fix. I'll leave it to you guys to 
commit :)

> Erasure Coding: Typo in document
> 
>
> Key: HADOOP-17232
> URL: https://issues.apache.org/jira/browse/HADOOP-17232
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Trivial
> Attachments: HADOOP-17232.001.patch
>
>
> When review ec document and code, find the typo.
> Change "a erasure code" to "an erasure code"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-17232) Erasure Coding: Typo in document

2020-08-27 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186033#comment-17186033
 ] 

Mingliang Liu edited comment on HADOOP-17232 at 8/27/20, 6:20 PM:
--

Not native English speaker, but set's past participle is also set right?

{code}
If an erasure coding policy is setted on that file,...
{code}
should be
{code}
If an erasure coding policy is set on that file,
{code}


was (Author: liuml07):
Not native English speaker, but set's part participle is also set right?

{code}
If an erasure coding policy is setted on that file,...
{code}
should be
{code}
If an erasure coding policy is set on that file,
{code}

> Erasure Coding: Typo in document
> 
>
> Key: HADOOP-17232
> URL: https://issues.apache.org/jira/browse/HADOOP-17232
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Trivial
> Attachments: HADOOP-17232.001.patch
>
>
> When review ec document and code, find the typo.
> Change "a erasure code" to "an erasure code"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17232) Erasure Coding: Typo in document

2020-08-27 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17186033#comment-17186033
 ] 

Mingliang Liu commented on HADOOP-17232:


Not native English speaker, but set's part participle is also set right?

{code}
If an erasure coding policy is setted on that file,...
{code}
should be
{code}
If an erasure coding policy is set on that file,
{code}

> Erasure Coding: Typo in document
> 
>
> Key: HADOOP-17232
> URL: https://issues.apache.org/jira/browse/HADOOP-17232
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.3.0
>Reporter: Fei Hui
>Assignee: Fei Hui
>Priority: Trivial
> Attachments: HADOOP-17232.001.patch
>
>
> When review ec document and code, find the typo.
> Change "a erasure code" to "an erasure code"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17159) Make UGI support forceful relogin from keytab ignoring the last login time

2020-08-27 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17159:
---
Summary: Make UGI support forceful relogin from keytab ignoring the last 
login time  (was: Ability for forceful relogin in UserGroupInformation class)

> Make UGI support forceful relogin from keytab ignoring the last login time
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-26 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17185014#comment-17185014
 ] 

Mingliang Liu commented on HADOOP-17222:


I’m thinking two new configs seem a bit overkill - unrelated to this but Hadoop 
has too many configs while we seldom clean them up...

Cache max size being 1000 is very enough for me.

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-26 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17184970#comment-17184970
 ] 

Mingliang Liu commented on HADOOP-17222:


Yes an evict policy for the cache is good for long-running clients and large or 
dynamic HDFS clusters. I don't have any good estimation, but the expiration 
time can be up to 12 or even 24 hours.

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17226) Failure of ITestAssumeRole.testRestrictedCommitActions

2020-08-25 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17226:
---
Description: 
Counter of progress callbacks of parallel commits was 1, not two. As the 
callback wasn't using an atomic long, likely to be a (rare) race condition.

see in HADOOP-16830 test run; looks unrelated but I'll add the fix there anyway 
so I can see if it goes away.



  was:
Counter of progress callbacks of parallel commits was 1, not too. As the 
callback wasn't using an atomic long, likely to be a (rare) race condition.

see in HADOOP-16830 test run; looks unrelated but I'll add the fix there anyway 
so I can see if it goes away.




> Failure of ITestAssumeRole.testRestrictedCommitActions 
> ---
>
> Key: HADOOP-17226
> URL: https://issues.apache.org/jira/browse/HADOOP-17226
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3, test
>Affects Versions: 3.4.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Minor
>
> Counter of progress callbacks of parallel commits was 1, not two. As the 
> callback wasn't using an atomic long, likely to be a (rare) race condition.
> see in HADOOP-16830 test run; looks unrelated but I'll add the fix there 
> anyway so I can see if it goes away.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183807#comment-17183807
 ] 

Mingliang Liu commented on HADOOP-17222:


{quote}
it will affect all callers of `NetUtils.createSocketAddr` and not just the hdfs 
client, right?
{quote}

Yes that is true...

Is it overkill if we:
# in hadoop-common/NetUtils we have two API of {{createSocketAddr(...) \{ 
return createSocketAddr(..., false);\}}} and {{createSocketAddr(..., 
useCacheIfPresent)}}
# in HDFS we have a new config which will decide the {{useCacheIfPresent}} 
being false or true

If we make DFS code path use this only by default, perhaps the 2 point is not 
necessary. I'm also fine with that.

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-08-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183803#comment-17183803
 ] 

Mingliang Liu commented on HADOOP-17159:


[~sandeep.guggilam] Could you provide a new PR (reusing this JIRA) for 
branch-2.10? I saw major conflicts there. Thank you.

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-08-25 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17159:
---
Fix Version/s: 3.1.5
   3.4.0
   3.3.1
   3.2.2

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183785#comment-17183785
 ] 

Mingliang Liu commented on HADOOP-17222:


[~fanrui] I have added you to Hadoop Contributor list in JIRA, and also 
assigned this JIRA to you. Thanks,

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-25 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17222:
--

Assignee: fanrui

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Assignee: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-25 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183783#comment-17183783
 ] 

Mingliang Liu commented on HADOOP-17222:


Yes, I totally see the value of caching. That is amazing results.

OK, caching the URI instead of {{InetSocketAddress}} makes more sense. I think 
this will be safe and more generic. Still, providing a tube for enabling this 
feature and disabling it by default will be suggested. Or at least provide two 
APIs and use the new one by "opt-in" and existing use cases are by default 
"opt-out".

Thanks!

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17222) Create socket address combined with cache to speed up hdfs client choose DataNode

2020-08-24 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183606#comment-17183606
 ] 

Mingliang Liu commented on HADOOP-17222:


Hi [~fanrui] The performance improvement seems awesome. Thanks for sharing.

One fundamental question I have is, will this work for cases where the DN host 
changes its IP? This happens a lot in containerized environment where the 
hostname is fixed (keeping either externalDNS or StatefulSet Pod identity) but 
the IP could change for each restart.

> Create socket address combined with cache to speed up hdfs client choose 
> DataNode
> -
>
> Key: HADOOP-17222
> URL: https://issues.apache.org/jira/browse/HADOOP-17222
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: common, hdfs-client
> Environment: HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
>Reporter: fanrui
>Priority: Major
> Attachments: After Optimization remark.png, After optimization.svg, 
> Before Optimization remark.png, Before optimization.svg
>
>
> Note:Not only the hdfs client can get the current benefit, all callers of 
> NetUtils.createSocketAddr will get the benefit. Just use hdfs client as an 
> example.
>  
> Hdfs client selects best DN for hdfs Block. method call stack:
> DFSInputStream.chooseDataNode -> getBestNodeDNAddrPair -> 
> NetUtils.createSocketAddr
> NetUtils.createSocketAddr creates the corresponding InetSocketAddress based 
> on the host and port. There are some heavier operations in the 
> NetUtils.createSocketAddr method, for example: URI.create(target), so 
> NetUtils.createSocketAddr takes more time to execute.
> The following is my performance report. The report is based on HBase calling 
> hdfs. HBase is a high-frequency access client for hdfs, because HBase read 
> operations often access a small DataBlock (about 64k) instead of the entire 
> HFile. In the case of high frequency access, the NetUtils.createSocketAddr 
> method is time-consuming.
> h3. Test Environment:
>  
> {code:java}
> HBase version: 2.1.0
> JVM: -Xmx2g -Xms2g 
> hadoop hdfs version: 2.7.4
> disk:SSD
> OS:CentOS Linux release 7.4.1708 (Core)
> JMH Benchmark: @Fork(value = 1) 
> @Warmup(iterations = 300) 
> @Measurement(iterations = 300)
> {code}
> h4. Before Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 4.86% of the entire CPU, and the creation of URIs accounts for a larger 
> proportion.
> !Before Optimization remark.png!
> h3. Optimization ideas:
> NetUtils.createSocketAddr creates InetSocketAddress based on host and port. 
> Here we can add Cache to InetSocketAddress. The key of Cache is host and 
> port, and the value is InetSocketAddress.
> h4. After Optimization FlameGraph:
> In the figure, we can see that DFSInputStream.getBestNodeDNAddrPair accounts 
> for 0.54% of the entire CPU. Here, ConcurrentHashMap is used as the Cache, 
> and the ConcurrentHashMap.get() method gets data from the Cache. The CPU 
> usage of DFSInputStream.getBestNodeDNAddrPair has been optimized from 4.86% 
> to 0.54%.
> !After Optimization remark.png!
> h3. Original FlameGraph link:
> [Before 
> Optimization|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]
> [After Optimization 
> FlameGraph|https://drive.google.com/file/d/133L5m75u2tu_KgKfGHZLEUzGR0XAfUl6/view?usp=sharing]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17197) Decrease size of s3a dependencies

2020-08-10 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175041#comment-17175041
 ] 

Mingliang Liu commented on HADOOP-17197:


{quote}
Use the shade plugin to publish an S3A Uber-jar that contains S3A + shaded SDK 
dependencies for the AWS services that S3A actually needs
This would be in addition to the existing S3A jar
{quote}
This seems reasonable request. I'd like try that when it's available.

> Decrease size of s3a dependencies
> -
>
> Key: HADOOP-17197
> URL: https://issues.apache.org/jira/browse/HADOOP-17197
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sahil Takiar
>Priority: Major
>
> S3A currently has a dependency on the aws-java-sdk-bundle, which includes the 
> SDKs for all AWS services. The jar file for the current version is about 120 
> MB, but continues to grow (the latest is about 170 MB). Organic growth is 
> expected as more and more AWS services are created.
> The aws-java-sdk-bundle jar file is shaded as well, so it includes all 
> transitive dependencies.
> It would be nice if S3A could depend on smaller jar files in order to 
> decrease the size of jar files pulled in transitively by clients. Decreasing 
> the size of dependencies is particularly important for Docker files, where 
> image pull times can be affected by image size.
> One solution here would be for S3A to publish its own shaded jar which 
> includes the SDKs for all needed AWS Services (e.g. S3, DynamoDB, etc.) along 
> with the transitive dependencies for the individual SDKs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17197) Decrease size of s3a dependencies

2020-08-10 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175040#comment-17175040
 ] 

Mingliang Liu commented on HADOOP-17197:


I'm not fond of moving back to individual jar files from the current sdk bundle 
jar. The benefit of smaller dependency size can not justify all the headaches 
we've had (we mean both S3A developers and users).

> Decrease size of s3a dependencies
> -
>
> Key: HADOOP-17197
> URL: https://issues.apache.org/jira/browse/HADOOP-17197
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/s3
>Reporter: Sahil Takiar
>Priority: Major
>
> S3A currently has a dependency on the aws-java-sdk-bundle, which includes the 
> SDKs for all AWS services. The jar file for the current version is about 120 
> MB, but continues to grow (the latest is about 170 MB). Organic growth is 
> expected as more and more AWS services are created.
> The aws-java-sdk-bundle jar file is shaded as well, so it includes all 
> transitive dependencies.
> It would be nice if S3A could depend on smaller jar files in order to 
> decrease the size of jar files pulled in transitively by clients. Decreasing 
> the size of dependencies is particularly important for Docker files, where 
> image pull times can be affected by image size.
> One solution here would be for S3A to publish its own shaded jar which 
> includes the SDKs for all needed AWS Services (e.g. S3, DynamoDB, etc.) along 
> with the transitive dependencies for the individual SDKs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17182) Dead links in breadcrumbs

2020-08-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17173593#comment-17173593
 ] 

Mingliang Liu commented on HADOOP-17182:


Committed to {{trunk}}. Thanks for reporting and fixing this [~aajisaka]. 
Thanks for review [~ayushsaxena]

> Dead links in breadcrumbs
> -
>
> Key: HADOOP-17182
> URL: https://issues.apache.org/jira/browse/HADOOP-17182
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.4.0
>
>
> In the breadcrumbs, most of the links are dead.
> Can we remove the breadcrumbs?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17182) Dead links in breadcrumbs

2020-08-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17182:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Dead links in breadcrumbs
> -
>
> Key: HADOOP-17182
> URL: https://issues.apache.org/jira/browse/HADOOP-17182
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: documentation
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.4.0
>
>
> In the breadcrumbs, most of the links are dead.
> Can we remove the breadcrumbs?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17164) UGI loginUserFromKeytab doesn't set the last login time

2020-08-05 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17164:
---
Fix Version/s: 2.10.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Merged to all fixed versions. Thanks [~sandeep.guggilam] for reporting and 
filing a patch. Thanks [~ste...@apache.org] for helpful review.

> UGI loginUserFromKeytab doesn't set the last login time
> ---
>
> Key: HADOOP-17164
> URL: https://issues.apache.org/jira/browse/HADOOP-17164
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.1.4, 3.2.2, 2.10.1, 3.3.1, 3.4.0
>
> Attachments: HADOOP-17164-branch-2.10.001.patch, 
> HADOOP-17164-branch-2.10.002.patch, HADOOP-17164.001.patch
>
>
> UGI initial login from keytab doesn't set the last login time as a result of 
> which the relogin can happen even before the configured minimum seconds to 
> wait before relogin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14056) Update maven-javadoc-plugin to 2.10.4

2020-08-04 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171232#comment-17171232
 ] 

Mingliang Liu commented on HADOOP-14056:


Thank you [~aajisaka] very much. I can help review.

(In above comment I made "run" a link to the build, but glad you find another 
one in the GitHub)

> Update maven-javadoc-plugin to 2.10.4
> -
>
> Key: HADOOP-14056
> URL: https://issues.apache.org/jira/browse/HADOOP-14056
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.0.0-alpha4, 2.10.1
>
> Attachments: HADOOP-14056.01.patch
>
>
> I'm seeing the following warning in OpenJDK 9.
> {noformat}
> [INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-minikdc 
> ---
> [WARNING] Unable to find the javadoc version: Unrecognized version of 
> Javadoc: 'java version "9-ea"
> Java(TM) SE Runtime Environment (build 9-ea+154)
> Java HotSpot(TM) 64-Bit Server VM (build 9-ea+154, mixed mode)
> ' near index 37
> (?s).*?([0-9]+\.[0-9]+)(\.([0-9]+))?.*
>  ^
> [WARNING] Using the Java the version instead of, i.e. 0.0
> {noformat}
> Need to update this to 2.10.4. (MJAVADOC-441)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14056) Update maven-javadoc-plugin to 2.10.4

2020-08-04 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171208#comment-17171208
 ] 

Mingliang Liu commented on HADOOP-14056:


[~aajisaka] In one recent branch-2.10 PreCommit 
[run|https://issues.apache.org/jira/browse/HADOOP-17164?focusedCommentId=17171199=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17171199]
 for HADOOP-17164, [~sandeep.guggilam] and I found errors like:
{code}
[ERROR] Plugin org.apache.maven.plugins:maven-javadoc-plugin:2.8.1 or one of 
its dependencies could not be resolved: Failed to read artifact descriptor for 
org.apache.maven.plugins:maven-javadoc-plugin:jar:2.8.1: Failure to transfer 
org.apache.maven.plugins:maven-javadoc-plugin:pom:2.8.1 from 
https://repo.maven.apache.org/maven2 was cached in the local repository, 
resolution will not be reattempted until the update interval of central has 
elapsed or updates are forced. Original error: Could not transfer artifact 
org.apache.maven.plugins:maven-javadoc-plugin:pom:2.8.1 from/to central 
(https://repo.maven.apache.org/maven2): Received fatal alert: protocol_version 
-> [Help 1]
{code}

I think it has something to do with this cherry-pick. Could you have a look? 
Thanks!


> Update maven-javadoc-plugin to 2.10.4
> -
>
> Key: HADOOP-14056
> URL: https://issues.apache.org/jira/browse/HADOOP-14056
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.0.0-alpha4, 2.10.1
>
> Attachments: HADOOP-14056.01.patch
>
>
> I'm seeing the following warning in OpenJDK 9.
> {noformat}
> [INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ hadoop-minikdc 
> ---
> [WARNING] Unable to find the javadoc version: Unrecognized version of 
> Javadoc: 'java version "9-ea"
> Java(TM) SE Runtime Environment (build 9-ea+154)
> Java HotSpot(TM) 64-Bit Server VM (build 9-ea+154, mixed mode)
> ' near index 37
> (?s).*?([0-9]+\.[0-9]+)(\.([0-9]+))?.*
>  ^
> [WARNING] Using the Java the version instead of, i.e. 0.0
> {noformat}
> Need to update this to 2.10.4. (MJAVADOC-441)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17179) [JDK 11] Fix javadoc error in Java API link detection

2020-08-04 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17171171#comment-17171171
 ] 

Mingliang Liu commented on HADOOP-17179:


Committed to {{trunk}} branch.

> [JDK 11] Fix javadoc error  in Java API link detection
> --
>
> Key: HADOOP-17179
> URL: https://issues.apache.org/jira/browse/HADOOP-17179
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.4.0
>
>
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc-no-fork 
> (default-cli) on project hadoop-hdfs-rbf: An error has occurred in Javadoc 
> report generation: 
> [ERROR] Exit code: 1 - javadoc: warning - You have specified the HTML version 
> as HTML 4.01 by using the -html4 option.
> [ERROR] The default is currently HTML5 and the support for HTML 4.01 will be 
> removed
> [ERROR] in a future release. To suppress this warning, please ensure that any 
> HTML constructs
> [ERROR] in your comments are valid in HTML5, and remove the -html4 option.
> [ERROR] javadoc: error - The code being documented uses modules but the 
> packages defined in https://docs.oracle.com/javase/8/docs/api/ are in the 
> unnamed module.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17184:
---
Description: 
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.

One example:
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
 Could not read metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 input contained no data -> [Help 1]
{code}
Another example:
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}
As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos-}} paramter, 
yetus will use a custom .m2 directory for executions for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]

  was:
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.

One example:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
 Could not read metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 input contained no data -> [Help 1]
{code}

Another example:
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}
As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]


> Add --mvn-custom-repos parameter to yetus calls
> ---
>
> Key: HADOOP-17184
> URL: https://issues.apache.org/jira/browse/HADOOP-17184
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
>
> In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see 
> the QA build fails with unrelated errors.
> One example:
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
> org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
>  Could not read metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  input contained no data -> [Help 1]
> {code}
> Another example:
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-project: Failed to install metadata 
> org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
> parse metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  in epilog non whitespace content is not allowed but got n (position: END_TAG 
> seen ...\nn... @21:2) -> [Help 1]
> {code}
> As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
> uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos-}} paramter, 
> yetus will use a custom .m2 directory for executions for PR 

[jira] [Updated] (HADOOP-17179) [JDK 11] Fix javadoc error in Java API link detection

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17179:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> [JDK 11] Fix javadoc error  in Java API link detection
> --
>
> Key: HADOOP-17179
> URL: https://issues.apache.org/jira/browse/HADOOP-17179
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Fix For: 3.4.0
>
>
> {noformat}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc-no-fork 
> (default-cli) on project hadoop-hdfs-rbf: An error has occurred in Javadoc 
> report generation: 
> [ERROR] Exit code: 1 - javadoc: warning - You have specified the HTML version 
> as HTML 4.01 by using the -html4 option.
> [ERROR] The default is currently HTML5 and the support for HTML 4.01 will be 
> removed
> [ERROR] in a future release. To suppress this warning, please ensure that any 
> HTML constructs
> [ERROR] in your comments are valid in HTML5, and remove the -html4 option.
> [ERROR] javadoc: error - The code being documented uses modules but the 
> packages defined in https://docs.oracle.com/javase/8/docs/api/ are in the 
> unnamed module.
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17184:
---
Status: Patch Available  (was: Open)

> Add --mvn-custom-repos parameter to yetus calls
> ---
>
> Key: HADOOP-17184
> URL: https://issues.apache.org/jira/browse/HADOOP-17184
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
>
> In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see 
> the QA build fails with unrelated errors.
> One example:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
> org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
>  Could not read metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  input contained no data -> [Help 1]
> {code}
> Another example:
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-project: Failed to install metadata 
> org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
> parse metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  in epilog non whitespace content is not allowed but got n (position: END_TAG 
> seen ...\nn... @21:2) -> [Help 1]
> {code}
> As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
> uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
> {{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
> for PR validations.
> This is a change to mimic that for Hadoop project.
> CC: [~aajisaka]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17184:
---
Description: 
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.

One example:
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
 Could not read metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 input contained no data -> [Help 1]
{code}

Another example:
{code:java}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}
As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]

  was:
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}

As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]



> Add --mvn-custom-repos parameter to yetus calls
> ---
>
> Key: HADOOP-17184
> URL: https://issues.apache.org/jira/browse/HADOOP-17184
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
>
> In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see 
> the QA build fails with unrelated errors.
> One example:
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-yarn-applications-mawo-core: Failed to install metadata 
> org.apache.hadoop.applications.mawo:hadoop-yarn-applications-mawo-core:3.4.0-SNAPSHOT/maven-metadata.xml:
>  Could not read metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/applications/mawo/hadoop-yarn-applications-mawo-core/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  input contained no data -> [Help 1]
> {code}
> Another example:
> {code:java}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-project: Failed to install metadata 
> org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
> parse metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  in epilog non whitespace content is not allowed but got n (position: END_TAG 
> seen ...\nn... @21:2) -> [Help 1]
> {code}
> As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
> uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
> {{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
> for PR validations.
> This is a change to mimic that for Hadoop project.
> CC: [~aajisaka]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17184:
---
Description: 
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}

As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]


  was:
In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}

As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]



> Add --mvn-custom-repos parameter to yetus calls
> ---
>
> Key: HADOOP-17184
> URL: https://issues.apache.org/jira/browse/HADOOP-17184
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Mingliang Liu
>Priority: Major
>
> In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see 
> the QA build fails with unrelated errors.
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-project: Failed to install metadata 
> org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
> parse metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  in epilog non whitespace content is not allowed but got n (position: END_TAG 
> seen ...\nn... @21:2) -> [Help 1]
> {code}
> As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
> uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos}} and 
> {{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
> for PR validations.
> This is a change to mimic that for Hadoop project.
> CC: [~aajisaka]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17184:
--

Assignee: Mingliang Liu

> Add --mvn-custom-repos parameter to yetus calls
> ---
>
> Key: HADOOP-17184
> URL: https://issues.apache.org/jira/browse/HADOOP-17184
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: build
>Reporter: Mingliang Liu
>Assignee: Mingliang Liu
>Priority: Major
>
> In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see 
> the QA build fails with unrelated errors.
> {code}
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
> on project hadoop-project: Failed to install metadata 
> org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
> parse metadata 
> /home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
>  in epilog non whitespace content is not allowed but got n (position: END_TAG 
> seen ...\nn... @21:2) -> [Help 1]
> {code}
> As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
> uses a shared .m2 repository. By adding {{\-\-mvn-custom-repos}} and 
> {{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
> for PR validations.
> This is a change to mimic that for Hadoop project.
> CC: [~aajisaka]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Created] (HADOOP-17184) Add --mvn-custom-repos parameter to yetus calls

2020-08-04 Thread Mingliang Liu (Jira)
Mingliang Liu created HADOOP-17184:
--

 Summary: Add --mvn-custom-repos parameter to yetus calls
 Key: HADOOP-17184
 URL: https://issues.apache.org/jira/browse/HADOOP-17184
 Project: Hadoop Common
  Issue Type: Bug
  Components: build
Reporter: Mingliang Liu


In my request PR [#2188|https://github.com/apache/hadoop/pull/2188], I see the 
QA build fails with unrelated errors.
{code}
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-install-plugin:2.5.1:install (default-install) 
on project hadoop-project: Failed to install metadata 
org.apache.hadoop:hadoop-project:3.4.0-SNAPSHOT/maven-metadata.xml: Could not 
parse metadata 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-project/3.4.0-SNAPSHOT/maven-metadata-local.xml:
 in epilog non whitespace content is not allowed but got n (position: END_TAG 
seen ...\nn... @21:2) -> [Help 1]
{code}

As reported by HBASE-22474 and HBASE-22801, PreCommit validation from yetus 
uses a shared .m2 repository. By adding {{--mvn-custom-repos}} and 
{{--jenkins}} paramters, yetus will use a custom .m2 directory for executions 
for PR validations.

This is a change to mimic that for Hadoop project.

CC: [~aajisaka]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17164) UGI loginUserFromKeytab doesn't set the last login time

2020-08-04 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17164:
---
Fix Version/s: 3.4.0
   3.3.1
   3.2.2
   3.1.4
 Hadoop Flags: Reviewed

I have committed this to 3.x branches.

[~sandeep.guggilam] Could you file a new PR for {{branch-2.10}}. I see major 
conflicts when backport this from 3.x.

> UGI loginUserFromKeytab doesn't set the last login time
> ---
>
> Key: HADOOP-17164
> URL: https://issues.apache.org/jira/browse/HADOOP-17164
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
> Fix For: 3.1.4, 3.2.2, 3.3.1, 3.4.0
>
> Attachments: HADOOP-17164.001.patch
>
>
> UGI initial login from keytab doesn't set the last login time as a result of 
> which the relogin can happen even before the configured minimum seconds to 
> wait before relogin



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-15891) Provide Regex Based Mount Point In Inode Tree

2020-08-03 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-15891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-15891:
---
Component/s: (was: fs)
 viewfs

> Provide Regex Based Mount Point In Inode Tree
> -
>
> Key: HADOOP-15891
> URL: https://issues.apache.org/jira/browse/HADOOP-15891
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: viewfs
>Reporter: zhenzhao wang
>Assignee: zhenzhao wang
>Priority: Major
> Attachments: HADOOP-15891.015.patch, HDFS-13948.001.patch, 
> HDFS-13948.002.patch, HDFS-13948.003.patch, HDFS-13948.004.patch, 
> HDFS-13948.005.patch, HDFS-13948.006.patch, HDFS-13948.007.patch, 
> HDFS-13948.008.patch, HDFS-13948.009.patch, HDFS-13948.011.patch, 
> HDFS-13948.012.patch, HDFS-13948.013.patch, HDFS-13948.014.patch, HDFS-13948_ 
> Regex Link Type In Mont Table-V0.pdf, HDFS-13948_ Regex Link Type In Mount 
> Table-v1.pdf
>
>
> This jira is created to support regex based mount point in Inode Tree. We 
> noticed that mount point only support fixed target path. However, we might 
> have user cases when target needs to refer some fields from source. e.g. We 
> might want a mapping of /cluster1/user1 => /cluster1-dc1/user-nn-user1, we 
> want to refer `cluster` and `user` field in source to construct target. It's 
> impossible to archive this with current link type. Though we could set 
> one-to-one mapping, the mount table would become bloated if we have thousands 
> of users. Besides, a regex mapping would empower us more flexibility. So we 
> are going to build a regex based mount point which target could refer groups 
> from src regex mapping. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14040) Use shaded aws-sdk uber-JAR 1.11.86

2020-07-29 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167577#comment-17167577
 ] 

Mingliang Liu commented on HADOOP-14040:


[~yzhangal] I filed [HDFS-15499] Thanks

> Use shaded aws-sdk uber-JAR 1.11.86
> ---
>
> Key: HADOOP-14040
> URL: https://issues.apache.org/jira/browse/HADOOP-14040
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14040-HADOOP-13345.001.patch, 
> HADOOP-14040-branch-2-001.patch, HADOOP-14040-branch-2.002.patch, 
> HADOOP-14040.001.patch
>
>
> AWS SDK now has a (v. large) uberjar shading all dependencies
> This ensures that AWS dependency changes (e.g json) don't cause problems 
> downstream in things like HBase, so enabling backporting if desired.
> This will let us addess the org.json don't be evil problem: this SDK version 
> doesn't have those files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14040) Use shaded aws-sdk uber-JAR 1.11.86

2020-07-29 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167546#comment-17167546
 ] 

Mingliang Liu commented on HADOOP-14040:


[~yzhangal] Have you filed a new JIRA for tracking that? I think this JIRA is 
released so we need a new JIRA to fix that...I can file one if you suggest, 
Thanks

> Use shaded aws-sdk uber-JAR 1.11.86
> ---
>
> Key: HADOOP-14040
> URL: https://issues.apache.org/jira/browse/HADOOP-14040
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14040-HADOOP-13345.001.patch, 
> HADOOP-14040-branch-2-001.patch, HADOOP-14040-branch-2.002.patch, 
> HADOOP-14040.001.patch
>
>
> AWS SDK now has a (v. large) uberjar shading all dependencies
> This ensures that AWS dependency changes (e.g json) don't cause problems 
> downstream in things like HBase, so enabling backporting if desired.
> This will let us addess the org.json don't be evil problem: this SDK version 
> doesn't have those files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14040) Use shaded aws-sdk uber-JAR 1.11.86

2020-07-29 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17167351#comment-17167351
 ] 

Mingliang Liu commented on HADOOP-14040:


[~yzhangal] I do not think we ignored that on purpose. The `aws-java-sdk-s3` is 
in the httpfs module because httpfs explicitly excludes this S3 library. I 
think we need to update it there as well. Thanks!

> Use shaded aws-sdk uber-JAR 1.11.86
> ---
>
> Key: HADOOP-14040
> URL: https://issues.apache.org/jira/browse/HADOOP-14040
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: build, fs/s3
>Affects Versions: 2.9.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HADOOP-14040-HADOOP-13345.001.patch, 
> HADOOP-14040-branch-2-001.patch, HADOOP-14040-branch-2.002.patch, 
> HADOOP-14040.001.patch
>
>
> AWS SDK now has a (v. large) uberjar shading all dependencies
> This ensures that AWS dependency changes (e.g json) don't cause problems 
> downstream in things like HBase, so enabling backporting if desired.
> This will let us addess the org.json don't be evil problem: this SDK version 
> doesn't have those files.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-07-28 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166749#comment-17166749
 ] 

Mingliang Liu commented on HADOOP-17159:


This request makes sense to me. I have assigned this JIRA to you 
[~sandeep.guggilam]

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Assigned] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-07-28 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu reassigned HADOOP-17159:
--

Assignee: Sandeep Guggilam

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3
>Reporter: Sandeep Guggilam
>Assignee: Sandeep Guggilam
>Priority: Major
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-07-28 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166738#comment-17166738
 ] 

Mingliang Liu commented on HADOOP-17159:


"Affects Version/s:" is 2.7.7 or newer? Since 2.7/2.8 has EoL, did you check if 
this is affecting any active version e.g. 2.10.0, [~sandeep.guggilam] Thanks,

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Sandeep Guggilam
>Priority: Major
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17159) Ability for forceful relogin in UserGroupInformation class

2020-07-28 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17159:
---
   Fix Version/s: (was: 3.1.5)
  (was: 3.4.0)
  (was: 3.3.1)
  (was: 2.10.1)
  (was: 3.2.2)
Target Version/s: 3.1.4, 3.2.2, 2.10.1, 3.3.1, 3.4.0

Moving values from the "Fixed version" (which should be updated when committing 
this change) to "Target version" filed.

> Ability for forceful relogin in UserGroupInformation class
> --
>
> Key: HADOOP-17159
> URL: https://issues.apache.org/jira/browse/HADOOP-17159
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: security
>Reporter: Sandeep Guggilam
>Priority: Major
>
> Currently we have a relogin() method in UGI which attempts to login if there 
> is no login attempted in the last 10 minutes or configured amount of time
> We should also have provision for doing a forceful relogin irrespective of 
> the time window that the client can choose to use it if needed . Consider the 
> below scenario:
>  # SASL Server is reimaged and new keytabs are fetched with refreshing the 
> password
>  # SASL client connection to the server would fail when it tries with the 
> cached service ticket
>  # We should try to logout to clear the service tickets in cache and then try 
> to login back in such scenarios. But since the current relogin() doesn't 
> guarantee a login, it could cause an issue
>  # A forceful relogin in this case would help after logout
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-16998) WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException

2020-07-17 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17159757#comment-17159757
 ] 

Mingliang Liu commented on HADOOP-16998:


Yes [~ayushtkn]. I have updated the "fixed versions". Thanks!

> WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException
> --
>
> Key: HADOOP-16998
> URL: https://issues.apache.org/jira/browse/HADOOP-16998
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HADOOP-16998.patch
>
>
> During HFile create, at the end when called close() on the OutputStream, 
> there is some pending data to get flushed. When this flush happens, an 
> Exception is thrown back from Storage. The Azure-storage SDK layer will throw 
> back IOE. (Even if it is a StorageException thrown from the Storage, the SDK 
> converts it to IOE.) But at HBase, we end up getting IllegalArgumentException 
> which causes the RS to get aborted. If we get back IOE, the flush will get 
> retried instead of aborting RS.
> The reason is this
> NativeAzureFsOutputStream uses Azure-storage SDK's BlobOutputStreamInternal. 
> But the BlobOutputStreamInternal is wrapped within a SyncableDataOutputStream 
> which is a FilterOutputStream. During the close op, NativeAzureFsOutputStream 
> calls close on SyncableDataOutputStream and it uses below method from 
> FilterOutputStream
> {code}
> public void close() throws IOException {
>   try (OutputStream ostream = out) {
>   flush();
>   }
> }
> {code}
> Here the flush call caused an IOE to be thrown to here. The finally will 
> issue close call on ostream (Which is an instance of BlobOutputStreamInternal)
> When BlobOutputStreamInternal#close() is been called, if there was any 
> exception already occured on that Stream, it will throw back the same 
> Exception
> {code}
> public synchronized void close() throws IOException {
>   try {
>   // if the user has already closed the stream, this will throw a 
> STREAM_CLOSED exception
>   // if an exception was thrown by any thread in the 
> threadExecutor, realize it now
>   this.checkStreamState();
>   ...
> }
> private void checkStreamState() throws IOException {
>   if (this.lastError != null) {
>   throw this.lastError;
>   }
> }
> {code}
> So here both try and finally block getting Exceptions and Java uses 
> Throwable#addSuppressed() 
> Within this method if both Exceptions are same objects, it throws back 
> IllegalArgumentException
> {code}
> public final synchronized void addSuppressed(Throwable exception) {
>   if (exception == this)
>  throw new 
> IllegalArgumentException(SELF_SUPPRESSION_MESSAGE, exception);
>   
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16998) WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException

2020-07-17 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-16998:
---
Fix Version/s: 3.4.0

> WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException
> --
>
> Key: HADOOP-16998
> URL: https://issues.apache.org/jira/browse/HADOOP-16998
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HADOOP-16998.patch
>
>
> During HFile create, at the end when called close() on the OutputStream, 
> there is some pending data to get flushed. When this flush happens, an 
> Exception is thrown back from Storage. The Azure-storage SDK layer will throw 
> back IOE. (Even if it is a StorageException thrown from the Storage, the SDK 
> converts it to IOE.) But at HBase, we end up getting IllegalArgumentException 
> which causes the RS to get aborted. If we get back IOE, the flush will get 
> retried instead of aborting RS.
> The reason is this
> NativeAzureFsOutputStream uses Azure-storage SDK's BlobOutputStreamInternal. 
> But the BlobOutputStreamInternal is wrapped within a SyncableDataOutputStream 
> which is a FilterOutputStream. During the close op, NativeAzureFsOutputStream 
> calls close on SyncableDataOutputStream and it uses below method from 
> FilterOutputStream
> {code}
> public void close() throws IOException {
>   try (OutputStream ostream = out) {
>   flush();
>   }
> }
> {code}
> Here the flush call caused an IOE to be thrown to here. The finally will 
> issue close call on ostream (Which is an instance of BlobOutputStreamInternal)
> When BlobOutputStreamInternal#close() is been called, if there was any 
> exception already occured on that Stream, it will throw back the same 
> Exception
> {code}
> public synchronized void close() throws IOException {
>   try {
>   // if the user has already closed the stream, this will throw a 
> STREAM_CLOSED exception
>   // if an exception was thrown by any thread in the 
> threadExecutor, realize it now
>   this.checkStreamState();
>   ...
> }
> private void checkStreamState() throws IOException {
>   if (this.lastError != null) {
>   throw this.lastError;
>   }
> }
> {code}
> So here both try and finally block getting Exceptions and Java uses 
> Throwable#addSuppressed() 
> Within this method if both Exceptions are same objects, it throws back 
> IllegalArgumentException
> {code}
> public final synchronized void addSuppressed(Throwable exception) {
>   if (exception == this)
>  throw new 
> IllegalArgumentException(SELF_SUPPRESSION_MESSAGE, exception);
>   
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-16998) WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException

2020-07-17 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-16998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-16998:
---
Hadoop Flags: Reviewed

> WASB : NativeAzureFsOutputStream#close() throwing IllegalArgumentException
> --
>
> Key: HADOOP-16998
> URL: https://issues.apache.org/jira/browse/HADOOP-16998
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/azure
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HADOOP-16998.patch
>
>
> During HFile create, at the end when called close() on the OutputStream, 
> there is some pending data to get flushed. When this flush happens, an 
> Exception is thrown back from Storage. The Azure-storage SDK layer will throw 
> back IOE. (Even if it is a StorageException thrown from the Storage, the SDK 
> converts it to IOE.) But at HBase, we end up getting IllegalArgumentException 
> which causes the RS to get aborted. If we get back IOE, the flush will get 
> retried instead of aborting RS.
> The reason is this
> NativeAzureFsOutputStream uses Azure-storage SDK's BlobOutputStreamInternal. 
> But the BlobOutputStreamInternal is wrapped within a SyncableDataOutputStream 
> which is a FilterOutputStream. During the close op, NativeAzureFsOutputStream 
> calls close on SyncableDataOutputStream and it uses below method from 
> FilterOutputStream
> {code}
> public void close() throws IOException {
>   try (OutputStream ostream = out) {
>   flush();
>   }
> }
> {code}
> Here the flush call caused an IOE to be thrown to here. The finally will 
> issue close call on ostream (Which is an instance of BlobOutputStreamInternal)
> When BlobOutputStreamInternal#close() is been called, if there was any 
> exception already occured on that Stream, it will throw back the same 
> Exception
> {code}
> public synchronized void close() throws IOException {
>   try {
>   // if the user has already closed the stream, this will throw a 
> STREAM_CLOSED exception
>   // if an exception was thrown by any thread in the 
> threadExecutor, realize it now
>   this.checkStreamState();
>   ...
> }
> private void checkStreamState() throws IOException {
>   if (this.lastError != null) {
>   throw this.lastError;
>   }
> }
> {code}
> So here both try and finally block getting Exceptions and Java uses 
> Throwable#addSuppressed() 
> Within this method if both Exceptions are same objects, it throws back 
> IllegalArgumentException
> {code}
> public final synchronized void addSuppressed(Throwable exception) {
>   if (exception == this)
>  throw new 
> IllegalArgumentException(SELF_SUPPRESSION_MESSAGE, exception);
>   
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-10 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132911#comment-17132911
 ] 

Mingliang Liu commented on HADOOP-17059:


{quote}
 dropping Java 7 support in Hadoop 2.x is the best choice 
{quote}

I'm totally with that. But branch 2.10 is assumed the last 2.x release and I 
guess it will live for a very long time. Some downstream projects may not like 
if we upgrade JDK version in a minor release (e.g. 2.10.1) :)

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 2.9.3, 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17059-branch-2.10-00.patch, HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-10 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17059:
---
Fix Version/s: 2.10.1
   2.9.3
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed the branch-2 patch into {{branch-2.10}} and {{branch-2.9}} branches. 
Thanks [~hemanthboyina] for filing and fixing. Thanks [~ayushtkn] for 
discussion.

Thanks [~aajisaka] for review and help for commit. Is there anything I can do 
next time when {{branch-2.10}} has PreCommit problems like this? I usually 
retry and wait for the best. :)

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 2.9.3, 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17059-branch-2.10-00.patch, HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128658#comment-17128658
 ] 

Mingliang Liu commented on HADOOP-17059:


{code}
Plugin org.codehaus.mojo:native-maven-plugin:1.0-alpha-8 or one of its 
dependencies could not be resolved: Failed to read artifact descriptor for 
org.codehaus.mojo:native-maven-plugin:jar:1.0-alpha-8: Could not transfer 
artifact org.codehaus.mojo:native-maven-plugin:pom:1.0-alpha-8 from/to central 
(https://repo.maven.apache.org/maven2): Received fatal alert: protocol_version 
-> [Help 1]
{code}
This seems something is wrong with the precommit build system. Not sure 
[~aajisaka] has seen this? Thanks,

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17059-branch-2.10-00.patch, HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128592#comment-17128592
 ] 

Mingliang Liu commented on HADOOP-17059:


+1 on  [^HADOOP-17059-branch-2.10-00.patch]. Will commit after we have a good 
PreCommit run. Thanks [~hemanthboyina] for updating.

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17059-branch-2.10-00.patch, HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17047) TODO comments exist in trunk while the related issues are already fixed.

2020-06-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17047:
---
Fix Version/s: 3.1.5
   3.3.1
   2.10.1
   3.2.2
   2.9.3

Backport into other branches. See Fixed versions.

> TODO comments exist in trunk while the related issues are already fixed.
> 
>
> Key: HADOOP-17047
> URL: https://issues.apache.org/jira/browse/HADOOP-17047
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Rungroj Maipradit
>Assignee: Rungroj Maipradit
>Priority: Trivial
> Fix For: 2.9.3, 3.2.2, 2.10.1, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17047.001.patch, HADOOP-17047.001.patch, 
> HADOOP-17047.002.patch, HADOOP-17047.003.patch
>
>
> In a research project, we analyzed the source code of Hadoop looking for 
> comments with on-hold SATDs (self-admitted technical debt) that could be 
> fixed already. An on-hold SATD is a TODO/FIXME comment blocked by an issue. 
> If this blocking issue is already resolved, the related todo can be 
> implemented (or sometimes it is already implemented, but the comment is left 
> in the code causing confusions). As we found a few instances of these in 
> Hadoop, we decided to collect them in a ticket, so they are documented and 
> can be addressed sooner or later.
> A list of code comments that mention already closed issues.
>  * A code comment suggests making the setJobConf method deprecated along with 
> a mapred package HADOOP-1230. HADOOP-1230 has been closed a long time ago, 
> but the method is still not annotated as deprecated.
> {code:java}
>  /**
>* This code is to support backward compatibility and break the compile  
>* time dependency of core on mapred.
>* This should be made deprecated along with the mapred package 
> HADOOP-1230. 
>* Should be removed when mapred package is removed.
>*/ {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L88]
>  * A comment mentions that the return type of the getDefaultFileSystem method 
> should be changed to AFS when HADOOP-6223 is completed.
>  Indeed, this change was done in the related commit of HADOOP-6223: 
> ([https://github.com/apache/hadoop/commit/3f371a0a644181b204111ee4e12c995fc7b5e5f5#diff-cd86a2b9ce3efd2232c2ace0e9084508L395)]
>  Thus, the comment could be removed.
> {code:java}
> @InterfaceStability.Unstable /* return type will change to AFS once
> HADOOP-6223 is completed */
> {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17047) TODO comments exist in trunk while the related issues are already fixed.

2020-06-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17047:
---
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

Committed to {{trunk}}. Thanks [~rungroj] for filing and providing a patch. 
Thanks [~aajisaka] for review and discussion.

> TODO comments exist in trunk while the related issues are already fixed.
> 
>
> Key: HADOOP-17047
> URL: https://issues.apache.org/jira/browse/HADOOP-17047
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Rungroj Maipradit
>Assignee: Rungroj Maipradit
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HADOOP-17047.001.patch, HADOOP-17047.001.patch, 
> HADOOP-17047.002.patch, HADOOP-17047.003.patch
>
>
> In a research project, we analyzed the source code of Hadoop looking for 
> comments with on-hold SATDs (self-admitted technical debt) that could be 
> fixed already. An on-hold SATD is a TODO/FIXME comment blocked by an issue. 
> If this blocking issue is already resolved, the related todo can be 
> implemented (or sometimes it is already implemented, but the comment is left 
> in the code causing confusions). As we found a few instances of these in 
> Hadoop, we decided to collect them in a ticket, so they are documented and 
> can be addressed sooner or later.
> A list of code comments that mention already closed issues.
>  * A code comment suggests making the setJobConf method deprecated along with 
> a mapred package HADOOP-1230. HADOOP-1230 has been closed a long time ago, 
> but the method is still not annotated as deprecated.
> {code:java}
>  /**
>* This code is to support backward compatibility and break the compile  
>* time dependency of core on mapred.
>* This should be made deprecated along with the mapred package 
> HADOOP-1230. 
>* Should be removed when mapred package is removed.
>*/ {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L88]
>  * A comment mentions that the return type of the getDefaultFileSystem method 
> should be changed to AFS when HADOOP-6223 is completed.
>  Indeed, this change was done in the related commit of HADOOP-6223: 
> ([https://github.com/apache/hadoop/commit/3f371a0a644181b204111ee4e12c995fc7b5e5f5#diff-cd86a2b9ce3efd2232c2ace0e9084508L395)]
>  Thus, the comment could be removed.
> {code:java}
> @InterfaceStability.Unstable /* return type will change to AFS once
> HADOOP-6223 is completed */
> {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-08 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17059:
---
  Component/s: viewfs
Fix Version/s: 3.1.5
   3.4.0
   3.3.1
   3.2.2
 Hadoop Flags: Reviewed

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: viewfs
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Fix For: 3.2.2, 3.3.1, 3.4.0, 3.1.5
>
> Attachments: HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-08 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128498#comment-17128498
 ] 

Mingliang Liu commented on HADOOP-17059:


Committed to {{trunk}}, {{branch-3.3}}, {{branch-3.2}} and {{branch-3.1}} 
branches. In {{branch-2.10}}, we can not use Java 8 features like lambda. 
[~hemanthboyina] Could you provide a separate patch for {{branch-2.10}}? It's 
mainly about test.

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17064) Drop MRv1 binary compatibility in 4.0.0

2020-06-06 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17064:
---
Priority: Major  (was: Trivial)

> Drop MRv1 binary compatibility in 4.0.0
> ---
>
> Key: HADOOP-17064
> URL: https://issues.apache.org/jira/browse/HADOOP-17064
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Rungroj Maipradit
>Priority: Major
>
> A code comment suggests making the setJobConf method deprecated along with 
> mapred package HADOOP-1230. HADOOP-1230 has been closed a long time ago, but 
> the method is still not annotated as deprecated.
> {code:java}
>  /**
>* This code is to support backward compatibility and break the compile  
>* time dependency of core on mapred.
>* This should be made deprecated along with the mapred package 
> HADOOP-1230. 
>* Should be removed when mapred package is removed.
>*/ {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L88]
> From the previous discussion, it seems that this method is still required if 
> we ensure binary compatibility with MRv1. 
>  
> https://issues.apache.org/jira/browse/HADOOP-17047?focusedCommentId=17111702=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17111702
> Mingliang Liu suggested to Drop MRv1 binary compatibility in 4.0.0
>  
> https://issues.apache.org/jira/browse/HADOOP-17047?focusedCommentId=17112442=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17112442



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-17064) Drop MRv1 binary compatibility in 4.0.0

2020-06-06 Thread Mingliang Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mingliang Liu updated HADOOP-17064:
---
Labels: incompatible  (was: )

> Drop MRv1 binary compatibility in 4.0.0
> ---
>
> Key: HADOOP-17064
> URL: https://issues.apache.org/jira/browse/HADOOP-17064
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Rungroj Maipradit
>Priority: Major
>  Labels: incompatible
>
> A code comment suggests making the setJobConf method deprecated along with 
> mapred package HADOOP-1230. HADOOP-1230 has been closed a long time ago, but 
> the method is still not annotated as deprecated.
> {code:java}
>  /**
>* This code is to support backward compatibility and break the compile  
>* time dependency of core on mapred.
>* This should be made deprecated along with the mapred package 
> HADOOP-1230. 
>* Should be removed when mapred package is removed.
>*/ {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L88]
> From the previous discussion, it seems that this method is still required if 
> we ensure binary compatibility with MRv1. 
>  
> https://issues.apache.org/jira/browse/HADOOP-17047?focusedCommentId=17111702=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17111702
> Mingliang Liu suggested to Drop MRv1 binary compatibility in 4.0.0
>  
> https://issues.apache.org/jira/browse/HADOOP-17047?focusedCommentId=17112442=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17112442



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17059) ArrayIndexOfboundsException in ViewFileSystem#listStatus

2020-06-06 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127423#comment-17127423
 ] 

Mingliang Liu commented on HADOOP-17059:


Thanks for reporting, [~hemanthboyina]. It should not throw 
{{ArrayIndexOfboundsException}} here. So a fix is required.

The {{UGI::getPrimaryGroupName()}} method throws exception if primary group 
name is not found. I'm not sure that if this IOE is too strict. In HADOOP-15167 
the proposed change is to use the name of the user itself as group (in case of 
no primary group). I think that patch should go in CC: [~brahmareddy]

Since HADOOP-15167 is still pending and it actually is targeting another aspect 
of this problem, I think can commit this one first. [~ayushtkn] Thoughts?

> ArrayIndexOfboundsException in ViewFileSystem#listStatus
> 
>
> Key: HADOOP-17059
> URL: https://issues.apache.org/jira/browse/HADOOP-17059
> Project: Hadoop Common
>  Issue Type: Bug
>Reporter: hemanthboyina
>Assignee: hemanthboyina
>Priority: Major
> Attachments: HADOOP-17059.001.patch
>
>
> In Viewfilesystem#listStatus , we get groupnames of ugi , If groupnames 
> doesn't exists  it will throw AIOBE
> {code:java}
> else {
>   result[i++] = new FileStatus(0, true, 0, 0,
> creationTime, creationTime, PERMISSION_555,
> ugi.getShortUserName(), ugi.getGroupNames()[0],
> new Path(inode.fullPath).makeQualified(
> myUri, null));
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-17047) TODO comments exist in trunk while the related issues are already fixed.

2020-06-06 Thread Mingliang Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17127421#comment-17127421
 ] 

Mingliang Liu commented on HADOOP-17047:


The error seems unrelated, and I have triggered another PreCommit run.

> TODO comments exist in trunk while the related issues are already fixed.
> 
>
> Key: HADOOP-17047
> URL: https://issues.apache.org/jira/browse/HADOOP-17047
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Rungroj Maipradit
>Assignee: Rungroj Maipradit
>Priority: Trivial
> Attachments: HADOOP-17047.001.patch, HADOOP-17047.001.patch, 
> HADOOP-17047.002.patch
>
>
> In a research project, we analyzed the source code of Hadoop looking for 
> comments with on-hold SATDs (self-admitted technical debt) that could be 
> fixed already. An on-hold SATD is a TODO/FIXME comment blocked by an issue. 
> If this blocking issue is already resolved, the related todo can be 
> implemented (or sometimes it is already implemented, but the comment is left 
> in the code causing confusions). As we found a few instances of these in 
> Hadoop, we decided to collect them in a ticket, so they are documented and 
> can be addressed sooner or later.
> A list of code comments that mention already closed issues.
>  * A code comment suggests making the setJobConf method deprecated along with 
> a mapred package HADOOP-1230. HADOOP-1230 has been closed a long time ago, 
> but the method is still not annotated as deprecated.
> {code:java}
>  /**
>* This code is to support backward compatibility and break the compile  
>* time dependency of core on mapred.
>* This should be made deprecated along with the mapred package 
> HADOOP-1230. 
>* Should be removed when mapred package is removed.
>*/ {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/ReflectionUtils.java#L88]
>  * A comment mentions that the return type of the getDefaultFileSystem method 
> should be changed to AFS when HADOOP-6223 is completed.
>  Indeed, this change was done in the related commit of HADOOP-6223: 
> ([https://github.com/apache/hadoop/commit/3f371a0a644181b204111ee4e12c995fc7b5e5f5#diff-cd86a2b9ce3efd2232c2ace0e9084508L395)]
>  Thus, the comment could be removed.
> {code:java}
> @InterfaceStability.Unstable /* return type will change to AFS once
> HADOOP-6223 is completed */
> {code}
> Comment location: 
> [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileContext.java#L512]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



  1   2   3   4   5   6   7   8   9   10   >