[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-12-02 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275613#comment-16275613
 ] 

Ping Liu commented on HADOOP-14600:
---

This is great to hear!  Finally, this gets in.  Thanks [~chris.douglas]!

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Fix For: 3.1.0
>
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java, 
> command_line_test_result__linux.txt, command_line_test_result__windows.txt
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-30 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273970#comment-16273970
 ] 

Ping Liu edited comment on HADOOP-14600 at 12/1/17 5:47 AM:


Just verified.  There is no error!

I missed {{-Pnative}} in Maven build that is required profile to generate JNI 
native code.  Now after built with {{-Pnative}}, things look good.  I tried the 
patch on IntelliJ in both Windows and Linux and made sure seeing the code flow 
into the test cases.

Also tested command line console.  I am attaching the command line test results 
from both Windows and Linux (see attachments: 
{{command_line_test_result__linux.txt}}, 
{{command_line_test_result__windows.txt}}).

cc: [~chris.douglas], [~steve_l]


was (Author: myapachejira):
Just verified.  There is no error!

I missed {{-Pnative}} in Maven build that is required profile to generate JNI 
native code.  Now things look good.  I tried the patch on IntelliJ in both 
Windows and Linux and made sure seeing the code flow into the test cases.

Also tested command line console.  I am attaching the command line test results 
from both Windows and Linux (see attachments: 
{{command_line_test_result__linux.txt}}, 
{{command_line_test_result__windows.txt}}).

cc: [~chris.douglas], [~steve_l]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java, 
> command_line_test_result__linux.txt, command_line_test_result__windows.txt
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-30 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273970#comment-16273970
 ] 

Ping Liu commented on HADOOP-14600:
---

Just verified.  There is no error!

I missed {{-Pnative}} in Maven build that is required profile to generate JNI 
native code.  Now things look good.  I tried the patch on IntelliJ in both 
Windows and Linux and made sure seeing the code flow into the test cases.

Also tested command line console.  I am attaching the command line test results 
from both Windows and Linux (see attachments: 
{{command_line_test_result__linux.txt}}, 
{{command_line_test_result__windows.txt}}).

cc: [~chris.douglas], [~steve_l]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java, 
> command_line_test_result__linux.txt, command_line_test_result__windows.txt
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-30 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: command_line_test_result__linux.txt
command_line_test_result__windows.txt

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java, 
> command_line_test_result__linux.txt, command_line_test_result__windows.txt
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-27 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16268098#comment-16268098
 ] 

Ping Liu commented on HADOOP-14600:
---

Yes, Chris.  I am verifying the patch.  There is an issue just found tonight in 
my Linux environment.  In TestRawLocalFileSystemContract.testPermission(), the 
native call failed with {{java.lang.UnsatisfiedLinkError: 
org.apache.hadoop.io.nativeio.NativeIO$POSIX.stat(Ljava/lang/String;)Lorg/apache/hadoop/io/nativeio/NativeIO$POSIX$Stat;}}.
  I'll look into it further.  I guess it is due to my last change.  I'll come 
back with update.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-22 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263821#comment-16263821
 ] 

Ping Liu edited comment on HADOOP-14600 at 11/23/17 5:01 AM:
-

[~chris.douglas] Finally, this round is green.  That's great!  Do you still 
need me verify it?  If so, I will try to work on it during this weekend.


was (Author: myapachejira):
[~chris.douglas] Finally, this round is green.  That's great!  Do you still 
need me verify it?  If so, I need learn how to use "git apply " :) 

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-22 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263821#comment-16263821
 ] 

Ping Liu commented on HADOOP-14600:
---

[~chris.douglas] Finally, this round is green.  That's great!  Do you still 
need me verify it?  If so, I need learn how to use "git apply " :) 

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> HADOOP-14600.009.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-17 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16257912#comment-16257912
 ] 

Ping Liu commented on HADOOP-14600:
---

[~chris.douglas] excellent catch!  Your correction is perfect.  Yes, 
{{recursive}} is boolean and will instruct {{Shell.getSetPermissionCommand()}} 
to get "set permission command" either recursively or not based on the flag.  
Currently, only non-recursive mode is used.  But in the future, recursive mode 
can be used when needed.

Other changes look good.  Thanks for detailed changes!  The only question I 
have is the number of spaces for indentation.  I notice you are using two 
spaces in {{StatUtils}}.  I was using two-space before as I think this saves 
space but was told four-space should be used for readability as two-space looks 
busy.  Oh, as I just read the Oracle/Sun code convention, it says indentation 
should be four spaces.  Other than indentation, all else look good.  Thanks 
Chris!

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, HADOOP-14600.007.patch, HADOOP-14600.008.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-05 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239865#comment-16239865
 ] 

Ping Liu commented on HADOOP-14600:
---

This time unit test fails on different test case 
(TestZKFailoverController.testGracefulFailover).  But again it is irrelevant to 
the patch.
cc: [~chris.douglas], [~ste...@apache.org]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-04 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.006.patch

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-04 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: (was: HADOOP-14600.006.patch)

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-04 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16239347#comment-16239347
 ] 

Ping Liu commented on HADOOP-14600:
---

[~chris.douglas] You are right.  {{path}} doesn't connect to return value.  It 
should be released regardless the value of {{ret}}.  I just added updated new 
patch.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-04 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.006.patch

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> HADOOP-14600.006.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-11-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235082#comment-16235082
 ] 

Ping Liu commented on HADOOP-14600:
---

Can someone have a look at this?  As I said before, the unit test failure is 
irrelevant to this fix.

[~ste...@apache.org]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
>Priority: Major
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-10-04 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.005.patch

* Moved up deprecation suppression annotation in TestRawLocalFileSystemContract 
to class level - hopefully this will clear javac warning.
* Again, the JUnit test failures with KDiag.java and TestRaceWhenRelogin.java 
are irrelevant to HADOOP-14600.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, HADOOP-14600.005.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-10-02 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.004.patch

=> Fixed license issue
- added ASF license header to newly added Helper.java

=> Fixed deprecation issue
- added Suppression anotation in 
TestRawLocalFileSystemContract.testPermission()

=> Fixed leftover test directory and file, which should have be cleaned up
- changed to use test base directory (getTestBaseDir()) in 
TestRawLocalFileSystemContract.testPermission()
- this directory will be automatically recycled in tearDown()

Please note that there are totally three test failures as follows.

{quote}Tests in error: 
  TestKDiag.testKeytabAndPrincipal:162->kdiag:119 ? KerberosAuth Login failure 
f...
  TestKDiag.testFileOutput:186->kdiag:119 ? KerberosAuth Login failure for 
user:...
  TestKDiag.testLoadResource:196->kdiag:119 ? KerberosAuth Login failure for 
use...

Tests run: 3927, Failures: 0, Errors: 3, Skipped: 206{quote}

The failures are all from TestKDiag which is not related to HADOOP-14600.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, HADOOP-14600.004.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-29 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16186875#comment-16186875
 ] 

Ping Liu commented on HADOOP-14600:
---

[~ste...@apache.org]

Thanks for your detailed code review!  Really appreciate it.

I just finished making recommended changes and attached 
*HADOOP-14600.003.patch*.

Following are details.

{{Helper.java (new class):}}

For better supporting unit test, I added some more testing mechanism and moved 
the logic into a new class called Helper.java.  Now I can not only check 
permission but also change permission. I didn't find a place where we can put 
the utilities.  So just add this one.  In case, if one want to added other 
common utility method.  Helper class can be the place.

{{TestRawLocalFileSystemContract.java:}}

With this addition, I can improve test by adding testPermission() into 
TestRawLocalFileSystemContract where now both loadPermissionInfoByNativeIO() 
and loadPermissionInfoByNonNativeIO() can be directly tested as you suggested. 
I can test it on Linux.  But on Windows, sticky bit change doesn't take effect. 
 I guess Windows probably doesn't have sticky bit feature.

{{TestNativeIO.java:}}

Similarly, doStatTest() was simplified by calling the Helper method.

Also improved testStatOnError() by using LambdaTestUtils, improved 
testMultiThreadedStat() by using ExecutorService and Future, also adding 
testMultiThreadedStatOnError().

{{RawLocalFileSystem.java:}}

A minor issue found (loadPermissionInfo()) is that domain returned with group 
in Windows.  So we need remove domain.  This is the same as removing domain 
from domain/user in existing code.

Lastly, for {{NativeIO.c}}, FindFileOwnerAndPermission is not a MSDN function.  
I found it defined at Line 811 in hadoop-common/src/main/winutils/libwinutils.c.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-29 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.003.patch

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> HADOOP-14600.003.patch, TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-09 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160168#comment-16160168
 ] 

Ping Liu commented on HADOOP-14600:
---

Changes has been made.  *HADOOP-14600.002.patch* is attached.  Also added unit 
tests.

CC: [~ste...@apache.org]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-09 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.002.patch

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, HADOOP-14600.002.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-07 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157223#comment-16157223
 ] 

Ping Liu commented on HADOOP-14600:
---

Excellent comments.  I'm going to make the suggested changes.

For all of those tests that timed out, they don't use the file permission.  
They test something else.  I should have clarified it.

Thanks Steve!

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-05 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154192#comment-16154192
 ] 

Ping Liu commented on HADOOP-14600:
---

I couldn't successfully set up a local environment to run test-patch.  So I 
went to the test result at 
https://builds.apache.org/job/PreCommit-HADOOP-Build/13153/testReport/ in the 
above table from [~hadoopqa].

I did manual test on all of five tests as follows.

* TestSFTPFileSystem.testStatFile
* TestDNS.testDefaultDnsServer
* TestRaceWhenRelogin.test
* TestKDiag.testKeytabAndPrincipal
* TestKDiag.testFileOutput
* TestKDiag.testLoadResource

But none of the tests hits on the new method *loadPermissionInfoByNativeIO()* 
in *RawLocalFileSystem* -- *loadPermissionInfoByNativeIO()* is the new code 
that swaps the original *_loadPermissionInfo()_* and is the only change to the 
previous version.

Additionally, I ran "mvn test -Pnative -Dtest=allNative" on my local 
environment and found 3 failures and 5 errors.

But they are mainly timed out.  After giving more time, majority of the tests 
passed.  For  TestRPCWaitForProxy.testInterruptedWaitForProxy, it's the only 
one still generating error after timeout time has been increased.  However, 
manual test on it didn't hit the break point in 
*loadPermissionInfoByNativeIO()* too.

In summary, I didn't find any failed test case for the target new method, 
*loadPermissionInfoByNativeIO()*.  Please let me know if this is enough for the 
verification or there are more tests to run and how.

CC: [~hadoopqa]



> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151289#comment-16151289
 ] 

Ping Liu commented on HADOOP-14600:
---

Yeah, it must be automatically included with MingW, Visual Studio, or some 
other installation.  Thanks for telling me that!  It's good to know all of 
these especially when using command line.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151255#comment-16151255
 ] 

Ping Liu commented on HADOOP-14600:
---

Thanks [~aw]!  I have both Cygwin and Git.  But Neither has /usr/bin.  As I 
checked Program Files, I found there is another Git installation just as you 
mentioned!  I can try this one.

But you are right.  I got lots of cuts and blood with getting mvn test 
-Dtest=foo running on Windows.  I'll try to run test-patch on Linux and come 
back test the same scenario on Windows probably with mvn test -Dtest=foo one by 
one as a workaround.

Thanks!

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151090#comment-16151090
 ] 

Ping Liu commented on HADOOP-14600:
---

Now I am trying to do the patch test on my Windows.  It looks like 
dev-support/bin/test-patch is a BASH script and cannot be run on Windows.  Is 
there any guide on how to run it on Windows or the patch test is not expected 
to be run on Windows?

CC: [~aw]

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150822#comment-16150822
 ] 

Ping Liu commented on HADOOP-14600:
---

Hi [~jzhuge], thanks for your help!

The patch and the test file are now attached.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Updated] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ping Liu updated HADOOP-14600:
--
Attachment: HADOOP-14600.001.patch
TestRawLocalFileSystemContract.java

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>Assignee: Ping Liu
> Attachments: HADOOP-14600.001.patch, 
> TestRawLocalFileSystemContract.java
>
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150254#comment-16150254
 ] 

Ping Liu commented on HADOOP-14600:
---

Oops, looks like I don't have permission to attach files.  I'll see if I can 
request the permission from the mailing list.

> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-14600) LocatedFileStatus constructor forces RawLocalFS to exec a process to get the permissions

2017-09-01 Thread Ping Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-14600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16150245#comment-16150245
 ] 

Ping Liu commented on HADOOP-14600:
---

I just followed [~steve_l]'s idea to add stat() native implementation.  Yes, it 
is similar to fstat() but doesn't need open file as it doesn't require a file 
descriptor.

Now there is no need to spawn extra thread to gather process info any more.

I did some manual test on both Windows 10 and Linux (Ubuntu on VirtualBox).  It 
looks like it has dramatic improvement on both systems.

{noformat}
Windows
number of files time (ms)   time (ms) with native IO

100 14274   1234
150 19002   1782
200 21865   2250
500 timed out   5125

1000timed out   9735
2000timed out   18875

Linux
number of files time (ms)   time (ms) with native IO

100 45391632
150 61372031
200 71392764
500 15566   5292

1000timed out   7490
2000timed out   14040
{noformat}

The test is primitive but sufficiently shows the improvement.

Attached is the patch file: *HADOOP-14600__Patch__20170901.txt*.

When doing the test, I added testListStatusForPerformance() to 
TestRawLocalFileSystem.java.  Also attached above.


> LocatedFileStatus constructor forces RawLocalFS to exec a process to get the 
> permissions
> 
>
> Key: HADOOP-14600
> URL: https://issues.apache.org/jira/browse/HADOOP-14600
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 2.7.3
> Environment: file:// in a dir with many files
>Reporter: Steve Loughran
>
> Reported in SPARK-21137. a {{FileSystem.listStatus}} call really craws 
> against the local FS, because {{FileStatus.getPemissions}} call forces  
> {{DeprecatedRawLocalFileStatus}} tp spawn a process to read the real UGI 
> values.
> That is: for every other FS, what's a field lookup or even a no-op, on the 
> local FS it's a process exec/spawn, with all the costs. This gets expensive 
> if you have many files.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org